VPTrees.jl
Implementation of Vantage Point Trees in Julia. A Vantage Point Tree is a metric tree which can be used for nearest neighbor or radius searches in any metric space. See Data structures and algorithms for nearest neighbor search in general metric spaces.
Example:
using VPTrees
using StringDistances
data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
vptree = VPTree(data, Levenshtein()))
query = "blau"
radius = 2
data[find(vptree, query, radius)]
# 2-element Array{String,1}:
# "bla"
# "blub"
n_neighbors = 3
data[find_nearest(vptree, query, n_neighbors)]
# 3-element Array{String,1}:
# "baube"
# "blub"
# "bla"Related Packages
The following packages implement other data structures for use in nearest neighbor and radius search in metric space:
- BKTrees.jl
- NearestNeighbors.jl implementing Ball Trees
API
VPTrees.VPTree — TypeVPTree(data::Vector{InputType}, metric; threaded=nothing)Construct Vantage Point Tree with a vector of data and given a callable metric. threaded uses threading is only avaible in Julia 1.3+ to parallelize construction of the Tree. When not explicitly set, is set to true when the necessary conditions are met.
Example:
using VPTrees
using StringDistances
data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
vptree = VPTree(data, Levenshtein())
query = "blau"
radius = 2
data[find(vptree, query, radius)]
# 2-element Array{String,1}:
# "bla"
# "blub"
n_neighbors = 3
data[find_nearest(vptree, query, n_neighbors)]
# 3-element Array{String,1}:
# "baube"
# "blub"
# "bla"VPTrees.find — Methodfind(vptree::VPTree{InputType, MetricReturnType}, query::InputType, radius::MetricReturnType, skip=nothing)::Vector{Int}Find all items in vptree within radius of query with respect to the metric defined in the VPTree. Returns indices into VPTree.data. The optional skip argument is a function f(::Int)::Bool which can be used to omit points in the tree from the search based on their index.
Example
using VPTrees
using StringDistances
data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
metric = (a, b) -> evaluate(Levenshtein(),a,b)
vptree = VPTree(data, metric)
query = "blau"
radius = 2
data[find(vptree, query, radius)]
# 2-element Array{String,1}:
# "bla"
# "blub"VPTrees.find_nearest — Methodfind_nearest(vptree::VPTree{InputType, MetricReturnType}, query::InputType, n_neighbors::Int, skip=nothing)::Vector{Int}Find n_neighbors items in vptree closest to query with respect to the metric defined in the VPTree. Returns indices into VPTree.data. The optional skip argument is a function f(::Int)::Bool which can be used to omit points in the tree from the search based on their index.
Example:
using VPTrees
using StringDistances
data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
metric = (a, b) -> evaluate(Levenshtein(),a,b)
vptree = VPTree(data, metric)
query = "blau"
n_neighbors = 3
data[find_nearest(vptree, query, n_neighbors)]
# 3-element Array{String,1}:
# "baube"
# "blub"
# "bla"