VPTrees.jl

Implementation of Vantage Point Trees in Julia. A Vantage Point Tree is a metric tree which can be used for nearest neighbor or radius searches in any metric space. See Data structures and algorithms for nearest neighbor search in general metric spaces.

Example:

using VPTrees
using StringDistances

data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
vptree = VPTree(data, Levenshtein()))
query = "blau"
radius = 2
data[find(vptree, query, radius)]
# 2-element Array{String,1}:
#  "bla" 
#  "blub"
n_neighbors = 3
data[find_nearest(vptree, query, n_neighbors)]
# 3-element Array{String,1}:
#  "baube"
#  "blub" 
#  "bla"

The following packages implement other data structures for use in nearest neighbor and radius search in metric space:

API

VPTrees.VPTreeType
VPTree(data::Vector{InputType}, metric; threaded=nothing)

Construct Vantage Point Tree with a vector of data and given a callable metric. threaded uses threading is only avaible in Julia 1.3+ to parallelize construction of the Tree. When not explicitly set, is set to true when the necessary conditions are met.

Example:

using VPTrees
using StringDistances

data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
vptree = VPTree(data, Levenshtein())
query = "blau"
radius = 2
data[find(vptree, query, radius)]
# 2-element Array{String,1}:
#  "bla"
#  "blub"
n_neighbors = 3
data[find_nearest(vptree, query, n_neighbors)]
# 3-element Array{String,1}:
#  "baube"
#  "blub"
#  "bla"
source
VPTrees.findMethod
find(vptree::VPTree{InputType, MetricReturnType}, query::InputType, radius::MetricReturnType, skip=nothing)::Vector{Int}

Find all items in vptree within radius of query with respect to the metric defined in the VPTree. Returns indices into VPTree.data. The optional skip argument is a function f(::Int)::Bool which can be used to omit points in the tree from the search based on their index.

Example

using VPTrees
using StringDistances

data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
metric = (a, b) -> evaluate(Levenshtein(),a,b)
vptree = VPTree(data, metric)
query = "blau"
radius = 2
data[find(vptree, query, radius)]
# 2-element Array{String,1}:
#  "bla"
#  "blub"
source
VPTrees.find_nearestMethod
find_nearest(vptree::VPTree{InputType, MetricReturnType}, query::InputType, n_neighbors::Int, skip=nothing)::Vector{Int}

Find n_neighbors items in vptree closest to query with respect to the metric defined in the VPTree. Returns indices into VPTree.data. The optional skip argument is a function f(::Int)::Bool which can be used to omit points in the tree from the search based on their index.

Example:

using VPTrees
using StringDistances

data = ["bla", "blub", "asdf", ":assd", "ast", "baube"]
metric = (a, b) -> evaluate(Levenshtein(),a,b)
vptree = VPTree(data, metric)
query = "blau"
n_neighbors = 3
data[find_nearest(vptree, query, n_neighbors)]
# 3-element Array{String,1}:
#  "baube"
#  "blub"
#  "bla"
source