Public API

Neighborhood.jl is a Julia package that provides a unified interface for doing neighbor searches in Julia. This interface is described in this page.

Search Structures

searchstructure(S, data, metric; kwargs...) → ss

Create a search structure ss of type S (e.g. KDTree, BKTree, VPTree etc.) based on the given data and metric. The data types and supported metric types are package-specific, but typical choices are subtypes of <:Metric from Distances.jl. Some common metrics are re-exported by Neighborhood.jl.


All currently supported search structures are:


The following functions are defined for search structures:

datatype(::Type{S}) :: Type

Get the type of the data points (arguments to the metric function) of a search struture of type S.


Search functions

search(ss, query, t::SearchType [, skip]; kwargs... ) → idxs, ds

Perform a neighbor search in the search structure ss for the given query with search type t (see SearchType). Return the indices of the neighbors (in the original data) and the distances from the query.

Optional skip function takes as input the index of the found neighbor (in the original data) skip(i) and returns true if this neighbor should be skipped.

Package-specific keywords are possible.


Search types

WithinRange(r::Real) <: SearchType

Search type representing all neighbors with distance ≤ r from the query (according to the search structure's metric).

NeighborNumber(k::Int) <: SearchType

Search type representing the k nearest neighbors of the query (or approximate neighbors, depending on the search structure).


Bulk searches

Some packages support higher performance when doing bulk searches (instead of individually calling search many times).

bulksearch(ss, queries, t::SearchType [, skip]; kwargs... ) → vec_of_idxs, vec_of_ds

Same as search but many searches are done for many input query points.

In this case skip takes two arguments skip(i, j) where now j is simply the index of the query that we are currently searching for (j is the index in queries and goes from 1 to length(queries)).


Brute force searches

The BruteForce "search structure" performs a linear search through its data array, calculating the distance from the query to each data point. This is the slowest possible implementation but can be used to check results from other search structures for correctness. The Neighborhood.bruteforcesearch function can be used instead without having to create the search structure.

bruteforcesearch(data, metric, query, t::SearchType[, skip])

Perform a brute-force search of type t against data array data (by calculating the metric for query and and every point in data).


Theiler window

Theiler(w::Int, nidxs = nothing)

Struct that generates skip functions representing a Theiler window of size w ≥ 0. This is useful when the query of a search is also part of the data used to create the search structure, typical in timeseries analysis. In this case, you do not want to find the query itself as its neighbor, and you typically want to avoid points with indices very close to the query as well. Giving w=0 excludes the query itself as its neighbor, see the boolean expressions below.

Let theiler = Theiler(w). Then, theiler by itself can be used as a skip function in bulksearch, because theiler(i, j) ≡ abs(i-j) ≤ w. In addition, if the given argument nidxs is not nothing, then theiler(i, j) ≡ abs(i-nidxs[j]) ≤ w. (useful to give as nidxs the indices of the queries in the original data)

However theiler can also be used in single searches. theiler(n) (with one argument) generates the function i -> abs(i-n) ≤ w. So theiler(n) can be given to search as the skip argument. Notice that theiler is an instance of Theiler, and in summary you'd have to do Theiler(w)(n) and give that to search.