# Public API

Neighborhood.jl is a Julia package that provides a unified interface for doing neighbor searches in Julia. This interface is described in this page.

## Search Structures

`Neighborhood.searchstructure`

— Function`searchstructure(S, data, metric; kwargs...) → ss`

Create a search structure `ss`

of type `S`

(e.g. `KDTree, BKTree, VPTree`

etc.) based on the given `data`

and `metric`

. The data types and supported metric types are package-specific, but typical choices are subtypes of `<:Metric`

from Distances.jl. Some common metrics are re-exported by Neighborhood.jl.

All currently supported search structures are:

BruteForce KDTree

The following functions are defined for search structures:

`Neighborhood.datatype`

— Function`datatype(::Type{S}) :: Type`

Get the type of the data points (arguments to the metric function) of a search struture of type `S`

.

`Neighborhood.getmetric`

— Function`getmetric(ss)`

Get the metric function used by the search structure `ss`

.

## Search functions

`Neighborhood.search`

— Function`search(ss, query, t::SearchType [, skip]; kwargs... ) → idxs, ds`

Perform a neighbor search in the search structure `ss`

for the given `query`

with search type `t`

(see `SearchType`

). Return the indices of the neighbors (in the original data) and the distances from the query.

Optional `skip`

function takes as input the index of the found neighbor (in the original data) `skip(i)`

and returns `true`

if this neighbor should be skipped.

Package-specific keywords are possible.

`Neighborhood.isearch`

— Function`isearch(args...; kwargs... ) → idxs`

Same as `search`

but only return the neighbor indices.

`Neighborhood.inrange`

— Function`inrange(ss, query, r::Real [, skip]; kwargs...)`

`search`

for `WithinRange(r)`

search type.

`Neighborhood.knn`

— Function`knn(ss, query, k::Int [, skip]; kwargs...)`

`search`

for `NeighborNumber(k)`

search type.

## Search types

`Neighborhood.SearchType`

— TypeSupertype of all possible search types of the Neighborhood.jl common API.

`Neighborhood.WithinRange`

— Type`WithinRange(r::Real) <: SearchType`

Search type representing all neighbors with distance `≤ r`

from the query (according to the search structure's metric).

`Neighborhood.NeighborNumber`

— Type`NeighborNumber(k::Int) <: SearchType`

Search type representing the `k`

nearest neighbors of the query (or approximate neighbors, depending on the search structure).

## Bulk searches

Some packages support higher performance when doing bulk searches (instead of individually calling `search`

many times).

`Neighborhood.bulksearch`

— Function`bulksearch(ss, queries, t::SearchType [, skip]; kwargs... ) → vec_of_idxs, vec_of_ds`

Same as `search`

but many searches are done for many input query points.

In this case `skip`

takes two arguments `skip(i, j)`

where now `j`

is simply the index of the query that we are currently searching for (`j`

is the index in `queries`

and goes from 1 to `length(queries)`

).

`Neighborhood.bulkisearch`

— Function`bulkisearch(ss, queries, t::SearchType [, skip]; kwargs... ) → vec_of_idxs`

Same as `bulksearch`

but return only the indices.

## Brute force searches

The `BruteForce`

"search structure" performs a linear search through its data array, calculating the distance from the query to each data point. This is the slowest possible implementation but can be used to check results from other search structures for correctness. The `Neighborhood.bruteforcesearch`

function can be used instead without having to create the search structure.

`Neighborhood.bruteforcesearch`

— Function`bruteforcesearch(data, metric, query, t::SearchType[, skip])`

Perform a brute-force search of type `t`

against data array `data`

(by calculating the metric for `query`

and and every point in `data`

).

`Neighborhood.BruteForce`

— Type`BruteForce`

A "search structure" which simply performs a brute-force search through the entire data array.

## Theiler window

`Neighborhood.Theiler`

— Type`Theiler(w::Int, nidxs = nothing)`

Struct that generates skip functions representing a Theiler window of size `w ≥ 0`

. This is useful when the query of a search is also part of the data used to create the search structure, typical in timeseries analysis. In this case, you do not want to find the query itself as its neighbor, and you typically want to avoid points with indices very close to the query as well. Giving `w=0`

excludes the query itself as its neighbor, see the boolean expressions below.

Let `theiler = Theiler(w)`

. Then, `theiler`

by itself can be used as a skip function in `bulksearch`

, because `theiler(i, j) ≡ abs(i-j) ≤ w`

. In addition, if the given argument `nidxs`

is *not* `nothing`

, then `theiler(i, j) ≡ abs(i-nidxs[j]) ≤ w`

. (useful to give as `nidxs`

the indices of the queries in the original data)

However `theiler`

can also be used in single searches. `theiler(n)`

(with one argument) generates the function `i -> abs(i-n) ≤ w`

. So `theiler(n)`

can be given to `search`

as the `skip`

argument. Notice that `theiler`

is an instance of `Theiler`

, and in summary you'd have to do `Theiler(w)(n)`

and give that to `search`

.