Dev Docs

Here is what you have to do to bring your package into Neighborhood.jl.

  • Add your package as a dependency to Neighborhood.jl.
  • Add the search type into the constant SSS in src/Neighborhood.jl and export it.
  • Then, proceed through this page and add as many methods as you can.

An example implementation for KDTrees of NearestNeighbors.jl is in src/kdtree.jl.

Mandatory methods

Let S be the type of the search structure of your package. To participate in this common API you should extend the following methods:

searchstructure(::Type{S}, data, metric; kwargs...) → ss
search(ss::S, query, t::SearchType; kwargs...) → idxs, ds

for both types of t: WithinRange, NeighborNumber. search returns the indices of the neighbors (in the original data) and their distances from the query. Notice that ::Type{S} only needs the supertype, e.g. KDTree, without the type-parameters.

Performance methods

The following methods are implemented automatically from Neighborhood.jl if you extend the mandatory methods. However, if there are performance benefits you should write your own extensions.

isearch(ss::S, query, t::SearchType; kwargs...) → idxs  # only indices
bulksearch(ss::S, queries, ::SearchType; kwargs...) → vec_of_idxs, vec_of_ds
bulkisearch(ss::S, queries, ::SearchType; kwargs...) → vec_of_idxs

Predicate methods

The following methods are extremely useful in e.g. timeseries analysis.

search(ss::S, query, t::SearchType, skip; kwargs...)
bulksearch(ss::S, queries, t::SearchType, skip; kwargs...)

(and their "i" complements, isearch, bulkisearch).

These methods "skip" found neighbors depending on skip. In the first method skip takes one argument: skip(i) the index of the found neighbor (in the original data) and returns true if this neighbor should be skipped. In the second version, skip takes two arguments skip(i, j) where now j is simply the index of the query that we are currently searching for.

You can kill two birds with one stone and directly implement one method:

search(ss::S, query, t::SearchType, skip = alwaysfalse; kwargs...)

to satisfy both mandatory API as well as this one.

Insertion/deletion methods

Simply extend Base.insert! and Base.deleteat! for your search structure.

Testing

The Neighborhood.Testing submodule contains utilities for testing the return value of search and related functions for your search structure. Most of these functions use Test.@test internally, so just call within a @testset in your unit tests.

Neighborhood.Testing.cmp_bruteforceFunction
cmp_bruteforce(results, data, metric, query, t[, skip])::Bool

Check whether results returned from search match those computed with Neighborhood.bruteforcesearch(data, metric, query, t[, skip]) (up to order). skip may be nothing, which calls the 4-argument method.

Caution: results of a NeighborNumber search are only expected to match if the distances from query to each point in data are all distinct, otherwise there may be some ambiguity in which data points are included.

source
Neighborhood.Testing.search_allfuncsFunction
search_allfuncs(ss, query, t[, skip])

Call search(ss, query, t[, skip]) and check that the result matches those for isearch and knn/inrange (depending on search type) for the equivalent arguments.

skip may be nothing, in which case the 3-argument methods of all functions will be called. Uses Test.@test internally.

source
Neighborhood.Testing.check_search_resultsFunction
check_search_results(data, metric, results, query, t[, skip])

Check that results = search(ss, query, t[, skip]) make sense for a search structure ss with data data and metric metric.

Note that this does not calculate the known correct value to compare to (which may be expensive for large data sets), just that the results have the expected properties. Use [cmp_bruteforce] for a more exact test. skip may be nothing, in which case the 3-argument methods of all functions will be called. Uses Test.@test internally.

Checks the following:

  • results is a 2-tuple of (idxs, ds).
  • ds[i] == metric(query, data[i]).
  • skip(i) is false for all i in idxs.
  • For t::NeighborNumber:
    • length(idxs) <= t.k.
  • For t::WithinRange:
    • d <= t.r for all d in ds.
source
Neighborhood.Testing.test_bulksearchFunction
test_bulksearch(ss, queries, t[, skip=nothing])

Test that bulksearch gives the same results as individual applications of search.

skip may be nothing, in which case the 3-argument methods of both functions will be called. Uses Test.@test internally.

source