Dev Docs
Here is what you have to do to bring your package into Neighborhood.jl.
- Add your package as a dependency to Neighborhood.jl.
- Add the search type into the constant
SSS
insrc/Neighborhood.jl
and export it. - Then, proceed through this page and add as many methods as you can.
An example implementation for KDTrees of NearestNeighbors.jl is in src/kdtree.jl
.
Mandatory methods
Let S
be the type of the search structure of your package. To participate in this common API you should extend the following methods:
searchstructure(::Type{S}, data, metric; kwargs...) → ss
search(ss::S, query, t::SearchType; kwargs...) → idxs, ds
for both types of t
: WithinRange, NeighborNumber
. search
returns the indices of the neighbors (in the original data
) and their distances from the query
. Notice that ::Type{S}
only needs the supertype, e.g. KDTree
, without the type-parameters.
Performance methods
The following methods are implemented automatically from Neighborhood.jl if you extend the mandatory methods. However, if there are performance benefits you should write your own extensions.
isearch(ss::S, query, t::SearchType; kwargs...) → idxs # only indices
bulksearch(ss::S, queries, ::SearchType; kwargs...) → vec_of_idxs, vec_of_ds
bulkisearch(ss::S, queries, ::SearchType; kwargs...) → vec_of_idxs
Predicate methods
The following methods are extremely useful in e.g. timeseries analysis.
search(ss::S, query, t::SearchType, skip; kwargs...)
bulksearch(ss::S, queries, t::SearchType, skip; kwargs...)
(and their "i" complements, isearch, bulkisearch
).
These methods "skip" found neighbors depending on skip
. In the first method skip
takes one argument: skip(i)
the index of the found neighbor (in the original data) and returns true
if this neighbor should be skipped. In the second version, skip
takes two arguments skip(i, j)
where now j
is simply the index of the query that we are currently searching for.
You can kill two birds with one stone and directly implement one method:
search(ss::S, query, t::SearchType, skip = alwaysfalse; kwargs...)
to satisfy both mandatory API as well as this one.
Insertion/deletion methods
Simply extend Base.insert!
and Base.deleteat!
for your search structure.
Testing
The Neighborhood.Testing
submodule contains utilities for testing the return value of search
and related functions for your search structure. Most of these functions use Test.@test
internally, so just call within a @testset
in your unit tests.
Neighborhood.Testing
— ModuleUtilities for testing search structures.
Neighborhood.Testing.cmp_search_results
— Functioncmp_search_results(results...)::Bool
Compare two or more sets of search results ((idxs, ds)
tuples) and check that they are identical up to ordering.
Neighborhood.Testing.cmp_bruteforce
— Functioncmp_bruteforce(results, data, metric, query, t[, skip])::Bool
Check whether results
returned from search
match those computed with Neighborhood.bruteforcesearch
(data, metric, query, t[, skip])
(up to order). skip
may be nothing
, which calls the 4-argument method.
Caution: results of a NeighborNumber
search are only expected to match if the distances from query
to each point in data
are all distinct, otherwise there may be some ambiguity in which data points are included.
Neighborhood.Testing.search_allfuncs
— Functionsearch_allfuncs(ss, query, t[, skip])
Call search
(ss, query, t[, skip])
and check that the result matches those for isearch
and knn
/inrange
(depending on search type) for the equivalent arguments.
skip
may be nothing
, in which case the 3-argument methods of all functions will be called. Uses Test.@test
internally.
Neighborhood.Testing.check_search_results
— Functioncheck_search_results(data, metric, results, query, t[, skip])
Check that results = search(ss, query, t[, skip])
make sense for a search structure ss
with data data
and metric metric
.
Note that this does not calculate the known correct value to compare to (which may be expensive for large data sets), just that the results have the expected properties. Use [cmp_bruteforce
] for a more exact test. skip
may be nothing
, in which case the 3-argument methods of all functions will be called. Uses Test.@test
internally.
Checks the following:
results
is a 2-tuple of(idxs, ds)
.ds[i] == metric(query, data[i])
.skip(i)
is false for alli
inidxs
.- For
t::NeighborNumber
:length(idxs) <= t.k
.
- For
t::WithinRange
:d <= t.r
for alld
inds
.
Neighborhood.Testing.test_bulksearch
— Functiontest_bulksearch(ss, queries, t[, skip=nothing])
Test that bulksearch
gives the same results as individual applications of search
.
skip
may be nothing
, in which case the 3-argument methods of both functions will be called. Uses Test.@test
internally.