HNSW API

HNSW.HierarchicalNSWMethod
HierarchicalNSW(data;
    metric=Euclidean(),
    M=10,
    M0=2M,
    m_L=1 / log(M),
    efConstruction=100,
    ef=10,
    max_elements=length(data)

Create HNSW structures based on data.

  • data: This is an AbstractVector of the data points to be used.
  • metric: The metric to use for distance calculation. Any metric defined in Distances.jl should work as well as any type for which evaluate(::CustomMetric, x,y) is implemented.
  • M: The maximum number of links per node on a level >1. Note that value highly influences recall depending on data.
  • M0: The maximum number of links on the bottom layer (=1). Defaults to M0 = 2M.
  • efConstruction: Maximum length of dynamic link lists during index creation. Low values may reduce recall but large values increase runtime of index creation.
  • ef: Maximum length of dynamic link lists during search. May be changed afterwards using set_ef!(hnsw, value)
  • m_L: Prefactor for random level generation.
  • max_elements: May be set to a larger value in case one wants to add elements to the structure after initial creation.

Note: the data object will be used as a primary storage of the the vectors. Don't change it outside HNSW after initialization.

Sample:

using HNSW

dim = 10
num_elements = 10000
data = [rand(dim) for i=1:num_elements]

#Intialize HNSW struct
hnsw = HierarchicalNSW(data; efConstruction=100, M=16, ef=50)

#Add all data points into the graph
#Optionally pass a subset of the indices in data to partially construct the graph
add_to_graph!(hnsw)
source
HNSW.HierarchicalNSWMethod
HierarchicalNSW(vector_type::Type;
    metric=Euclidean(),
    M=10, #5 to 48
    M0=2M,
    m_L=1 / log(M),
    efConstruction=100,
    ef=10,
    max_elements=100000
)

This case constructs an empty HNSW graph based on the vector_type. Any data should be added with add! method.

Example:

dim = 5
num_elements = 100
data = [rand(Float32, dim) for n ∈ 1:num_elements]

hnsw = HierarchicalNSW(eltype(data))

# Now add new data
HNSW.add!(hnsw, data)
source
HNSW.add_to_graph!Function
add_to_graph!(notify_func, hnsw, indices, multithreading=false)

Add i ∈ indices referring to data[i] into the graph.

notify_func(i) provides an interface for a progress notification by current index.

Indices already added previously will be ignored.

source
add_to_graph!(hnsw, indices)

short form of add_to_graph!(notify_func, hnsw, indices)

source
Missing docstring.

Missing docstring for set_ef!. Check Documenter's build log for details.

Missing docstring.

Missing docstring for knn_search. Check Documenter's build log for details.