Find nearest neighbors
kd_nearest_neighbors(x, v, n, ...)
# S3 method for arrayvec
kd_nearest_neighbors(x, v, n, p = 2, a = 0, ...)
# S3 method for matrix
kd_nearest_neighbors(x, v, n, cols = NULL, p = 2, a = 0, ...)
# S3 method for data.frame
kd_nearest_neighbors(x, v, n, cols = NULL, w = NULL, p = 2, a = 0, ...)
kd_nn_indices(x, v, n, ...)
# S3 method for arrayvec
kd_nn_indices(x, v, n, distances = FALSE, p = 2, a = 0, ...)
# S3 method for matrix
kd_nn_indices(
x,
v,
n,
cols = NULL,
distances = FALSE,
p = 2,
a = 0,
validate = TRUE,
...
)
# S3 method for data.frame
kd_nn_indices(
x,
v,
n,
cols = NULL,
w = NULL,
distances = FALSE,
p = 2,
a = 0,
validate = TRUE,
...
)
kd_nearest_neighbor(x, v)
# S3 method for arrayvec
kd_nearest_neighbor(x, v)
# S3 method for matrix
kd_nearest_neighbor(x, v)
an object sorted by kd_sort
a vector specifying where to look
the number of neighbors to return
ignored
exponent of p-norm (Minkowski) distance
approximate neighbors within (1 + a)
integer or character vector or formula indicating columns
distance weights
return distances as attribute if true
if FALSE, no input validation is performed
kd_nearest_neighbors | one or more rows from the sorted input |
kd_nn_indices | a vector of row indices indicating the result |
kd_nearest_neighbor | the row index of the neighbor |
Distance is calculated as $$D_{ij} = [\sum_k w_k G(x_{ik}, x_{jk}) ^ p] ^ {1 / p}$$ where \(i\) and \(j\) are records, \(k\) is the \(k\)th field or tuple element, and \(w_k\) is the weight in the \(k\)th dimension. Here, \(G\) depends on the type. For reals, \(G(a, b) = |a - b|\). For logicals and integers, \(G(a, b)\) is one if \(a = b\) and zero otherwise. For strings, \(G(a, b)\) is Levenshtein or edit distance. Convert strings to factors unless edit distance makes sense for your application.
When using the cols
argument, the search key v
is handled specially. If
the length of v
is equal to the number of columns of x
, then it is
assumed that the key is given in the same order as the columns of x
. In that case,
the key v
is mapped through the cols
argument. This will possibly
change the order of the elements and length of the search key. The reason for this
is that it allows one to use a row of x
as the key and it will respect the
cols
argument. Otherwise, or if validate
is FALSE
, the search
key v
is passed unchanged and must be given with the correct length and order
to match the cols
argument. The same is true of the w
parameter.
if (has_cxx17()) {
x = matrix(runif(200), 100)
y = matrix_to_tuples(x)
kd_sort(y, inplace = TRUE)
y[kd_nearest_neighbor(y, c(1/2, 1/2)),]
kd_nearest_neighbors(y, c(1/2, 1/2), 3)
y[kd_nn_indices(y, c(1/2, 1/2), 5),]
}
#> [,1] [,2]
#> [1,] 0.5262575 0.4970160
#> [2,] 0.4668478 0.4783650
#> [3,] 0.5236941 0.5392107
#> [4,] 0.4728633 0.4533170
#> [5,] 0.4516867 0.5455092