PRTools contents

PRTools manual

gendatk

GENDATK

K-Nearest neighbor data generation

    B = GENDATK(A,N,K,S)

Input
 A Dataset
 N Number of points (optional; default: 50)
 K Number of nearest neighbors (optional; default: 1)
 S Standard deviation (optional; default: 1)

Output
 B Generated dataset

Description

Generation of N points using the K-nearest neighbors of objects in the  dataset A. First, N points of A are chosen in a random order. Next, to each  of these points and for each direction (feature), a Gaussian-distributed  offset is added with the zero mean and the standard deviation: S * the mean  signed difference between the point of A under consideration and its K nearest neighbors in A.

The result of this procedure is that the generated points follow the local  density properties of the point from which they originate.

If A is a multi-class dataset the above procedure is followed class by  class, neglecting objects of other classes and possibly unlabeled objects.

If N is a vector of sizes, exactly N(I) objects are generated  for class I. Default N is 100 objects per class.

See also

datasets, gendatp,

PRTools contents

PRTools manual