PRTools contents |
GENDATK
B = GENDATK(A,N,K,S)
Input | |
A | Dataset |
N | Number of points (optional; default: 50) |
K | Number of nearest neighbors (optional; default: 1) |
S | Standard deviation (optional; default: 1) |
Output | |
B | Generated dataset |
Generation of N points using the K-nearest neighbors of objects in the dataset A. First, N points of A are chosen in a random order. Next, to each of these points and for each direction (feature), a Gaussian-distributed offset is added with the zero mean and the standard deviation: S * the mean signed difference between the point of A under consideration and its K nearest neighbors in A.
The result of this procedure is that the generated points follow the local density properties of the point from which they originate.
If A is a multi-class dataset the above procedure is followed class by class, neglecting objects of other classes and possibly unlabeled objects.
If N is a vector of sizes, exactly N(I) objects are generated for class I. Default N is 100 objects per class.
PRTools contents |