PRTools contents

PRTools manual

gendatd

GENDATD

Generation of 'difficult' normally distributed classes

    A = GENDATD(N,K,D1,D2,LABTYPE)

Input
 N Number of objects in each of the classes (default: [50 50])
 K Dimensionality of the dataset (default: 2)
 D1 Difference in mean in feature 1 (default: 3)
 D2 Difference in mean in feature 2 (default: 3)
 LABTYPE 'crisp' or 'soft' labels (default: 'crisp').

Output
 A Generated dataset

Description

Generation of a K-dimensional 2-class dataset A of N objects.  Class variances are very different for the first two dimensions.  Separation is thereby, for small sample sizes, 'difficult'.

D1 is the difference between the means for the first feature, D2 is the difference between the means for the second feature. In all  other directions the means are equal. The two covariance matrices  are equal with a variance of 1 in all directions except for the  second feature, which has a variance of 40. The first two feature  are rotated over 45 degrees to construct a strong correlation.  Class priors are P(1) = P(2) = 0.5.

If N is a vector of sizes, exactly N(I) objects are generated  for class I, I = 1,2.

LABTYPE defines the desired label type: 'crisp' or 'soft'. In the  latter case true posterior probabilities are set for the labels.

See also

datasets, prdatasets,

PRTools contents

PRTools manual