gendatd

GENDATD

Generation of 'difficult' normally distributed classes

A = GENDATD(N,K,D1,D2,LABTYPE)

Input
N Number of objects in each of the classes (default: [50 50])
K Dimensionality of the dataset (default: 2)
D1 Difference in mean in feature 1 (default: 3)
D2 Difference in mean in feature 2 (default: 3)
LABTYPE 'crisp' or 'soft' labels (default: 'crisp').

Output
A Generated dataset

Description

Generation of a K-dimensional 2-class dataset A of N objects. Class variances are very different for the first two dimensions. Separation is thereby, for small sample sizes, 'difficult'.

D1 is the difference between the means for the first feature, D2 is the difference between the means for the second feature. In all other directions the means are equal. The two covariance matrices are equal with a variance of 1 in all directions except for the second feature, which has a variance of 40. The first two feature are rotated over 45 degrees to construct a strong correlation. Class priors are P(1) = P(2) = 0.5.

If N is a vector of sizes, exactly N(I) objects are generated for class I, I = 1,2.

LABTYPE defines the desired label type: 'crisp' or 'soft'. In the latter case true posterior probabilities are set for the labels.

Input
N	Number of objects in each of the classes (default: [50 50])
K	Dimensionality of the dataset (default: 2)
D1	Difference in mean in feature 1 (default: 3)
D2	Difference in mean in feature 2 (default: 3)
LABTYPE	'crisp' or 'soft' labels (default: 'crisp').

Generation of 'difficult' normally distributed classes

Description

See also