HOME Dataset overload Datasets Dataset download of real world datasets: prdatasetsDataset, creation of artificial datasets

Dataset, creation of artificial datasets

Datasets can be created in the following ways:

> Commands for generating artificial datasets
gendatb Generation of two banana shaped classes.
gendatc Generation of two circular classes.
gendatd Generation of two normally distributed 'difficult' classes. This example is 'difficult' as the distribution should be known to construct a good classifier in case of small sample sizes.
gendath Generation of two normally distributed classes according to Highleyman [], see also [].
gendatl Generation of the 'Lithuanian' classes as proposed by Raudys.
gendats Generation of two simple normally distributed classes.
gendatm Generation of eight 2d classes.
gentrunk Generation of Trunk's dataset [], used to illustrate the peaking phenomenon (curse of dimensionality).
gendatgauss Generation of multivariate Gaussian distributed data.
gencirc Generation of a one-class circular dataset.
circles3d Create a dataset containing 2 circles in 3 dimensions (for mds examples).
lines5d Create a dataset containing 3 lines in 5 dimensions (for mds examples).
gendatr Generate regression dataset from data and target values (for regression examples).
gendat Random sampling of datasets for training and testing.
gensubsets Generation of a series of consistent subsets of a dataset.
gendatk Nearest neighbor data generation.
gendatp Parzen density data generation.

The last three commands generate datasets out of existing datasets.


R.P.W. Duin, January 28, 2013


HOME Dataset overload Datasets Dataset download of real world datasets: prdatasetsDataset, creation of artificial datasets