PRTools contents |
DATAFILES
This is not a command, just an information file.
Datafiles in PRTools are in the MATLAB language defined as objects of the class DATAFILE. They inherit most of their properties of the class DATASET. They are a generalisation of this class allowing for large datasets distributed over a set of files. Before conversion to a dataset preprocessing can be defined. There are four types of datafiles
raw | Every file is interpreted as a single object in the dataset. These files may, for instance, be images of different size. |
cell | All files should be mat-files containing just a single variable being a cell array. Its elements are interpreted as objects. The file names will be used as labels during construction. This may be changed by the user afterwards. |
pre-cooked | In this case the user should supply a command that reads a file and converts it to a dataset. |
half-baked | All files should be mat-files, containing a single dataset. |
mature | This is a datafile by PRTools, using the SAVEDATAFILE command after execution of all preprocessing defined for the datafile. |
Datafiles are mainly an administration. Operations on datafiles are possible as long as they can be stored (e.g. filtering of images for raw datafiles, or object selection by GENDAT). Commands that are able to process objects sequentially, like NMC and TESTC can be executed on datafiles.
Whenever a raw datafile is sufficiently defined by pre- and postprocessing it can be converted into a dataset. If this is still a large dataset, not suitable for the available memory, it should be stored by the SAVEDATAFILE command and is ready for later use. If the dataset is sufficiently small it can be directly converted into a dataset by DATASET.
Intermediate results of datafiles that by the defined preprocessing cannot yet be converted into a dataset, can be stored as a new, raw datafile by CREATEDATAFILE.
The main commands specific for datafiles are
DATAFILE | constructor. It defines a datafile on a directory. |
ADDPREPROC | adds preprocessing commands (low level command) |
ADDPOSTPROC | adds postprocessing commands (low level command) |
FILTM | user interface to add preprocessing to a datafile. |
CREATEDATAFILE | executes all defined preprocessing and stores the result as a new, raw datafile. |
SAVEDATAFILE | executes all defined pre- and postprocessing and stores the result as a dataset in a set of matfiles. |
DATASET | conversion to dataset |
ROOTPATH | Absolute path of the datafile |
FILES | names of directories (for raw datafiles) or mat-files (for converted datafiles) |
TYPE | datafile type |
PREPROC | preprocessing commands in a struct array |
POSTPROC | postprocessing commands as mappings |
DATASET | stores all dataset fields. Note that the DATA field as well as the target field are empty and that the IDENT.FILE_INDEX field is used to store for every object a pointer to a file or directory in FILES. |
Subscription of datafiles is only defined for the first arguement, the objects, e.g. A(M,:) or even, irregulary, A(M) refer to object number M. As the objects in datafiles (e.g. images or time signals) may have different lengths, the second subscript, for datasets refering to the feature number, is undefined. A(M,N) causes an error of any N. Formally the feature size of a dataset is set to 0. Checking of feature sizes in applying mappings to datafiles is disabled.
The possibility to define preprocessing of objects (e.g. images) with different sizes makes datafiles useful for handling raw data and measurements of features.
datafile, addpreproc, addpostproc, filtm, filtim, createdatafile, savedatafile,
PRTools contents |