8.1.0.3. Data Sources
(packages/gblearn2/data-sources.lsh)


data sources are used in machine learning experiments as sources of training and testing examples. Data sources are pretty much like regular modules. Most data sources should implement the following methods to be compatible with most gradient-based learning algorithms Optionally, some data sources may implement the following methods: These methods can be used by algorithms that updates samples or preprocessings internal to the data source, based on gradients coming from the learning machine. However, since those methods are used by a very small number of very special learning algorithms, implementing them is generally not necessary.

8.1.0.3.0. dsource
(packages/gblearn2/data-sources.lsh)


Semi-abstract class for a data source. Various subclasses are defined for special cases. This root class defines the methods seek, tell, next, and size. Subclasses should define (or redefine) at least fprop and size.

8.1.0.3.0.0. (==> dsource seek i)
[MSG] (packages/gblearn2/data-sources.lsh)


set pointer to i -th item

8.1.0.3.0.1. (==> dsource tell)
[MSG] (packages/gblearn2/data-sources.lsh)


return index of current item.

8.1.0.3.0.2. (==> dsource next)
[MSG] (packages/gblearn2/data-sources.lsh)


move pointer to next item.

8.1.0.3.0.3. (==> dsource size)
[MSG] (packages/gblearn2/data-sources.lsh)


reutrns the size of the data source.

8.1.0.3.1. dsource-idx3l
(packages/gblearn2/data-sources.lsh)


An abstract data source class appropriate for most supervised learning algorithms that take input data in the form of an idx3-state and a label in the form of an idx0 of int. Most learning machines implemented in the library are designed to be compatible with this data source and its subclasses. This class just defines the fprop method prototypes. Subclasses actually implement the functionalities.

8.1.0.3.1.0. (==> dsource-idx3l fprop out lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


get the current item and copy the sample into out (an idx3-state) and the corresponding label into lbl (and idx0 of int).

8.1.0.3.2. dsource-idx3fl
(packages/gblearn2/data-sources.lsh)


a data source that stores samples as idx3 of floats. As a subclass of dsource-idx3l , this class is compatible with most learning machines defined in the gblearn2 library.

8.1.0.3.2.0. (new dsource-idx3fl inp lbl)
[CLASS] (packages/gblearn2/data-sources.lsh)


create a dsource-idx3fl where the input samples are slices of an PxDxYxX idx4 of floats passed as argument inp and labels are slices of an idx1 of int passed as argument lbl . P is the number of samples, and D, Y, X the dimensions of each sample. If your data needs less that three dimensions simply setq Y and/or X to 1 (i.e. inp would be a PxDx1x1 matrix).

8.1.0.3.2.1. (==> dsource-idx3fl fprop out lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


get the current item and copy the sample into out (an idx3-state) and the corresponding label into lbl (and idx0 of int).

8.1.0.3.3. dsource-idx3ul
(packages/gblearn2/data-sources.lsh)


a data source that stores samples as idx3 of ubytes. As a subclass of dsource-idx3l , this class is compatible with most learning machines defined in the gblearn2 library. The ubyte values (between 0 255) can be shifted and scaled by before being written to the output idx3-state. This data source is considerably more economical in term of memory than dsource-idx3fl (1 byte per sample per variable, versus 4).

8.1.0.3.3.0. (new dsource-idx3ul inp lbl bias coeff)
[CLASS] (packages/gblearn2/data-sources.lsh)


create a dsource-idx3ul where the input samples are idx3 slices of an PxDxYxX idx4 of ubytes passed as argument inp and labels are slices of an idx1 of int passed as argument lbl . P is the number of samples, and D, Y, X the dimensions of each sample. If your data needs less that three dimensions simply setq Y and/or X to 1 (i.e. inp would be a PxDx1x1 matrix). values in the item are shifted by bias and multiplied by coeff before being copied into the destination by fprop .

8.1.0.3.3.1. (==> dsource-idx3ul fprop out lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


get the current item and copy the sample into out (an idx3-state) and the corresponding label into lbl (and idx0 of int). Raw values are shifted by the bias parameter and multiplied by the coeff parameter before being copied into the x slot of out .

8.1.0.3.4. dsource-image
(packages/gblearn2/data-sources.lsh)


A data source that stores images of variable sizes, with any number of components per pixel, and one byte per component. As a subclass of dsource-idx3l , this class is compatible with most learning machines defined in the gblearn2 library. The fprop method resizes the output argument to the size of the current image. Pixel values may be shifted and scaled before being written the the output idx3-state .

8.1.0.3.4.0. (new dsource-image bias coeff)
[CLASS] (packages/gblearn2/data-sources.lsh)


create an empty dsource-image . When accessing an item, the values are shifted by bias and multiplied by coeff .

8.1.0.3.4.1. (==> dsource-image size)
[MSG] (packages/gblearn2/data-sources.lsh)


reutnrs the size of the database

8.1.0.3.4.2. (==> dsource-image fprop dest lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


writes the current item into the x slot of dest (an idx3-state) and the corresponding label to lbl (and idx0 of int). dest is automatically resized to the size of the current item.

8.1.0.3.4.3. (==> dsource-image load-ppms flist clist)
[MSG] (packages/gblearn2/data-sources.lsh)


fills up the database with images from a bunch of PPM files clist is a list of labels (one integer for each image) or a single category, in which case all the loaded images will be in that category.

8.1.0.3.4.4. (==> dsource-image load-pgms flist clist)
[MSG] (packages/gblearn2/data-sources.lsh)


fills up the database with images from a bunch of PGM files clist is a list of labels (one integer for each image) or a single category, in which case all the loaded images will be in that category.

8.1.0.3.4.5. (==> dsource-image save basename)
[MSG] (packages/gblearn2/data-sources.lsh)


save database into pre-cooked IDX files. These files can be subsequently loaded quickly using the load or map methods. The database is saved in four files named basename images.idx, basename starts.idx, basename sizes.idx, and basename labels.idx

8.1.0.3.4.6. (==> dsource-image load basename)
[MSG] (packages/gblearn2/data-sources.lsh)


load database from pre-cooked IDX files produced through the save method. The database will be loaded from four files named basename images.idx, basename starts.idx, basename sizes.idx, and basename labels.idx

8.1.0.3.4.7. (==> dsource-image map basename)
[MSG] (packages/gblearn2/data-sources.lsh)


memory-map database from pre-cooked IDX files produced through the save method. This is MUCH faster than load, and consumes fewer memory/disk bandwidth. The database will be mapped from four files named basename images.idx, basename starts.idx, basename sizes.idx, and basename labels.idx

8.1.0.3.5. dsource-idx3l-narrow
(packages/gblearn2/data-sources.lsh)


a data source constructed by taking patterns in an existing data source whose indices are within a given range.

8.1.0.3.5.0. (new dsource-idx3l-narrow base size offset)
[CLASS] (packages/gblearn2/data-sources.lsh)


make a new data source by taking size items from the data source passed as argument, starting at item offset .

8.1.0.3.5.1. (==> dsource-idx3l-permute fprop out lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


copy current item and label into out and lbl .

8.1.0.3.6. dsource-idx3l-permute
(packages/gblearn2/data-sources.lsh)


a data source constructed by shuffling the items of a base data source.

8.1.0.3.6.0. (new dsource-idx3l-permute base)
[CLASS] (packages/gblearn2/data-sources.lsh)


make a new data source by taking the base data source and building a permutation map of its items. Initially, the permutation map is equal to the identity. It can be shuffled into a random order by calling the shuffle method.

8.1.0.3.6.1. (==> dsource-idx3l-permute fprop out lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


copy current item and label into out and lbl .

8.1.0.3.6.2. (==> dsource-idx3l-permute shuffle)
[MSG] (packages/gblearn2/data-sources.lsh)


randomly shuffles the samples of the db.

8.1.0.3.7. dsource-idx3l-concat
(packages/gblearn2/data-sources.lsh)


a data source constructed by concatenating two base data sources.

8.1.0.3.7.0. (new dsource-idx3l-concat base1 base2)
[CLASS] (packages/gblearn2/data-sources.lsh)


make a new data source by taking concatenating two base data sources base1 and base2 . the Two data sources must be dsource-idx3l or subclasses thereof.

8.1.0.3.7.1. (==> dsource-idx3l-concat fprop out lbl)
[MSG] (packages/gblearn2/data-sources.lsh)


copy item and label into out and lbl .