8.1.0. GBLearn2: Machine Learning Library

Author(s): Yann LeCun

The GBLearn2 Library is an object-oriented framework for building, training and testing machine learning systems and algorithms. The "GB" stands for "gradient-based", but the library can handle learning algorithms that are not gradient-based.

The basic concepts of GBLearn2 are: modules, states, params, trainers, data sources, and meters.

8.1.0.0. GBLearn2: Basic Concepts

8.1.0.0.0. Modules

Modules (subclasses of gb-module ) are components of learning machines. Modules generally accept at least the method fprop . which computes the output of the module from its inputs. Input and outputs are generally subclasses of idx-state . Many commonly used modules are pre-defined, including various types of neural nets layers, convolutional layers, RBFs, Softmax and many others. Many of these modules use inputs and outputs that are of class idx3-ddstate (more on this below).

Some modules provide two other methods: bprop and bbprop . Those methods are defined for modules that are used in gradient-based machine learning algorithms. Method bprop back-propagates gradients through the module, while bbprop backpropagate diagonal second derivatives.

An large collection of high-level "macro-modules" is provided which include various types of fully-connected neural nets, convolutional networks, time-delay neural nets, etc.

Machine learning systems are usually built by using one of these predefined classes, or by subclassing them. Specific machine are generally created by defining classes whose slots are gb-modules and gb-states , with appropriate fprop and bprop methods.

8.1.0.0.1. States

Modules exchange information through subclasses of gb-state .

Most predefined modules use idx3-state and its subclasses as input and output. The class idx3-ddstate stores numbers arranged in a tensor with 3 indices. Most of those modules (e.g. as convolutional layers) interpret the first index as a feature index, and the last two indexes as spatial dimensions. This allows to easily replicate the modules spatially (e.g. for applications that require scanning a classifier over all locations of an image, more on this later).

8.1.0.0.2. Machines and Cost Modules

Basic modules generally do not assume much about the kind of learning algorithm with which they will be trained. The most common form of training is gradient-based training. gradient-based training consists in finding the set of parameters that minimize a particular energy function (generally computed by averaging over a set of training examples).

Classes are provided to conveniently assemble trainable modules and cost modules (e.g. energy functions) into a complete learning machine. An example of such class is idx3-supervised-module . This class can be used for supervised training of classifiers. The idx3-supervised-module contains:

a learning machine with one input and one output (both idx3-ddstate ).
a cost module whose first inputs is the output of the machine, whose second input is a desired category (stored in an idx0 of int), and whose output is an idx0-ddstate which contains the value of the energy.
a classifier module which takes the machine's output and computes a class-state which stores the label of the recognized category together with confidence scores and runner-up categories.

8.1.0.0.3. Parameters

Modules may have trainable parameters that are computed or adjusted by the various learning algorithms. Learning algorithms are designed to operate on gb-param objects, so as to make the algorithm implementations independent of the details of the learning machine's architecture. Whenever a instance of a module is created, a gb-param object is passed to the constructor. The modules trainable parameters are added to the param object.

8.1.0.0.4. Trainers

Learning machines created as above are then inserted in a gb-trainer object. The gb-trainer class (or rather, its subclasses) define the learning algorithm being used to train the machine. The most commonly used gb-trainer is supervised-gradient , which implements learning algorithms based on stochastic gradient descent.

The supervised-gradient trainer class expects a machine whose fprop and bprop methods take input, an output, a desired output, and an energy. supervised-gradient is an generic interpreted class which makes no hard assumptions on the type of the objects it manipulates as long as those objects understand the required set of methods.

Training a machine in such a trainer is performed by calling one of the training methods (e.g. train in the case of supervised-gradient ) with a data source and a gb-meter (and possibly other specific parameters, depending on the trainer and method considered).

8.1.0.0.5. Data Sources

A data source is a source of training and testing examples. Data sources are a bit like regular modules, but they generally don't have inputs. Most gb-trainer subclasses expect data sources to undertand the method size , seek , tell , and next , which they use to sweep through the set of examples.

Since many pre-defined machine expect an idx3-state as input, and an idx0 of int as desired output, the data source class dsource-idx3l exists for that purpose. Subclasses of dsource-idx3l are defined for databases of examples stored as vectors of floats or ubytes, or for databases of images of variable size.

8.1.0.0.6. Performance Meters

Meters are used by gb-trainer to measure the performance of learning machines. A commonly used one is the classifier-meter which can be used to measure the performance of supervised classifiers.

8.1.0.1. Examples and Demos

8.1.0.7. Module Libraries