This package provides the functions of Neuristique's SN28 neural network
simulator. Paper documentation is available from Orlogia
mailto:info@orlogia.com or
http://www.orlogia.com .
8.1.2.0.0. Introduction to Netenv.


8.1.2.0.0.0. Overview of Netenv.


"Netenv" is the main component of the first generation libraries:

Primitive functions implement the elementary concepts of an unified
framework. There are primitive functions for allocating units, defining
connections and performing elementary operations on the units or the
connections.
 Connection functions are implemented in the file
"sn28/lib/connect.sn" . This library implement the most
common connection schemes: full connections, local connections, shared
weight connections, time delay connections,...
 Simulation functions are implemented in the file
"sn28/lib/netenv.sn" . This library calls the sequence of
primitive functions needed for training or testing a multilayer network
using various optimisation strategies. The user can program training and
test epochs, plot curves and adjust the various parameters of the
training algorithms.
 Example libraries take advantage of the Netenv mechanism for
implementing several neural network models. Since the sequencement is
performed in lisp, it is possible to build different algorithms by
writing new sequencement functions using the same primitives.
Curious readers might find Netenv based code for the following
algorithms:
 Kmeans and LVQ in directory
"sn28/examples/netlvq/" ,
 RBF networks in directory
"sn28/examples/netrbf/" ,
 Hopfield networks, Kanerva memories, Adalines and Rosenblatt's
perceptrons in directory "sn28/examples/netold/"
.
8.1.2.0.0.1. About this Netenv Manual.


This manual is a reference manual. It lists all the functions of Netenv
and describes their purpose. It also gives equivalent mathematical
formulas when such a description is adequate.
 Section 2
describes primitive functions directly implemented in C language. These
functions implement the elementary concepts of neural networks. There
are primitives for allocating units, defining connections and performing
elementary computations on units and connections.
 Section 3 introduces high level functions for training and testing
neural networks. These functions are implemented in file
"sn28/lib/netenv.sn" .
 Section 4 describes high level functions for creating standard
connection patterns between layers. These functions, located in file
"sn28/lib/connect.sn" , can handle full connections, local
connections, shared weight connections and time delay connections.
8.1.2.0.1. Low Level Functions in Netenv.


8.1.2.0.1.0. Introduction to Low Level Functions in Netenv.


In this section, we describe the primitive C functions of SN2.8 for the
simulation of neural networks. These functions usually are very simple
and thoroughly optimized.
Most of these functions, however, are seldom used directly: users call
instead high level functions which actually use the low level functions.
Primitive functions are the basic elements of such higher level
functions You may use them if you want to define a new algorithm and
thus write a new library.
Understanding these basic computation functions of the simulator,
however, is a very efficient way to use the full capacity of SN2.8.
8.1.2.0.1.1. Creating and Scaning Networks Architectures in Netenv.


The network is the main data struture of the simulator. This is an
arbitrary oriented graph. A number of units or neurons are linked by
oriented connections or synapses. The biological analogy stops here...
.IP A unit is identified by its number. .IP A layers is represented as a
list of numbers.
The special unit 0 is often referred
to as the bias unit. It is always allocated and its initial state is set
to 1 . Establishing connections from
this unit is a simple way to introduce an adaptable bias parameters.
8.1.2.0.1.1.0. Allocation of Neurons and Connections in Netenv.


8.1.2.0.1.1.0.0. (allocnet units links [weights])


Allocates memory space for networks with at most units units and at most
link connections. The third argument weights is only allowed by kernels
sn28ite and sn28itenew. It defines the maximal number of independent
weights. If this third argument is not given, the number of weights
allocated is equal to the number of connections.
This function returns the number of bytes allocated.
8.1.2.0.1.1.0.1. (clearnet)


Destroy all units and connections created by previous calls to functions
newneurons or connect .
There is only one unit remaining after a call to
clearnet : the bias unit (unit number
0 ) whose state is set to 1
.
8.1.2.0.1.1.0.2. nmax

[VAR] 
Contains the maximum number of units, as set by function
allocnet . This number includes the bias unit. The largest
available unit number thus is nmax1 .
8.1.2.0.1.1.0.3. smax

[VAR] 
Contains the maximum number of connections, as set by function
allocnet .
8.1.2.0.1.1.0.4. wmax

[VAR] 
With kernels sn28ite and sn28itenew only, this variable contains the
maximum number of independent parameters in a network, as set by
function allocnet .
8.1.2.0.1.1.0.5. nnum

[VAR] 
This variable contains the number of units currently used.
8.1.2.0.1.1.0.6. snum

[VAR] 
This variable contains the number of connections currently used.
8.1.2.0.1.1.0.7. wnum

[VAR] 
With kernels sn28ite and sn28itenew only, this variable contains the
number of shared weights currently used.
8.1.2.0.1.1.0.8. (newneurons n)


Creates n new units in the network
memory and returns a list of numbers. A sufficient storage space should
have been previously allocated using allocnet
. This function never connects the bias unit to the new units. This must
be done explicitly when required.
; create a set of 10 units and store the list of their indices in the variable mylayer
? (setq mylayer (newneurons 10))
= (5 6 7 8 9 10 11 12 13 14)
See: (allocnet units
links [ weights ])
8.1.2.0.1.1.1. Connections in Netenv.


8.1.2.0.1.1.1.0. (connect presyn postsyn [val [eps]])


Creates a new connection from unit presyn
to unit postsyn . Arguments
presyn or postsyn may also
be lists of units. In this case, no optional parameters are allowed and
all units of presyn are connected to
all units of postsyn .
The optional argument val specifies an
(initial) weight value for this connection. With kernels sn28new and
sn28itenew, the optional argument eps
specifies the learning rate of the connection.
8.1.2.0.1.1.1.1. (dupconnection m1 m2 n1 n2 [val [eps]])


With kernels sn28ite and sn28itenew only, this function creates a new
connection from n1 to units
n2 which shares the same weight as the connection from
m1 to m2 . This function is
used for creating networks with equality constraints among the weights.
Like function connect , arguments
m1 , m2 ,
n1 and n2 may be lists of
cell indices.
Like function connect , function
dupconnection handles the optional arguments
val for setting the weight value and
eps for setting the learning rate.
8.1.2.0.1.1.1.2. (amont n)

(netenv.sn) 
This function returns the list of all units that have a connection to
unit n (a list of the presynaptic
cells of cell n ). The order of the
units in the returned list is the order of creation of the corresponding
connections.
8.1.2.0.1.1.1.3. (presyn n)


This function is a synonim for amont .
8.1.2.0.1.1.1.4. (aval n)

(netenv.sn) 
This function returns the list of all units which that a connection from
unit n (a list of the postsynaptic
cells of cell n ). The order of the
units in the returned list is the order of creation of the corresponding
connections.
8.1.2.0.1.1.1.5. (postsyn n)


This function is a synonim for amont .
8.1.2.0.1.1.1.6. (nfanin n)


Returns the number of units connected to unit n
. The fanin of a unit is not stored in a predefined field; function
nfanin recomputes it each time.
8.1.2.0.1.1.1.7. (cutconnection from to)


Function cutconnection removes the
connection between cells from and
to from the internal lists used for performing the network
computations.
Cutting a connection does not free the corresponding weight slot. The
weight value however is cleared when you cut the last connection sharing
this weight. This property ensures that the weight vector generated by
BPtool after pruning a network can be used by the regular C code
generated by Nettool.
8.1.2.0.1.2. Accessing Internal Variables in Netenv.


When a network is created, space is allocated for recording numerical
values on the units and connections of the network graph.
Internal variables associated with the units are called ``nfields'', an
acronym for ``neuron fields''. Internal variables associated with the
connections are called ``sfields'', an acronym for ``synapse fields''.
Fields are highly specialized: in order to increase speed, many
computational functions are working on implicit fields. In addition, all
possible fields are not available on all versions of SN2.8. There is no
reason indeed to slow down a regular network by updating the fields
required by a shared weigths network.
8.1.2.0.1.2.0. Accessing the Unit Fields ``nfields''.


Here is a list of the fields allocated for each unit, as well as their
availability on the major kernels of SN2.8. Each nfield is identified by
a number. Symbolic names are defined by the
"netenv.sn" library.
Fields marked with a star (*) do not exist in sn28new and
sn28itenew.Accessing these fields however is simulated by writing to a
reading from the equivalent sfields sepsilon
and ssigma of the incoming
connections.
8.1.2.0.1.2.0.0. Convertion between Fields and Numbers or Lists of Numbers in Netenv.


8.1.2.0.1.2.0.0.0. (nfield n f [v])


Set/get field number f of unit
n . If the third optional argument is given, this function
sets the field f of unit
n to value v . This
function always returns the current value of field
f .
A group of functions have been defined for directly accessing most
fields.
8.1.2.0.1.2.0.0.1. (nval n [v])


Set/get the state of unit n (in nfield
nval ).
8.1.2.0.1.2.0.0.2. (ngrad n [v])


Set/get the activation gradient attached to unit
n (in nfield ngrad ).
8.1.2.0.1.2.0.0.3. (nsum n [v])


Set/get the activation of unit n (in
nfield nsum ).
8.1.2.0.1.2.0.0.4. (nbacksum n [v])


Set/get the output gradient of unit n
(in nfield nbacksum ).
8.1.2.0.1.2.0.0.5. (nepsilon n [v])


Set/get the learning rate of unit n
(in nfield nepsilon ).
With kernels sn28ite and sn28itenew, this function sets the learning
rate of all the connections to cell n
and returns the mean of those learning rates.
8.1.2.0.1.2.0.0.6. (nggrad n [v])


Sets/gets the instantaneous second derivative of the cost function with
respect to the activation of unit n
(in nfield nggrad ).This is not the
real second derivative, but the contribution of the current pattern to
the overall second derivative.
This function is defined by kernels sn28new and sn28itenew only.
8.1.2.0.1.2.0.0.7. (nsqbacksum n [v])


Sets/gets the instantaneous second derivative of the cost function with
respect to the output of unit n (in
nfield nsqbacksum ). This is not the
real second derivative, but the contribution of the current pattern to
the overall second derivative.
This function is defined by kernels sn28new and sn28itenew only.
8.1.2.0.1.2.0.0.8. (nsigma n)


Returns the moving average of the second derivative of the cost function
with respect to the connections arriving on unit
n . Since field nsigma
does not exist, this function computes the average of the sfields
ssigma of the incoming connections of unit
n .
This function is defined by kernels sn28new and sn28itenew only.
8.1.2.0.1.2.0.0.9. (nspare1 n [v]) (nspare2 n [v]) (nspare3 n [v])


Set/get one of the three spare fields attached to unit
n . These fields are not used by the simulator routines and
can be used freely by the user.
8.1.2.0.1.2.1. Accessing the Connection Fields: ``sfields''.


Similarly, fields associated to each connection are named
sfields . Here is a list of the fields in the connections.
Like nfields , the various
sfields are identified by a number.
Kernels sn28ite and sn28itenew implement shared weight networks by
setting up connections which physically share certain sfields: they
point to the same piece of memory. A star (*) in the above table
indicates those sfields which are shared when you set up a shared weight
connection with function dupconnection
.
8.1.2.0.1.2.1.0. (sval n1 n2 [v])


Set/get the value of the connection going from unit
n1 to unit n2 (in sfield
sval ). This sfield usually contains the weight of the
connection. This function returns the empty list if the connection does
not exist. It produces an error if you attempt to set the value of a
nonexisting connection.
8.1.2.0.1.2.1.1. (sdelta n1 n2 [v])


Set/get the weight change of the connection going from unit
n1 to unit n2 (in sfield
sdelta ). This function returns the empty list if the
connection does not exist. It produces an error if you attempt to set
the value of a nonexisting connection.
8.1.2.0.1.2.1.2. (sepsilon n1 n2 [v])


Set/get the learning rate of the connection going from unit
n1 to unit n2 (in sfield
sepsilon ). This function returns the empty list if the
connection does not exist. It produces an error if you attempt to set
the value of a nonexisting connection.
This function is defined by kernels sn28new and sn28itenew only.
8.1.2.0.1.2.1.3. (sacc n1 n2 [v])


Set/get the accumulated gradient for the weight associated to the
connection from unit n1 to unit
n2 . This function returns the empty list if the connection
does not exist. It produces an error if you attempt to set the value of
a nonexisting connection.
This function is defined by kernels sn28ite and sn28itenew only.
8.1.2.0.1.2.1.4. (ssigma n1 n2 [v])


Sets/gets the partial time average of the second derivative of the cost
function for the connection between unit n1
and n2 (in sfield
ssigma ). This function returns the empty list if the
connection does not exist. It produces an error if you attempt to set
the value of a nonexisting connection.
This function is defined by kernels sn28new and sn28itenew only.
8.1.2.0.1.2.1.5. (shess n1 n2 [v])


Set/get the estimate of the Hessian's diagonal term associated to the
weight of the connection between units n1
and n2 (in sfield
shess ). This function returns the empty list if the
connection does not exist. It produces an error if you attempt to set
the value of a nonexisting connection.
This function is defined by kernels sn28new and sn28itenew only.
8.1.2.0.1.2.1.6. (meansqrweight n)


Returns the mean squared value of the weights (in sfields
sval ) of the connections arriving to cell
n .
8.1.2.0.1.2.1.7. (meansqrdelta n)


Returns the mean square value of the last weight change (in sfields
sdelta ) of the connections arriving to cell
n .
8.1.2.0.1.2.1.8. (scounter n1 n2)


Get the number of connections which share the same sfields as the
connection between units n1 and
n2 . This function returns 0
if the connection does not exist.
This function is defined by kernels sn28ite and sn28itenew only.
8.1.2.0.1.2.1.9. (sindex n1 n2 [v])


This function sets/gets the identification index of the weight
associated to the connection from unit n1
to unit n2 . Connections with the same
index actually share their sfields. Function sindex may be used either
to check or to modify the way connections are shared. It returns
() if the connection does not exist.
This function is defined by kernels sn28ite and sn28itenew only.
8.1.2.0.1.3. Fields Manipulation in Netenv.


Two kinds of functions operate on nfields and sfields. The first ones
are highly optimized algorithmdependant functions. The second ones
implement several general purpose computations. We describe here these
latter functions.
8.1.2.0.1.3.0. Unit Fields Manipulation in Netenv.


8.1.2.0.1.3.0.0. (setnfield l f v)


Sets the fields f of all units in
layer l to the value
v .
(setnfield input nval 0)
; clears all the states of layer input.
8.1.2.0.1.3.0.1. (gaussstate l sdev)


Adds a zero mean gaussian noise of standard deviation
sdev to the states of the cells specified by list
l .
8.1.2.0.1.3.0.2. (flipstate l p)


Changes the state sign with probability p
for all units in list l .
8.1.2.0.1.3.0.3. (findwinner l) (findloser l)


These functions return the unit of list l
whose field nval has the largest
(respectively smallest) value.
8.1.2.0.1.3.0.4. (copynfield l1 f1 [l2] f2)


Copies field f2 of units in list
l2 into field f1 of units
in list l1 . The fields are identified
by integers or by their symbolic names as described with function
nfield .
If argument l2 is not defined, list
l1 is assumed. Execution is slightly faster in this case.
See: (nfield n
f [ v ])
8.1.2.0.1.3.0.5. (opnfield l f1 f2 g2 f3 g3)


Performs a linear combination of fields f2
and f3 of units in
l and puts the result in field f1
of the same unit. The operation performed is defined by the real numbers
g2 and g3 in the following
way:
f1 = g2*f2 + g3*f3
8.1.2.0.1.3.0.6. (mulnfield l f1 f2 f3)


Computes the product of fields f2 and
f3 of units in l and puts
the result in field f1 of the same
unit. The operation performed is defined in the following way:
f1 = f2*f3
Function mulnfield returns the sum of
all the computed terms i.e. the dot product of fields
f2 and f3 .
8.1.2.0.1.3.0.7. (invertnfield l f1 f2 g)


Divides the scalar g by fields
f2 of units in l and puts
the result in field f1 of the same
unit. The operation performed is defined in the following way :
f1 = g / f2
Function invertnfield returns the sum
of all the computed terms.
8.1.2.0.1.3.1. Connection Fields Manipulation in Netenv.


8.1.2.0.1.3.1.0. (setsfield l <sf v)


Sets field sf of all incoming connections of layer
l to value v .
8.1.2.0.1.3.1.1. (copysfield l1 sf1 [l2] sf2)


Copies field sf2 of incoming
connections of layer l2 into field
sf1 of incoming connections of layer
l1 . If argument l2 is not
provided, l1 is assumed.
8.1.2.0.1.3.1.2. (opsfield l sf1 sf2 g2 sf3 g3)


Performs a linear combination of fields sf2
and sf3 of all incoming connections of
layer l and puts the result into
sfield sf1 . The operation is defined
by the reals g2 ,
g3 :
sf1 = g2 * sf2 + g3 * sf3
8.1.2.0.1.3.1.3. (mulsfield l sf1 sf2 sf3)


Computes the product of fields sf2 and
sf3 of all incoming connection of layer
l . The result is stored into field
sf1 .
8.1.2.0.1.3.1.4. (invertsfield l sf1 sf2 g)


Divides g by the value of field
sf2 of all incoming connection of layer
l . The quotients are stored into field
sf1 .
8.1.2.0.1.3.1.5. (setswn n sf sf)


Takes fields f from all units
connected to unit n and transfers them
into field sf of the corresponding
connections.
8.1.2.0.1.3.1.6. (accswn n sf f)


Takes field f from all units connected
to unit n and adds it to field
sf of the corresponding connections.
8.1.2.0.1.4. Non Linear Functions.


Non Linear Functions (NLF) are special lisp objects which describe a
numerical function f(x) of a real
variable. NLFs are used for specifying the activation functions of the
units and their derivatives.
8.1.2.0.1.4.0. (NLF x)

[NLF] 
A NLF object can be used like a Lisp function. This call return
f(x) , where f is the
function associated to the NLF NLF .
Returns the null NLF:
f(x) = 0
8.1.2.0.1.4.2. (nlfftanh A B C D)


Returns a sigmoid NLF:
f(x) = A tanh (B x) + C x + D
8.1.2.0.1.4.3. (nlfflin A B C D)


Returns a linear NLF:
f(x) = A B x + C x + D
8.1.2.0.1.4.4. (nlffpiecewise A B C D)


Returns a piecewise linear NLF:
f(x) = A g(Bx) + Cx + D
where
g(x) = 1 if x < 1
g(x) = 1 if x > +1
g(x) = x otherwise
8.1.2.0.1.4.5. (nlffthreshold A B C D)


Returns a threshold NLF:
f(x) = A g(Bx) + Cx + D
where
g(x) = 1 if x > 0
g(x) = 1 otherwise
8.1.2.0.1.4.6. (nlffbell)


Returns a bell shaped NLF:
f(x) = 0 if x < 1
f(x) = 0 if x > +1
f(x) = (1  x**2)**3 otherwise
8.1.2.0.1.4.7. (nlfflisp f)


Returns a NLF associated to the lisp function f
. Applying this NLF to a number x
calls the lisp function f and returns
the result. This is especially fast if the lisp function is a DZ.
See: DZ
8.1.2.0.1.4.8. (nlffspline xl yl)


Returns a tabulated NLF. The graph of this NLF is a smooth interpolation
of the points whose abcissas are specified by list
xl and which ordinates are specified by list
yl .
8.1.2.0.1.4.9. (nlfdfall nlf)


Returns a NLF describing the derivative of NLF
nlf . This derivative is computed with a finite difference
and thus is not very accurate.
8.1.2.0.1.4.10. (nlfddfall nlf)


Returns a NLF describing the second derivative of NLF
nlf . Again, this derivative is computed with finite
differences and is inaccurate.
8.1.2.0.1.4.11. (nlfdftanh A B C D)


Returns a NLF describing the derivative of a sigmo•d NLF. This NLF
computes the derivative faster and more accurately than (nlfdfall
(nlftanh A B C D)). It actually computes:
df(x) = AB (1 + tanh2 (Bx)) + C
8.1.2.0.1.4.12. (nlfdfbell)


Returns the derivative of the bell shaped NLF:
df(x) = 0 if x < 1
df(x) = 0 if x > +1
df(x) = 6 x (1  x**2)**2 otherwise
8.1.2.0.1.4.13. (nlfdfspline X Y)


This function returns NLFs describing the first derivative of a
tabulated NLF. As usual, computation is faster and more accurate than
(nlfdfall (nlfspline X Y))
8.1.2.0.1.4.14. (nlfddfspline X Y)


This function returns NLFs describing the second derivative of a
tabulated NLF. As usual, computation is faster and more accurate than
(nlfddfall (nlfspline X Y))
8.1.2.0.1.5. Functions for QuasiLinear Units.


8.1.2.0.1.5.0. Propagation Implementation for QuasiLinear Units.


8.1.2.0.1.5.0.0. (updateweightedsum [theta]...layers...)


Computes the weighted sum (stored in nfield nsum
) of each cell in layers layers, according to the following formula:
nsum(i) = Sum on j ( sval(j,i) nval(j) )
The optional argument theta specifies
the standard deviation of a gaussian noise added to the weighted sum.
8.1.2.0.1.5.0.1. (updatestateonly nlf...layers...)


Applies NLF nlf to the units in layers
layers.
nval(i) = NLF [nsum(i)]
8.1.2.0.1.5.0.2. (updatestate [theta] nlf...layers...)


This function combines the two previous functions. It first computes the
weighted sum for each unit and then applies the NLF
nlf . This is the basic function for updating the state of
one or several layers.
See: (updateweightedsum [ theta
]... layers ...)
See: (updatestateonly nlf
... layers ...)
8.1.2.0.1.5.1. Online Gradient Implementation for QuasiLinear Units.


The following functions are used for implementing ``online'' or
``stochastic'' gradient algorithms. In these algorithms, the weights are
updated after each pattern presentation.
8.1.2.0.1.5.1.0. (initgradlms dnlf output desired)


Computes the gradient of the following cost function (Mean Squared
Error):
Cost = 0.5 * Sum on i (nval(output i)  nval(desired i))**2
with respect to the weighted sum and state of units in layer output:
 Gradient of Cost on nval(outputi)
= nbacksum(output i) = nval(desired i)  nval(output i)
 Gradient of Cost on nsum(output i))
= ngrad(output i) = nbacksum(output i) DNLF[nsum(output i)]
In addition, kernels sn28new and sn28itenew set the second derivative to
1:
nsqbacksum(output i) = 1
The desired values for a network are usually stored in a pseudo layer
desired. The function initgradlms
compares these desired values with the states of the output layer
output and initializes the computation of the gradients.
Function initgradlms returns the
value of the cost function.
8.1.2.0.1.5.1.1. (initgradthlms dnlf output desired threshold)


Computes the gradient of a thresholded cost function defined as follows:
nbacksum(output i)
= 0 if nval(output i) > threshold and nval(desired i) > threshold,
= 0 if nval(output i) < threshold and nval(desired i) < threshold,
= nval(desired i)  nval(output i) otherwise.
 Gradient of Cost on nsum(outputi)
= ngrad(output i) = nbacksum(output i) * DNLF[nsum(output i)]
This cost function is useful for implementing Rosenblatt's perceptron
like algorithms. These algorithms however are often unstable when the
minimal cost is not zero.
8.1.2.0.1.5.1.2. (updatebacksum...l...)


Computes the derivatives of the cost with respect to the states of units
in layers l , given the derivatives
with respect to the weighted sum of the downstream units. .VP  Gradient
of Cost on nval(i) = nbacksum(i) = sum on j (sval(i, j) * ngrad(j)) .PP
8.1.2.0.1.5.1.3. (updategradientonly dnlf...l...)


Computes the derivatives of the cost with respect to the activations
(nfields nsum ) of units in layers
l , given the derivatives with respect to their states.
 Gradient of Cost on nsum(i)
= ngrad(i) = nbacksum(i) * DNLF[nsum(i)]
The NLF object dnlf must be the
derivative of the NLF object given as argument to the function
updatestateonly called on that layer.
8.1.2.0.1.5.1.4. (updategradient dnlf...l...)


Calls both updatebacksum and
updategradientonly . This is the basic step of the well
known backpropagation algorithm.
8.1.2.0.1.5.1.5. (updateweight alpha decay [...l...])


Updates the weights according to the gradient of the cost, the momentum
parameter alpha and the exponential
decay parameter decay .
sdelta(i, j) = alpha * sdelta(i, j) + nval(i) * ngrad(j) * nepsilon(j)
sval(i, j) = ( 1  decay * nepsilon(j) ) * sval(i, j) + sdelta(i, j)
Kernels sn28new and sn28itenew use sepsilon(i,j)
instead of nepsilon(j) .
When layers l are specified, the
weight update is only performed on the incoming connections to layers
l . Otherwise, the weight update happens in all the network.
When arguments alpha and/or
decay are zero, a faster optimized code is used.
With kernels sn28ite and sn28itenew, the possible presence of shared
weights makes the computation more complex and slower. Function
updateweight then calls the functions
clearacc , updateacc ,
updatedelta and updatewghtonly
described in the next section. Three differences with the non iterative
version are then important:
 The execution is 40% slower,
 Argument decay is no longer scaled by the value of field nepsilon:
sval(i, j) = (1  decay) * sval(i, j) + sdelta(i, j)
 Restricting the update to specific layers with an the optional
argument l has an ambiguous meaning.
Selected connections may shared weights with unselected connections.
Since the right computation is very slow, it is considerably more
efficient to use directly clearacc
and updateacc as discussed in the
sequencement section.
8.1.2.0.1.5.2. Batch Gradient Implementation for QuasiLinear Units.


Batch gradient (also called true gradient) consists in updating the
weights after the presentation of several patterns. A new sfield,
sacc accumulates the contribution of each pattern. This
field is only present whith kernels sn28ite and sn28itenew.
The sfield sacc is also used for
accumulating the contributions of several connections to the gradient of
shared weights. In fact, shared weights and batch gradient use the same
underlying mechanisms. Four functions then replace the usual call to
updateweight :
8.1.2.0.1.5.2.0. (clearacc)


Sets the sacc field of all weights to
zero.
sacc = 0
8.1.2.0.1.5.2.1. (updateacc [...l...])


Adds the contribution to the gradient of all incoming connections to
layer l . When argument
l is omitted, all existing connections are considered.
sacc(i, j) = sacc(i, j) + nval(i) * ngrad(j) * nepsilon(j)
Kernels sn28new and sn28itenew use sepsilon(i,j)
instead of nepsilon(j) .
8.1.2.0.1.5.2.2. (updatedelta alpha)


Updates the sfields sdelta of all
weights, according to the value of the sfields
sacc field and to the momentum parameter
alpha :
sdelta = alpha * sdelta + sacc
8.1.2.0.1.5.2.3. (updatewghtonly decay)


Updates the weights, according to sfields sdelta
and to the decay parameter decay .
sval = (1  decay) * sval + sdelta
8.1.2.0.1.5.3. Second Order Algorithms Implementation for QuasiLinear Units.


Adjusting the learning rates in a neural network is sometimes difficult
and often tricky. A simple and efficient idea is used by the Newton
algorithm: relying on the curvature of the cost function. The stronger
the curvature, the smaller the learning rate in that direction.
The curvature is described by the matrix of the second derivatives of
the cost, the Hessian matrix. Since this matrix has a huge number of
coefficients, we only compute the diagonal coefficients.
The weight update equation then becomes
W = W  Gradient of Cost on W * Inverse( Hessian of Cost on W )
This simple approach, however, is plagued by several problems:
W = W  sepsilon * Gradient of Cost on W / (mu + ssigma)
We have replaced our learning rate problem by three new parameters.
These new parameters however are much more robust: we always use the
same values although more refined values may prove more efficient:
sepsilon = 0.0005
gamma = mu = 0.05
This scheme is further complexified by two important additional factors:

The contributions to the second derivatives of shared connections are
accumulated in a new sfield shess .
 It is empirically more efficient to use the LevembergMarquard
approximation of the second derivatives. This approximation ensures that
the value of field ssigma is greater
or equal to zero!
These algorithms are implemented by kernels sn28new and sn28itenew. Two
sets of new functions are required for computing the derivatives and for
updating the weights.
8.1.2.0.1.5.3.0. (updateggradient dnlf ddnlf gamma...layers...)


This function first computes the nfields nggrad
of units in layers layers . It then
computes the sfields ssigma of all
incoming connections to layers layers. Argument
ddnlf is the second derivative of the NLF used for that
specific layer. Argument dnlf is the
first derivative of the NLF.
For each unit i in layers:
nsqbacksum(i) = Sum on k ([sval(i, k)]**2 * nggrad(k))
nggrad(i) = nbacksum(i) * DDNLF(nsum(i)) + nsqbacksum(i) * [ DNLF(nsum(i)) ]**2
ssigma(j, i) = gamma * nggrad(i) * [nval(j)]**2 + (1  gamma) * ssigma (j, i)
This computation requires that the fields nggrad
of all downstream units are correct. All the fields
nggrad and ssigma thus
are computed during a single backward reccurrence similar to the usual
backpropagation.
Remark:
 If a unit has no downstream connection, field
nsqbacksum is left unchanged. Applying
updateggradient to the output layer thus leaves the value
1 stored by initgrad in
that field and still computes fields nggrad
and ssigma .
8.1.2.0.1.5.3.1. (updatelmggradient dnlf gamma...layers...)


This function computes the LevenbergMarquardt approximation of
ssigma . This function takes no argument
ddnlf for the second derivative of the NLF because the
LevenbergMarquardt approximation consists in ignoring this second
derivative!
The computation is similar to that of
updateggrad except that the nfield
nggrad is computed according to the following formula:
nggrad(i) = nsqbacksum(i) * [DNLF(nsum(i))]2
8.1.2.0.1.5.3.2. (clearhess)


This function is similar to clearacc
. It clears the sfield shess in each
weight.
shess = 0
8.1.2.0.1.5.3.3. (updatehess [...l...])


Accumulates the sfield ssigma of each
connection into the sfield shess of
the corresponding weight. When arguments l
are present, only the incoming connections to layers
l are processed.
For each connection:
shess = shess + ssigma
8.1.2.0.1.5.3.4. (hessianscale mu)


This function is usually inserted before
updatedelta and updatewghtonly
. It scales field sacc according to
the ``second derivative'' stored in field shess
and to the parameter mu . The usual
functions updatedelta and
updatewghtonly then perform the desired weight
update .
For all weights:
sacc = sacc / (mu + shess)
8.1.2.0.1.5.3.5. (updatewnewton alpha decay mu [...layers...])


This function is similar to updateweight, but updates the weights
according to the quasiNewton equation. Actually, this function calls in
sequence:
(clearacc)
(clearhess)
(updateacc...layers...)
(updatehess...layers...)
(hessianscale mu)
(updatedelta alpha)
(updatewghtonly decay)
The performance remarks discussed with function
updateweight function are even more valid. With kernel
sn28itenew, function updatewnewton
becomes very slow if you specify argument layers. It is much better to
use the basic functions themselves.
See: (updateweight alpha
decay [... l ...])
8.1.2.0.1.5.4. Conjugate Gradient Implementation for QuasiLinear Units.


Conjugate Gradient is a powerful batch optimization algorithm which
outperforms all other algorithms implemented in SN when looking for a
high optimization accuracy. This algorithm is especially suitable to
function approximation tasks.
This stateoftheart implementation of the conjugate gradient method
rely on partial line searches based on the analytical computation of the
cost function curvature. Such partial line searches are allowed by a
selfrestarting HestenesStiefel formula for computing the conjugate
directions.
8.1.2.0.1.5.4.0. (updatedeltastate dnlf...layers...)


Function updatedeltastate is the
basic component of the computation of the Gauss Newton approximation to
the curvature information. This computation is performed during a single
forward propagation. If the connection field
sdelta contains the search direction, function
updatedeltastate stores in fields
ngrad the derivative of the cell state with respect to a
weight change along the search direction.
For each cell i in layers layers,
function updatedeltastate updates the
gradient field ngrad according to the
following formula:
ngrad(i) = DNLF(nsum(i)) isu(j;; sdelta(j,i) nval(j) + sval(j,i) ngrad(j))
8.1.2.0.1.5.4.1. (updatewghtonlyconjgrad curvature)


Assuming that connection field sacc
contains the gradient, that connection field
sdelta contains the search direction, and that argument
curvature is the second derivative along the search direction, function
updatewghtonlyconjgrad performs a Newton step along the
search direction. This is the main component of the partial line search.
This function updates the weight vector according to the following
formula:
sval = sval + f(sdelta . sacc;curvature) sdelta
8.1.2.0.1.6. Functions for Implementing Euclidian Units.


Instead of computing:
f( Sum on i (wi xi) )
euclidian unit compute a function of the distance between the vector
xi and the weights wi :
f( Sum on i (wi  xi)**2 )
Euclidian units are useful for implementing several algorithms, like
Kohonen's map, Learning Vector Quantization, Radial Basis Functions, K
means and Nearest Neighbour. This implementation in Netenv provides an
alternative to KNNenv.
A few additional functions whose name contains
"nn" implement these units in SN2.8. These functions just
replace their standard counterparts for quasilinear units. An euclidian
unit in SN is the same object than a quasilinear unit. The only
difference is the use of a different set of functions for computing
their states, gradients and weights.
8.1.2.0.1.6.0. Propagation Implementation for Euclidean Units.


8.1.2.0.1.6.0.0. (updatennsum [theta]...layers...)


This is the euclidian counterpart of
updateweightedsum . It computes the nfield
nsum of each unit in layers
according to the following formula:
nsum (i) = Sum on j ( sval(j, i)  nval(j) )**2
Argument theta is the standard
deviation of an optional gaussian noise added during the computation.
8.1.2.0.1.6.0.1. (updatennstate [theta] nlf...layers...)


This is the counterpart of updatestate
. It first calls updatennsum then
calls updatestateonly computing thus
field nval of each euclidian unit in
layer layers .
8.1.2.0.1.6.1. Gradient Descent Algorithms Implementation for Euclidean Units.


Back propagating through euclidian units is essentially identical to
backpropagating through quasilinear units. The equations however are
slightly different and are implemented by new SN2.8 functions.
8.1.2.0.1.6.1.0. (updatennbacksum...layers...)


This is the counterpart of updatebacksum
. It computes the field nbacksum of
units in layers according to the value of field
ngrad of their downstream units. These downstream units are
assumed to be Euclidian.
nbacksum(i) = 2 * Sum on k ( ngrad(k) * [nval(i)  sval(i,k)] )
Units in layers do not need to be actually euclidian units. Their
downstream units however have to be euclidian units.
8.1.2.0.1.6.1.1. (updatenngradient dnlf...layers...)


This is the counterpart of updategradient. It computes the field ngrad
by calling updatennbacksum and
updategradientonly .
 Units in layers do not need
to be euclidian units. Their downstream units however have to be
euclidian units.
 Argument dnlf is the derivative
of the NLF used by updatestateonly
for layers layers . It is not the
derivative of the NLF associated to the euclidian units which actually
do not belong to layers layers .
8.1.2.0.1.6.1.2. (updatennweight alpha decay...layers...)


This is the counter part of updateweight
. It updates the weights for all incoming connections of units in
layers . These units are assumed to be euclidian.
For each unit i in layers,
For each unit j connected to i:
sdelta (i,j) = alpha * sdelta(j,i) + (1  alpha) * ngrad(i) * (sval(j,i)  nval(j)) * nepsilon(i)
sval(i,j) = (1  nepsilon(i) * decay) * sval(j, i) + sdelta(j, i)
With kernels sn28ite or sn28itenew, this function has the same
limitations as function updateweight
. It is then better to use explicitely clearacc
, updatennacc ,
updatedelta and updatewghtonly
.
See: (updateweight alpha
decay [... l ...])
8.1.2.0.1.6.1.3. (updatennacc...layers...)


Adds the contribution of the incoming connections of units in
layers to the gradient according to the following formula.
This function is only available with kernels sn28ite and sn28itenew.
sacc(j,i) = sacc(j, i) + ngrad(i) * ( sval(j, i)  nval(j) ) * nepsilon(i)
8.1.2.0.1.6.2. Second Order Algorithms Implementation for Euclidean Units.


Similarly, a few functions allow the computation of the second
derivative in order to use the Quasi Newton or Levenberg Marquardt
algorithms.
8.1.2.0.1.6.2.0. (updatennggradient dnlf ddnlf gamma...layers...)


This is the counterpart of updateggradient
. It computes the field nggrad
according to the following formula:
nsqbacksum(i) = 2 * Sum on k ( 2 * (sval(i, k)  nval(i))**2 * nggrad(k)  ngrad(k) )
nggrad(i) = nsqbacksum(i) * [DNLF(nsum(i))]2  nbacksum(i) * DDNLF(nsum(i))
The sfields ssigma of each connection
are then updated:
ssigma(i,j) = (1  gamma) * ssigma(i, j)
+ gamma[ 2 * nggrad(j) * (sval(i, j)  nval(i))**2  ngrad(j) ]
8.1.2.0.1.6.2.1. (updatennlmggradient dnlf...layers...)


This is the counterpart of function
updatelmggradient . It behaves like function
updatennggradient but performs the LevembergMarquardt
approximation:
nsqbacksum(i) = 2 * Sum on k ( 2 * (sval(i, k)  nval(i))**2 * nggrad(k)  ngrad(k) )
nggrad(i) = nsqbacksum(i) [DNLF(nsum(i))]**2
ssigma(i,j) = (1  gamma) * ssigma(i, j) + gamma[ 2 * nggrad(j) * (sval(i, j)  nval(i))**2  ngrad(j) ]
8.1.2.0.1.6.2.2. (updatennwnewton alpha decay mu...layers...)


This is the newton weight update for euclidian cells. It calls in
sequence:
(clearacc)
(clearhess)
(updatennacc...layers...)
(updatehess...layers...)
(hessianscale mu)
(updatedelta alpha)
(updatewghtonly decay)
Like updateweight this function is
much slower than calling the seven above functions.
See: (updatewnewton alpha
decay mu [...
layers ...])
See: (updateweight alpha
decay [... l ...])
8.1.2.0.1.7. Sequencement in Netenv.


In order to make clear how to use these functions, let us consider a
simple example. A network has 4 layers, layer1
to layer4 ; each layer uses its own
nlf2 to nlf4 activation
functions. In addition, layer3 is composed of euclidian units:
layer1 Input
layer2 Quasi Linear (nlf2, dnlf2, ddnlf2)
layer3 Euclidian (nlf3, dnlf3, ddnlf3)
layer4 Quasi Linear (nlf4, dnlf4, ddnlf4)
Let us consider an additional layer <des> which contains desired states for
layer <layer4>.
The standard backpropagation algorithm thus is
;present a pattern into layer layer1
;forwardprop
(updatestate nlf2 layer2)
(updatennstate nlf3 layer3)
(updatestate nlf4)
; present a desired output into layer des
; backwardprop
(initgradlms dnlf4 layer4 des)
(updategradient dnlf3 layer3)
(updatenngradient dnlf2 layer2)
; updateweights
(updateweight 0 0 layer4 layer2)
(updatennweight 0 0 layer3)
With kernels sn28ite and sn28itenew, functions
updateweight and updatennweight
are very inefficient when they do not operate on the whole network. This
is the case here. The last two lines thus are better replaced by:
; updateweights (sn28ite or sn28itenew)
(clearacc)
(updateacc layer4 layer2)
(updatennacc layer3)
(updatedelta 0)
(updatewghtonly 0)
And finally, if we use the LevembergMarquardt algorithm, these lines
become:
; second backward pass (newton)
(updatelmggradient dnlf4 0.05 layer4)
(updatelmggradient dnlf3 0.05 layer3)
(updatennlmggradient dnlf2 0.05 layer2)
; weight update (newton)
(clearacc)
(clearhess)
(updateacc layer4 layer2)
(updatennacc layer3)
(updatehess)
(hessianscale)
(updatedelta 0)
(updatewghtonly 0)
8.1.2.0.2. High Level Functions in Netenv.


This library of high level functions is automatically loaded on startup.
It provides an easier way to create and run networks in SN2.8.
Convenient functions are defined for quickly creating a network, running
the backpropagation algorithm and displaying the results. Hooks have
been managed for calling userdefined algorithm, data processing and
display routines.
8.1.2.0.2.0. Overview of High Level Functions in Netenv.


We suggest the reader to browse the file
"sn28/lib/netenv.sn" which defines most functions in the
library. This file is rather small and has a simple structure.
Understanding it is a good way to use SN2.8 more efficiently. There are
four families of functions in this file:
 Functions for
creating a network ( buildnet ,
buildnetnobias ) and creating the internal data structures
used by the library ( definenet ).
 Functions for initializing the weights (
forgetXXX ) and setting the parameters (
epsiXXX , nlfXXX )).
 Functions for managing a data base of examples, loading these
examples into the network ( presentpattern
, presentdesired ) and accessing
nfields layer per layer ( state ,
weightedsum , etc...).
 Functions for running the simulation ( run
, trun , learn
) and testing the performance ( perf
).
These simulation functions are organised around a few key functions:
 Function testpattern computes
the output of the network for a specific pattern and sets a couple of
global variables ( localerror ,
goodanswer ) according to the result.
 Four functions implement the various training algorithms:
 Function basiciteration
implements the `stochastic gradient backpropagation algorithm.
 Function batchiteration
implements the batch gradient backpropagation algorithm.
 Function basicnewtoniteration
implements both Newton's and LevembergMarquardt's stochastic
backpropagation algorithms.
 Function batchnewtoniteration
implements both Newton's and LevembergMarquardt's batch
backpropagation algorithms.
These four functions in turn make heavy use of eight simpler functions:
forwardprop docomputeerr
backwardprop backward2prop
doupdateweight doupdateweightnewton
doupdateacc accupdatehess
These functions are automatically defined when function
definenet is called, either manually by the user or
automatically during the network definition (the usual network
definition function, buildnet , calls
this function).
By changing the definition of these functions, you can create new
connectionist models. For instance, both the kMeans and LVQ algorithms
have been implemented by redefining these functions (see file
"sn28/examples/netlvq/lvq.sn" ).
8.1.2.0.2.1. Creating a Network.


8.1.2.0.2.1.0. (buildnet lunits llinks)

(netenv.sn) 
Function buildnet is a high level
function for building networks. It provides an easy way for creating
multilayer perceptrons. It also sets up all the data structures needed
by the library.
 The first argument
lunits defines the units. Argument
lunits must be a list with the following structure:
( (name1 num1) (name2 num2).....(nameN
numN) )
Each pair (name num) in list
lunits defines a layer named name with
num units. More precisely, buildnet
stores in symbol name the list returned by
(newneurons num) .
The order of the layers in argument lunits
defines the order of the computation during the forward pass. The first
pair thus is always the input layer and the last pair is always the
output layer.
Function buildnet also creates an
additional layer desiredlayer which
has the same size as the last layer in lunits
. This additional layer is used to store the desired outputs of the
network.
 The second argument llinks is a
possibly empty list of the form:
( (from1 to1) (from2 to2)....(fromn ton) )
For each pair (from to) , function
buildnet creates a full connection from layer
from to layer to . This
capability is handy for creating simple networks. Complex connection
patterns however are usually created by calling directly the functions
of library "connect.sn" once
buildnet has returned.
In addition, function buildnet
always connects the bias unit (unit 0
) to all units in all layers but the input layer.
Function buildnet generates several
global variables. The global variables are assigned with functions
makedesired , makeoutput
and makeinput . Finally, function
buildnet calls definenet
for defining the eight forward and backward propagation functions.
Example: Creating a ``424 decoder''.
(allocnet 50 100)
= 4400
? (buildnet
; create the cells
'((input 4) ; input layer
(hidden 2) ; a hidden layer
(output 4) ) ; output layer
; make connections
'((input hidden) ; connect input layer to the hidden layer
(hidden output))) ; connect hidden layer to the output layer
= ( ( ) ( ) )
? input
= ( 1 2 3 4 )
? inputlayer
= ( 1 2 3 4 )
? hidden
= ( 5 6 )
? nnum ; number of units
= 15
? snum ; number of connections
= 22
A variable netstruc is set for
backward compatibility purposes. It contains some additional information
about the structure of the network.
8.1.2.0.2.1.1. (buildnetnobias lunits llinks)

(netenv.sn) 
This function is quite similar to buildnet
. It does not connect however the allocated cells to the bias unit. This
function is especially useful for creating shared weight nets with
constraints on the biases.
8.1.2.0.2.1.2. (makeinput l)

(netenv.sn) 
This functions stores layer l into
variable inputlayer . It should be
called whenever this layer are redefined or modified. This functions is
automatically called by buildnet and
buildnetnobias .
8.1.2.0.2.1.3. (makeoutput l)

(netenv.sn) 
This functions stores layer l into
variable outputlayer . It should be
called whenever this layer is redefined or modified. This functions is
automatically called by buildnet and
buildnetnobias .
8.1.2.0.2.1.4. (makedesired l)

(netenv.sn) 
This functions stores layer l into
variable desiredlayer . It should be
called whenever this layer is redefined or modified. This functions is
automatically called by buildnet and
buildnetnobias .
8.1.2.0.2.1.5. (definenet arg)

(netenv.sn) 
This function defines the eight forward and backward propagation
functions. This function is automatically called by
buildnet and buildnetnobias
. Argument arg usually is a list of
single element lists. Each single element list contains the name of a
layer.
Remark: A limited support is offered by buildnet
, buildnetnobias and
definenet for defining specific activation functions per
layer. This feature is described in section 3.5.3.
8.1.2.0.2.2. Defining the Examples in Netenv.


The library handles the examples using functions
presentpattern and presentdesired
. These functions store example number n
in the input layer and the desired layer. The user can redefine these
functions for implementing preprocessing or for calling a more refined
database management.
The default functions presentpattern and presentdesired just store the
n th row of matrices patternmatrix
or desiredmatrix into the input layer
or into the desired layer.
8.1.2.0.2.2.0. (ensemble pmin pmax)

(netenv.sn) 
Defines the training set as being composed of examples number
pmin to pmax inclusive.
These bounds are stored in the global variables
pattmin and pattmax .
See: pattmin
pattmax
8.1.2.0.2.2.1. <pattmin pattmax

[VAR] (netenv.sn) 
These two variables store the upper lower bounds of the training set.
You should call function ensemble
instead of modifying directly these variables.
8.1.2.0.2.2.2. (testset pmin pmax)

(netenv.sn) 
Defines the test set as being composed of examples
pmin to pmax inclusive.
Examples in the training set are used for learning; examples in the test
set are only used for evaluating the performance of the neural network
on fresh data. This measure gives an idea of the generalization ability
of the network.
Function testset also prints a
pessimistic evaluation of a 90% confidence interval on the accuracy of
the performance measure. This evaluation is computed with function
hoeffding according to the value of variable confidence.
The bounds of the test set are stored in the global variables
tpattmin and tpattmax .
See: tpattmin
tpattmax
8.1.2.0.2.2.3. tpattmin tpattmax

[VAR] (netenv.sn) 
You should call function testset instead of directly modifying these
variables.
8.1.2.0.2.2.4. (hoeffding tau eta n)

(netenv.sn) 
Computes a bound epsilon of the
difference between the empirical average of n
independant identically distributed positive random variables
Xi and their mean m with
confidence 1  h (1eta) . Argument
tau is the maximal value of variables
Xi .
The bound is computed according to Hoeffding's formula:
P (  mean  1/n * Sum on i (Xi)  ) > epsilon )
< 2 * exp( 2 * n * e**2) = 1  eta
8.1.2.0.2.2.5. (presentpattern l n)

(netenv.sn) 
This function is called whenever the library needs a new example. It
must load the fields nval of layer
l with the input part of example number
n .
This function is often redefined by the user. The default function just
loads the n th row of the
2dimensionnal matrix patternmatrix
into layer l using function
getpattern and adds a gaussian noise of standard deviation
inputnoise .
8.1.2.0.2.2.6. inputnoise

[VAR] (netenv.sn) 
Noise level added to the input patterns before any network calculations.
The computation is faster when inputnoise
is set to zero. This is the default value. A non zero value (0.1 for
example ) is sometimes used during training in order to get a better
generalization.
8.1.2.0.2.2.7. (presentdesired l n)

(netenv.sn) 
This function is called by the library whenever a new example is needed.
It must load the fields nval of layer
l with the desired output part of example number
n .
This function is often redefined by the user. The default function just
loads the n th vector of the
2dimensionnal matrix desiredmatrix
into layer l using function
getpattern .
8.1.2.0.2.3. Accessing Internal Variables in Netenv.


When a network is created, space is allocated for recording numerical
values on the units and connections of the network graph.
Internal variables associated with the units are called ``nfields'', an
acronym for ``neuron fields''. Internal variables associated with the
connections are called ``sfields'', an acronym for ``synapse fields''.
Fields are highly specialized: in order to increase speed, many
computational functions are working on implicit fields. In addition, all
possible fields are not available on all versions of SN2.8. There is no
reason indeed to slow down a regular network by updating the fields
required by a shared weigths network.
8.1.2.0.2.3.0. Accessing Neural Fields of Groups of Units.


8.1.2.0.2.3.0.0. Conversion between Neural Fields and Lists.


8.1.2.0.2.3.0.0.0. (mapncar f arg)

(netenv.sn) 
This is a general function used by other functions.
(de state l (mapncar 'nval l))
= state
we can easily set or get the state of a list of units:
? (setq foo (newneurons 4)) ; allocate 4 units
= ( 1 2 3 4 )
? (state) ; get their state (with the bias unit state)
= ( 1 0 0 0 0 )
? (state '(2 3) 0.5) ; set the state of units 2 and 3 to 0.5
= 0.5
? (state foo) ; display the states of layer foo
= ( 1 0 0.5 0.5 0 )
? (state 0.33) ; set the state of all the units to 0.33
= 0.33
? (state) ; display the network state
= ( 1 0.33 0.33 0.33 0.33 )
And in fact, the following functions are already defined:
8.1.2.0.2.3.0.0.1. (state args)

(netenv.sn) 
Set or get the states of a group of units (field
nval ).
8.1.2.0.2.3.0.0.2. (gradient args)

(netenv.sn) 
Set or get the gradients of a group of units (field
ngrad ).
8.1.2.0.2.3.0.0.3. (weightedsum args)

(netenv.sn) 
Set or get the total input of a group of units (field
nsum ).
8.1.2.0.2.3.0.0.4. (backsum args)

(netenv.sn) 
Set or get the backward sum of a group of units (field
nbacksum ).
8.1.2.0.2.3.0.0.5. (epsilon args)

(netenv.sn) 
Set or get the learning rate of a group of units (field
nepsilon ). With kernels sn28new and sn28itenew, this
function returns the average of the learning rates attached to the
weights of each unit.
8.1.2.0.2.3.0.0.6. (realepsilon args)

(netenv.sn) 
With kernels sn28new and sn28itenew only, this function gets the
effective learning rate of a group of units. The effective learning rate
is equal to:
effective_epsilon = epsilon / (mu +  sigma )
where epsilon is the learning rate
attached to the unit (or the weight) and sigma
is the mean second derivative of the cost function with respect to the
weights of the unit extracted from the sfield
shess .
8.1.2.0.2.3.0.0.7. (sqbacksum args)

(netenv.sn) 
With kernels sn28new and sn28itenew only, sets or gets the instantaneous
second derivatives of the cost function with respect to the state of a
group of units (field nsqbacksum ).
8.1.2.0.2.3.0.0.8. (ggradient args)

(netenv.sn) 
With kernels sn28new and sn28itenew only, sets or gets the instantaneous
second derivatives of the cost function with respect to the total input
of a group of units (field nggrad ).
8.1.2.0.2.3.0.0.9. (sigma args)

(netenv.sn) 
With kernels sn28new and sn28itenew only, sets or gets the average of
the second derivative of a group of units (field
nsigma ).
8.1.2.0.2.3.1. Conversion between Neural Fields and Matrices.


Several functions transfer data from a matrix to the
nval or sval field of
units.
 Functions getpattern
and getpattern2 are used for loading
the network input with new data.
 Functions matrixtoweight and
weighttomatrix are used for saving or loading weights.
8.1.2.0.2.3.1.0. (getpattern arr n l1...ln)


This function is used to transfer a pattern vector into the states of a
list of units. The units to be loaded are defined by the lists
l1 ... ln . Matrix
arr must have 2 dimensions; n
is the index of the vector that should be transfered. The sum of the
length of the lists l1 ...
ln should be equal to the length of the vector (i.e. the size
of the matrix in the second dimension)
(dim mypatterns 10 5) ; define a matrix with 10 vectors of dimension 5
(initializemypatterns) ; put interesting data into the matrix
; transfer elements (7 x) to units 1, 2, 3, 11, 12
(getpattern mypattern 7 '(1 2 3) '(11 12))
8.1.2.0.2.3.1.1. (getpattern2 arr l1...ln)


This function is also used to transfer a pattern vector into the states
of a list of units. However it works slightly differently than
getpattern .
Matrix arr is transfered entirely into the units defined by
l1 ... ln . It may have any
number of dimensions. The sum of the length of the lists
l1 ... ln should be equal
to the number of elements in the matrix. This function is mainly
designed for dealing with submatrices.
Function getpattern2 is slightly
slower than getpattern. However, it implements a more general way to
transfer data from a matrix to a layer in the network.
; get pattern is now defined as
(de getpattern(arr n . l)
(apply getpattern2 (cons (submatrix arr n ()) l)) )
8.1.2.0.2.3.2. Accessing Synaptic Fields of Groups of Units.


8.1.2.0.2.3.2.0. Conversion between Synaptic Fields and Lists.


The following functions are not meant for saving or restoring weights on
disk but for temporarily saving a weight configuration in a lisp
variable. The content of this variable can be used to restore the weight
configuration afterwards. These functions are convenient for
experimenting with several sets of parameters or for comparing different
weight configurations.
8.1.2.0.2.3.2.0.0. (dumpweights)

(netenv.sn) 
Returns a list of lists. Each list contains the weights of a particular
unit. An empty list corresponds to unit with no incoming weights. The
units are ordered according to their number.
; save the weight configuration into variable agoodsolution
? (setq agoodsolution (dumpweights))
=...
8.1.2.0.2.3.2.0.1. (restweights l)

(netenv.sn) 
Loads the network weights with the content of the variable passed as
argument.
Caution: This function does not create connections but just sets the
weight values. The network structure must be exactly identical to the
structure used when dumpweights has
been called.
; restore the weights stored in variable agoodsolution
? (restweights agoodsolution)
= ()
8.1.2.0.2.3.2.0.2. (weight n)

(netenv.sn) 
Returns the list of the incoming weights to cell
n (field sval ).
8.1.2.0.2.3.2.0.3. (setweight n l)

(netenv.sn) 
Sets the weights of unit n to the
values contained in list l .
8.1.2.0.2.3.2.0.4. (deltaw n)

(netenv.sn) 
Returns the list of the weight change of the incoming weights to cell
n (field sdelta ).
8.1.2.0.2.3.2.0.5. (setdeltaw n l)

(netenv.sn) 
Sets the delta's of the weights coming into unit
n to the values contained in l
.
8.1.2.0.2.3.2.1. Conversion between Synaptic Fields and Matrices.


8.1.2.0.2.3.2.1.0. (matrixtoweight mat)


Transfers the elements of a monodimensional matrix
mat into the network weights. This function is usually called
by loadnet .
8.1.2.0.2.3.2.1.1. (weighttomatrix [ mat ])


Transfers the weights of the networks into a monodimensional matrix
mat . When no suitable argument is supplied, a new matrix is
created. This function is now called by savenet
.
8.1.2.0.2.3.2.1.2. (matrixtoacc mat)


Transfers the elements of a monodimensional matrix mat into the
gradient accumulator sacc for each
weight in the network. This functions is available in kernels sn28ite
and sn28itenew.
8.1.2.0.2.3.2.1.3. (acctomatrix [ mat ])


Returns a monodimensional matrix containing the gradient accumulator
sacc for each weight in the network. When provided, the
optional argument mat specifies a
destination matrix and saves the memory allocation time. This functions
is available in kernels sn28ite and sn28itenew.
8.1.2.0.2.3.2.1.4. (matrixtodelta mat)


Transfers the elements of a monodimensional matrix
mat into the weight change field
sdelta for each weight in the network. This functions is
available in kernels sn28ite and sn28itenew.
8.1.2.0.2.3.2.1.5. (deltatomatrix [ mat ])


Returns a monodimensional matrix containing the weight change field
sdelta for each weight in the network. When provided, the
optional argument mat specifies a
destination matrix and saves the memory allocation time. This functions
is available in kernels sn28ite and sn28itenew.
8.1.2.0.2.3.2.1.6. (matrixtohess mat)


Transfers the elements of a monodimensional matrix mat into the hessian
curvature field sdelta for each
weight in the network. This functions is available in sn28itenew only.
8.1.2.0.2.3.2.1.7. (hesstomatrix [ mat ])


Returns a monodimensional matrix containing the hessian curvature field
shess for each weight in the network. When provided, the
optional argument mat specifies a destination matrix and saves the
memory allocation time. This functions is available in sn28itenew only.
8.1.2.0.2.4. Weight Files in Netenv.


The weights (sfield sval ) may be
saved in files under several formats.
First, it is possible to store them under ascii or binary formats.
Secondly, it is possible to store them with (resp. without) information
related to the neural net architecture, leading to a explanative (resp.
blind) format.
Function loadnet and
mergenet recognize all these formats.
8.1.2.0.2.4.0. (savenet/merge str [ l ])


Saves the weight values of connections involving any two units in list
l and connections from the bias unit (unit
0 ) to any unit in list l .
When argument l is omitted, all units
are considered. All the weights then are saved.
Function savenet stores the weights
into file str with a binary
explanative format. The first 12 bytes of the file are:
 A
``magic'' number on 4 bytes for identifying the file type.
 The number of units in list l ,
as a 4 bytes integer.
 The number of training iterations achieved so far, as a 4 bytes
integer. There is then one record per connection. Each record is
composed of 12 bytes.
 A 4 bytes integer gives the rank in list l
of the upstream unit.
 A 4 bytes integer gives the rank in list l
of the downstream unit.
 A 4 bytes floating point number gives the value of sfield
sval for this connection.
8.1.2.0.2.4.1. (saveasciinet/merge str [ l ])


This function is similar to savenet/merge
, but creates a human readable text file.
The first line of the file is a file header. It is composed of the
letters ".WEI" , of the numbers of
cells involved in the file and of the age of the network. Each
connection is described on one line, consisting in the number of the
upstream cell, the number of the downstream cell and the weight value.
Note that the celle 0 is always the
threshold cell. After 160 iterations, a three layers XOR network might
be saved as:
.WEI 6 160
0 3 1.7688258
0 4 2.2176821
0 5 1.5779345
1 3 2.3539493
1 4 2.0589452
2 3 0.8007546
2 4 1.9538255
3 5 1.5308707
4 5 1.4315528
In this file, unit 0 is the bias,
units 1 and 2
are the input units, units 3 and
4 are the hidden units and unit 5
is the output unit.
8.1.2.0.2.4.2. (savenet str [ l ])

(netenv.sn) 
When argument l is provided, this
function works like savenet/merge and
create an old format file. Argument l
indicates indeed that the weights are stored for a subsequent use by
function mergenet .
When argument l is omitted, this
function saves the weights in file str
as a regular monodimensional TLisp matrix using function
savematrix . This matrix just contains the weights by their
order of creation.
8.1.2.0.2.4.3. (saveasciinet str [ l ])

(netenv.sn) 
When argument l is provided, this
function works like saveasciinet/merge
and create an explanative format file. Argument l
indicates indeed that the weights are stored for a subsequent use by
function mergenet .
When argument l is omitted, this
function saves the weights in file str
using blind format which means a regular monodimensional TLisp matrix
using function saveasciimatrix .
This matrix just contains the weights by their order of creation.
8.1.2.0.2.4.4. (loadnet str)

(netenv.sn) 
Loads a weight configuration saved with savenet
from file str .
See: (mergenet str [
l ])
8.1.2.0.2.4.5. (mergenet str [ l ])


When a list l is given as argument,
function mergenet loads the weights saved in file str and stores them
in the fields sval of the
corresponding connections between units in list l
.
When argument l is omitted, function
mergenet returns the number of units involved by the weight
file str . When argument
l is omitted and when str
is not an explanative format weight file, function
mergenet returns () . This
feature is used for sensing the file format (explanative vs. blind).
Function mergenet is usually called
with the same list of units than the corresponding call to
savenet . Using a different list lets you load the weights
of a network into a part of another network and viceversa.
Function mergenet never creates a
connection. Connections have to be created first.
;; The following function is defined in the file <"netenv.sn">.
;; It loads a weight file created by savenet without arguments
(de loadnet (str)
(mergenet str (range 1 (1num))) )
8.1.2.0.2.5. Setting the Parameters in Netenv.


8.1.2.0.2.5.0. Initial Weights in Netenv.


8.1.2.0.2.5.0.0. (forget x)

(netenv.sn) 
Sets all the weights of a network to a random value chosen between
 x and x according to a
uniform probability distribution.
8.1.2.0.2.5.0.1. (forgetinv x)

(netenv.sn) 
Like forget but divides the bounds of the uniform distribution by the
fanin of each unit.
8.1.2.0.2.5.0.2. (forgetsqrt x)

(netenv.sn) 
Like forget but divides the bounds of the uniform distribution by the
square root of the fanin of each unit.
8.1.2.0.2.5.0.3. (smartforget)

(netenv.sn) 
Equivalent to (forgetsqrt 1) . This
weight initialisation usually is a good choice with the default
nonlinear function.
8.1.2.0.2.5.0.4. (forgetlayer x l)

(netenv.sn) 
This function is essentially equivalent to forget
. It reinitialize only the weights of incoming connections to layer
l . The remaining weights are left unchanged.
8.1.2.0.2.5.0.5. (forgetinvlayer x l)

(netenv.sn) 
This function is essentially equivalent to
forgetinv . It reinitialize only the weights of incoming
connections to layer l . The remaining
weights are left unchanged.
8.1.2.0.2.5.0.6. (forgetsqrtlayer x l)

(netenv.sn) 
This function is essentially equivalent to
forgetsqrt . It reinitialize only the weights of incoming
connections to layer l . The remaining
weights are left unchanged.
8.1.2.0.2.5.1. Learning Parameters in Netenv.


It is rarely advisable to use the same ``learning rate'' in the whole
network. This is of course possible with function
epsilon . The following functions usually are a better
choice:
8.1.2.0.2.5.1.0. (epsi x)

(netenv.sn) 
Sets the learning rate of each unit (each connection with kernels
sn28new and sn28itenew) to x divided
by the number of inputs to the unit.
8.1.2.0.2.5.1.1. (epsisqrt x)

(netenv.sn) 
Sets the learning rate of each unit (each connection with kernels
sn28new and sn28itenew) to x divided
by the square root of the number of inputs to the unit.
8.1.2.0.2.5.1.2. (maskepsi s)

(connect.sn) 
It is often useful to set small learningrates
on shared connections. Using maskepsi
rather than epsisqrt or
epsi is often a good choice for setting the learning rates in
a shared weights network.
Function maskepsi sets the learning
rate for each unit to s divided by the
square root of the number of incoming connections and by the square root
of the sharing count of the incoming connections.
8.1.2.0.2.5.1.3. (epsilayer x l)

(netenv.sn) 
This function is essentially similar to function
epsi . However it sets only the learning rates for the units
in layer l . The other learning rate
are kept unchanged.
8.1.2.0.2.5.1.4. (epsisqrtlayer x l)

(netenv.sn) 
This function is essentially similar to function
epsisqrt . However it sets only the learning rates for the
units in layer l . The other learning
rate are kept unchanged.
Most other learning parameters are constant for all the network. They
are stored in a couple of global variables.
8.1.2.0.2.5.1.5. decay

[VAR] (netenv.sn) 
Decay factor. In each learning iteration, the weights are multiplied by
( 1  epsilon.decay ) . This computation is disabled when
decay is set to 0 . The simulation is
slightly faster in this case. The default value is
0 .
8.1.2.0.2.5.1.6. theta

[VAR] (netenv.sn) 
Noise level q . A zeromean gaussian
random variable with standard deviation theta
is added to the total input of each unit after each state update. The
state of a unit is thus equal to:
X(i) = f(A(i) + N(q))
Where A(i) is the total input to unit
i and N(q) is a zero mean
gaussian random variable with standard deviation
q . This computation is disabled when theta is set to
0 . The simulation is slightly faster in this case. The
default value is 0 . A non zero value
can be used for:
 limiting the bandwidth of the units
 getting a more robust solution
8.1.2.0.2.5.1.7. alpha

[VAR] (netenv.sn) 
Momentum factor a for backpropagation. The weight increment is computed
with the following formula:
increment W(i,j) := increment W(i,j)  epsilon * gradient of Cost on W(i,j)
This computation is disabled when alpha is set to
0 . The simulation is slightly faster in this case. The
default value is 0 .
8.1.2.0.2.5.1.8. mu

[VAR] (netenv.sn) 
With kernels sn28new and sn28itenew, variable mu
is used for computing the effective learning rate in the approximated
newton method. The effective learning rate of a unit is defined by:
Effective epsilon(i,j) = epsilon(i,j) / (mu +  sigma(i,j) )
8.1.2.0.2.5.1.9. gamma

[VAR] (netenv.sn) 
With kernels sn28new and sn28itenew, variable
gamma controls the time constant used for computing the
average second derivative sigma which
is updated according to the rule:
sigma(i) := (1  gamma) * sigma(i) + gamma * (d2 E / d W 2) * X(i)**2
where X(i)**2 is the square of the state of unit i.
See: (updatestateonly nlf
... layers ...)
See: (updateweight alpha
decay [... l ...])
See: (updateggradient dnlf
ddnlf gamma ...
layers ...)
See: (updatewnewton alpha
decay mu [...
layers ...])
8.1.2.0.2.5.2. Choosing the Activation Functions in Netenv.


8.1.2.0.2.5.2.0. nlf, dnlf, ddnlf

[VAR] (netenv.sn) 
These three variables contain the default NLF used by the library for
all layers. They are set by the functions described below.
Since NLF objects have a functional behavior, writing
(nlf x) then returns the value of the default NLF in
x , writing (dnlf x)
returns the value of its derivative in x
and writing (ddnlf x) returns the
value of its second derivative in x .
8.1.2.0.2.5.2.1. (nlftanh mx dmin [scale [offset]])

(netenv.sn) 
Sets the nonlinear function to a sigmoid of the following form:
f(x) = scale * tanh( mx * x ) + offset + dmin * x
Since the hyperbolic tangent is an odd function taking values in range
]1,1[, argument scale determines the amplitude, argument
dmin the minimum value of the derivative (if
dmin is not 0 the
asymptotes are slanted), argument mx
defines the gain or the inverse of the ``temperature'' and argument
offset shifts the curve up or down.
Argument offset default to
0 . If argument scale is omitted, a scale is computed to
ensure that f(1) = 1 and
f(1) = 1 .
The most common settings are the following:

mx=0.5 , dmin=0 ,
scale=0.5 , offset=0.5
corresponds to the standard logistic function going from
0 to 1 , with
f(0)=0.5 .
 mx=2/3 ,
dmin=0 gives a function which goes from
1.715905 to +1.715905 .
This function fulfils the conditions f(1)=1
and f(1)=1 . Moreover, the maximum
of the second derivative of this function is at 1
. This is the default activation function of SN2.8.
The default non linear function for SN2.8 and its derivatives:
(nlftanh 0.666 0)
8.1.2.0.2.5.2.2. (nlflin dmin dmax [th])

(netenv.sn) 
Set the nonlinear function to a piecewise linear function. This odd
function is made of three parts. The first part, between
th and th , is a line
segment with a slope equal to dmax .
The remaining parts are straight lines of slope
dmin outside range [th,th]
. The three parts form a continuous curve.
Although this function is not differentiable, the computation of its
derivative causes no problem except for two points. SN2.8 assumes that
the derivative at these two boundaries are those of the central part. A
typical piecewise non linear function: and its derivatives:
(nlflin 0.1 1.1)
8.1.2.0.2.5.2.3. (nlfbell)

(netenv.sn) 
Set the non linear function to a bell shaped function whose equation is:
f(x) = (1x**2)**3 if 1<x<1
f(x) = 0 otherwise
8.1.2.0.2.5.2.4. (nlfbin mx dmin [scale [offset]])

(netenv.sn) 
Set the non linear function to a binary threshold with a smooth
pseudoderivative similar to the derivative obtained by the
nlftanh function. The nlfbin
function, here (nlfbin 0.666 0 1) ,
is made of a binary step. However, a smooth derivative is used during
the backward pass.
8.1.2.0.2.5.2.5. (nlfspline x y)

(netenv.sn) 
Set the default nonlinear function (defined by
nlf , dnlf ,
ddnlf ) to a cubic spline interpolated between points whose
coordinates are in x and
y . Computing a spline interpolation is often faster than
computing a transcendent function. Here is a code for tabulating the
standard NLF as a spline:
? (nlftanh 0.6666 0) ; sets a standard function
= 1.14393
? (let* ((x (range 4 4 0.05))
(y (all ((i x)) (nlf i))))
(nlfspline x y) ) ; copy it with a cubic spline
= 400
8.1.2.0.2.5.2.6. NLF support in buildnet.


The library allows the use of different NLFs per layer. In the
buildnet instruction, the layer specification may be
completed by a call to function of the type
nlfXXX .
(buildnet
'((input 4)
(hidden 2)
(output 4 (nlflin 1 1)))
'((input hidden)
(hidden output)) )
In that example, layer hidden uses the default NLF, defined by variables
nlf , dnlf ,
ddnlf . This default NLF can be changed by subsequent calls
to a function nlfXXX . Layer output
however uses a fixed linear function, defined by
(nlflin 1 1) . Subsequent use of functions
nlfXXX will never affect the activation function of layer
output.
8.1.2.0.2.6. Performance Evaluation in Netenv.


8.1.2.0.2.6.0. Single Example Performance in Netenv.


8.1.2.0.2.6.0.0. (testpattern n)

(netenv.sn) 
Tests the response to pattern n . This
function presents the pattern number n
using presentpattern , propagates the
states forward using forwardprop ,
computes the error on that pattern, stores its average scalar value into
localerror using docomputeerror
, decides if the answer is correct using classify and finally calls a
display function.
Function testpattern is defined as follows:
(de testpattern (n)
; transfers pattern n into the inputlayer
(presentpattern inputlayer n)
; propagates the states forward: computes the outputs
(forwardprop netstruc)
; gets the desired outputs
(presentdesired desiredlayer n)
; computes the error for this pattern,
; and stores it into localerror
(docomputeerr outputlayer desiredlayer)
; decides if the result in goodanswer
; and stores the result in goodanswer
(setq goodanswer (classify n)
; call displaying functions
(processpendingevents)
(dispperfiteration n)
; returns the error
localerror)
Functions forwardprop and
docomputeerror are defined at network creation time by the
function definenet . These functions
are described in the next section.
Function classify is defined by the setclassXXX
functions. It defines how to measure the quality of the systems. The
default classify function returns t if
the outputs and the desired outputs have the same sign. This is too
restrictive for most applications: you should always call one of the
setclassXXX functions in order to define your
classification criterion.
8.1.2.0.2.6.0.1. localerror

[VAR] (netenv.sn) 
This global variable contains a flag indicating the success of the
network on the current pattern. It is set by functions
testpattern , basiciteration
and basicnewtoniteration .
8.1.2.0.2.6.0.2. goodanswer

[VAR] (netenv.sn) 
This global variable contains the average scalar training error for the
current pattern. It is set by functions
testpattern , basiciteration
and basicnewtoniteration .
8.1.2.0.2.6.1. Multiple Example Performance in Netenv.


8.1.2.0.2.6.1.0. (perf [n1 n2])

(netenv.sn) 
Without arguments, this function computes the performance of the network
on the training set. When integer arguments n1
and n2 are given, the performance is
evaluated on the patterns whose indices are between
n1 and n2 inclusive.
Function perf temporary cancels the
inputnoise settings.
Function perf prints the average error
over the examples and over the outputs as well as the percentage of good
recognition (as defined by the function classify). This function also
calls the function saveperf wich
saves the result of the measure in the file whose name is contained in
the global variable perffile .
8.1.2.0.2.6.1.1. (performance n1 n2)

(netenv.sn) 
Lower level performance evaluation function called by
perf . Function performance
does not affect the inputnoise
settings. If inputnoise is not
0 , the measured performance may thus be different if you
call performance twice.
This function computes the average error over patterns between
n1 and n2 and stores the
value into global variable globalerror
. It also computes the percentage of good answers and stores it into
global variable goodanpercent . The
value of globalerror is returned.
8.1.2.0.2.6.1.2. globalerror, goodanpercent

[VAR] (netenv.sn) 
These variables contain the results of the last call to performance.
8.1.2.0.2.6.2. Performance File in Netenv.


8.1.2.0.2.6.2.0. perffile

[VAR] (netenv.sn) 
This variable defines an optional performance file.
This file is used for storing information about the main neural function
calls and results.
perf,
epsilayer, epsisqrtlayer, forgetlayer, forgetsqrtlayer,
epsi, epsisqrt, mu, forget, forgetsqrt, ensemble, testset.
Example:
;;; ============ NEW PERFORMANCE
;;; ============ NEW NETWORK
;;; (forgetsqrt 1)
;;; (epsisqrt 0.1)
;;; (ensemble 0 319)
;;; (testset 320 479)
;;; trunaction using function 'learn'...
0 0.53568070 9.69 ;{0319}
0 0.53332978 10.00 ;{320479}
320 0.07470918 93.75 ;{0319}
320 0.08033946 91.25 ;{320479}
640 0.05402946 97.50 ;{0319}
640 0.06082892 95.62 ;{320479}
960 0.04978141 98.75 ;{0319}
960 0.05853230 96.88 ;{320479}
1280 0.04746896 99.06 ;{0319}
1280 0.05897417 97.50 ;{320479}
;;; (epsisqrt 0.05)
1600 0.03919556 100.00 ;{0319}
1600 0.05168496 97.50 ;{320479}
1920 0.03774762 100.00 ;{0319}
1920 0.05145141 97.50 ;{320479}
;;; (epsisqrt 0.025)
2240 0.03479438 100.00 ;{0319}
2240 0.04916938 98.12 ;{320479}
2560 0.03418482 100.00 ;{0319}
2560 0.04903031 98.12 ;{320479}
2880 0.03369421 100.00 ;{0319}
2880 0.04896870 98.12 ;{320479}
3200 0.03326655 100.00 ;{0319}
3200 0.04892774 98.12 ;{320479}
;;; break
8.1.2.0.2.6.2.1. (saveperf [n1 [n2...]])

(netenv.sn) 
This function takes numeric arguments which will be written on a single
line in a file whose name is in global variable
perffile .
 If variable
perffile contains the empty list or an empty string,
function saveperf does nothing.
 If variable perffile contains
the name of an existing file, the values of n1
, n2 , etc... are appended on a single
line at the end of the file.
Function saveperf is called by
function perf .
8.1.2.0.2.6.2.2. (plotperffile fname)

[VAR] (netenv.sn) 
Function plotperffile creates a
performance plot and an error plot of the informations stored in
performance file fname .
8.1.2.0.2.6.3. Classification Modes.


8.1.2.0.2.6.3.0. (classlms pn margin)

(netenv.sn) 
Tests if the mean square distance between the state of
desiredlayer and the state of
outputlayer is smaller than margin. Argument
pn is not used.
8.1.2.0.2.6.3.1. (setclasslms margin)

(netenv.sn) 
Defines the current classification function to be
classlms with margin margin
.
8.1.2.0.2.6.3.2. (classsgn <pn tmin tmax)

(netenv.sn) 
Decides of the correctness of the output based on the region where the
output state is compared to the desired output. Arguments
tmin and tmax are two
thresholds which defines three regions: less than
tmin , between tmin and
tmax and above tmax .
The output is considered correct if all the states of the units in
outputlayer are in the same region as the states of units in
desiredlayer . This lisp function is slower than
classquadrant . Argument pn
is not used.
8.1.2.0.2.6.3.3. (setclasssgn tmin tmax)

(netenv.sn) 
Set the classification function to classsgn
with thresholds tmin and
tmax . If both tmin and
tmax are 0 , the faster
function classquadrant is used.
8.1.2.0.2.6.3.4. (classmax pn)

(netenv.sn) 
Tests if the most active unit has the same rank in
outputlayer and desiredlayer
. Argument pn is not used.
8.1.2.0.2.6.3.5. (setclassmax)

(netenv.sn) 
Set the classification function to classmax
.
8.1.2.0.2.6.3.6. (classhamming pn margin)

(netenv.sn) 
Tests if all the differences between the states of the output layer and
the corresponding desired states are less than margin. Argument
pn is not used.
8.1.2.0.2.6.3.7. (setclasshamming margin)

(netenv.sn) 
Set the classification function to classhamming
.
8.1.2.0.2.6.3.8. (classquadrant pn)

(netenv.sn) 
Tests if the output vector and the desired vector are in the same
quadrant. Argument pn is not used.
8.1.2.0.2.6.3.9. (setclassquadrant)

(netenv.sn) 
Set the classification function to classquadrant
.
8.1.2.0.2.6.3.10. (setclassnil)

(netenv.sn) 
Sets the classification function to an empty function. This should be
used whenever possible to speed up the simulation.
8.1.2.0.2.6.3.11. (classify pn)

(netenv.sn) 
Main classification function. Function classify
calls one of the previous classification functions. It can be user
redefined.
8.1.2.0.2.7. Training the Network with Online Algorithms in Netenv.


8.1.2.0.2.7.0. Choice of the Patterns in Netenv.


On line algorithms update the weights after each pattern presentation.
The order of presentation thus becomes important. The following
functions define how the library presents the patterns.
Learning functions learn and run use the function nextpattern to choose
the next pattern to present to the network. The user can redefine
nextpattern to fit his needs or select a predefined method.
8.1.2.0.2.7.0.0. (nextpattern n)

(netenv.sn) 
Choose a new pattern in the training set and returns its index. Argument
n is the index of the current pattern.
8.1.2.0.2.7.0.1. currentpattern

[VAR] (netenv.sn) 
Contains the index of the current pattern.
8.1.2.0.2.7.0.2. (setnextpatseq)

(netenv.sn) 
Redefines nextpattern to be
equivalent to nextpatseq . The
following pattern is chosen regardless of the result of the previous
classification.
8.1.2.0.2.7.0.3. (setnextpatstay)

(netenv.sn) 
Redefines nextpattern to be
equivalent to nextpatstay .
The same pattern is kept until the network returns the right answer
(until function classify returns a positive result). The next pattern is
then chosen using function nextchoice
.
8.1.2.0.2.7.0.4. (nextchoice n)

(netenv.sn) 
This function chooses the next pattern when a new pattern is needed.
8.1.2.0.2.7.0.5. (setnextchoseq)

(netenv.sn) 
This function redefines nextchoice so
that the patterns are chosen sequentially in the training set.
8.1.2.0.2.7.0.6. (setnextchornd)

(netenv.sn) 
This function redefines nextchoice so that the next pattern is chosen
at random in training set with uniform probability.
8.1.2.0.2.7.1. Forward and Backward Propagation Functions in Netenv.


Like testpattern , the learning
functions make heavy use of eight elementary functions defined at
network creation time by definenet .
Tampering with the definition of these functions is the preferred way to
explore new algorithms.
8.1.2.0.2.7.1.0. (forwardprop ns)

[DE] 
This function propagates the states forward in the network. It is
usually defined as a sequence of calls to
updatestate . Argument ns
is an inheritance of primitive versions of SN2.8 and is never used.
8.1.2.0.2.7.1.1. (forward2prop ns)

(netenv.sn) 
Function forward2prop is defined at
network creation time by definenet
and plays a role similar to that of functions
forwardprop , backwardprop
, backward2prop , etc... Function
forward2prop just calls function
updatedeltastate on each layers of the network.
This function is used by the conjugate gradient algorithms.
8.1.2.0.2.7.1.2. (docomputeerr <out des)

[DE] 
This function computes the training error between the output layer
out and the desired layer des
, stores the average scalar error in the global variable
localerror and initializes the gradients by using
initgrad .
8.1.2.0.2.7.1.3. (backwardprop ns)

[DE] 
This function propagates the gradients backward in the network. It is
usually defined as a sequence of calls to
updategradient . Argument ns
is unused.
8.1.2.0.2.7.1.4. (doupdateweight ns)

[DE] 
Given the gradients and the states, this function performs the weight
update. It is usually defined as a single call to
updateweights . With kernels sn28ite and sn28itenew, this
function may be defined as:
(de doupdateweight (ns)
(clearacc)
(doupdateacc ns)
(updatedelta alpha)
(updatewght only decay))
(doupdateacc ns)
With kernels sn28new and sn28itenew only, this function loops over the
connections and adds their contribution to the gradient in the field
sacc of the weights. Usually defined as a single call to
updateacc , this function may be redefined as a sequence of
calls to updateacc .
8.1.2.0.2.7.1.5. (backward2prop ns)

[DE] 
With kernels sn28new and sn28itenew only, this function propagates
backward the second derivatives of the cost (fields
nsqbacksum and nggradient
). This function is usually defined as a sequence of calls to
updategradient or updatelmggradient
according to the status of the flag levmar
.
8.1.2.0.2.7.1.6. (doupdateweightnewton m)

[DE] 
With kernels sn28new and sn28itenew only, this function updates the
weights given the states, gradients and second derivatives of the cost
function. It is usually defined as a single call to
updatewnewton .
With kernel sn28itenew however, this function might be redefined as:
(de doupdateweight (ns)
(clearacc)
(clearhess)
(doupdateacc ns)
(doupdatehess ns)
(hessianscale mu)
(updatedelta alpha)
(updatewghtonly decay))
(doupdatehess ns)
Accumulates the contributions of each connection to the second
derivatives (in fields ssigma ) into
the field shess of the weights. This
is usually done by a single call to function
updatehess .
8.1.2.0.2.7.2. Online Gradient.


8.1.2.0.2.7.2.0. (initgrad out des)

(netenv.sn) 
Sets the gradients (fields nbacksum
and ngrad ) of units in out using the
desired state stored in units of list des
. This function uses either initgradlms
or initgradthlms depending on the
active mode. It returns the total output error.
8.1.2.0.2.7.2.1. (setcostlms)

(netenv.sn) 
Defines the current error measure to be the classical meansquare:
C = 0.5 *  DesiredOutput  ActualOutput **2
Function initgrad then calls
initgradlms .
8.1.2.0.2.7.2.2. (setcostthlms threshold)

(netenv.sn) 
Defines the current error measure to be the thresholded meansquare.
If the actual output is larger than the desired output and the desired
output is larger than threshold or if
the actual output is smaller than the desired output and the desired
output is smaller than threshold then
the error is set to 0 and no gradient
is propagated. Otherwise the error is the classical meansquare error.
Function initgrad then calls initgradthlms.
8.1.2.0.2.7.2.3. (basiciteration n)

(netenv.sn) 
This function performs an elementary learning iteration. It first gets
an input pattern using the redefinable function
presentpattern . Then it propagates the states using
forwardprop , gets the desired values using
presentdesired and sets the variables
localerror and goodanswer
using functions docomputeerr and
classify .
The gradients then are propagated backward using function
backwardprop and the weights are updated with function
doupdateweight .
Function basiciteration also is in charge of incrementing the variable
age and calling a few display functions. It is defined as follows:
(de basiciteration (n)
; get a pattern
(presentpattern inputlayer n)
; propagates the states
(forwardprop netstruc)
; get the desired output
(presentdesired desiredlayer n)
; computes localerror and initializes the gradients
(docomputeerr outputlayer desiredlayer)
; propagates the gradients
(backwardprop netstruc)
; update the weights
(doupdateweight netstruc)
; increments "age"
(incr age)
; computes "good answer"
(setq goodanswer (classify n))
; display functions
(processpendingevents)
(dispbasiciteration)
localerror)
The following learning function provides convenient interfaces for
calling function basiciteration .
8.1.2.0.2.7.2.4. (learn n)

(netenv.sn) 
Performs n learning iterations. This
function modifies the variable currentpattern
after each learning iteration and uses the patternchoosing function
nextpattern .
8.1.2.0.2.7.2.5. (learncycle n)

(netenv.sn) 
Performs n sweeps through the training
set as defined by ensemble. The patterns are taken in sequence,
regardless of the patternchoosing function. This function does not
modify the variable currentpattern .
8.1.2.0.2.7.2.6. (run cyclength cycnum)

(netenv.sn) 
This is the main learning function. Function run performs
cyclength learning iterations, starting at pattern
currentpattern and choosing the next pattern with the
function nextpattern . Then it
measures the performance on the training set (i.e the patterns whose
index are between pattmin and
pattmax as defined by the function
ensemble ). This whole sequence is repeated
cycnum times .
8.1.2.0.2.7.2.7. (trun cyclength cycnum)

(netenv.sn) 
This function behaves like function run
. But it also measures the global performance on the test set specified
by the function testset .
8.1.2.0.2.7.3. Online Second Order Algorithms in Netenv.


A set of functions are provided for implementing second order stochastic
backpropagation algorithms. The functions described in this section are
only available with kernels sn28new and sn28itenew.
8.1.2.0.2.7.3.0. levmar

[VAR] (netenv.sn) 
This flag controls the behavior of
backward2prop and thus the behavior of
basicnewtoniteration .
 When flag
levmar is the empty list, the Quasi Newton algorithm is
applied.
 When flag levmar is
t , the Quasi Levemberg Marquardt algorithm is applied.
8.1.2.0.2.7.3.1. (basicnewtoniteration n)

(netenv.sn) 
This function closely looks like basiciteration
. It however calls functions backward2prop
and doupdateweightnewton . It
applies thus the Quasi Newton or the Quasi Levemberg Marquardt algorithm
according to the status of the flag levmar
.
8.1.2.0.2.7.3.2. (learnnewton n)

(netenv.sn) 
This function closely looks like learn
. However it clears the flag levmar
and calls basicnewtoniteration ,
applying thus the Quasi Newton algorithm.
8.1.2.0.2.7.3.3. (learncyclenewton n)

(netenv.sn) 
This function closely looks like learncycle
. However it clears the flag levmar
and calls basicnewtoniteration ,
applying thus the Quasi Newton algorithm.
8.1.2.0.2.7.3.4. (runnewton cyclength cycnum)

(netenv.sn) 
This function closely looks like run .
However it clears the flag levmar and
calls basicnewtoniteration ,
applying thus the Quasi Newton algorithm.
8.1.2.0.2.7.3.5. (trunnewton cyclength cycnum)

(netenv.sn) 
This function closely looks like trun
. However it clears the flag levmar
and calls basicnewtoniteration ,
applying thus the Quasi Newton algorithm.
8.1.2.0.2.7.3.6. (learnlm n)

(netenv.sn) 
This function closely looks like learn
. However it temporarily sets the flag levmar
and calls basicnewtoniteration ,
applying thus the Quasi Levemberg Marquardt algorithm.
8.1.2.0.2.7.3.7. (learncyclelm n)

(netenv.sn) 
This function closely looks like learncycle
. However it temporarily sets the flag levmar
and calls basicnewtoniteration ,
applying thus the Quasi Levemberg Marquardt algorithm.
8.1.2.0.2.7.3.8. (runlm cyclength cycnum)

(netenv.sn) 
This function closely looks like run .
However it temporarily sets the flag levmar
and calls basicnewtoniteration ,
applying thus the Quasi Levemberg Marquardt algorithm.
8.1.2.0.2.7.3.9. (trunlm cyclength cycnum)

(netenv.sn) 
This function closely looks like learn
, learncycle ,
run and trun . However it
temporarily sets the flag levmar and
calls basicnewtoniteration ,
applying thus the Quasi Levemberg Marquardt algorithm.
8.1.2.0.2.8. Training the Network with Batch Algorithms in Netenv.


These ``batch'' algorithms update the weights after the presentation of
several patterns.
8.1.2.0.2.8.0. Batch Gradient in Netenv.


A set of functions are provided for implementing the batch version of
backpropagation (among other learning algorithms). The functions
described in this section are only available with kernels sn28ite and
sn28itenew.
The batch version accumulates the gradients over a set of patterns
before performing a weight update. The batch version is much slower than
the online version (it usually takes much more forward and backward
passes to learn a given problem). It is sometimes useful for analyzing
the error surface and the behavior of an algorithm.
8.1.2.0.2.8.0.0. (runbatch cyclength cycnum)

(netenv.sn) 
This is the main simulation function for the batch version.
Function runbatch performs
cyclength batch learning iterations. Each iteration consist
in a sweep through the entire training set (i.e. patterns between
pattmin and pattmax ).
Then it measures the performance on this training set. This entire
sequence is repeated cycnum times.
8.1.2.0.2.8.0.1. (trunbatch cyclength cycnum)

(netenv.sn) 
This is the main simulation function for the batch version.
Function trunbatch performs
cyclength batch learning iterations. Each iteration consist
in a sweep through the entire training set (i.e. patterns between
pattmin and pattmax ).
Then it measures the performance on this training set and on the test
set (i.e. patterns between tpattmin
and tpattmax ). This entire sequence
is repeated cycnum times.
8.1.2.0.2.8.0.2. (learnbatch n)

(netenv.sn) 
Performs n batch learning iterations.
Each iteration consist in a sweep through the entire training set (i.e.
patterns between pattmin and
pattmax ).
8.1.2.0.2.8.0.3. (batchiteration pmin pmax)

(netenv.sn) 
This function implements the batch version of backpropagation.
Function batchiteration accumulates
the gradients and the errors of the training set patterns and then
performs a weight update. The global variable
globalerror is set to the average output error over the set
of patterns and its value is returned. The global variable age is
incremented after each pattern presentation in order to remain
consistent with the stochastic version.
8.1.2.0.2.8.1. Batch Conjugate Gradient in Netenv.


Function learnbatchcg performs the
conjugate gradient optimisation using two auxiliary function
batchcg1iteration and
batchcg1iteration . An elementary propagation function,
forward2prop , is also defined at network creation time by
function definenet . You can of
course find more information about our implementation of the conjugate
gradient by looking at these functions in file
".../sn28/lib/netenv.sn" .
8.1.2.0.2.8.1.0. (learnbatchcg ns [grad1] [grad2] [gradc])

(netenv.sn) 
Function learnbatchcg performs
ns epochs of the batch conjugate gradient algorithm. Each
epoch consists in a complete pass over the training set. There is no
need to set the learning rates or any other parameters before applying
the batch conjugate gradient algorithm. As a side effect, function
learnbatchcg sets all learning rates to
1 . You must therefore reset the learning rates before
applying another algorithm.
The optional arguments grad1 ,
grad2 and gradc are three
matrices used for continuing the conjugate gradient algorithm without
loosing the information accumulated during the previous epochs. Function
learnbatchcg indeed returns a list
(it grad1 grad2 gradc) that you can use as argument list for
the next call to function learnbatchcg
.
;; WRONG: Restarts the conjugate gradient every other epoch!
(repeat n
(learnbatchcg 2)
(perf) )
;; RIGHT: Runs the conjugate gradient during 2*n epochs!
(let ((cgs (list 2)))
(repeat n
(setq cgs (apply learnbatchcg cgs))
(perf) ) )
8.1.2.0.2.8.1.1. (runbatchcg cyclength cycnum)

(netenv.sn) 
Function runbatchcg repeats
cycnum times the following operations:
 Performing
cyclength epochs of batch conjugate gradient. Each epoch
consist in a sweep through the entire training set,
 Measuring and displaying the performance on the training set.
This function is simular to function run usually used for the online
gradient algorithm.
8.1.2.0.2.8.1.2. (trunbatchcg cyclength cycnum)

(netenv.sn) 
Function trunbatchcg repeats
cycnum times the following operations:
 Performing
cyclength epochs of batch conjugate gradient. Each epoch
consist in a sweep through the entire training set,
 Measuring and displaying the performance on the training set,
 Measuring and displaying the performance on the test set.
This function is similar to function trun used for the online gradient
algorithm.
8.1.2.0.2.8.1.3. (batchcg1iteration from to)

(netenv.sn) 
Function batchcg1iteration computes
the gradient of the cost function. It performs a complete pass of the
training set, accumulates the gradient of the cost function over all
examples, and stores this gradient in connection field
sacc . This function is close to function
batchiteration but does not modify the weights.
8.1.2.0.2.8.1.4. (batchcg2iteration from to)

(netenv.sn) 
Function batchcg2iteration returns
the GaussNewton approximation to the curvature of the cost function
along a search direction stored in connection field
sdelta . It performs a second pass of the training set and
computes the curvature using function
forward2prop which is defined at network creation time by
definenet .
8.1.2.0.2.8.2. Batch Second Order Algorithm in Netenv.


Finally, a set of functions are provided for implementing the batch
version of backpropagation (among other learning algorithms). The
functions described in this section are only available with kernel
sn28itenew.
8.1.2.0.2.8.2.0. (batchnewtoniteration pmin pmax)

(netenv.sn) 
This function is similar to batchiteration
. It however applies the Quasi Newton or the Quasi Levemberg Marquardt
algorithm according to flag levmar.
8.1.2.0.2.8.2.1. (learnbatchnewton n)

(netenv.sn) 
This functions is similar to learnbatch
. However it clears the flag levmar
and calls batchnewtoniteration ,
applying thus the Quasi Newton Algorithm.
8.1.2.0.2.8.2.2. (learnbatchnewton cyclength cycnum)

(netenv.sn) 
This functions is similar to runbatch
. However it clears the flag levmar
and calls batchnewtoniteration ,
applying thus the Quasi Newton Algorithm.
8.1.2.0.2.8.2.3. (trunbatchnewton cyclength cycnum)

(netenv.sn) 
This functions is similar to trunbatch
. However it clears the flag levmar
and calls batchnewtoniteration ,
applying thus the Quasi Newton Algorithm.
8.1.2.0.2.8.2.4. (learnbatchlm n)

(netenv.sn) 
This functions are similar to learnbatch
. However it sets the flag levmar and
calls batchnewtoniteration ,
applying thus the Quasi Levemberg Marquardt Algorithm.
8.1.2.0.2.8.2.5. (learnbatchlm cyclength cycnum)

(netenv.sn) 
This functions are similar to runbatch
. However it sets the flag levmar and
calls batchnewtoniteration ,
applying thus the Quasi Levemberg Marquardt Algorithm.
8.1.2.0.2.8.2.6. (trunbatchlm cyclength cycnum)

(netenv.sn) 
This functions are similar to trunbatch
. However it sets the flag levmar and
calls batchnewtoniteration ,
applying thus the Quasi Levemberg Marquardt Algorithm.
8.1.2.0.2.9. Pruning Connections in Netenv: Optimal Brain Damage (OBD).


Optimal Brain Damage (OBD) is a weight pruning method. It consists in
computing a saliency criterion based on the curvature of the cost
function.
This method sometimes improves the generalization performances by
adjusting the effective number of parameters. The simulation speed is
also improved. The optimised runtime speed however may remain identical
because modern computers spend more time to perform a test than to
perform a multiplication by zero.
Using pruning methods is quite a long process because the mathematics of
these method usually require that the weight saliencies are computed on
an optimum of the cost function. The best results are obtained after
training the network during a long time before applying OBD.
Two library functions of "netenv.sn"
implement OBD. These functions require kernels sn28new or sn28itenew
because the computation of the saliency criterion requires the
evaluation of the second derivatives of the cost function.
A typical use of OBD consists in the following steps:
 1
Training thoroughly a network until it reaches an optimum.
 2 Computing the curvature information using
obdcomputecurvature .
 3 Pruning a small proportion of weights (such as 10%) using
obdprune .
 4 Retraining the network until it reaches a new minimum.
 5 Repeating steps 2, 3 and 4 until enough weights have been
removed.
An example file is provided in directory
"sn28/examples/obd.sn" .
8.1.2.0.2.9.0. (obdcomputecurvature pmin pmax )

(netenv.sn) 
Computes the diagonal of the hessian matrix of the cost function
measured on patterns pmin to
pmax . These diagonal values are left in field
shess of the weights.
8.1.2.0.2.9.1. (obdprune fraction)

(netenv.sn) 
Removes a proportion fraction of the
connections with the lowest saliency using function
cutconnection and returns the exact number of connections
removed. Argument fraction must be a number between
0 (prune nothing) and 1
(prune all).
When pruning a shared weight network (using sn28itenew) you must
remember that the saliency criterion is a characteristic of the weight
and not of the connection. All the connections sharing a same weight
will be either kept or removed as a whole.
8.1.2.0.2.10. Displaying and Plotting in Netenv.


8.1.2.0.2.10.0. Netenv Displaying Modes.


8.1.2.0.2.10.0.0. (dispbasiciteration)

(netenv.sn) 
This is the general display function which is called after each learning
iteration. This function can be dynamically redefined to one of the
builtin display functions or to a user defined function.
See: (basiciteration n )
8.1.2.0.2.10.0.1. (dispperfiteration)

(netenv.sn) 
This is the general display function which is called after each pattern
testing. This function can be dynamically redefined to one of the built
in display functions or to a user defined function.
See: (testpattern n )
8.1.2.0.2.10.0.2. (setdispeverything)

(netenv.sn) 
Makes dispbasiciteration equivalent
to dispeverything and
dispperfiteration equivalent to
dispnil .
8.1.2.0.2.10.0.3. (setdisperror)

(netenv.sn) 
Makes dispbasiciteration equivalent
to disperror and
dispperfiteration equivalent to
dispnil . This function also creates a window
disperrorwindow and a plotport
disperrorport for plotting the error.
8.1.2.0.2.10.0.4. (setdispnet)

(netenv.sn) 
Makes dispbasiciteration and
dispperfiteration equivalent to
dispnet .
8.1.2.0.2.10.0.5. (setdispnetanderror)

(netenv.sn) 
Makes dispbasiciteration equivalent
to disperror and
dispperfiteration equivalent to
dispnet . SN2.8 will thus plot the instantaneous error
during the training phase and draw the network states during the
performance evaluation.
8.1.2.0.2.10.0.6. (setdisptext)

(netenv.sn) 
Makes both dispbasiciteration and
dispperfiteration equivalent to
disptext .
8.1.2.0.2.10.0.7. (setdispnil)

(netenv.sn) 
Makes both dispbasiciteration and
dispperfiteration equivalent to
dispnil .
8.1.2.0.2.10.0.8. (dispeverything n)

(netenv.sn) 
Function dispeverything can be
redefined by the user. However, its default definition is:
(de dispeverything (pattnum)
(printlayer outputlayer)
(printf "age=%d pattern=%d error=%9.5f %s\n"
age pattnum localerror
(if goodanswer " ok " "**arrgh**"))
(ploterror age localerror)
(drawnet netstruc pattnum) )
8.1.2.0.2.10.0.9. (disperror n)

(netenv.sn) 
Function disperror can be redefined
by the user. However, its default definition is:
(de disperror (pattnum)
(ploterror age localerror) )
8.1.2.0.2.10.0.10. (dispnet n)

(netenv.sn) 
Function dispnet can be redefined by
the user. However, its default definition is:
(de dispnet (pattnum)
(drawnet netstruc pattnum) )
8.1.2.0.2.10.0.11. (disptext n)

(netenv.sn) 
Function disptext can be redefined by
the user. However, its default definition is:
(de disptext (pattnum)
(printf "age=%d pattern=%d error=%9.5f %s\n"
age pattnum localerror
(if goodanswer " ok " "**arrgh**")) )
8.1.2.0.2.10.0.12. (drawnet struc n)

(netenv.sn) 
The user defined function drawnet
must produce a graphic display of the network state on the current
graphic window. This function is initially defined as an empty function.
Function drawnet is called by certain
functions dispXXX . The first
argument is now obsolete. The second argument is the index of the
current pattern.
It is easy to write a display function using function
drawlist . Here is an example of how to define a
drawnet function for a small network:
; Define the network as in the "getting started" 424 encoder example
; then define a display function for the 424 encoder
(de drawnet (a b) ; a and b are dummy arguments, we won't use them
(drawlist 100 100 (state output) 4 30 50 48)
(drawlist 150 170 (state hid) 2 30 50 48)
(drawlist 100 240 (state input) 4 30 50 48) )
(newwindow) ; now open a graphic window
(setdispnet) ; enable display
(learn 16) ; do a couple iterations, look at the graphic screen.
The network editor ``Nettool'' generates automatically
drawnet functions. These functions create an interactive
display window. By clicking outside the network, the window is refreshed
and displays the states of the network. By clicking on a neuron, the
weights of the connections leading to the target neuron are displayed
and the other neurons are turned to color gray (corresponding to weight
0).
8.1.2.0.2.10.0.13. (ploterror n err)

(netenv.sn) 
Function ploterror plots the output
error for the current pattern on a graph. It is called by certain
functions dispXXX . The default
function plots the instantaneous error in the plotport
disperrorport .
8.1.2.0.2.10.0.14. (printlayer l)

(netenv.sn) 
Very simple textual description of the states of a layer.
8.1.2.0.2.10.0.15. (prsyn)

(netenv.sn) 
Very simple textual description of the connections of the network.
8.1.2.0.2.10.0.16. (prneur)

(netenv.sn) 
Very simple textual description of the states of the network.
8.1.2.0.2.10.0.17. (prnet)

(netenv.sn) 
Very simple textual description of a network.
8.1.2.0.2.10.1. Monitoring Error and Performance in Netenv.


8.1.2.0.2.10.1.0. (initerrorplotting nsweep maxerr)

(netenv.sn) 
This function creates window
errorplottingwindow if it does not exist already. It
creates then two plot ports for plotting the mean squared error. These
plot ports are named trainingerrorport
and testerrorport .
As soon as these plot ports exist, functions run
and trun plot an error curve in this
window.
8.1.2.0.2.10.1.1. (initperfplotting nsweep)

(netenv.sn) 
This functions creates window
perfplottingwindow if it does not exist already. It creates
then two plot ports for plotting the performance. These plot ports are
named trainingperfport and
testperfport .
As soon as these plot ports are defined, functions
run and trun plot a
performance curve in this window.
Remark: You can destroy these error and performance windows, either by
using your window manager or by typing:
(delete perfplottingwindow)
(delete errorplottingwindow)
Plotting is then cancelled.
8.1.2.0.2.10.2. Netenv Miscellaneous Functions.


8.1.2.0.2.10.2.0. (allneuron (i) body)

(netenv.sn) 
Evaluates body with i taking all
possible unit indices.
8.1.2.0.2.10.2.1. (allsynapse (i j) body)

(netenv.sn) 
Evaluates body with i and
j taking all possible values for the upstream and downstream
units of a connection.
8.1.2.0.3. Building Connections in Netenv.


8.1.2.0.3.0. Introduction to Building Connections in Netenv.


The functions introduced in this section are located in the file
"connect.sn" which is automatically loaded on startup. They
make the creation of complex networks easier by providing lisp functions
which connect layers according to various connectivity patterns.
These functions have been designed for both efficiency and simplicity.
They only use the two lowlevel functions connect
and dupconnection .
See: Connections in Netenv.
8.1.2.0.3.1. Functions for Locally Connecting Layers in Netenv.


A first class of functions creates local connections. The first
arguments always describe the upstream and downstream layers and their
sizes.
8.1.2.0.3.1.0. (local1dconnect layer1 n1 layer2 n2 <step size)

(connect.sn) 
This function creates local connections between
layer1 (a list of n1 cells)
and layer2 (a list of
n2 cells). The cells of layer2
are connected to a window of size cells from
layer1 , stepped by step cells.
Function local1dconnect complains if
n1 != step*n2+sizestep .
8.1.2.0.3.1.1. (local2dconnect layer1 row1 col1 <layer2 row2 col2 xstep ystep xsize <ysize)

(connect.sn) 
This function creates local connections between
layer1 (a list of row1*col1
cells) and layer2 (a list of row2*col2
cells). Cells in layer1 and
layer2 are listed by row order:
(r1c1 r1c2 r1c3...r1cN r2c1 r2c2...rMcN).
Cells of layer layer2 are connected to
a window of xsize*ysize cells from
layer1 , stepped by xstep
and ystep cells. Function
local2dconnect complains if row1 !=
ystep*row2+ysizeystep or col1 !=
xstep*col2+xsizexstep .
8.1.2.0.3.1.2. (localtoricconnect layer1 row1 col1 layer2 row2 col2 xstep ystep xsize ysize)

(connect.sn) 
This function is almost identical to function
local2dconnect : it creates local connections between
layer1 (a list of row1*col1
cells) and layer2 (a list of
row2*col2 cells). Cells of layer
layer2 are connected to a window of
xsize*ysize cells from layer1
, stepped by xstep and
ystep cells.
However, side effects are handled in a different way: when using the
function local2dconnect , cells in
the periphery of layer1 have less
downstream connections than the central cells. Input data thus are best
located in the center of the input layer.
In certain cases, it is easier to use the function
localtoricconnect whose sliding window wraps around
layer1 .
Layer layer1 always has
col1*xstep columns and row1*ystep
rows. The input window of the rightmost cells of
layer2 wraps thus on the left of
layer1 . The input window of the bottommost cells similarly
wraps on the top.
8.1.2.0.3.2. Functions for Creating Shared Weights Masks in Netenv.


All the local connection functions have an equivalent function for
performing shared weights connections. Of course, these functions
require kernel sn28ite or sn28itenew.
8.1.2.0.3.2.0. (mask1dconnect layer1 n1 layer2 n2 step size)

(connect.sn) 
This function creates shared weight connections between
layer1 (a list of n1 cells)
and layer2 (a list of
n2 cells). Cells of layer2
are connected to a window of size cells from
layer1 , stepped by step cells, always using the same set of
shared weights.
8.1.2.0.3.2.1. (mask2dconnect layer1 row1 col1 layer2 row2 col2 xstep ystep xsize ysize)

(connect.sn) 
This function creates shared weights connections between
layer1 (a list of row1*col1 cells) and
layer2 (a list of row2*col2
cells). Cells of layer layer2 are
connected to a window of xsize*ysize
cells from layer1 , stepped by
xstep and ystep cells,
always using the same set of shared weights.
8.1.2.0.3.2.2. (masktoricconnect layer1 row1 col1 layer2 row2 col2 xstep ystep xsize ysize)

(connect.sn) 
This function is almost identical to mask2dconnect: it creates shared
weights connections between layer1 (a
list of row1*col1 cells) and
layer2 (a list of row2*col2
cells). Cells of layer layer2 are
connected to a window of xsize*ysize
cells from layer1 , stepped by
xstep and ystep cells.
However, side effects are handled like function
localtoricconnect .
8.1.2.0.3.2.3. (equalmask1dconnect layer1 n1 layer2 n2 step size)

(connect.sn) 
This function is basically similar to the
mask1dconnect function, except that all the connections
share a same single weight. Function
equalmask1dconnect might be used for implementing
averaging/smoothing operations on signal data.
8.1.2.0.3.2.4. (equalmask2dconnect layer1 row1 col1 layer2 row2 col2 xstep ystep xsize ysize)

(connect.sn) 
This function is basically similar to the
mask2dconnect function, except that all the connections
share a same single weight. Function
equalmask2dconnect might be used for implementing
averaging/smoothing operations on image data.
Example:
; divide the resolution of a 16x16 layer 'layer1' by 2 into a 8x8 layer 'layer2'
(equalmaskconnect layer1 16 16
layer2 8 8
2 2 2 2 )
8.1.2.0.3.2.5. (tdnnconnect layer1 nframes1 ntraits1 layer2 nframes2 ntraits2 step size)

(connect.sn) 
This function helps implementing Time Delay Neural Networks. Cells in
lists layer1 and
layer2 are ordered by trait order:
(t1f1 t2f1 t3f1...t1f2 t2f2...).
Each frame of layer2 is connected to a
sliding window in layer1 of
size frames, stepped by step
frames, always using the same set of weights.
8.1.2.0.3.2.6. (mask2dreverse layer1 row1 col1 layer2 row2 col2 xstep ystep xsize ysize)

(connect.sn) 
This function creates shared weights connections between
layer1 (a list of row1*col1
cells) and layer2 (a list of
row2*col2 cells). Cells of layer
layer1 are connected to a window of
xsize*ysize cells from layer2
, stepped by xstep and
ystep cells, always using the same set of shared weights.
Function mask2dreverse complains if
row2 != ystep*row1+ysizeystep or col2
!= xstep*col1+xsizexstep .
8.1.2.0.3.3. Functions for Creating Shared Biases in Netenv.


Shared weights usually are useful for designing translation invariant
networks. Such networks also require the biases to be invariant.
Function buildnet , however, always
creates a separate bias per unit. The solution consist in using function
buildnetnobias instead of function
buildnet . It uses the same arguments, but no biases are
created.
The following functions let the user create biases by hand.
8.1.2.0.3.3.0. (biasconnect layer)

(connect.sn) 
Creates a separate bias for each cell in list layer.
8.1.2.0.3.3.1. (sharedbiasconnect layer)

(connect.sn) 
Creates a single shared bias for each cell in list layer.
8.1.2.0.3.3.2. (tdnnbiasconnect layer nframes ntraits)

(connect.sn) 
Creates a shared bias for each cell in each trait of layer layer which
is composed of nframes*ntraits units.
8.1.2.1.0. Introduction to SN Tools.


The Netenv library is mainly used for simulating multilayer networks.
Indeed, SN2.8 contains efficient heuristics for training large
multilayer networks using several variants of the backpropagation
algorithm. This task is made even easier by two dedicated graphical
interfaces, ``NetTool`` and ``BPTool``.
 NetTool is a graphical
network editor. Using NetTool you can build a network using all the
connection patterns defined in the library
"connect.sn" . The program then generates lisp functions for
creating and drawing the network.
 BPTool is a graphical interface for controlling the training
process. BPTool allows you to perform the most common actions allowed by
the library "netenv.sn" . It provides
sensible default values for all the convergence parameters.
 A few other tools ``StatTool``, ``NormTool`` and ``ImageTool`` are
useful for browsing and preprocessing the data. These tools are briefly
discussed in the last chapter.
This manual describes all the functionalities of these tools.
8.1.2.1.1.0. Overview of NetTool.


NetTool is a graphic network editor dedicated to multilayer networks.
The NetTool window appears when function NetTool
is runned. If a function NetToolnet
is defined, the corresponding network is loaded into the editor.
The NetTool window includes a menu bar and a workspace controlled by two
scrollbars.
Using NetTool you can build a network using all the connection patterns
defined in the library "connect.sn" .
NetTool deals with two kind of objects: layers and layertolayer
connections.
 Layers are just a collection of units or neurons.
Both 1 dimensional and 2 dimensional layers are supported. As a
convention, the leftmost layer is always the network input layer.
Similarly, the rightmost layer is always the output layer.
 Layertolayer connections define the connection pattern between
the units of two layers. These connection patterns are defined using the
library "connect.sn" . Connections
always link layers from left to right. It is thus not possible to edit
recurrent networks with NetTool.
The NetTool editor generates three lisp functions:
 Function
createnet creates a network with the specified architecture.
This function calls functions defined in the libraries
"netenv.sn" and "connect.sn"
.
 Function drawnet opens a graphic
window and displays the activation state of each layer. This function is
used in conjunction with the setdispnet mode of library Netenv.
 Function NetToolnet encodes the
structure of the network for the next runs of NetTool. You should not
call this function yourself.
These functions can be used by the interpreter or saved into a file.
8.1.2.1.1.0.0. (NetTool)

(NetTool.sn) 
This autoload function loads once the libraries
"ogre.sn" and "NetTool.sn"
and create the NetTool interface.
8.1.2.1.1.1. Loading and Saving a Network Architecture in NetTool.


There are four items in menu Network .
Item "New Network" clears the
workspace and initializes the network editor. If the current network has
not been saved into a file, a confirmation dialog is displayed.
Item "Load Network" pops up a dialog
for loading a network description file. A mouse click on button
"Load Network" of the requester reads the specified network
description file. A network description file is just a file containing
three SN functions: NetToolnet ,
createnet and drawnet .
NetTool extracts the network architecture from the first function,
NetToolnet .
Item "Save or Create Network" pops up
a dialog for saving a network description file or creating a network.

The network description may be either saved into a file or used for
creating a network.
When item "Create network in Memory"
is selected, the network editor creates the description functions in the
interpreter memory. Then it calls function
createnet in order to actually create the network.
When item "Save network in File" is
selected, the network editor saves the description functions in the
specified file.
 Network descriptions are composed of three functions.
Function NetToolnet encodes the
structure of the network for subsequent runs of the network editor. It
is always generated.
Function createnet builds the
network using the Netenv functions. It is generated only When item
Generate createnet is checked.
Function drawnet draws the network
in connection with the Netenv library. It is generated only When item
Generate drawnet is checked.
 Function createnet usually
allocates memory for the exact number of units or connections required
by the network. The user can however ask for additionnal space using the
fields "Allocate more cells" and
"Allocate more connections" .
 A mouse click on button "Compile & Save"
creates a network description file.
Item "Quit" closes the network editor
window. If unsaved changes have been performed on the current network, a
confirmation dialog is displayed.
8.1.2.1.1.2. Layers in NetTool.


Layers are managed from menu Layer .
There are also four items in this menu for creating, editing,
duplicating and deleting layers:
 Item
"New" . Selecting this item or typing
n in the workspace pops up a dialog for creating a new layer
with various characteristics.
 Item "Edit" . Selecting this item
or typing e in the workspace pops up a
dialog for modifying the characteristics of the selected layer.
 Item "Duplicate" Selecting this
item or typing d in the workspace make
copies of the selected layers and of their connections.
 Item "Delete" Deletes the
selected layers.
8.1.2.1.1.2.0. Layers Creation in NetTool.


Selecting New in menu
Layers pops up the layer editing dialog.
Several parameters can be specified.
The name of the layer is controled by an editstring. A default unique
name is provided.
The size of the layer is controled by an editstring.
 A single
number defines a 1dimensional layer. For instance a size of
"40" indicates a one dimension layer with 40 units.
 Two numbers define a 2dimensional layer. For instance a size of
"6x10" defines a matrix of 160 units with 16 rows and 10
columns.
The biases of the units in the layer is controled by four exclusive
buttons.
 Item "None"
specifies that no units in this layer are connected to the bias unit.
 Item "Full" Every unit is
connected to the bias unit with a different weight.
 Item "Shared" specifies that
every unit is connected to the bias unit with the same weight. This
function generates a network with shared weight. Only the kernels
sn28ite and sn28itenew can simulate such networks.
 Item "Column shared" specifies
that every unit in a column of the layer is connected to the bias unit
with the same weight. There is a different weight per column. This
option is useful for defining timedelay networks. This function
generates a network with shared weight. Only the kernels sn28ite and
sn28itenew can simulate such networks.
The display mode of the layer is controled by three exclusive buttons
This mode is only useful for generating function drawnet.

Item "None" specifies that function
drawnet will not display this layer.
 Item "Hinton" specifies that
function drawnet will display this
layer as small squares using function drawlist
. This display style has been popularized by several famous papers
written by Geoffrey Hinton.
 Item "Gray Levels" specifies that
function drawnet will display this
layer as gray levels using function
graydrawlist .
The new layer appears in the workspace when you click button
"Ok" .
8.1.2.1.1.2.1. Layers Manipulation in NetTool.


Layers are symbolized by a decorated rectangle and a label. The size and
the location of the rectangle denotes the size and location of the layer
drawn by function drawnet .
Four zones can be activated:
 The layer label is a selection
zone. A first click in the layer label selects the layer. A second click
on the layer label calls the layer edition popup requester. A click
outside a layer deselects all layers. The label of selected layers is
rendered in white on a black background.
 The gray triangle on the right side of the drag area is a
connection handle. It is used for creating a new connection.
 You can resize a layer using the resize box. The form factor of the
layer is constrained by the size and the display mode of the layer.
 You can press the mouse on the gray rectangle and move the layer
around. You will notice that an invisible grid helps you to align the
layers nicely.
8.1.2.1.1.2.2. Layers Edition in Netttol.


Layers are edited with the layer editing dialog, which is also used for
creating layers.
This dialog can be poped up and handle a given layer when the layer to
edit has been selected by clicking on its label and either item
Edit of menu Layer is
called or the label is clicked on again.
See: Creating Layers in NetTool.
8.1.2.1.1.2.3. Layers Deletion in NetTool.


Layers are deleted after they have been selected by choosing item
Delete of menu Layer . A
confirmation dialog is poped up before effective deletion.
8.1.2.1.1.2.4. Layers Duplication in NetTool.


Layers are duplicated after they have been selected by choosing item
Duplicate of menu Layer or
typing key d .
The name of the new layers is derived from the name of the original
layers. The parameters of the new layer are the same as the original
layers. New connections are created similarily with the connections with
the original layers.
8.1.2.1.1.3. Connections in NetTool.


Layertolayer connections are created, deleted and edited with the
connection edition dialog.
8.1.2.1.1.3.0. Creating a Connection.


A connection is created by clicking in the connection handle of the
upstream layer (the gray triangle on the right side of the layer layout)
and draging the mouse to the downstream layer.
After the initial click in the connection handle, while the mouse is
moved but its button is not released yet, a dashed line is drawn between
the connection handle and the mouse position.
When the mouse button is released on the downstream layer, the
connection edition dialog appears and lets you specify a pattern of
connections among those offered by the functions of library
"connect.sn" .
Menu Full / Local proposes full and
local connection patterns.
 Item
"connect" specifies a full connection between the layers:
every unit in the upstream layer is connected to every unit in the
downstream layer.
 Items "local1dconnect" ,
local2dconnect and
localtoricconnect specify local connections identical to
the local connections established by the corresponding functions of
library "connect.sn" . See the
documentation this library in manual "Netenv" for more details.
Menu Shared proposes shared weight
connection patterns. Networks using these connection patterns require
kernels sn28ite or sn28itenew.
 Items
"mask1dconnect" , "mask2dconnect"
and "masktoricconnect" specify
convolutional connection patterns identical to the patterns established
by the corresponding functions of library
"connect.sn" . See this library documentation in manual
"Netenv" for more details.
 Items "equalmask1dconnect" and
"equalmask2dconnect" specify subsampling connection
patterns identical to the patterns established by the corresponding
functions of library "connect.sn" .
See the documentation of this library in manual "Netenv" for more
details.
 Item "tdnnconnect" specifies a
timedelay connection pattern identical to the pattern established by
function tdnnconnect of library
"connect.sn" . See the documentation of this function in
manual "Netenv" for more details.
When you select certain patterns of connection, you must enter
additional parameters in the fields named "Size"
and "Step" . Each parameter can be a
single digit, like "4" or a pair of
digits, like "3x3" . You can find more
details about the standard connection patterns in the chapter describing
the library "connect.sn" of the manual
"Netenv".
Parameters "From" and
"To" enable the user to connect a part of the upstream layer
to the downstream layer. They defaults so that the whole upstream layer
is connected.
The connection is created when you click on button
"Ok" .
See: Building Connections in Netenv.
8.1.2.1.1.3.1. Editing and Deleting Connections.


Connections are symbolized by a line linking both layers. In the first
half of the line, there is a small black square which is in fact a small
button.
This button becomes white if the parameters of this connection do not
match the sizes of the layers. Some connection patterns indeed require
arithmetic relations between the sizes of the layers and the parameters
of the connection pattern.
Clicking on this button pops up the connection edition dialog.

You can modify the parameters or the type of the connection and validate
the modification by clicking on button "Ok"
.
 You can press button "Delete" to
remove the connection.
8.1.2.1.1.4. Conclusion to NetTool.


Both beginners and advanced programmers can take advantage of this
network editor.
 Using NetTool, a beginner to SN2.8 can easily
define complex networks.
 Advanced programmers can use the files produced by NetTool as a
template for defining even more complex network architectures, like
recurrent networks.
8.1.2.1.2.0. Overview of BPTool.


BPTool is a graphical interface for controlling the training of
multilayer networks.
The BPTool window appears when function BPTool
is runned.
It allows you to perform the most common actions allowed by the library
"netenv.sn" . It provides sensible default values for all the
convergence parameters.
More precisely, BPTool provides mousedriven interface to:

define a network using NetTool,
 load the patterns,
 define a training set and a test set,
 select a training algorithm,
 select the initial weights,
 select the parameters of the training algorithm,
 select the performance criterion,
 train the network,
 measure the performance on both the training set and the test set,
 plot the evolution of these performances during training,
 display the network during training or measurement.
BPTool relies on the underlying mechanisms of library Netenv. Moreover,
the various menus and requesters of BPTool are equivalent to the most
common functions of this library. You can even mix BPTool actions and
TLisp commands.
When called the BPTool interface first proposes a file requester for
setting the performance file.
BPTool contains a menu bar with three menus.
 Menu
File lets you define a network, load the databases, load or
save the weights.
 Menu Parameters lets you select
the parameters of the training algorithm.
 Menu Settings lets you choose a
performance criterion and select display options.
BPTool contains several logical zones.
 The ``data selection
zone'' lets you select the boundaries of the training and test sets.
 The ``algorithm selection zone'' lets you choose a proper training
algorithm.
 The ``learning rates control zone'' lets you change the learning
rates used for the weight updates.
 The ``training and control zone'' lets you start and stop the
training algorithms.
 The ``Message zone'' displays messages such as the age of the
current weights and the number of allocated neurons.
8.1.2.1.2.0.0. (BPTool)

(BPTool.sn) 
This autoload function loads once the libraries
"ogre.sn" and "BPTool.sn"
and create the BPTool interface.
8.1.2.1.2.1. BPTool Menus.


Menus are somewhat arranged in logical order. You can use the menu items
in order: start with menu File to load
a network and a database, then specify the parameters with menu
Parameters and request some information display using menu
Settings .
8.1.2.1.2.1.0. BPTool File Menu.


Menu File lets you define a network,
load the databases, load or save the weights.
Item "Define Network" starts the
network editor NetTool. Using NetTool you can define or load a network
and build it by checking item Create to Memory
when you save the network description. Once the network has been
created, the Network Status String displays the number of units,
connections and weights in this network.
Item "Automatic Network" computes an
optimized simple network with one hidden layer and full connections.
This computation is made as follows:
 1 Until an overfitting
criteria is reached, a new architecture is chosen.
 2 For this architecture, a training is carried out and is stopped
using an overfitting criteria based on the testset. The best weight set
is kept.
 3 Goto step 1 or save the network architecture and weights.
Item "Load Patterns" pops up a
requester for loading the example files. You must specify the names of
two files. When you press button "Load"
, both files are loaded into the variables
patternmatrix and desiredmatrix
using function loadpatterns of the
Netenv library.
 The "patternmatrix"
file must contain a 2dimensional matrix in TLisp format. Each row of
this matrix contains the input vector for one example.
 The "desiredmatrix" file must
contain a 2dimensional matrix in TLisp format. Each row of this matrix
contains the output vector for one example.
Item "Load Weights" pops up a
requester for loading the weights from a file using function
loadnet of the Netenv library.
Item "Save Weights" pops up a
requester for saving the weight into a file using function
savenet of the Netenv library.
Item "Quit" closes the BPTool window.
8.1.2.1.2.1.1. BPTool Parameters Menu.


Menu "Parameters" lets you select the
parameters of the training algorithm.
 Item
"Initial weights" pops up a requester for setting initial
weights.
 Item "Learning rates" pops up a
requester for setting initial learning rates.
 Item "Alpha & Decay" pops up a
requester for setting momentum and decay.
 Item "Mu & Gamma" pops up a
requester for setting initial secondorder parameters.
 Item "Transfert Function" pops up
a requester for setting the global transfert function.
8.1.2.1.2.1.1.0. Setting Initial Weights with BPTool.


Item "Initial weights" of menu
"Parameters" pops up a requester for initializing the weights
in the network.
The weights are initialized using uniform random values in a given
interval which depends on a parameter x
and the fanin of units (i.e. the number of incoming connections to a
downstream cell).
 Exclusive button
"random[x,x]" refers to function
forget of the Netenv library.
 Exclusive button "random[x/fanin,+x/fanin]"
refers to function forgetinv of the
Netenv library.
 Exclusive button
"random[x/sqrt(fanin),+x/sqrt(fanin)]" refers to function
forgetsqrt of the Netenv library.
Button "Default" sets the default
initialization which uses interval
random[1/sqrt(fanin),+1/sqrt(fanin)].
Button "Ok" actually initializes the
weights and pops the requester down.
8.1.2.1.2.1.1.1. Setting Initial Learning Rates with BPTool.


Item "Learning rate" of menu
"Parameters" pops up a requester for setting the learning
rates in the network.
Four methods are provided which depend on a parameter
x , on the fanin (i.e. the number of incoming connections to
a downstream cell) and share of units (the number of connections sharing
this weight).
 Exclusive button "x"
refers to function epsilon of the
Netenv library.
 Exclusive button "x/fanin" refers
to function epsi of the Netenv
library.
 Exclusive button "x/sqrt(fanin)"
refers to function epsisqrt of the
Netenv library.
 Exclusive button "x/sqrt(fanin*share)"
refers to function maskepsi of the
Netenv library. This is dedicated to shared weights networks and is
available only in versions sn28ite and sn28itenew.
Button "Default" sets the default
value 0.1 / sqrt(fanin)) .
Button "Ok" actually sets the learning
rates.
8.1.2.1.2.1.1.2. Setting Momentums and Decays with BPTool.


Item "Momentum & Decay" of menu
"Parameters" lets the user define two parameters used by the
first order weight update rule.
This menu item is disabled when a second order algorithm is selected,
like Newton or LevembergMarquardt.
Using this menu merely sets the variables alpha
and decay which are used by the Netenv
library. Both values default to 0.
See function updateweight of the
Netenv library for more details.
8.1.2.1.2.1.1.3. Setting Second Order Parameters with BPTool.


Item "Mu & Gamma" of menu
Parameters lets the user define two parameters used by the
second order weight update rules.
This menu item is only enabled if a first order algorithm is selected,
like Newton or LevembergMarquardt.
Using this menu merely sets the variables gamma
and mu which are used by the Netenv
library. Both values default to 0.05.
See function updateggradient and
hessianscale of the Netenv library for more details.
8.1.2.1.2.1.1.4. Setting the Global Transfer Function with BPTool.


Item "Transfer function" lets the user
choose a global transfer function. It calls a dialog providing
sigmoidal, piecewise linear and binary activation functions. BPTool
handles only the (single) global activation function.
The default function is a sigmoid.
 Checking item
"Sigmoid" sets a sigmoidal activation function using the
function nlftanh of the Netenv
library.
 Checking item "Piecewise linear"
sets a piecewise linear activation function using function
nlflin of the Netenv library.
 Checking item "Binary threshold"
sets a binary activation function using function
nlfbin of the Netenv library.
Button "Default" sets the default
sigmoid.
Button "Draw" draws the current
activation function.
8.1.2.1.2.1.2. BPTool Settings Menu.


Menu Settings lets you choose a
performance criterion and select display options.
8.1.2.1.2.1.2.0. Plotting Global Performances with BPTool.


Item "Plotting" pops up a requester
for setting up windows for plotting the mean squared error and the
network performance.
 Checking Error
Plotting sets up a window for plotting the mean square error
on the training set and on the test set (white and black circles
respectively) versus the number of training iteration. This is achieved
using Netenv library function initerrorplotting
.
 Checking Perf Plotting sets up a
window for plotting the performance on the training set (white circles)
and on the test set (black circles) versus the number of training
iteration. This is achieved using function initperfplotting of the
Netenv library. Performance is measured according to a criterion
selected with menu item "Classify" .
Field Number of sweeps specifies the
length of the plot as a number of passes of the training set.
Pressing button "Initialize Plotting"
actually creates the windows and recomputes the axes using the current
number of iterations and the current size of the training set.
8.1.2.1.2.1.2.1. Classification criterion in BPTool..


Item "Classify" lets the user define
how the performance is computed.
 Checking
None disables the performance computation. See function
setclassnil of the Netenv library.
 Checking Max states that a
pattern is well classified if the most active unit is the right unit.
See function setclassmax of the Netenv library.
 Checking Quadrant states that a
pattern is well classified if the state of each output unit has the
right sign. See Netenv library function setclassquadrant.
 Checking Sign states that a
pattern is well classified if the state of each output unit has the
right sign with a given margin. See function setclasssgn of the Netenv
library.
 Checking Lms states that a
pattern is well classified if the mean square error is less than a given
margin. See function setclasslms of the Netenv library.
 Checking Hamming states that a
pattern is well classified if the states of the output units are within
a given margin of the desired values. See function setclasshamming of
the Netenv library.
8.1.2.1.2.1.2.2. Display Options in BPTool.


Item "Display" of menu
Settings lets the user set up various displays updated during
every training iterations or every measurement iterations. This is
similar to the functions setdisp of
the Netenv library.
 Checking Text
outputs extensive information to the interpreter window whenever an
example is processed.
 Checking Error plots the
instantaneous error versus the number of training iterations whenever an
example is processed. This plot is created in a special window.
 Checking Network draws the
network whenever an example is processed. This picture is created in a
new window using the function drawnet defined by NetTool.
8.1.2.1.2.2. Training with BPTool.


8.1.2.1.2.2.0. Defining the test set and the training set with BPTool.


Four fields in the ``Data Selection Zone'' let you enter the boundaries
of the training set and the test set. These values are passed to
functions ensemble and testset of the
Netenv library.
8.1.2.1.2.2.1. Choosing the backpropagation version with BPTool.


You must also select a training algorithm using the ``Algorithm
Selection Zone''. The algorithm selection zone offers four choices of an
optimisation algorithm: Online Gradient, Online Levemberg Marquardt,
Batch Gradient and Batch Conjugate Gradient.
Certain items are disabled when the current kernel does not support such
training algorithm.
The default training algorithm is the socalled
"Online Gradient Algorithm" (i.e. stochastic gradient) which
is faster when you train your network on extremely redundant data, like
handwritten digits.
If you are using sn28ite or sn28itenew, you can also select the
"Batch Gradient Algorithm" . This algorithm is sometimes
better for finishing the training and tuning the weights of a network.
You must progressively decrease the learning rate when learning
imporves.
If you are using sn28new or sn28itenew, you can use second order
optimization of the cost function using the stochastic
LevembergMarquardt's algorithm. These methods can be used with both the
stochastic and batch gradient methods.
You can just specify a constant learning rate of about 0.0005 and let
the algorithm adjust the step sizes. Although slower than well tuned
parameters, these algorithms put the strain onto your computer rather
than you.
The "Conjugate Gradient Method" is
useful for deep optimization of small networks on medium size databases.
8.1.2.1.2.2.2. Controlling the training with BPTool.


Finally, the ``Training control Zone'' and ``Learning rate control
zone'' let you control the training.
The editable field labelled "Test performance
every  iterations" controls the number of pattern
presentation between two performance measurement. When a batch algorithm
is selected, this number is rounded to the next multiple of the training
set size. By default, performance measurements are done after every pass
of the training set.
Three buttons let the user start the training algorithm.

Button "Learn" starts the training
algorithm until you press button "Break"
. No measurements are performed during training.
 Button "Run" starts the training
algorithm until you press button "Break"
. Measurement are performed on the training set every n iterations. The
results are plotted according to the current settings and displayed in
the message string.
 Button "Trun" starts the training
algorithm until you press button "Break"
. Measurement are performed on both the training set and the test set
every n iterations. The results are plotted according to the current
settings and displayed in the message string.
On the first click button "Break"
tells BPTool to stop training at the end of the current pass of the
training set. This condition is displayed in the message string. On the
second click it stops training immediatly.
The BPTool window remains active while the training algorithm is
running. In particular, you can use the File
, Parameters and
Settings menus.
Two buttons control the learning rates.
 Button
"Divide epsilon by 2" divides all learning rates by 2.
 Button "Multiply epsilon by 2"
multiplies all learning rates by 2.
At any time you can start a measurement pass on the training set or on
the test set using the following buttons.
 Button
"Training Set Perf" Performs a measurement pass on the
training set. Results are plotted according to the current settings and
displayed in the message string.
 Button "Test Set Perf" Performs a
measurement pass on the test set. Results are plotted according to the
current settings and displayed in the message string.
8.1.2.1.3. Other SN Tools.


Three tools are available for data preprocessing or postprocessing.

StatTool computes simple statistics on the values of a 2dimensional
matrix.
 NormTool performs a linear normalization on a 2dimensional matrix.
 ImageTool allows you to browse and preprocess a matrix containing
several images
This tool computes simple statistics on the values of a 2dimensional
matrix.
Item "Load" in menu
File lets you load a new matrix from a file or from a lisp
variable.
Button "Statistics" computes simple
statistics.
Button "Histogram" draws an histogram.
The histogram routine however is written in Lisp and may be
disappointingly slow.
8.1.2.1.3.0.0. (StatTool [mat])

(matrixtool.sn) 
This autoload function invokes the Statool interface.
This tool performs a linear normalization on a 2dimensional matrix.
There are several choices for entering the coefficients of the linear
normalization.
 Selecting Min/max
lets you enter the desired maximal and minimal values of the elements of
the matrix.
 Selecting Mean/Sdev lets you
enter the desired mean and standard deviation of the elements of the
matrix.
 Selecting AX+B lets you enter
directly the coefficients.
If Column by column is checked, these
normalization coefficients are computed independently for each column of
the matrix.
8.1.2.1.3.1.0. (NormTool)

(matrixtool.sn) 
This autoload function invokes the NormTool interface.
8.1.2.2.0. Introduction to Knnenv.


8.1.2.2.0.0. Overview of Knnenv.


This manual discusses a family of algorithms known as ``codebook based
algorithms'' for pattern recognition. SN2.8 offers turnkey routines for:

nearest neighbor algorithms (kNN),
 codebook editing algorithms (Multiedit/Condense),
 clustering algorithms (kMeans and LVQ),
 network of locally tuned units (RBF),
 topological maps (TMaps).
It also contains a set of lower level routines for manipulating
codebooks and performing the essential computations of the above
algorithms. In fact, all these algorithms are partly implemented by
primitives hardcoded in the kernel, partly implemented as lisp programs
located in file ".../sn2.8/lib/knnenv.sn"
.
There is a major prerequisite for using codebook based algorithms:
codebook based algorithms rely on a simple euclidian topology which is
often too simple for handling raw data. They usually perform much better
on adequately preprocessed data.
Preprocessing methods include a lot of techniques:
 statistical
techniques like scaling the principal axes like a Mahalanobis distance,
 spectral analysis using Fourier transforms, wavelet methods or LPC,
 image analysis techniques, like histogram equalization or Hough
transform,
 or even multilayer neural networks.
The algorithms themselves have been implemented with numerically robust
routines whose convergence is based on well understood properties.
8.1.2.2.0.1. About this Knnenv Manual.


This manual is divided in five major parts:
 Knnenv relies
mainly on the codebook data structure. Knnenv provides several
fundamental functions for creating, accessing, modifying, merging and
splitting codebooks.
 Codebooks are basicaly used for quantization and classification
tasks. Knnenv provides several basic operators for this purpose.
 Codebooks may be ressourceconsuming when the number of data
increases. Knnenv provides three main families of functions aiming at
getting reasonable codebooks: editing algorithms like multiedit and
condense, clustering algorithms like kMeans and discriminant algorithms
like LVQ.
 Radial Basis Functions (RBF) are implemented as an hybrid algorithm
combining a Knnenv codebook and a Netenv network.
 Topological Maps are implemented as codebooks extended with
topological information.
8.1.2.2.1.0. The Codebook Data Structure.


A codebook is a set of pairs composed of a vector and of a numeric label
which encodes the class of the corresponding vector. Codebooks serve two
purposes:
 A codebook can store examples of a pattern
recognition problem. Each element of the codebook describes a pattern
(the vector) and its class (the label).
 A codebook can store prototypes extracted by a pattern recognition
algorithm. Each element of the codebook describes a typical pattern (the
vector) of a given class (the label). Such prototypes are then used for
recognizing unknown patterns.
SN2.8 provides a number of fast primitives for (a) creating and
accessing codebooks, (b) merging and splitting codebooks, (c) returning
the closest elements of a codebook to a given pattern and (d) creating
or improving a codebook of prototypes using a codebook of examples.
8.1.2.2.1.1. Codebook Creation and Access.


Codebooks are lisp objects of class CODEBOOK.
The patterns of a codebook are in fact stored in the rows of a matrix
(the ``wordmatrix'') associated to the codebook. You can obtain this
matrix either with function codebookword
or by evaluating the codebook with function eval
.
The labels are stored in the codebook itself. Labels are small positive
integer numbers. You can access these labels with function
codebooklabel .
8.1.2.2.1.1.0. (loadiris)

(knnenv.sn) 
This function creates two codebooks containing Fisher's iris database.
Codebook irisa contains a training set of 100 examples. Codebook irist
contains the total set of 150 examples (including the 100 training
example for historical reasons only!)
This database contains examples of three species of iris.
 Each
flower is described by four numerical measurements (the pattern)
concerning the size and color of the flowers petals and sepals. These
four numbers are stored on each row of the wordmatrix of the codebooks
 The specie of the flower, i.e. 0
, 1 or 2
, is stored in the label part of the codebook.
8.1.2.2.1.1.1. (newcodebook <ncode  wordmatrix> <ndim  labelvector>)


When function newcodebook is called
with numerical arguments, it returns a new codebook object able to
contain ncode patterns of dimension ndim. Both the pattern and the
labels are initialized with zeroes.
? (setq a (newcodebook 30 4))
= ::CODEBOOK:30x4
? (eval a)
= ::MAT:30x4
? (bound (eval a))
= (30 4)
When function newcodebook is called
with matricial arguments, it returns an initialized codebook object.
Argument wordmatrix is a 2dimensional fmatrix whose rows contain the
patterns. Argument labelvector is a
1dimensionnal containing the labels of the patterns stored in every row
of wordmatrix .
The resulting codebook just points to the patterns stored in matrix
wordmatrix . Modifying this matrix thus modifies the codebook
patterns.
8.1.2.2.1.1.2. (codebookword cb [ n [ word ]] )


With one argument, this function returns the wordmatrix of codebook
cb . This matrix addresses the same elements than the
codebook. Modifying this matrix thus modify the codebook.
With two arguments, this function returns the n
th pattern of codebook cb as a
vector. This vector addresses the same elements than the codebook.
Modifying this vector thus modify the n
th pattern of the codebook.
With three arguments, this function copies vector word into the
n th pattern of codebook cb
. It returns then the new pattern.
? (loadiris)
= ::CODEBOOK:150x4
? (codebookword irisa 2)))
= ::MAT:4
8.1.2.2.1.1.3. (codebooklabel cb [ n [ label ]] )


With one argument, this function returns a new vector containing the
labels of each example of codebook cb
. Unlike the matrices returned by function
codebookword , modifying this vector does not modify the
codebook.
With two arguments, this function returns the label of the
n th example of cb .
With three arguments, this function makes the label of the
n th example of codebook cb
equal to integer label. It returns then the new label.
? (loadiris)
= ::CODEBOOK:150x4
? (for (i 0 4) (print (codebooklabel irisa i)))
0
2
2
0
2
= 2
8.1.2.2.1.1.4. (codebooklabels cb)


This function returns a list of all the labels present in codebook
cb .
? (loadiris)
= ::CODEBOOK:150x4
? (codebooklabels irist)
= (0 1 2)
8.1.2.2.1.2. Codebook Operations.


A few operations are extremely useful for manipulating codebooks as a
whole.
8.1.2.2.1.2.0. (codebooksplit cb min max )


This function returns a codebook composed of examples
min to max of codebook
cb . Note that this new codebook addresses the same patterns
than cb . Modifying a pattern in the
new codebook thus modifies a pattern in codebook
cb .
? (loadiris)
= ::CODEBOOK:150x4
? (codebooksplit irist 25 74)
= ::CODEBOOK:50x4
8.1.2.2.1.2.1. (codebookmerge cb1...cbn)


This function combines the example of codebooks
cb1 to cbn and returns a
new codebook containing copies of these examples.
? (loadiris)
= ::CODEBOOK:150x4
? (codebookmerge irisa irist)
= ::CODEBOOK:250x4
8.1.2.2.1.2.2. (codebookselect condition cb1 ... cbn)


This function extracts the examples of codebooks
cb1 to cbn whose labels
fulfil condition condition. It returns a new codebook containing copies
of these examples.
Argument condition is a function of one argument (the label) which
returns a boolean value. An example is selected if a call to this
function with the example label as argument returns a non nil result.
? (loadiris)
= ::CODEBOOK:150x4
;; Select all iris of first class in codebooks irisa and irist
? (codebookselect (lambda(c) (= c 1)) irisa irist)
= ::CODEBOOK:81x4
8.1.2.2.1.3. Codebook and Netenv.


If you develop an application which merges a classical network and a
codebookbased algorithm, you may wish to transfer data from a codebook
to a network or vice versa.
8.1.2.2.1.3.0. Convertions with the Netenv Format.


The codebook data structure is handy for storing or retrieving pattern
recognition data. A single data structure contains both the patterns and
the class.
The natural format used by the Netenv library is slightly different.
Patterns are stored in a first matrix (the pattern matrix) identical to
the wordmatrix of a codebook. Labels however are stored in a second
matrix (the desired matrix) whose columns correspond to each class. The
largest value in a row indicates the class of the corresponding example.
8.1.2.2.1.3.0.0. (codebooktopatdes cb)

(knnenv.sn) 
This function takes a codebook cb and
returns a list of the two matrices containing the same examples under
the Netenv natural format.
8.1.2.2.1.3.0.1. (patdestocodebook patternmatrix desiredmatrix)

(knnenv.sn) 
This function takes a pattern matrix
patternmatrix and a desired matrix
desiredmatrix and returns a codebook containing the same
examples.
8.1.2.2.1.3.1. Using Codebooks with Netenv Algorithms.


Besides these conversion functions, you might redefine functions
presentpattern and presentdesired
of the Netenv library for handling the codebook format. These functions
indeed are always called for accessing the examples.
For instance, the following functions extract the example data of a
backpropagation network from a codebook patterncodebook rather than
using the classical matrices patternmatrix
and desiredmatrix .
(de presentpattern(layer pnum)
(get_pattern (eval patterncodebook) pnum layer)
(if (0<> inputnoise)
(gaussstate layer inputnoise) ) )
(de presentdesired(layer pnum)
(let ((cl (codebooklabel patterncodebook pnum)))
(each ((c desiredlayer))
(if (0= cl)
(nval c 1)
(nval c 1) )
(incr cl 1) ) ) )
8.1.2.2.1.4. Codebook Files.


8.1.2.2.1.4.0. Codebook File Format.


There are two formats for codebook files.
The ascii format records the codebook data in text files. The first two
lines contain the number of examples ncode
and the size of the patterns ndim .
Every other line contains the patterns y
as a collection of numbers separated by blanks. The corresponding labels
are recorded as integer numbers on the intermediate lines.
<ncode>
<ndim>
<y00> <y01> <y02> ...
<label0>
<y10> <y11> <y12> ...
<label1>
...
The binary format is almost identical. The number of examples <ncode>, the
size of the patterns <ndim> and the labels are stored as four byte integers
(int). The patterns are stored as vectors of single precision numbers (float).
CODEBOOK FILE:
int ncode;
int ndim;
REPEAT ncode TIMES:
float pattern[NDIM];
int labels;
8.1.2.2.1.4.1. Codebook I/O Functions.


8.1.2.2.1.4.1.0. (savecodebook cb filename)


This function saves a codebook cb into
a file named filename under the binary codebook format.
8.1.2.2.1.4.1.1. (saveasciicodebook cb filename)


This function saves a codebook cb into
a file named filename under the ascii codebook format.
8.1.2.2.1.4.1.2. (loadcodebook filename)


This function loads and returns a codebook from file named filename
using the binary codebook format. Unlike function
loadmatrix , these functions never check the type of the
data files.
8.1.2.2.1.4.1.3. (loadasciicodebook filename)


This function loads and returns a codebook from file named filename
using the ascii codebook format. Unlike function loadmatrix, these
functions never check the type of the data files.
8.1.2.2.2. kNearest Neighbors.


In this chapter, we describe elementary functions for using codebooks
for quantization and classification task. The basic algorithm consists
in extracting the k closest examples of a codebook to a given pattern.
8.1.2.2.2.0. Distances in Knnenv.


The very notion of closest examples depends on the selected distance.
For efficiency reasons, we have implemented functions using the
Euclidian distance. In many cases a non Euclidian topology can be mapped
to a Euclidian topology by an adequate preprocessing of the patterns.
Consider for instance the Mahalanobis distance:
Mahalanobis(x,y) = (xy)T A (xy)
where matrix A is the inverse of the
definite positive covariance matrix of the examples. By diagonalizing
matrix A , we obtain
A = (P)T L P where L is a
diagonal matrix with positive terms. If we denote as
R(L) the diagonal matrix whose terms are the square root of
the terms of matrix L , we have:
Mahalanobis(x,y) = Euclidian( R(L) P x, R(L) P y)
Actually A is often the inverse of the
covariance matrix plus a small regularizing term.
8.1.2.2.2.1. KNN for Vector Quantization.


Vector quantization consists in replacing a pattern by its closest
pattern in a codebook. This technique changes a continuous problem
(working on a continuous pattern space) into a discrete problem (working
on a finite set of codes). It has been used extensively in several area
like speech processing.
8.1.2.2.2.1.0. (codebookdistances vector cbref)


This function returns a 1dimensional matrix containing the Euclidian
distances between the vector vector
and all patterns in codebook cbref .
Argument vector must be a
1dimensionnal matrix whose size matches the size of the example in
codebook cbref .
8.1.2.2.2.1.1. (knn vector cbref k)


This function returns a list containing the order number of the
k closest patterns of codebook cbref
to vector vector . Argument
vector must be a 1dimensionnal matrix whose size matches the
size of the example in codebook cbref
.
? (loadiris)
= ::CODEBOOK:150x4
? (knn (codebookword irist 0) irisa 5)
= (0 73 59 56 87)
? (all ((n result)) (codebooklabel irisa n))
= (0 0 0 0 0)
8.1.2.2.2.2. KNN for Pattern Recognition.


As suggested by the previous example, we can assign a class to a new
pattern by selecting the most frequent label among the labels of the
k closest examples of a codebook. This is easily achieved by
the following functions:
8.1.2.2.2.2.0. (codebookdistandlabels vector cbref)

(knnenv.sn) 
This function returns a 2dimensional matrix with 2 colums. The first
column contains the Euclidian distances between the vector
vector and all patterns in codebook
cbref . The second column contains the labels of the
corresponding patterns in codebook cbref
. Argument vector must be a 1dimensionnal matrix whose size matches the
size of the example in codebook cbref
.
8.1.2.2.2.2.1. (knnclass vector cbref k)


This function retrieves in codebook cbref
the k closest examples to vector
vector and returns the most frequent label. If two labels are
equally represented, this function returns 1.
? (loadiris)
= ::CODEBOOK:150x4
? (knnclass (codebookword irist 0) irisa 5)
= 0
This pattern recognition method is referred to as ``the kNN algorithm''.
8.1.2.2.2.2.2. (testknn cbdata n cbref <k.)

(knnenv.sn) 
This function extracts the n th
pattern of codebook cbdata , retrieves
in codebook cbref the
k closest examples to vector vector
and tests if the most frequent label is equal to the
n th label of codebook cbdata
. In this latter case, it returns t ;
otherwise () is returned.
? (loadiris)
= ::CODEBOOK:150x4
? (testknn irist 0 irisa 5)
= t
8.1.2.2.2.2.3. (perfknn cbdata cbref k [flag])


This function applies the kNN algorithm to every pattern of codebook
cbdata using the prototype codebook
cbref . It compares then the result with the label stored in
codebook cbdata and returns the
percentage of correct classification.
? (loadiris)
= ::CODEBOOK:150x4
? (perfknn irist irisa 2)
= 94
When the optional argument flag is t ,
this functions returns a list (mis rej)
which indicates the misclassification rate mis
and the rejection rate rej . A
rejection occurs when two labels are equally represented among the
k closest examples.
? (perfknn irist irisa 2 t)
= (1.33333 4.66667)
The actual performance of the kNN algorithm improves when the number of
references increases. Given a reference codebook size, the optimal value
for parameter k is probably related to
the amount of noise present in the data.
8.1.2.2.2.2.4. (confusionknn cbdata cbref k)

(knnenv.sn) 
This function applies the kNN algorithm to every pattern of codebook
cbdata using the prototype codebook
cbref . It compares then the result with the label stored in
codebook cbdata and returns a matrix
containing the number of examples of a given class recognized as members
of a second class..
? (loadiris)
= ::CODEBOOK:150x4
? (confusionknn irist irisa 1)
= [[50.00 0.00 0.00]
[ 0.00 48.00 2.00]
[ 0.00 0.00 50.00]]
In this matrix, each row denotes the actual class and each column
denotes the class returned by the kNN algorithm.
8.1.2.2.3. Codebook Algorithms.


Several issues must be addressed for using efficiently the nearest
neighbour methods. On the first hand, the performance of these
algorithms depends crucially on the size of the codebook. On the second
hand, large codebooks make slower the search for the nearest neighbours.
This chapter present several algorithms for computing small codebooks
with good properties using large codebooks. There are several families
of algorithms:
 Editing algorithms, like multiedit and
condense, select a limited number of examples from the large codebook.
 Clustering algorithms, like kMeans, compute a fixed number of
protoypical examples
 Iterative algorithms, like LVQ, improve the patterns of a small
codebook for increasing the pattern recognition performance.
8.1.2.2.3.0. Codebook Editing Algorithms.


Editing algorithms select a limited number of examples from a large
codebook in order to maintain or improve the rate of correct
classifications. We must then select patterns which implement a good
piecewise linear approximation of the Bayes boundary.
Certain patterns indeed seem less informative:
 Ambiguous
patterns just introduce noise effects on the decision boundary. In
practice, this noise is cancelled by selecting a larger number of
neighbours. The multiedit algorithm builds a smaller codebook with less
ambiguous patterns. This codebook achieves comparable performance using
a smaller number of nearest neighbours.
 Patterns deeply imbedded within clusters bring little contribution
to the classification boundaries. The condense algorithm builds a
smaller codebook by removing these patterns. This codebook achieves a
comparable performance with less computation.
Here are a few remarks for using these algorithms:
 Each
algorithm discards the patterns that the other algorithm would keep. The
multiedit algorithm keeps only the typical patterns of a class and
remove the ambiguous and noisy patterns which are closer to the decision
boundary. The condense algorithm would discard typical patterns and keep
the few limiting patterns which actually define the boundary.
 The multiedit algorithm is useful when the data are noisy for
reducing the number k of nearest neighbours with a limited impact on the
performance.
 The condense algorithm usually makes a strong reduction on the size
of a multiedited codebook. It however performs poorly on a noisy
codebook because it keeps a large number of ambiguous patterns.
8.1.2.2.3.0.0. (multiedit cb n [verbose])

(knnenv.sn) 
This function applies algorithm multiedit to codebook
cb and returns the multiedited codebook. Here is the
algorithm multiedit :
 1 Divide the initial codebook in
n subsets.
 2 Classify the examples of subset i
using the nearest neighbour in the next subset.
 3 Discard all the misclassified examples.
 4 Iterate steps 1, 2 and 3 until no editing is done.
The parameter n controls the grain of
the multiedit algorithm. It must be included between 3 and the square
root of the codebook size. Using larger n
eliminates more examples. We suggest starting with the smallest legal
value: 3.
When the flag verbose is t , a message
is displayed before each iteration. This message indicates the current
size of the multiedited codebook.
? (loadiris)
= ::CODEBOOK:150x4
? (perfknn irist irisa 1)
= 98.6667
? (setq mirisa (multiedit irisa 3))
= ::CODEBOOK:92x4
? (perfknn irist mirisa 1)
= 96.6667
8.1.2.2.3.0.1. (condense cb [verbose])

(knnenv.sn) 
This function applies the condense algorithm to codebook
cb and returns the condensed codebook. Here is the condense
algorithm:
 1 Create an empty codebook.
 2 Classify every example of the initial codebook using the nearest
neighbour rule in the new codebook. Insert every misclassified example
into the new codebook.
 3 Iterate steps 1 and 2 until no editing is done.
When the flag verbose is t , a message
is displayed before each iteration. This message indicates the current
size of the multiedited codebook.
? (setq cirisa (condense irisa))
= ::CODEBOOK:17x4
? (perfknn irist cirisa 1)
= 98.6667
When the data sets are noisy, you should use multiedit before condense.
Algorithm condense indeed would keep a lot of noisy patterns. The
situation is less obvious on clean data sets. For example, 17 examples
selected by the condense algorithm achieve the full performance of the
iris training set..
8.1.2.2.3.1. Codebook Clustering Algorithms.


Clustering algorithms compute a codebook whose protoypes
xi fulfil an optimal criterion. The main clustering algorithm
in SN2.8 is the kMeans algorithm which minimizes the quantification
error:
Q = Expectation( Min (xxi)2 )
This quantity measures the average distance between a pattern and the
closest prototype in the reference codebook.
8.1.2.2.3.1.0. (codebookdistorsion cbdata cbref)


This function returns the average distance between a pattern of codebook
cbdata and the closest pattern in codebook
cbref . This is exactly the quantity Q
described above.
8.1.2.2.3.1.1. (kmeans cbdata k)


This function returns a codebook of k
prototypes which minimizes the quantization distorsion
Q computed over codebook cbdata
. This codebook is computed using the stochastic kMeans algorithm:

1 Fill the protoype codebook with the k
first patterns.
 2 Take a random pattern in the example codebook and pull the
closest pattern in the prototype codebook towards the example pattern.
 3 Repeat steps 1 and 2 until the algorithm converges.
The kMeans algorithm does not depend on the labels of the example
codebook.The default labels of the k
references of the resulting codebook are those of the
k first references of cbdata
. It is thus useful to compute optimal labels using the function
codebookassignlabels .
8.1.2.2.3.1.2. (codebookassignlabels cbdata cbref [k])

(knnenv.sn) 
This function assigns labels to the examples of codebook
cbref in order to minimize the misclassification rate of the
kNN algorithm applied to the examples of codebook cbdata. This is
achieved by a voting procedure. The default value for
k is 1.
? (loadiris)
= ::CODEBOOK:150x4
? (setq kirisa (kmeans irisa 12))
= ::CODEBOOK:12x4
? (codebookassignlabels irisa kirisa 1)
= ::CODEBOOK:12x4
? (perfknn irist kirisa 1)
= 96
8.1.2.2.3.1.3. (kmeansclass cbdata kpc [classes])

(knnenv.sn) 
There is however a much better alternative for optimizing the pattern
recognition performance. Function kmeansclass
compute a codebook containing kpc
prototypes per class. These prototypes are computed by running algorithm
kMeans on codebook cbdata within each
class.
? (loadiris)
= ::CODEBOOK:150x4
? (setq kirisa (kmeansclass irisa 2))
= ::CODEBOOK:6x4
? (perfknn irist kirisa 1)
= 96.6667
The optional argument classes restricts this algorithm to the examples
whose labels belong to a given list. If this argument is omitted, all
the examples are considered. Important note: you should always test this
astounding ``supervised kMeans'' algorithm.
8.1.2.2.3.2. Codebook Iterative Algorithms.


Iterative algorithms try to improve the patterns of a small codebook for
decreasing the misclassification rate. The main iterative algorithms are
the Learning Vector Quantization (LVQ) algorithms introduced by Kohonen
in the middle of the eighties.
The LVQ algorithms always require a good initial codebook of protypes
returned by the kMeans or supervised kMeans algorithms. When a large
number of clean examples are available, a few iterations of the LVQ
algorithms improve the pattern recognition performance.
8.1.2.2.3.2.0. Initializing a Codebook.


8.1.2.2.3.2.0.0. (initlvq cbdata k)

(knnenv.sn) 
This function returns a good initial codebook with
k examples for the LVQ algorithms. This codebook is computed
by running the kMeans algorithm (see function
kmeans ) on the example codebook
cbdata . The labels are assigned using a voting procedure
(see function codebookassignlabels
).
8.1.2.2.3.2.0.1. (initlvqclass cbdata kpc)

(knnenv.sn) 
This function returns an even better initial codebook with
kpc examples per class by running the supervised kMeans
algorithm (see function kmeansclass
above) on the example codebook cbdata
. This initial codebook is so good that LVQ algorithms seldom improve
the misclassification rate...
8.1.2.2.3.2.1. Codebook Learning Vector Quantization Algorithms.


8.1.2.2.3.2.1.0. (learnlvq version cbdata cbref niter [ a0 [ win ]])


This single functions implements the three major LVQ algorithms for
improving the prototype codebook cbref
using the example codebook cbdata .
Argument niter specifies the number of
iterations of the LVQ algorithm to perform. Argument version must be 1,
2 or 3; it selects the LVQ1, LVQ2 or LVQ3 algorithms.
Algorithm LVQ1 iterates niter times
over the examples of codebook cbdata .
For each example x , it retrieves the
closest pattern x* in codebook
cbref .
 If the prototype and example labels are
equal, the prototype pattern xi is
moved towards the example pattern x by
quantity epsi * (xx*) .
 If the labels are different, the prototype pattern
x* is pushed away from the example pattern
x by the same amount.
For each example x , algorithm LVQ2
retrieves the two closest patterns x*
and x** in the prototype codebook.

If the label of prototype x* is
different from the label of example x
,
 if the label of prototype x** is
equal to the label of example x ,
 and if the projection of the example x
on the line [x*,x**] falls in a window
of size win * x**x* located in
the middle of segment [x*,x**] ,
then the closest prototype pattern x*
is pushed away from the example pattern x
by quantity epsi * (xx*) and the
second closest prototype pattern x**
is pulled towards the example pattern by quantity
epsi * (xx**) .
For each example x , algorithm LVQ3
retrieves the closest pattern x* and
the closest pattern x+ with the right
label the prototype codebook.
 If the label of the closest
prototype x* is different from the
label of example x ,
 and if the projection of the example x
on the line [x*,x+] falls in a window
of size win * x+x* located in the
middle of segment [x*,x+] ,
 then prototype pattern x* is
pushed away from the example pattern x
by quantity epsi * (xx*) and
prototype pattern x+ is pulled towards
the example pattern by quantity epsi * (xx+)
.
It has been proven that algorithm LVQ3 minimizes the misclassification
error on the example codebook when the window size win decreases slowly
towards 0 .
Argument a0 indicates the initial
value of the step size. The step size et decreases linearly from value
a0 to 0 . The initial step
size default to 0.1 which is usually a
good value for all problems.
Argument win specifies the ``window'' size for the LVQ2 and LVQ3
algorithms. The default window size 0.01
however is too small to allow significant modification of the prototype
patterns. You must either select a higher window size and use the LVQ3
algorithm for fine tuning the patterns.
? (loadiris)
= ::CODEBOOK:150x4
? (setq lirisa (initlvqclass irisa 2))
= ::CODEBOOK:6x4
? (perfknn irist lirisa)
= 96.6667
? (learnlvq 3 irisa lirisa 20 0.1 0.1)
= ()
? (perfknn irist lirisa)
= 97.3333
8.1.2.2.4. Radial Basis Functions in Knnenv.


8.1.2.2.4.0. Overview of Radial Basis Functions in Knnenv.


Radial Basis Function networks have been shown quite efficient for
multivariate regressions and classifications.
They have been introduced by Broomhead and Loewe in 1988 and popularized
later by Moody.
A RBF net is made of three layers:
 an input layer,
 a hidden layer made of k
euclidian units. These units compute a non linear function (usually a
bell shaped function) of the distances between the input
x and the weights w :
fk(x) = g b( f((xwk)2;s2) )
8.1.2.2.4.1. Implementation of Radial Basis Functions as an Hybrid Algorithm.


This chapter presents an example of neural network algorithm using both
codebooks and standard networks.
In order to take advantage of the capabilities of both the Netenv and
Knnenv libraries, we have designed a hybrid system. The computation of
the distances ((xwi)**2) is
implemented by a codebook. The bell shaped function and the output units
are implemented as a multilayer network.
The glue between both libraries is implemented by redefining locally the
functions presentpattern and
presentdesired .
These functions extract an example from a codebook
cbdata , compute the distances to the centers of the
euclidian units (see function codebookdistances
) and enter these distances as input to the multilayer network.
The first layer of the multilayer network multiplies the distances by
the inverse squared widths 1/s2 and
applies the bell shaped function. The second layer of the multilayer
network computes the linear outputs.
The corresponding lisp code is located in file
"sn2.8/lib/knnenv.sn" .
8.1.2.2.4.1.0. (initrbf cbdata kpc)

(knnenv.sn) 
This function computes an initial codebook containing kpc prototypes per
class by running the supervised kMeans algorithm on the examples set
cbdata. Function initrbf then builds
the multilayer network using the Netenv functions and initializes the
weights and the learning rates.
In order to implement the connection between the functions of Knnenv and
the functions of NetEnv, this function generates function
rbfpresentpattern and
rbfpresentdesired and variables
rbfcodebook and rbfclasses
.
8.1.2.2.4.1.1. rbfcodebook

[VAR] (knnenv.sn) 
The global variable rbfcodebook
contains the codebook containing the weights of the first layer of the
RBF network.
It is created by the function initrbf
for implementing the glue between the functions of Knnenv and the
functions of NetEnv. It is automatically handled by
learnrbf and perfrbf .
8.1.2.2.4.1.2. rbfclasses

[VAR] (knnenv.sn) 
The global variable rbfclasses
contains the list of labels present in the example codebook. The
multilayer network contains one output per class in this list.
It is created by the function initrbf
for implementing the glue between the functions of Knnenv and the
functions of Netenv. It is automatically handled by
learnrbf and perfrbf .
8.1.2.2.4.1.3. (rbfpresentpattern layer pnum)

(knnenv.sn) 
Function rbfpresentpattern extracts
the pnumth pattern of codebook cbdata
, computes the distances to the centers of the euclidian units and
enters these distances into the states of the input layer
layer .
It is created by the function initrbf
for implementing the glue between the functions of Knnenv and the
functions of Netenv. It is automatically handled by
learnrbf and perfrbf .
8.1.2.2.4.1.4. (rbfpresentdesired layer pnum)

(knnenv.sn) 
Function rbfpresentdesired extracts
the pnumth label of codebook
cbdata and stores the corresponding desired output vector
into the states of the desired layer layer
.
It is created by the function initrbf
for implementing the glue between the functions of Knnenv and the
functions of Netenv. It is automatically handled by
learnrbf and perfrbf .
8.1.2.2.4.1.5. (learnrbf cbdata niter [epssigma] [epsclass])

(knnenv.sn) 
This function performs niter
backpropagation epochs on the multilayter network using the examples
stored in codebook cbdata . The
optional arguments epssigma and
epsclass let the user enter learning rates for the two layers
of the multilayer network.
8.1.2.2.4.1.6. (perfrbf cbtest)

(knnenv.sn) 
This function tests the current RBF network against the examples of
codebook cbtest . It prints the mean
squared error and the misclassification rate.
? (loadiris)
= ::CODEBOOK:150x4
? (initrbf irisa 4)
= t
? (learnrbf irisa 20)
= t
? (perfrbf irist)

patterns{0149}, age=2000, error=0.0895082, performance=92
= 92
8.1.2.2.5. Topological Maps in Knnenv.


A generic Topological Maps package has been added to the Knnenv library.
The following sections provide
 the basic principles of this
implementation of TMaps,
 the functions for running topological map algorithms,
 a simple example is then discussed in Section 4.3.
More details on the Topological Map algorithms are given by the
following texts:
 Kohonen T.
Self organisation and associative memories, Springer Series in
Information Sciences, vol. 8, Springer Verlag, (1984).
 Kohonen T. The neural phonetic typewriter, IEEE Computer, March88,
(1988).
8.1.2.2.5.0. Overview of Topological Maps in Knnenv.


Like the other algorithms of the Knnenv library, both the databases and
the Topological Maps reference points are stored as codebooks. There are
mostly three differences between the kMeans and the Topological Map
algorithms:
 The Topological Map algorithm introduces a
topology on the reference nodes. Reference nodes might be organized on a
line, on a grid, on a ring or on whatever structure pleases us.
 When a reference point is updated, all neighbouring reference
points in the structure are also updated with a similar term.
 Both the learning rate and the neighborhood size decrease during
the training procedure.
A single training function controls all these parameters. In fact this
training function takes various hook functions as arguments. These hooks
functions are called for finding the neighboring references, for
scheduling the decrease of the learning rate and the neighborhood size
and for producing various displays. Several predefined hook functions
are available.
Topological Maps thus interact with all features of Knnenv. For
instance, function codebookassignlabels
assigns classes to the references of a Topological Map. You can then use
functions knn and
perfknn for a pattern recognition task. Although using a
Topological Map for pattern recognition is a common practice, you should
consider that Topological Maps is a non supervised algorithm. It is
usually less efficient for pattern recognition than a supervised
algorithm like kmeansclass or
learnlvq .
8.1.2.2.5.1. Programming Topological Maps in Knnenv.


Programming Topological Maps requires four steps:
 Obtain a
codebook for the data points,
 Create and initialize a codebook for the reference points,
 Define the characteristics of the map using predefined or custom
hook functions.
 Train the map using the Topological Map learning function.
8.1.2.2.5.1.0. Creating and Initializing the Codebooks for 2D Topological Maps.


The Topological Map databases are stored in a codebook. This codebook
might be created with newcodebook or
loaded from a file with loadcodebook
.
8.1.2.2.5.1.0.0. (bidimquery string [xmin ymin xmax ymax])

(knnenv.sn) 
This function displays string string
on top of the current window. It builds a codebook by gathering the
coordinates of the mouse clicks. These coordinates are scaled to fit in
range [xmin,xmax]x[ymin,ymax] . If the
range is not specified, the coordinates are scaled between
1 and +1 . This function
returns the newly created codebook when the user clicks in the title
string.
The Topological Maps reference points are also stored in a codebook.
Such a codebook is easily created with the function
newcodebook and might be initialized randomly. It is however
more efficient to initialize the reference points using predefined
patterns. The following functions lets you initialize a codebook using
standard patterns defined in a bidimensional space.
These functions only set the first two coordinates of the reference
points:
8.1.2.2.5.1.0.1. (bidiminitring cbd nrefs)

(knnenv.sn) 
Returns a codebook of nrefs references located on a circle whose radius
and center are the standard deviation and means of the points of
training codebook cbd .
8.1.2.2.5.1.0.2. (bidiminit2d cbd n m)

(knnenv.sn) 
Returns a codebook of n times
m references located on a grid of n lines and
m columns whose extent is determined by the statistical
properties of the points of codebook cbd
.
8.1.2.2.5.1.1. The Topological Maps Learning Function in Knnenv.


This single function trains all kind of topological maps. The
characteristics of the map are specified by three hook functions
described below.
8.1.2.2.5.1.1.0. (learntmap cbdata cbref niter topofunc epsifunc distfunc [displayfunc])

(knnenv.sn) 
This function performs niter
iterations of the Topological Map training algorithm. Data points are
accessed in codebook cbdata . The
Topological Map reference points are stored in codebook
cbref . Arguments topofunc
, epsifunc and
distfunc are hook functions which specify the
characteristics of the map.
 (topofunc n mat)
The topology is encoded by function topofunc
.
When this function is called, integer n
is the subscript of the closest reference point to the current pattern
(in the input space). Vector mat
contains one component per reference node. This function must store into
matrix mat the distances (as defined
by the map topology) between the n th
reference node and every node of the map.
 (epsifunc coeff)
The learning rate is scheduled by function
epsifunc .
This function is called once before each iteration. Argument
coeff is a number which decreases linearly from
1 to 0 with the iteration
number. This function must return the current learning rate.
 (distfunc d coeff)
The neighborhood size is scheduled by function
epsifunc .
This function is called before updating a reference node located at
distance d (as defined by the map
topology) of the closest node to the current pattern (in the input
space). Argument coeff is a number
which decreases linearly from 1 to
0 with the iteration number.
If this function returns () , the
reference point is not changed. If this function returns a number, the
learning rate is multiplied by this number before updating the reference
point.
The optional argument displayfunc
specifies a function called after each iteration for performing various
display tasks. This function takes three arguments: the data codebook
cbdata , the reference codebook cbref
and the iteration number.
8.1.2.2.5.1.2. Topological Maps Predefined Hook Functions in Knnenv.


Standard hook functions are provided. The following calls generate a
standard hook function for various topologies, various scheduling
strategies and various simple displays.
8.1.2.2.5.1.2.0. Topology Hooks in Knnenv.


8.1.2.2.5.1.2.0.0. (tmaptopo1d n)

(knnenv.sn) 
Returns a topology hook function topofunc
which defines a monodimensional topological map. Argument
n indicates the number of nodes in the map. Such topological
maps are better initialized using function
bidiminitring .
8.1.2.2.5.1.2.0.1. (tmaptoporing n)

(knnenv.sn) 
Returns a topology hook function topofunc
which defines a monodimensional ringshaped topological map. Argument
n indicates the number of nodes in the map. Such topological
maps are better initialized using function
bidiminitring .
8.1.2.2.5.1.2.0.2. (tmaptopo2d n m)

(knnenv.sn) 
Returns a topology hook function topofunc
which defines a bidimensional topological map organized as a grid with
n lines and m columns. Such
topological maps are better initialized using function
bidiminit2d .
8.1.2.2.5.1.2.0.3. (tmaptopo3d n m p)

(knnenv.sn) 
Returns a topology hook function topofunc
which defines a tridimensional topological map organised as a grid with
n planes, m lines and
p columns.
8.1.2.2.5.1.2.0.4. (tmaptopokd n1...nk)

(knnenv.sn) 
Returns a topology hook function topofunc
which defines a kdimensional topological map organised as a grid whose
sizes are given by n1 to
nk .
8.1.2.2.5.1.2.1. Topological Maps Learning Rate in Knnenv.


8.1.2.2.5.1.2.1.0. (tmapepsiconst ieps)

(knnenv.sn) 
Returns a learning rate scheduling function
epsifunc which sets up a constant learning rate whose value
is ieps . When the learning rate is
equal to 1 , the algorithm exactly
moves the closest reference point over the current data point. A value
of 0.1 is often a good choice for
ieps .
8.1.2.2.5.1.2.1.1. (tmapepsilin ieps)

(knnenv.sn) 
Returns a learning rate scheduling function
epsifunc which sets up a linearly decreasing learning rate
whose initial value is ieps .
When the learning rate is equal to 1 ,
the algorithm exactly moves the closest reference point over the current
data point. A value of 0.3 is often a
good choice for ieps .
8.1.2.2.5.1.2.2. Neighborhood Size.


8.1.2.2.5.1.2.2.0. (tmapdstconst idist)


Returns a neighborhood size scheduling function
disthook which selects a constant neigborhood size. This
scheduling function returns 1 for all
reference nodes whose distance to the closest node is smaller than
idist .
8.1.2.2.5.1.2.2.1. (tmapdstlin idist)


Returns a neighborhood size scheduling function
disthook which selects a linearly decreasing neigborhood
size. This scheduling function returns 1
for all reference nodes whose distance to the closest node is smaller
than idistxcoeff .
8.1.2.2.5.1.2.3. Topological Maps Display Hook in Knnenv.


Several simple display functions are provided. These functions display a
projection of both the data points and the reference points onto the
first two axes:
8.1.2.2.5.1.2.3.0. (tmapdisplaybidim1d [xmin ymin xmax ymax])

(knnenv.sn) 
Returns a display hook function displayfunc
for displaying in the current window a monodimensional Topological Map:
Reference points are linked by a line. Scaling is computed to
accommodate points whose first two coordinates fit in range
[xmin,ymin]x[xmax,ymax] . The default range is
[1,+1]x[1,+1] .
8.1.2.2.5.1.2.3.1. (tmapdisplaybidimring [xmin ymin xmax ymax])

(knnenv.sn) 
Returns a display hook function displayfunc
for displaying in the current window a monodimensional ring shaped
Topological Map. Reference points are linked by a closed line. Scaling
is computed to accommodate points whose first two coordinates fit in
range [xmin,ymin]x[xmax,ymax] . The
default range is [1,+1]x[1,+1] .
8.1.2.2.5.1.2.3.2. (tmapdisplaybidim2d n m [xmin ymin xmax ymax])


Returns a display hook function displayfunc
for displaying in the current window a monodimensional Topological Map.
The reference points are linked by a grid of n
lines and m columns. Scaling is computed to accommodate points whose
first two coordinates fit in range
[xmin,ymin]x[xmax,ymax] . The default range is
[1,+1]x[1,+1] .
8.1.2.2.5.1.3. How to Write Efficient Topology Hook Functions in Knnenv ?


You must define a new topology hook function whenever you want to run a
Topological Map with a new topology. There is clearly a right way and a
wrong way to program such a hook functions.
 The wrong way
consists in computing the distances whenever the hook function is
called. Such a hook function might dominate the running time.
 The right way takes profit of the static nature of the Topological
Map structure. It consists in computing all the distances at once. The
hook function just copies the distances of interest in the destination
matrix.
Assume that we use a Topological Map organized as a grid of three lines
and four columns. The distance between two nodes of the grid will be the
number of vertical or horizontal segments between these nodes.
N0 N1 N2 N3
N4 N5 N6 N7
N8 N9 N10 N11
The topology hook function topofunc
must compute a matrix whose elements are the distances from a given node
to any other nodes. For instance, the distance matrices for nodes N5 and
N8 are:
2 1 2 3 2 3 4 5
1 0 1 2 1 2 3 4
2 1 2 3 0 1 2 3
The function tmaptopo2d first
computes a matrix with six lines and eight columns which contains the
distances the central element of the matrix, located on (3,4).
5 4 3 2 3 4 5 6
4 3 2 1 2 3 4 5
3 2 1 0 1 2 3 4
4 3 2 1 2 3 4 5
5 4 3 2 3 4 5 6
6 5 4 3 4 5 6 7
It returns then a function which copies a 3x4 submatrix into the
destination matrix. This submatrix always includes element 3x4 of the
big matrix on the same location than the closest node:
5 4 3 2 3 4 5 6 5 4 3 2:3:4:5 6
4 3 2:1:2:3 4 5 4 3 2 1:2:3:4 5
3 2 1:0:1:2 3 4 3 2 1 0:1:2:3 4
4 3 2:1:2:3 4 5 4 3 2 1 2 3 4 5
5 4 3 2 3 4 5 6 5 4 3 2 3 4 5 6
6 5 4 3 4 5 6 7 6 5 4 3 4 5 6 7
The hook function thus never computes a distance but copies the right
section of the big matrix into the destination matrix. This copy is
achieved by an efficient C routine. The user is encouraged to design
custom topologies using the same efficient and flexible method.
8.1.2.2.5.2. A Complete Example of Topological Map in Knnenv


The following example is provided online:
8.1.2.2.5.2.0. (tspdemo)

(knnenv.sn) 
This function implements a ring shaped Topological Map for solving the
Travelling Salesman Problem. It creates a window in which you can
acquire cities with mouse clicks. The learning process begins and stops
automatically. Here is the text of this program:
(de tspdemo(&optional niter (epsi 0.33))
;; Create window,
;; Acquire cities,
;; Initialize the topological map
(let ((window ()))
(newwindow)
(let* ((cbdata (bidimquery "Enter cities."))
(ncities (car (bound (eval cbdata))))
(nrefs (2* ncities))
(cbref (bidiminitring cbdata nrefs)) )
;; Compute the solution
(learntmap cbdata cbref (or niter ncities)
(tmaptoporing nrefs)
(tmapepsiconst epsi)
(tmapdstlin (2/ ncities))
(tmapdisplaybidimring) )
;; Move the nodes on the closest city
;; Note that epsilon=1!
(learntmap cbdata cbref 1
(tmaptoporing nrefs)
(tmapepsiconst 1)
(tmapdstconst 0)
(tmapdisplaybidimring) ) )
window ) )
If you look at the call to function learntmap, you see that:

Data is acquired using bidimquery ,
 The map is initialized using bidiminitring
,
 The topology hook is returned by
tmaptoporing ,
 The learning rate hook is returned by
tmapepsiconst ,
 The neighborhood hook is returned by
tmapdstlin ,
 The display hook is returned by
tmapdisplaybidimring . You then have an example of
programming topological maps.
8.1.2.2.6. Programming with Knnenv: Examples.


Examples of Knnenv programms are located in
"sn2.8/examples/Knnenv/examples.sn" . Please copy this file
in your directory. You can then launch the function go described below.
(de go ()
(printf "*********LOADING IRIS DATA SETS*********\n")
;load learning set and test set
(loadiris)
;or
;(setq iris_a (load_codebook "../examples/Knnenv/iris_train"))
;(setq iris_t (load_codebook "../examples/Knnenv/iris_test"))
(printf "\n\n*********TESTING 1NN METHOD*********\n")
;nearestneighbor performance
(printf "performance=%f\n" (perfknn irist irisa 1))
(printf "Printing confusion matrix\n")
(matrixprint (confusionknn irist irisa 1))
(printf "\n\n*********TESTING SOME PREPROCESSING*********\n")
(printf "Multiediting learning set\n")
(setq multiedited (multiedit irisa 3))
(printf "1NN performance on multiedited set =%f\n"
(perfknn irist multiedited 1) )
(printf "Condensing the multiedited set\n")
(setq condensed (condense multiedited))
(printf "1NN performance on condensed set = %f\n"
(perfknn irist condensed 1) )
(printf "Condensing without multiedit:\n")
(setq condensed (condense irisa))
(printf "1NN performance after condensing learning set = %f"
(perfknn irist condensed 1) )
(printf " \n\n********* UNSUPERVISED KMEANS*********\n")
(printf "Learning process with 9 references\n")
(setq kirisa (kmeans irisa 9))
(printf "Assign class to references\n")
(codebookassignlabels irisa kirisa)
(printf "1NN performance = %f\n" (perfknn irist kirisa 1))
(printf "Performances of a few test examples at random\n")
(for (i 0 9)
(let*((random (int (rand 0 (bound (codebookword irist) 1))) )
(answer (testknn irist random kirisa 1)) )
(printf "testing example %3d of test set = " random)
(if (= answer t) (printf "wellclassified\n")
(printf "misclassified\n") ) ) )
(printf "Have a look to the labels of the 9 references\n")
(matrixprint (codebooklabel kirisa))
(printf "\n Note that there are:\n")
(printf "\t2 references dedicated to class 0\n")
(printf "\t3 references dedicated to class 1\n")
(printf "\t4 references dedicated to class 2\n")
(printf " \n\n********* SUPERVISED KMEANS*********\n")
(printf "Learning with 3 references per class\n")
(setq kirisa (kmeansclass irisa 3))
(printf "1NN performance = %f" (perfknn irist kirisa 1))
(printf "\n\n********* LVQ1*********\n")
(printf "Learning with 3 references per class\n")
(printf "on 20 iteratiobns\n")
(setq kirisa (initlvqclass irisa 3))
(learnlvq 1 irisa kirisa 20)
(printf "LVQ1 performance = %f" (perfknn irist kirisa 1))
(printf "\n\n********* LVQ2*********\n")
(printf "Learning with 3 references per class\n")
(setq kirisa (initlvqclass irisa 3))
(learnlvq 2 irisa kirisa 20)
(printf "LVQ2 performance = %f"(perfknn irist kirisa 1))
(printf "\n\n********* LVQ3*********\n")
(printf "Learning with 3 references per class\n")
(setq kirisa (initlvqclass irisa 3))
(learnlvq 3 irisa kirisa 20)
(printf "LVQ3 performance = %f" (perfknn irist kirisa 1))
;saving any of the above codebooks
;for example kirisa is saved in file results.cb
(savecodebook kirisa "results")
(printf "\n\n*********RADIAL BASIS FUNCTIONS*********\n")
(printf "Learning with 3 nodes per class\n")
(initrbf irisa 2)
(learnrbf irisa 20)
(printf "Performances of RBF algorithm\n")
(perfrbf irist)
;saving the RBF weights
;saving the codebook containing the nodes
(savecodebook rbfcodebook "nodes")
;saving the nodes widths and linear weights
(savenet "sigmaw")
;for loading previously saved rbf weights
(initrbf irisa 2)
(setq rbfcodebook (loadcodebook "nodes"))
(loadnet "sigmaw")
(printf "\n Remove all files built by this program\n")
(sys "rm sigmaw.wei")
(sys "rm nodes.cb")
(sys "rm results.cb")
)