The first device that used this principle was the Perceptron. This machine was constructed at the NPL in the 'GOs. Its basis was an imitation of the general appearance of part of the CNS. Nerves have many inputs called dendrites and an output called an axon. The probability that the all or nothing output will be fired depends on the degree of excitation arriving along the dendrites. To mimic this one element of the Perceptron was schmitt trigger (using two valves) whose input signal was provided by currents through variable resistors. These resistors were connected to photocells arranged in a square array. Each photocell was connected to every schmitt trigger through the variable resistors. The firing of a particular trigger depended on the setting of the resistors and the degree of illumination of Lhe photocells. The output was wired to a display arid the system could "recognise" different inputs. To program the device the resistors had to be altered until the correct response was produced.
A protocol of altering the resistors was arrived at (called annealing) which was to alter them at random until the output was correct. This took some time, and was called "training".
The next step was to alter the resistances automatically until a feedback loop indicated the correct response.
A mathematics was developed to cope with the system in this a photocell Pi connected to a trigger Tj through a resistor Rij so the probability of firing was:
It was realised that a single layer could not perform all logical operations and in particular the FOR was impossible. So another layer was added with positive and negative weighting connection network. Since each output had only one line, another layer was needed to give a variable output, again with a weighted network in between and each output connected to several output motor outputs. Feedback was necessary to tell the network if the correct response had been obtained and this had to act on Lhe weighting network. It is not clear how this could be achieved easily. In these systems there is a network in which the interconnections develop through learning, and in the nervous system this could be mediated by the synapses, but some authors think that the learning unit is the nerve. In another realisation each receptor is only connected to a limited number of triggers as edge, and movement detectors and higher level detectors in a hierarchy as in the eNS. To do this, we have the first layer a discriminator cells which are connected to a subset of the receptors then the other discriminator layers then the motor output cells which are the Learning cells whose outputs are connected to a subset of the motor cells and have a the feedback loop connected from the reward or punishment centres to switch the learning cell to a higher or lower level of excitablity if it is active when the feedback loop is triggered.
In this model the L cells are the basis of memory of the forward loop at the level of the cell.
In the CNS there must be another loop from output of the L cells to the D--cells to give rise to the phenomina of "thought". This feedback loop contains information about the outside world and enables planning. This region in loaded from the experiences of the animal and is a table of Input / output / next input. This region is much more complicated and is a linear system. ln man this loop is being continually cycled to give rise to the "stream of Conciousness". It is rather like a microprocessor with a fetch cycle which get the next association whose address is formed from the last Memory "frame".
An important function of this processor in the logical difference between the present information and the stored information in that frame to give the "answer" to a problem. This can be extended to the problem of finding the "next" association and involves a comparator. In a conversation the speech output is the difference between the input and what you know about that subject. A nerve circuit to do this could be designed.
To visualise the associative memory, one has to imagine a sheet of cells. Input rise up to the surface and then a pulse train leaves that point spreading outwards in an increasing circle. It is a finite pulse train. At other points on this sheet other pulse trains leave as inputs arrive and the circles of pulse trains coincide at various points. When sufficient numbers of pulses coincide the output cell at that point fires a pulse train. The cell is then activated and is fires at a lower level of input excitation next time. These patterns of activated cells form the "Memories" of the system.
Each pair of active centres forms a parabola of activation. This is not very interesting as a real system would have millions of inputs taken thousands at a time, forming an interference pattern. This pattern is the fourier transform of the input array. The complete system requires another fourier transform to put the outputs back to the 'direct representation, this would be done with another ' associative system connected to the first, a simple mirror plane is all that is required, common in biological structures. If you have more than a few inputs at once to make the coincidence patterns, then once the cells have become activated, a subset of the input patterns will produce an output which is the same as the produced by the whole input pattern. Thus we have pattern recognition. In this system, the output array goes to a learning cell system which produces the output. If you have the output modified by a transform that allows the difference between the present information and the stored information we have a information retrieval system that will deal with partial knowledge. It the memory also stores the previous outputs then this difference will be the solution to the problem (One step action). A many step solution will involve a cycle of input/output/input/output until the problem is solved. I have written a program which models this cyclic process to solve problems.
There are three arrays, the input string array, the output string array and the new input string array. By matching the two input arrays for a sequence of operations a solution to problems can be found.
I NPUT1 : OUTPUT: I NPUT2
AAAAAA: 111111: CCGCCC
BBBBBB: 333333 : AAAAAA
CCCCCC: 444444: DDDDDD
Problem: BBBBB to DDDDD, solution 333333,111111,444444.
A program like that could be used to control a micromouse to solve a maze. The mouse would explore the maze to get the map and then sort out the shortest route to solve the problem. Its outputs would be the motor OUTPUT. These two concepts of pattern recognition and problem solving need to be linked in a practical system. PDP is for pattern recognition and serial processing for planning and problem solving.
I have developed an algorith for adjusting the weighting on a single layer perceptron which coverges quickly. The wieghts are initialised to a random value. At each trial a positive feedback (reinforcement) reduces the amount of randomness (the amplitude) a neutral feedback makes no change to the amplitude of the randomness and a negative feedback (punishment) resets the randomness amplitude to its initial value.
Here the photocells produce an analogue signal, the feedback is analogue and the output digital. In a multilayer system the intermediate levels give an output which is analogue depending on Tj in a linear manner.