GMoretti@massey.ac.nz (Giovanni Moretti) (02/26/90)
Since I posted an article replying to the use of Neural nets to decode morse code and indicating that I had an example of a back-propagation algorithm in Turbo Pascal (tm), I've had several requests for it. Rather than reply to each individually and since I have a question relating to it, here it is. ----------------------------------------------------------------------------- AND THE QUESTION: In this 2*2 net, the cells are arranged as a 3 * 3 matrix with row 0 being used as inputs - no problem with that - simplifies programming. However, why is column zero needed - It's filled with ones (0.95) and used only in the calculation of weights. If I take out this column (change "for i:= 0 ..." to "for i:= 1 ..." in the forward and backup updating procedures, convergence suffers badly - maybe fatally, I didn't wait to find out. WHY IS COLUMN ZERO NEEDED - what's it for ??? ----------------------------------------------------------------------------- Anyway here follows the program, and a big thank you to Dave Parker for something simple for neophytes to cut their teeth on. If this program is about your level, then get hold of the accompanying article - it's got some great insights into NNs, and, wait for it - its easy to understand :-) {---------program as in Dr Dobbs' follows with slightly altered layout------} { I've altered the indentation a little and added a few blank lines and added the USES statement, I don't believe there are any other differences. You can get the original from SIMTEL20 in the DDJMAG directory. Written by Dave Parker - one of the inventors of the back-propagation algorithm. From "Programming Paradigms" by Michael Swaine, Doctor Dobbs' Journal, October 1989, p112, listing starts p146. } Program BackPropagationDemo; uses crt,dos; Const NumOfRows = 2; (* Number of rows of cells. *) NumOfCols = 2; (* Number of columns of cells. *) LearningRate = 0.25; (* Learning rate. *) Criteria = 0.005; (* Convergence criteria. *) Zero = 0.05; (* Anything below 0.05 counts as zero. *) One = 0.95; (* Anything above 0.95 counts as one. *) Type CellRecord = Record Output : Real; (* Output of the current cell. *) Error : Real; (* Error signal for the current cell. *) Weights: Array[0..NumOfCols] Of Real; (* Weights in cell. *) End; Var CellArray : Array[0..NumOfRows,0..NumOfCols] Of CellRecord; (* Cells. *) Inputs : Array[1..NumOfCols] Of Real; (* Input signals. *) DesiredOutputs: Array[1..NumOfCols] Of Real; (* Desired output signals. *) Procedure CalculateInputsAndOutputs( Iteration: Integer ); Var I: Integer; Begin (* Calculate the inputs and desired outputs for the current iteration. *) (* The inputs cycle through the 4 patterns (0.05,0.05), (0.95,0.05), *) (* (0.05,0.95), (0.95,0.95). The corresponding desired outputs are *) (* (0.05,0.05), (0.05,0.95), (0.05,0.95), (0.95,0.05). The first *) (* desired output is the logical AND of the inputs, and the second *) (* desired output is the logical XOR. *) If (Iteration Mod 2) = 1 Then Inputs[1] := One Else Inputs[1] := Zero; If (Iteration Mod 4) > 1 Then Inputs[2] := One Else Inputs[2] := Zero; If (Inputs[1] > 0.5) And (Inputs[2] > 0.5) Then DesiredOutputs[1] := One Else DesiredOutputs[1] := Zero; If (Inputs[1] > 0.5) Xor (Inputs[2] > 0.5) Then DesiredOutputs[2] := One Else DesiredOutputs[2] := Zero; End; Procedure UpdateCellOnForwardPass( Row, Column: Integer ); Var J : Integer; Sum: Real; Begin (* Calculate the output of the cell at the specified row and column. *) With CellArray[Row,Column] Do Begin Sum := 0.0; (* Clear weighted sum of inputs. *) For J := 0 To NumOfCols Do (* Form weighted sum of inputs. *) Sum := Sum + Weights[J]*CellArray[Row-1,J].Output; Output := 1.0/(1.0+Exp(-Sum)); (* Calculate output of cell. This *) (* is called a sigmoid function. *) Error := 0.0; (* Clear error for backward pass. *) End; End; Procedure UpdateCellOnBackwardPass( Row, Column: Integer ); Var J: Integer; Begin (* Calculate error signals and update weights on the backward pass. *) Begin For J := 1 To NumOfCols Do (* Back propagate the error to the cells *) CellArray[Row-1,J].Error := (* below the current cell. *) CellArray[Row-1,J].Error+Error*Output*(1.0-Output)*Weights[J]; For J := 0 To NumOfCols Do (* Update the weights in the current cell. *) Weights[J] := Weights[J] + LearningRate*Error*Output*(1.0-Output)*CellArray[Row-1,J].Output; End; End; Var I, J, K : Integer; (* I loops over rows, J loops over columns,*) (* and K loops over weights. *) ConvergedIterations: Integer; (* Network must remain converged for four *) (* iterations (one for each input pattern).*) Iteration : Integer; (* Total number of iterations so far. *) ErrorSquared : Real; (* Error squared for current iteration. *) Begin ClrScr; (* Initialize the screen. *) Writeln('Iteration Inputs Desired Outputs Actual Outputs'); Iteration := 0; (* Start at iteration 0. *) ConvergedIterations := 0; (* The network hasn't converged yet. *) For I := 1 To NumOfRows Do (* Initialize the weights to small random numbers.*) For J := 1 To NumOfCols Do For K := 0 To NumOfCols Do CellArray[I,J].Weights[K] := 0.2*Random-0.1; For I := 0 To NumOfRows Do (* Initialize outputs of dummy constant cells. *) CellArray[I,0].Output := One; Repeat CalculateInputsAndOutputs(Iteration); For J := 1 To NumOfCols Do (* Copy inputs to dummy input cells. *) CellArray[0,J].Output := Inputs[J]; For I := 1 To NumOfRows Do (* Propagate inputs forward through network. *) For J := 1 To NumOfCols Do UpdateCellOnForwardPass(I,J); For J := 1 To NumOfCols Do (* Calculate error signals. *) CellArray[NumOfRows,J].Error := DesiredOutputs[J]-CellArray[NumOfRows,J].Output; For I := NumOfRows Downto 1 Do (* Propagate errors backward through *) For J := 1 To NumOfCols Do (* network, and update weights. *) UpdateCellOnBackwardPass(I,J); ErrorSquared := 0.0; (* Clear error squared. *) For J := 1 To NumOfCols Do (* Calculate error squared. *) ErrorSquared := ErrorSquared + Sqr(CellArray[NumOfRows,J].Error); If ErrorSquared < Criteria Then (* If network has converged, increment *) ConvergedIterations := ConvergedIterations + 1 (* convergence *) Else ConvergedIterations := 0; (* count, else clear convergence count. *) If (Iteration Mod 100) < 4 Then (* Every 100 iterations, write out *) Begin (* information on the 4 patterns. *) If (Iteration Mod 100) = 0 Then GotoXY(1,2); Write(' ',Iteration:5,' '); (* Write iteration number. *) For J := 1 To NumOfCols Do (* Write out input pattern. *) Write(Inputs[J]:4:2,' '); Write(' '); For J := 1 To NumOfCols Do (* Write out desired outputs. *) Write(DesiredOutputs[J]:4:2,' '); Write(' '); For J := 1 To NumOfCols Do (* Write out actual outputs. *) Write(CellArray[NumOfRows,J].Output:4:2,' '); Writeln; End; Iteration := Iteration + 1; (* Increment iteration count *) Until (ConvergedIterations = 4) Or (Iteration = 32767); (* Stop when the network has converged on all 4 input patterns, or when*) (* we are about to get integer overflow. *) If ConvergedIterations <> 4 (* Write a final message. *) Then Writeln('Network didn''t converge') Else Writeln('Network has converged to within criteria'); End. {-----------------------------------------------------------------------------} {Now a version that I've laid out according to my own taste and hacked slightly so I could better follow what was going on: NB although the matrix is supposedly only 2 * 2, in reality because it has a zero origin, it's 3*3. Row zero is used for the inputs (ie inputs across the top, outputs from the bottom. Column zero is filled with ONES (0.95) for reasons I don't yet understand but appear, for the purposes of evaluating the new weights, to be necessary. i.e. If you change the weight updating loop to start at 1 instead of 0 (as it is now, it doesn't seem to converge (or maybe it's just very slow). The filling of the weight matrix with -99 is just so I could see whether row and column zero of the weight matrix were altered - they weren't. The LearningRate must be less than one. If it's 0.25 the program converges after approximately 12000 iterations, with the rate = 0.95 it takes around 3000 iterations. } { Written by Dave Parker - From "Programming Paradigms" by Michael Swaine Doctor Dobbs' Journal, October 1989, p112 } Program BackPropagationDemo; uses crt,dos; Const NumOfRows = 2; (* Number of rows of cells. *) NumOfCols = 2; (* Number of columns of cells. *) LearningRate = 0.95 {0.25}; (* Learning rate. *) Criteria = 0.005; (* Convergence criteria. *) Zero = 0.05; (* Anything below 0.05 counts as zero. *) One = 0.95; (* Anything above 0.95 counts as one. *) Type CellRecord = Record Output : Real; (* Output of the current cell. *) Error : Real; (* Error signal for the current cell. *) Weights: Array[0..NumOfCols] Of Real; (* Weights in cell. *) End; Var CellArray : Array[0..NumOfRows,0..NumOfCols] Of CellRecord; (* Cells. *) Inputs : Array[1..NumOfCols] Of Real; (* Input signals. *) DesiredOutputs: Array[1..NumOfCols] Of Real; (* Desired output signals. *) Procedure CalculateInputsAndOutputs( Iteration: Integer ); Var I: Integer; Begin (* Calculate the inputs and desired outputs for the current iteration. *) (* The inputs cycle through the 4 patterns (0.05,0.05), (0.95,0.05), *) (* (0.05,0.95), (0.95,0.95). The corresponding desired outputs are *) (* (0.05,0.05), (0.05,0.95), (0.05,0.95), (0.95,0.05). The first *) (* desired output is the logical AND of the inputs, and the second *) (* desired output is the logical XOR. *) If (Iteration Mod 2) = 1 Then Inputs[1] := One Else Inputs[1] := Zero; If (Iteration Mod 4) > 1 Then Inputs[2] := One Else Inputs[2] := Zero; If (Inputs[1] > 0.5) And (Inputs[2] > 0.5) Then DesiredOutputs[1] := One Else DesiredOutputs[1] := Zero; If (Inputs[1] > 0.5) Xor (Inputs[2] > 0.5) Then DesiredOutputs[2] := One Else DesiredOutputs[2] := Zero; End; Procedure UpdateCellOnForwardPass( Row, Column: Integer ); Var J : Integer; Sum: Real; Begin (* Calculate the output of the cell at the specified row and column. *) With CellArray[Row,Column] Do Begin Sum := 0.0; (* Clear weighted sum of inputs. *) For J := 0 To NumOfCols Do (* Form weighted sum of inputs. *) Sum := Sum + Weights[J]*CellArray[Row-1,J].Output; Output := 1.0/(1.0+Exp(-Sum)); (* Calculate output of cell. This *) (* is called a sigmoid function. *) Error := 0.0; (* Clear error for backward pass. *) End; End; Procedure UpdateCellOnBackwardPass( Row, Column: Integer ); Var J: Integer; Begin (* Calculate error signals and update weights on the backward pass. *) Begin For J := 1 To NumOfCols Do (* Back propagate the error to the cells *) CellArray[Row-1,J].Error := (* below the current cell. *) CellArray[Row-1,J].Error+Error*Output*(1.0-Output)*Weights[J]; For J := 0 To NumOfCols Do (* Update the weights in the current cell. *) Weights[J] := Weights[J] + LearningRate*Error*Output*(1.0-Output)*CellArray[Row-1,J].Output; End; End; Var I, J, K : Integer; (* I loops over rows, J loops over columns,*) (* and K loops over weights. *) ConvergedIterations: Integer; (* Network must remain converged for four *) (* iterations (one for each input pattern).*) Iteration : Integer; (* Total number of iterations so far. *) ErrorSquared : Real; (* Error squared for current iteration. *) Begin ClrScr; (* Initialize the screen. *) Writeln('Iteration Inputs Desired Outputs Actual Outputs'); Iteration := 0; (* Start at iteration 0. *) ConvergedIterations := 0; (* The network hasn't converged yet. *) for i:= 0 to numofrows do {Fill cell weights with odd value} for j:= 0 to numofcols do {So I can tell which ones are altered} for k:= 0 to numofcols do cellarray[i,j].weights[k]:= -99; For I := 1 To NumOfRows Do (* Initialize the weights to small random numbers.*) For J := 1 To NumOfCols Do For K := 0 To NumOfCols Do CellArray[I,J].Weights[ K] := 0.2*Random-0.1; For I := 0 To NumOfRows Do (* Initialize outputs of dummy constant cells. *) CellArray[I,0].Output := One; Repeat CalculateInputsAndOutputs(Iteration); For J := 1 To NumOfCols Do (* Copy inputs to dummy input cells. *) CellArray[0,J].Output := Inputs[J]; For I := 1 To NumOfRows Do (* Propagate inputs forward through network. *) For J := 1 To NumOfCols Do UpdateCellOnForwardPass(I,J); For J := 1 To NumOfCols Do (* Calculate error signals. *) CellArray[NumOfRows,J].Error := DesiredOutputs[J]-CellArray[NumOfRows,J].Output; For I := NumOfRows Downto 1 Do (* Propagate errors backward through *) For J := 1 To NumOfCols Do (* network, and update weights. *) UpdateCellOnBackwardPass(I,J); ErrorSquared := 0.0; (* Clear error squared. *) For J := 1 To NumOfCols Do (* Calculate error squared. *) ErrorSquared := ErrorSquared + Sqr(CellArray[NumOfRows,J].Error); If ErrorSquared < Criteria Then (* If network has converged, increment *) ConvergedIterations := ConvergedIterations + 1 (* convergence *) Else ConvergedIterations := 0; (* count, else clear convergence count. *) If (Iteration Mod 100) < 4 Then (* Every 100 iterations, write out *) Begin (* information on the 4 patterns. *) If (Iteration Mod 100) = 0 Then GotoXY(1,2); Write(' ',Iteration:5,' '); (* Write iteration number. *) For J := 1 To NumOfCols Do (* Write out input pattern. *) Write(Inputs[J]:4:2,' '); Write(' '); For J := 1 To NumOfCols Do (* Write out desired outputs. *) Write(DesiredOutputs[J]:4:2,' '); Write(' '); For J := 0 To NumOfCols Do (* Write out actual outputs. *) Write(CellArray[NumOfRows,J].Output:4:2,' '); Writeln; End; Iteration := Iteration + 1; (* Increment iteration count *) Until (ConvergedIterations = 4) Or (Iteration = 32767); (* Stop when the network has converged on all 4 input patterns, or when*) (* we are about to get integer overflow. *) Writeln('Weights'); for i:= 0 to numofrows do begin for j:= 0 to numofcols do begin for k:= 0 to numofcols do write(' ',cellarray[i,j].weights[k]:4:2); write(' '); end; writeln; end; If ConvergedIterations <> 4 (* Write a final message. *) Then Writeln('Network didn''t converge') Else Writeln('Network has converged to within criteria'); End. -- ------------------------------------------------------------------------------- | GIOVANNI MORETTI, Consultant | EMail: G.Moretti@massey.ac.nz | |Computer Centre, Massey University | Ph 64 63 69099 x8398, FAX 64 63 505607 | | Palmerston North, New Zealand | QUITTERS NEVER WIN, WINNERS NEVER QUIT | -------------------------------------------------------------------------------
dano@ssc-vax.UUCP (Dan Olson) (02/28/90)
About your question - why is column zero needed? From just glancing at the pascal program, it looks like column zero is being used as the bias or constant input, which is typically set to 1. Usually the weight between this constant input and a cell is called the "threshold" of the cell. What does the constant input does: Think of each cell as trying to optimize a linear equation over a set of inputs I, where the variables are the weights W feeding into the cell. w1*i1 + w2*i2 + ... + wn*in = 0 As the equation stands, w1 through wn can be set so that it defines any line (or hyperplane) that passes through the axis. But in many cases the best solution for the linear equation should not pass though the axis. Adding a constant term to the equation moves the line from the axis. w1*i1 + w2*i2 + ... + wn*in + C = 0 So that the constant can be adjusted, in the neural net world it is represented by a weight attached to an input which is always one, or column zero in the example pascal program. -- Dan Olson (UUCP ..!uw-beaver!ssc-vax!dano)