Fahlman@C.CS.CMU.EDU ("Scott E. Fahlman") (09/29/87)
To answer your question about Boltzmann machines: In the original Boltzmann Machine formulation, a pattern (think of this as both inputs and outputs) is clamped into the visible units during the teaching phase; the network is allowed to free-run, with nothing clamped, during the normalization phase. The update of each weight is a function of the difference between co-occurrence statistics measured across that connection during the two phases. The result (if all goes well) is a trained network that has no concept of input and output: clamp a partial pattern into the visible units, and the network will try to complete it in a way that is consistent with the training examples. Clamp nothing, and the network should settle into states whose distribution approximates the distribution of examples in the training set. Later, someone (Geoff Hinton, I think), realized that if the network was really being trained to produce a certain input-to-output mapping, it was wasteful of links and training effort to train the network to reproduce the distribution of input vectors; an input will always be supplied when the network is performing. If the visible units are divided into an input set and an output set, if the teaching phase is done as before, and if the inputs (only) are clamped during the normalization phase, the network will "concentrate" on learning to produce the desired outputs, given the inputs, and will not develop the capability of reproducing the input distribution. Some papers refer to the "completion" model, others to the "Input/Ouput" model. The distinction is not always emphasized. The learning procedure is essentially the same in either case. Note that, unlike Boltzmann, the back-propagation model is inherently an I/O model, though it is possible to do completion tasks with some added work. For example, one might train a backprop network to map each of a set of patterns into itself, and then feed it partial patterns at the inputs. -- Scott Fahlman, CMU