Modelling the 'Homunculus'

by András Lórincz

The Neural Information Processing Group of the Eötvös Loránd University Budapest has been engaged in researching reflexive systems, capable of collecting experiences, including reinforcement. The system can learn from experiences and can accept directions. The system envisioned will keep an eye on itself and may seem to know what it is doing.

Generally speaking, the processing of signals that convey information can be considered as a transformation into another form that still carries the whole or just a piece of the original information. The environment feeds the system with some inputs and the system output represents (a part of) the environment. Whilst most models address the problem of coding inputs and making efficient internal representations, we are more concerned about the fundamental problem of making sense of these representations. In our view, the central issue of making sense or meaning is to provide answers to questions like "what does it mean?" in terms of our past experiences, or "how is it related?" in terms of known facts. In other words, making sense is inherently related to declarative memory. As a consequence, the homunculus (the little strange person who sits in the 'Cartesian theater' - to use the words of Dennett) becomes a central issue. There is a related fallacy that says that the internal representation is meaningless without an interpreter. This fallacy claims that all levels of abstraction require at least one further level to become the corresponding interpreter. Unfortunately, the interpretation - according to the fallacy - is just a new transformation and we are trapped in an endless regression. This problem could be more than a philosophical issue. We are afraid that any model of declarative memory or a model of structures playing a role in the formation of declarative memory could be questioned by the kind of arguments provided by the fallacy.

In our theoretical efforts, we start by claiming that the paradox stems from vaguely described procedures of 'making sense'. The fallacy arises when we say that the internal representation should make sense. To the best of our knowledge, this formulation of the fallacy has not been questioned except in our previous work, where the fallacy was turned upside down by changing the roles: Not the internal representation but the input should make sense. The proposal is the following. The input makes sense if the same (or similar) inputs have been experienced before and if the input can be derived or regenerated by means of the internal representation. According to this approach the internal representation interprets the input by (re)constructing it.

It is known that human-computer interface needs 'mind-reading' capabilities, eg, (i) the recognition of different emotional states, (ii) the actual state of thinking, such as concentrating, being tired or lost, etc. We are currently incorporating our reflexive architecture into human-computer interfaces, which can collect information about the 'state' of the user by measuring head and eye movements as well as facial expressions. The system will be capable of responding by animating facial expressions. The computer will use adaptive probabilistic reflexive architectures. The computer will make use of a visual thesaurus. For example, the figure depicts one frame of a short clip expressing 'wondering'. The interacting computer will use, interpolate, morph, and combine a collection of such short clips. Any of these clips is an 'action', from the point of view of an interacting computer, whereas the response of the user can form the reinforcing feedback. This is the basic scheme of reinforcement based optimization of human-computer interaction, serving the user. Note that this is the basic scheme that we use in our everyday interaction from early childhood when we 'live together'.

The idea behind this approach is to keep the infinite regression, make it a converging regression and execute this regression in a finite architecture. The change of the roles gives rise to a reconstructing loop structure. The loop has two constituents; the top and the bottom. The top contains the internal representation. The representation generates the reconstructed input. The bottom computes the difference between the actual input and the reconstructed input. This difference, the reconstructed error, corrects the internal representation, and so on. This is a finite architecture with a converging, but - in principle - endless iteration. In turn, the infinite regression is transformed into a converging iteration, and a route has been opened to reduce the fallacy to the problems of stability and convergence.

The approach - in principle - leads to a reflexive system, which can infer about the external world, about its own state and about the actual interaction between those. Notably, there are relatively strong (mathematical and computational) constraints on how such a reconstruction network should work. Constraints emerge, for example, from optimization of information transfer between bottom and top. Such constraints severely restrict modelling.

We accepted these constraints and started to develop a model of the 'homunculus'. The uniqueness of our model is that starting from a relatively small set of hypotheses, many structural and functional features can be derived. These features are indirect predictions of the model. Without including biological constraints beforehand, we could show the emergence of some specific low order memory functions. The emerging properties of our model:

explains general properties of the brain, such as priming and repetition suppression
recognizes novelty before searching the database
has a correct order of learning
explains properties of certain brain diseases
provides a unified view of control and sensory processing
is built of elements, which have a local Kalman-filter like structure with proper Hebbian-learning rules
encompasses reinforcement learning in a natural fashion
finds domains, which are almost deterministic
boosts prediction and goal-oriented planning
has striking similarities with known brain structures
provides a view for consciousness and introspection
has a philosophical ground, which goes back to the works of Locke and Hegel.

To give an example, consider novelty detection. Several works have shown that maximization of information transfer gives rise to sparse representation for natural stimuli. Then novel information (which could be a novel natural stimulus) gives rise to a non-sparse representation. In turn, the distribution of neural activities at the top indicates whether the actual input is familiar or novel: distinctions between novelty and familiarity can be immediate upon (the first) bottom-up information transfer in the loop.

Link:
http://people.inf.elte.hu/lorincz/pub.html

Please contact:
András Lórincz, Eötvös Loránd University, Budapest, Hungary
Tel: +36 1 209 0555 / 8473
E-mail: lorincz@inf.elte.hu