Visual recognition based on coding in temporal cortex: Analysis of pattern configuration and generalisation across viewing conditions without mental rotation

D. I. Perrett*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A model of recognition is described that is based on cell properties in the ventral cortical stream of visual processing in the primate brain. At a critical intermediate stage in this system Elaborate feature sensitive cells respond selectively to visual features in a way that depends on size (±1 octave), orientation (±45) but does not depend on position within central vision (±5). These features are simple conjunctions of 2-D elements (e.g. a horizontal dark area above a dark smoothly convex area). Such features can arise either as elements of an objects surface pattern or as 3-D component parts of the object. By requiring a combination of several such features without regard to their position within the central region of the visual image, Pattern sensitive cells at higher levels can become selective for complex configurations that typify objects experienced in particular viewing conditions. Given that input features are specified in approximate size and orientation, initial cellular ‘representations’ of the visual appearance of object type (or object example) are also selective orientation and size. Such representations are sensitive to object view (±40-60) because visual features disappear as objects are rotated in perspective. Combined sensitivity to multiple 2-D features independent of their position establishes selectivity for configuration of object parts (from one view) because rearranged configurations yield images lacking some of features present in the normal configuration. Different neural populations appear to be tuned to particular components of the same biological object (e.g. face, eyes, hands, legs), perhaps because the independent articulation of these components gives rise to correlated activity in different sets of input visual features. Generalisation over viewing conditions for a given object can be established by hierarchically pooling outputs of view specific cells. Such pooling could depend on the continuity in experience across viewing conditions: different object parts are seen together and different views are seen in succession when the observer walks around the object. For any familiar object, more cells will be tuned to the configuration of the objects features present in the view(s) frequently experienced. Therefore, activity amongst the population of cells selective for the objects appearance will accumulate more slowly when the object is seen in an unusual orientation or view. This accounts for increased time to recognise rotated views without the need to postulate mental rotation or transformations of novel views to align with neural representations of familiar views. The model is in accordance with known physiological findings and matches the behavioural performance of the mammalian visual system which displays view, orientation and size selectivity when learning about new pattern configurations.

Original languageEnglish
Title of host publicationArtificial Neural Networks, ICANN 1996 - 1996 International Conference, Proceedings
EditorsChristoph yon der Malsburg, Jan C. Vorbruggen, Werner von Seelen, Bernhard Sendhoff
PublisherSpringer-Verlag
Pages1
Number of pages1
ISBN (Print)3540615105, 3540615105, 9783540615101, 9783540615101
Publication statusPublished - 1996
Event1st International Conference on Artificial Neural Networks, ICANN 1996 - Bochum, Germany
Duration: 16 Jul 199619 Jul 1996

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1112
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st International Conference on Artificial Neural Networks, ICANN 1996
Country/TerritoryGermany
CityBochum
Period16/07/9619/07/96

Fingerprint

Dive into the research topics of 'Visual recognition based on coding in temporal cortex: Analysis of pattern configuration and generalisation across viewing conditions without mental rotation'. Together they form a unique fingerprint.

Cite this