Vision Science




 



THEORETICAL N EUROSCIENCE: 
IMAGE R EPRESENTATION IN THE VISUAL CORTEX

The signal that describes the irradiance at the photorreceptors (the input vector, a) undergoes a set of transforms as the signal travels from the retina to the cortex. The aim of theoretical visual neuroscience is to propose computational models of this image representation and deriving them from fundamental principles such as the Barlow efficient encoding idea . [Barlow01 ]
The current view of the (achromatic) image representation in V1 involves a linear stage and a non-linear stage (see [ Simoncelli01 , Simoncelli03 ] for a review): 

Linear transform

First, the input images, a, are analyzed by a set of linear sensors ( the rows in the matrix  ). This set of linear filters is a wavelet-like filterbank.
According to the Barlow hypothesis, this transform (i.e. the shape of the receptive fields of the linear sensors) can be obtained looking for the Independent Components of natural images [Field87, Field96, Bell97]. The basis functions resemble V1 receptive fields:

F

Scalar weighting

Second, the gain of each linear sensor is different depending on the space and spatial frequency content of its receptive field (impulse response). Therefore, the diagonal in the diagonal matrix F is closely related to classical linear perception models such as the CSF. According to the Barlow hypothesis, this scalar weighting may be obtained from the variance of the natural signals in the transform domain, assuming optimal information transmision by the human visual system [Atick92], or optimal information allocation to encode the natural signals with minimum distortion [ Malo00 ]. Our work in this issue also includes a prescription for an easy measurement of the filter function considering its multidimensional nature [ Malo94 ], and an expression to obtain the filter function in different transform domains [ Malo97a ].

R

Non-linear transform

Third, there is an input dependent saturation non-linearity that accounts for the psychophysical masking effect and the physiological non-linearities of V1 responses. The psychophysical experiments to derive R assume that the response domain is perceptually Euclidean [Watson97]. The parameters of the divisive normalization model [Heeger92] may also be obtained from the Barlow hypothesis [Schwartz01]. However, recently we have found that the non-linearities may also be obtained from the data using special non-linear ICA techniques asuming no explicit functional form for the non-linearity [ Malo06b ]. Conversely we have shown that the psychophysical non-linear response factorizes the PDF of natural images [Malo&Laparra10].

In most cases, our contributions in this field have been related to technical questions to clarify the formulation available in the literature  [ Malo94 , Buades95 , Malo97a ]. Note that this is not a trivial issue since many times the empirically derived models are very poorly formulated.
This better understanding of the formulation let us to propose new geometry tools, such as differential analysis of the perceptual geometry of these image representations [Malo97b , Pons99 , Watson02 , Epifanio03 , Malo04a, Laparra10a] (see below ), which are the basis of our succesful contributions to image and video coders.

However, recently [ Malo06b ], this technical background let us to obtain a more fundamental result: a new evidence that the original Barlow efficient encoding idea not only accounts for the linear part of the model (T and F), but also for the non-linearities R.
 

PUBLICATIONS 


J. Malo & V. Laparra
Psychophysically Tuned Divisive Normalization factorizes the PDF of Natural Images
Neural Computation (2010)

Abstract

Full Text



V. Laparra and J. Malo
Masking-like Non-Linearities from Non-linear PCA
GRC: Sensory Coding and The Natural Environment (july 2008)

 

Full Text



V. Laparra and J. Malo
Color and Luminance Discrimination by Non-Linear PCA
Computational Vision and Neuroscience symposium (april 2008)

 

Full Text



J. Malo, J. Gutiérrez
V1 non-linearities emerge from local-to-global non-linear ICA 
Network: Computation in Neural Systems Vol. 17, 1, pp 85-102  (2006)

Abstract

Full Text



J. Malo, J. Gutiérrez, J. Rovira
Perturbation Analysis of the Changes in V1 Receptive Fields due to Context
Presented at the Gordon Research Conference: Sensory Coding and the Natural Environment. Oxford, UK. (2004)

Abstract

Full Text



J.Malo, F.Ferri, J.Albert, J.Soret, J.M.Artigas
The role of perceptual contrast non-linearities in image transform quantization.
Image & Vision Computing , Vol. 18, 3, pp. 233-246 (2000) 

Abstract

Full Text



J.Malo, A.Pons, A.Felipe, J.M.Artigas
Characterization of the human visual system threshold performance by a weighting fuction in a Gabor domain.
Journal of Modern Optics . Vol. 44, 1, pp 127-148 (1997)

Abstract

Full Text



M.J. Buades, J.M. Artigas, A. Felipe, J. Malo
A statistical explanation of the effect of luminance on photopic visual acuity of speckled images.
Journal of Optics , Vol. 26, 4, pp 175-176 (1995) 

Abstract

Full Text



J.Malo, A. Felipe, M.J. Luque, J.M. Artigas
On the intrinsic two-dimensionality of the CSF and its measurement.
Journal of Optics. Vol. 25, 3, pp 93-103 (1994)

Abstract

Full Text





 


MEASURING D ISTANCES BETWEEN IMAGES

The perception of distortions added to a particular image depends on the nature of the noise and the nature of the original image. In the example below, random signals of the same energy but different frequency content are added on top of a natural image and a synthetic image with different contrast. Note that in each case the frequency content of the noise matches the frequency of a particular patch of the synthetic image.
 

fnoise ~ 3 cpd

fnoise ~ 6 cpd

fnoise ~ 12 cpd

fnoise ~ 24 cpd

Basic perceptual facts:

  • The visibility of the different distortions is quite different . Therefore, Euclidean norms in the spatial domain are not a good representation of the perceptual distance.

  • Frequency dependence. Distortions in the medium frequency range are more noticeable.

  • Masking . 

    • Distortions on top of high contrast signals are least noticeable.

    • Distortions on top of signals with similar position and spatial frequency are less noticeable. Conversely, the smaller the overlapping in space and spatial frequency , the bigger the visibility.

Our contribution in this field is providing a Riemannian formulation of the perceptual geometry of the image space [Pons99 , Epifanio03 , Malo04a , Laparra10a]. Specifically, assuming a quadratic pooling of the distortion in each coefficient, the perceptual distance, d, between an original image, a , and its distorted version a+ Da , is:

where, the (Riemannian) perceptual metric matrix, Wa(a), depends on the transforms of the image in the retina-to-cortex model described above :

This way of computing the distances (together with the appropriate vision model for T, F and R) reproduces the basic perceptual facts described above. In the example below, the distances of the above distorted images (low contrast pattern, natural image, high contrast pattern) have been computed using the above framework with wavelet transform (for T), CSF (for F) and divisive normalization (for R). 

In our work for the Video Quality Experts Group ( VQEG ), we have applied similar ideas to video quality assessment [ Watson02 ] analyzing what parameter of the model is more relevant to accurately reproduce the subjective opinion of observers.
This kind of image distortion metrics is the foundation of our successful contributions in image and video coding.
 

PUBLICATIONS


V. Laparra, J. Muñoz and J.Malo
Divisive Normalization Image Quality Metric Revisited
Accepted in JOSA A. (2010) 

Abstract

Full Text

MATLAB code


I. Epifanio, J. Gutiérrez and J.Malo
Linear Transform for Simultaneous Diagonalization of Covariance and Perceptual Metric Matrix in Image Coding.
Pattern Recognition, Vol. 36, pp. 1799-1811 (2003)

Abstract

Full Text



A.B. Watson and J.Malo
Video Quality Measures based on the Standard Spatial Observer.
Proc. IEEE Intl. Conf. Im. Proc. Vol. 3, pp. 41-44. (2002) 

Abstract

Full Text



A. Pons, J.Malo, J.M.Artigas, P.Capilla
Image quality metric based on multidimensional contrast perception models.
Displays Journal. Vol. 20, pp. 93-110. (1999) 

Abstract

Full Text



J.Malo, A.Pons, J.M.Artigas
Subjective image fidelity metric based on bit allocation of the human visual system in the DCT domain.
Image & Vision Computing . Vol. 15, pp. 535-548 (1997)

Abstract

Full Text


SOFTWARE FOR IMAGE QUALITY ASSESSMENT
 


 

VistaQualityTools1.0       (Improved) Wavelet-based Image Quality Measure

 

 




HUMAN O PTICS CHARACTERIZATION

Back in 1994, Jesús Malo was the winner of the European Vistakon Research because of his project for an accurate evaluation of the optical quality of disposable contact lenses. However, not very interested in human eye optics , Jesús only worked in the first double-pass experiment carried out in Valencia by 1995/96 [Lorente97 ]. 

The aim of the double-pass method [Santamaria87] is obtaining the Modulation Transfer Function (MTF) of the human eye from the image of a spot projected in the retina of the observer. The experimental set up is described in the figure below:

Below, the typical outcome of the above system: (a) the image of the recordered spot, and (b) the corresponding MTF (with axis in cycl/deg) 


(a)


(b)

During 1995/96 we carried out a study of the MTF of the human eye over a long period of time to quantify the standard fluctuations of the retinal image quality. We evaluated an MTF-based merit function on normal observers three times a day during a month. The standard deviation of these fluctuations (5%) can be chosen as an appropriate description of the behaviour of the average viewer. We used this result to study the behaviour of a time-variying compensation element: a disposable contact lens. The study of the eye + disposable contact lens system was carried out with four types of disposable contact lenses for one month. In spite of their general good behaviour, statistically significant differences from the standard pattern can be observed. This superimposed continuous fluctuation can be due to lens-dependent processes [ Lorente97 ].

Unfortunately, physiological optics is not one of the current interests of (VI (S)TA). For more information about the double-pass method implementation and results in our university, please contact Dr. A.M. Pons or Dr. A. Lorente (Dept. d'Òptica, Universitat de València).
 

PUBLICATIONS
 


A. Lorente, A.M. Pons, J.Malo, J.M.Artigas
Standard criterion for fluctuations of the Modulation Transfer Function in the human eye
Ophthalmic and Physiological Optics. Vol. 17 3, pp 267-272 (1997) 

Abstract

Full Text