(VI(S)TA) Visual Statistics Group

Image Processing

Software

Publicat.

Vision.Sci.

Video.Proc.

Im.Stats.

Color.Sci.

People

Links

IMAGE CODING
Transform Coding, Image Representation and Quantizer Design

Image compression systems commonly operate by transforming the input signal into a new representation whose elements are then independently quantized. The success of such a system depends on two properties of the representation. First, the coding rate is minimized only if the elements of the representation are statistically independent. Second, the perceived coding distortion is minimized only if the errors in a reconstructed image arising from quantization of the different elements of the representation are perceptually independent.

According to the Barlow hypothesis, the perceptual representation of the image should also be statistically efficient. The match with natural image statistics and the fact that most of the image coding applications are intended to be judged by a human observer have motivated the use of human vision models and perceptual metrics to inspire the image representation in transform coders as well as the bit allocation in the selected representation (e.g. JPEG and JPEG2000). Our work in this field has been focused in using accurate models of the perceptual non-linearties to improve these standards. The key issue is making a uniform quantization in a perceptually uniform domain. We have developed a distortion criterion that unifies all the results in perceptually based quantization: making a uniform quantization in the perceptual domain is equivalent to restrict the Maximum Perceptual Error (MPE) in each component of the perceptual representation. When using this concept, all the proposed approaches (ours and those of other people, e.g. JPEG and JPEG2000) are just particular cases using different perception models.

The first attempts to use human vision models in tranform coding were too simple: the original JPEG standard [Wallace91] used the block-DCT as the first stage of the perceptual representation and a simple linear approximation of the second stage was used to design bit allocation (using achromatic and chromatic CSFs).

Over the last years, we have shown that using progressively more accurate approximations of the second non-linear stage significantly improve the JPEG results [Malo95, Malo99, Malo00, Epifanio03, Malo03, Malo04]. In particular, we have shown that linear transforms cannot achieve either of the (statistical and perceptual independence) goals, and we have proposed an adaptive non-linear image representation (the divisive normalization) that greatly reduces both the statistical and the perceptual redundancy amongst representation elements. We developed an efficient method of inverting this representation, and we demonstrated that this dual reduction in dependency can greatly improve the visual quality of compressed images. For an extended review of the proposed methods in the context of color image coding, see [Malo02] (in Spanish!).

An illustration of the results that summarizes our work in this field is given below (0.18 bits/pix):

bar

Original (8 bits/pix)

JPEG [Wallace91]

[Malo95, Malo99, Malo00]

[Epifanio03]

[Malo03] [Malo04]

Rate-Distortion performance of DCT based methods

Rate (Entropy in bits/pix)

Dotted: JPEG [Wallace91]

Dash-dot: [Malo95, Malo99, Malo00]
(Note that this simple non-linearity was intended to work in the JPEG range, i.e, between [0.4-0.9] bits/pix)

Dashed: [Epifanio03]

Solid: [Malo03, Malo04]

The new standard JPEG2000 [Taubman02] uses wavelets in the first linear stage; and it incorporates more complex expressions, yet not exact, for the non-linear perceptual stage [Daly02]. In particular, in addition of the simplest linear model (the CSF), it allows point-wise non-linearities and simplified versions of the divisive normalization. The major problem found in using the most accurate version of divisive normalization is the mathematical complexity of its inversion. This is why the standard just uses simplified versions of the non-linearity.

In [Navarro05] we improved the performance of JPEG2000 by using the perceptual non-linearities as well. We extended to wavelets the robust and fast algorithm to invert the exact expression of divisive normalization that was derived and used in the DCT context [Malo04]. In this way, significant improvements in rate-distortion performance and better color reproduction than in JPEG2000 have been found (see the example below 0.2 bits/pix).

Original (24 bits/pix)

Linear JPEG2000 [Daly02]

Simple non-linear JPEG2000 [Daly02]

Our method [Navarro05]

Rate-Distortion performance of Wavelet based methods

Dotted: Linear JPEG2000 [Daly02]

Dashed: Simple non-linear JPEG2000 [Daly02]

Solid: [Navarro05]

Due to our colaboration with Dr. Gustavo Camps (Dept. d'Eng. Electr. Universitat de Valencia) we are applied a different kind of non-linear processing after the linear transform representation. In particular, we trained perceptually weighted Support Vector Machines (SVMs) to select the subset of more relevant coefficients in the linear representation.

SVM learning has been recently proposed for image compression in the frequency domain using a constant insensitivity zone by Robinson and Kecman [Rob&Kec03]. However, according to the statistical properties of natural images and the properties of human perception (the Barlow hypothesis again!), a constant insensitivity makes sense in the spatial domain but it is certainly not a good option in a frequency domain. In fact, in their approach they made a fixed low-pass ad-hoc assumption: they neglected high-frequency coefficients in the SVM training.

In [Gomez04] we have proposed the use of adaptive insensitivity SVMs [Camps01] for image coding using an appropriate distortion criterion [Malo95-Malo04] based on a simple visual cortex model. Training the SVM by using an accurate perception model avoids any a priori assumption and reduces the blocking effect and improves the subjective rate-distortion performance of the original approach.

Our results compared to the results using straightforward SVMs in the DCT domain and with the method proposed in [Rob&Kec03] are shown below (0.3 bits/pix).

Original (8 bits/pix)	Euclidean SVM
[Robinson&Kecman03]	[Gomez04]

Rate-Distortion performance of DCT-SVM based methods

Dotted: Euclidean SVM

Dashed: [Rob&Kec03]

Solid: CSF-SVM [Gomez04]

PUBLICATIONS

J. Gutiérrez, G. Gomez, G. Camps and J.Malo
Perceptual image representations in Suppot Vector Machine image coding.
Book chapter in: Kernel methods in bioengineering, communications and image processing. Publisher Idea Group Inc. (submmited in June 2005)

Abstract

Full Text

G. Gomez, G. Camps, J. Gutiérrez and J. Malo
Perceptual Adaptive Insensitivity for Support Vector Machine Image Coding.
IEEE Trans. Neural Networks (Accepted June 2005)

Abstract

Full Text

Y. Navarro, J. Rovira, J. Gutiérrez and J.Malo
Gain control for the chromatic channels in JPEG2000.
Proc. of the 10th Intl. Conf. AIC. Vol. 1, pp. 539-542. (2005)

Abstract

Full Text

J.Malo, I.Epifanio, R.Navarro and E. Simoncelli,
Non-linear Image Representation for Efficient Perceptual Coding.
IEEE Trans. Im. Proc., (Accepted in oct. 2004)

Abstract

Full Text

J. Malo
Normalized Image Representation for Efficient Coding.
Proc. IEEE 37th Asilomar Conf. Signals Systems & Comp., Vol. 2, pp. 1408-1412 (2003)

Abstract

Full Text

I. Epifanio, J. Gutiérrez and J.Malo
Linear Transform for Simultaneous Diagonalization of Covariance and Perceptual Metric Matrix in Image Coding.
Pattern Recognition, Vol. 36, pp. 1799-1811 (2003)

Abstract

Full Text

J.Malo
Almacenamiento y transmisión de imágenes en color
Tecnología del color (Artigas, Capilla y Pujol Eds.). Cap. 4, pp 117-164.
Servei de Publicacions de la Universitat de València (2002)

Abstract

Full Text

J.Malo, R.Navarro, I.Epifanio, F.Ferri, J.M.Artigas
Non-linear Invertible Representation for Joint Statistical and Perceptual Feature Decorrelation.
Lecture Notes on Computer Science, Vol. 1876, pp. 658-667 (2000)

Abstract

Full Text

J.Malo, F.Ferri, R.Navarro, R.Valerio
Perceptually and Statistically Decorrelated Features for Image Representation:
Application to Transform Coding.
Proc. IEEE Int. Conf. Patt. Rec., Vol. 3, pp. 242-245 (2000)

Abstract

Full Text

J.Malo, F.Ferri, J.Albert, J.Soret, J.M.Artigas
The role of perceptual contrast non-linearities in image transform quantization.
Image & Vision Computing , Vol. 18, 3, pp. 233-246 (2000)

Abstract

Full Text

J.Malo, F.Ferri, J.Albert, J.Soret
Comparison of perceptually uniform quantization with average perceptual error minimization in image transform coding.
Electronics Letters , Vol. 35, 13, pp. 1067-1068 (1999)

Abstract

Full Text

J.Malo, A.Pons, J.M.Artigas
Bit allocation algorithm for codebook design in vector quantization fully based on HVS non-linearities for suprathreshold contrasts.
Electronics Letters , Vol. 31, 15, pp.1222-1224 (1995)

Abstract

Full Text

IMAGE RESTORATION
Perceptual Image Restoration

Image restoration requires a priori knowledge on the solution. The key idea behind conventional regularization techniques is smoothness. Smoothness means predictability of the signal from the neighborhood, therefore, the conventional approaches use simple statistical models such as band-limitation or autorregresive (AR) models of the signal.

However, natural images exhibit additional features, such as particular relations between local Fourier or wavelet transform coefficients. Biological visual systems have evolved to capture these relations (as assumed by the Barlow hypothesis). We have proposed to use this biological behavior in regularization as an alternative to smoothness-based functionals [Gutierrez03, Gutierrez04].

In particular, we have shown that the use of the human visual non-linear response in local frequency domains to design the penalty functional in regularization identifies the relevant features of the images. In this way, these features are discriminated from noise and preserved in the restoration process.

An illustration of how the perceptual penalty functional identifies the presence of a relevant feature, consider this patch of the Barbara image that contains relevant information around 18 cycles/deg. Figures (a) and (b) show the original and degraded (blurred and noisy) versions of the patch. Figures (c) and (d), show the penalty functionals computed from these images using the proposed method (solid lines). The classical 2nd derivative functional [Katsaggelos02] (dashed lines) is shown for reference purposes. Note how the proposed functional captures (and hence preserves!) the relevant feature even though it is almost invisible after the degradation process.

The examples below show the performance of the proposed regularization functionals compared with other stationary (2nd derivarive and CSF) and adaptive (AR and Adaptive Wiener) regularization methods in synthetically degraded and naturally degraded images. In every case, no prior information about the nature of the noise or the blurring process is assumed. Therefore, the performance of the methods fully depends on the suitability of the underlying assumptions on the nature of the image in each case.

Synthetically degraded images were generated by low-pass filtering the original (using a certain cut-off frequency, fc) and using additive gaussian white noise of certain variance, s. Playing with fc and s, we considered all the possibilities between deblurring and denoising.

(almost) Deblurring: fc = 3 cpd, s² = 15

2^nd derivative

CSF

AR4

AR8

AR12

[Gutierrez04]

(almost) Denoising: fc = 27 cpd, s² = 200

2^nd derivative

CSF

AR4

AR8

AR12

[Gutierrez04]

Naturally degraded images were obtained taking pictures of a given scene in poor lightning conditions. The example below shows that the proposed method works better than classical methods (such as [Lee80]) because it focuses in removing the (real) degradation: note how the Wiener method identifies as noise relevant features of the image such as edges.

Naturally degraded images

Original (good illumination)

Degraded (poor illumination)

Adaptive Wiener [Lee80]

[Gutierrez04]

Noise identified by Adaptive Wiener [Lee80]

Noise identified by [Gutierrez04]

PUBLICATIONS

J. Gutiérrez, F. Ferri and J. Malo
Regularization operators for natural images based on non-linear perception models
To appear in IEEE Trans. Im. Proc. (accepted nov. 2004)

Abstract

Full Text

J. Gutiérrez, J. Malo and F. Ferri
Perceptual Regularization Functionals for Natural Image Restoration
Proc. IEEE Intl. Conf. Im. Proc. 2003. CD Edition. ISBN 0-7803-7751-6 (2003)

Abstract

Full Text

Imag.Stats.

Software

Vision.Sci.

Im.Proc.

Video.Proc.

Color.Sci.

Publicat.

People

Links