BISP5
Fifth Workshop on
BAYESIAN INFERENCE IN STOCHASTIC PROCESSES
Valencia (Spain), June 14-16, 2007
 


Abstracts of posters


L. M. Acosta Argueta and P. Muñoz Gràcia: Benchmark Study of Some Particle Filter Variants in a Nonlinear Non-Gaussian Framework: Application to a Stochastic Volatility Model Parameters Estimation
A Particle Filter (PF) is a sequential Monte Carlo method which from a Bayesian perspective allows to generate samples from a target distribution as new observations arrive. We study and apply this flexible simulation based methodology, using sampling importance resampling (SIR), to estimate the underlying varying state and/or the fixed model parameters under a possibly non-stationary, nonlinear and non-Gaussian framework. The benchmark study aims to assess the filtering performance of various PF variants, using both statistical and computational efficiency. First, taking from literature, a synthetic non-stationary, non-linear non-Gaussian dynamic model, four PF variants are applied to recursively estimate the states. These PF variants are: Kitagawa’s SIR PF, Pitt and Shephard’s auxiliary sampling importance resampling PF, the so called extended PF and the unscented PF. For completion, the non-simulation based filters, Extended Kalman Filter and Unscented Kalman Filter, are included. Additionally, the deterministic and residual resampling schemes are compared. Following, to achieve parameter estimation in a stochastic volatility model, adapted versions of the aforementioned PF variants are benchmarked. Here, our modified PF combining ideas from Kitagawa (1998) and Liu and West (2001) is included. Finally, the simulation study, results and conclusions are presented. All mentioned filters have been implemented using the R-language.



C. Archambeau, M. Opper, Y. Shen, D. Cornford and J. Shawe-Taylor: The Gaussian Variational Approximation of Stochastic Differential Equations
This work is an initial step towards the application of variational inference to general continuous time stochastic process based models. Such processes are described by stochastic differential equations and arise naturally in a range of contexts, from financial to environmental modelling. The main innovation of this work resides in the fact that the posterior measure of the system governed by the stochastic differential equation is replaced by an optimal Gaussian measure over paths. The latter minimizes the Kullback-Leibler divergence between the true posterior measure and the approximate one. Furthermore, this leads to an exact lower bound on the marginal log-likelihood of the data, which can be used to estimate the model parameters by type II maximum likelihood. The method is applied to two simple problems: the Ornstein-Uhlenbeck process, of which the exact solution is known and can be compared to, and the double-well system, for which standard approaches such as the ensemble Kalman smoother fail to provide a satisfactory result. The approach is also compared to the solutions found with a generalized hybrid Markov Chain Monte Carlo scheme. Experiments show that the method is able to cope with strongly nonlinear systems (for example inducing multimodal probability measures).



R. Argiento, A. Guglielmi, A. Pievatolo and F. Ruggeri: Bayesian semiparametric inference for the accelerated failure time model using hierarchical mixture modeling with N-IG priors
We will pursue a Bayesian semiparametric approach for an AFT regression model, usually considered in survival analysis, when the baseline survival distribution is a mixture of parametric densities on the positive reals with a nonparametric mixing measure. A popular choice for the mixing measure is a Dirichlet process, yielding an DPM model for the error. Here, as an alternative to the Dirichlet process, the mixing measure is equal to an N-IG prior, built from normalized inverse-Gaussian finite dimensional distributions, as recently proposed in the literature. A comparison of the models will be carried out. MCMC techniques will be used to estimate the predictive distribution of the survival time, along with the posterior distribution of the regression parameters. The efficiency of computational methods also will be compared, using both real and simulated data.



M. C. Ausín, M. P. Wiper and R. E. Lillo: Bayesian Estimation of Finite Time Ruin Probabilities
In this article, we consider Bayesian inference and estimation of finite time ruin probabilities for the Sparre Andersen risk model. The dense family of Coxian distributions is considered for the approximation of both the inter-claim time and claim size distributions. From a statistical point of view, the Coxian distributions are preferable to the whole over-parameterized class of phase-type distributions. We illustrate that the Coxian model can be well fitted to real, long tailed claims data and that this compares well with the generalized Pareto model. Furthermore, we describe how to estimate the predictive distribution of the finite time ruin probability making use of recent results from queueing theory. Predictive distributions are more informative than simple point estimations and take account of model and parameter uncertainty.



A. Barber, X. Barber, A. M. Mayoral and J. Morales: Multivariate geostatistical model to establish a Bioclimatic classification
We propose a Bayesian multivariate geostatistical model to establish a Bioclimatic classification of the Island of Cyprus. It is done by following the current Worldwide Bioclimatic Classification System to define and describe the bioclimates and bioclimatics belts (thermo-ombrotypes) of the island of Cyprus.



A. Beamonte, P. Gargallo and M. Salvador: Spatio-temporal evolution of the Real Estate Market: A Bayesian Approach based on Diffusion Processes
In this paper a Bayesian analysis for spatio-temporal evolution of dwelling prices in continuous time is proposed. Starting from a hedonic regression model, a diffusion process on the coefficients of the model is posed. This framework let us analyse their temporal stability. The estimation process is carried out by means of MCMC methods. The methodology is illustrated through an application to the real estate market in Zaragoza.



S. Cabras, M. E. Castellanos and E. Staffetti: Estimating Geometric Parameters and States in Industrial Robot Manipulators Programming by Demonstration
Programming industrial robots is a time consuming and expensive operation to be performed at every change of product in manufacturing industries. We focus on robotic tasks that involve contact between the objects manipulated by the robot and the environment in which it operates such as assembly. We follow a strategy for robot programming based on human demonstration in which an operator shows the robot the task to perform. A human operator moves the object to be manipulated by the robot which is equipped with sensors to measure its position, velocity and also the forces arising during the interaction with the environment. The aim of the experiment is to estimate, based on sensor data, position and orientation of the manipulated object and of the environment. All these parameters are static, and are related to the collected data by kinetostatic equations via the contacts produced during the demonstration, which are unknown, as well. These relations are highly non-lineal, therefore to estimate the unknown quantities with enough precision we implement a sequential Monte Carlo method taking into account the fact that the unknown geometrical parameters are static, while the contacts between the manipulated object and the environment can be model by a stochastic process whose dependency structure is described by a contact graph and change along time. We compare the obtained results with those produced by using different Bayes filters that deal with the problem of contextual presence of static and dynamic parameters.



A. Ertefaie and M. R. Meshkani: Bayesian Analysis of Software Reliability Models With Reference Prior
In this paper, we introduce a Bayesian analysis for nonhomogeneous Poisson process in software reliability models. Posterior summaries of interest are obtained using Markov chain Monte Carlo methods. We compare the results obtained from using conjugate and reference priors. Model selection based on the mean squared prediction error is developed.



M. A. R. Ferreira, R. Ruiz and A. M. Schmidt: Evolutionary Markov Chain Monte Carlo Algorithms for Expected Utility Maximization
We propose an evolutionary Markov chain Monte Carlo (EMCMC) framework for expected utility maximization. This is particularly useful when the optimal decision cannot be obtained analytically, as in problems of Bayesian estimation and optimal design with non-standard utility functions. In those situations, two main tasks have to be performed numerically: calculation of the expected utility and maximization over the decision space. Müller and coauthors have developed a clever simulation based framework for Bayesian optimal design blending MCMC with simulated annealing. Nevertheless, their approach has difficulties with the exploration of highly multimodal decision spaces. Building upon their work, we develop an algorithm that simulates a population of Markov chains, each having its own temperature. The different temperatures allow hotter chains to easily cross valleys and colder chains to rapidly climb hills. The population evolves according to genetic operators such as mutation and crossover, allowing the chains to explore the decision space both locally and globally through information exchange between chains. As a result, our framework explores the decision space very effectively. We illustrate that with two applications. First, we perform optimal design of a network of monitoring stations for spatio-temporal ground-level ozone. Second, we develop estimation of quantitative trait loci (QTL).



B. Flood, S. Wilson and T. Forde: A Spatio-Temporal Model of Electromagnetic Interference Temperature
In 2002, the Federal Communications Commission (FCC) proposed the interference temperature metric to control spectrum usage. Secondary users would be permitted to broadcast in the same band as licensed users only when their transmissions would not raise the interference temperature beyond a regulatory threshold. Secondary users must make broadcasting decisions in real time. Hence, a model is needed for predicting the value of interference temperature over the area to be affected by the broadcast, over the time interval required, given limited information. We present a spatio-temporal model for interference temperature where users of the specturm are making real-time decisions on usage. Iinference for data of interference temperature collected using spectrum analysers is discussed. The model is validated by comparing test data to intervals in the predictive distribution.



S. Frühwirth-Schnatter: Bayesian Variable Selection for State Space Models
State space models are a widely used tool in time series analysis to deal with processes which gradually change over time. Whereas estimation of these models is studied by many authors, model selection is somewhat neglected, the main reason being that this issue leads in general to a non-regular statistical testing problem. For practical application, however, it seems important to test if the components in a state space model are actually dynamic or not. The main strategy is usually to compute the marginal likelihood for each model under investigation and to choose the model with the largest likelihood. In this talk, the application of model space MCMC methods will be suggested to deal with state space models under model uncertainty. This model uncertainty may concern the issue whether a certain component, like a dynamic trend, should be added to the model, and whether this component is static or dynamic. It will be shown, how a Bayesian variable selection approach can be implemented which simultaneously allows adding and deleting components and choosing between static and dynamic components. This approach will be applied both to Gaussian linear state space models as well as to non-Gaussian state space models based on the Poisson distribution and to binary and multinomial state space models.



P. Galeano and M. C. Ausín: Bayesian Inference for multivariate GARCH models with Gaussian mixture innovations
A new multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) model where the vector of innovations is assumed to follow a mixture of two Gaussian distributions is proposed. This multivariate GARCH model can capture the stylized facts usually found in financial time series including large dependence in the tails as well as volatility clustering, large kurtosis and extreme observations. Bayesian inference for this model is implemented via the Markov Chain Monte Carlo (MCMC) methodology. Bayesian prediction of future volatilities and other quantities of interest is also carried out. The proposed approach is illustrated with both simulated and real multivariate time series.



H. Gzyl, E. Ter Horst and S. W. Malone: Towards a Bayesian framework for option pricing
We describe a general method for constructing the posterior distribution of an option price. Our framework takes as inputs the prior distributions of the parameters of the stochastic process followed by the underlying, as well as the likelihood function implied by the observed price history for the underlying. Our work extends that of Karolyi (1993) and Darsinos and Satchell (2001), but with the crucial difference that the likelihood function we use for inference is that which is directly implied by the underlying, rather than imposed in an ad hoc manner via the introduction of a function representing "measurement error". As such, an important problem still relevant for our method is that of model risk, and we address this issue by describing how to perform a Bayesian averaging of parameter inferences based on the different models considered using our framework.



M. Hahn and J. Sass: Markov Chain Monte Carlo Estimation of Markov Switching Models
We want to estimate the parameters of a linear non-autoregressive multi-dimensional continuous time Markov switching model given observations at discrete times only. In some applications we observe high-frequent regime-switching which causes unstable results of the discrete time expectation maximization algorithm (a continuous time expectation maximization algorithm is not available). On the other hand, moment based methods require a very high number of observations. Hence we consider a Bayesian approach. Facing continuous time and a latent state process, using MCMC methods two approaches seem reasonable: Using time-discretization and augmenting the unknowns with the (discrete) state process, or working in continuous time and augmenting with the full state process. We present a new approach which tries to combine the useful aspects of these two methods allowing filtering for the states while no discretization error is introduced; compared to the continuous time method, the dimension of the augmented variables and the correlation between the update-blocks can be reduced heavily. This is achieved augmenting with the states at the observation times only. Using results on the distribution of occupation times in Markov processes the likelihood of the observations given the boundary states can be computed exactly.



J. Haslett and A. Parnell: Non-parametric Bayesian monotonic regression with applications to radiocarbon dating
We develop a suite of models which can be used to examine radiocarbon-dated sediment cores. Such cores are used as the basis for estimating uncertainty in past events such as climate change (eg Haslett et al, 2006). The task is to reconstruct the sediment history of the core by linking depth to age. The nature of deposition is such that older events occur at lower depths; thus a valid sedimentation history must be monotonic. We incorporate this information via a random sum of gamma increments; a Tweedie distribution. An advantage of this method is that parameters are of random length without the need for reversible jump techniques. The models are easily incorporated into existing radiocarbon dating technology to produce stochastic sedimentation histories which honestly assess uncertainty in the timing of core events, and hence climate change.



M. V. Ibañez and A. Simó: A Geostatistical Spatio Temporal Modeling with Change Points
The aim of this work is to analyze a change point problem in the context of a spatiotemporal modeling, from a Geostatistical point of view. A precedent in the spatiotemporal geostatistical literature is the work of Majumdar et al. (2005). In this work, a spatio temporal geostatistical model is formulated, assuming that there can exist a temporal point (unknown) where the process suffers a change. This change affects to all the spatial sites and involves a change in the structure of the mean, the variability, or the correlation. Our purpose is to extend the previous model, allowing the existence of different (unknown) change points for the different spatial sites. So a binary latent process called status, is defined on each spatio temporal site. This process will indicate if each spatial site has already suffered the change, or if it has not for each time, and will be modelled by using a hidden binary Markov process. Then, a Geostatistical model is used to model the observations given the status. A Bayesian methodology is used to estimate the parameters and hyperparameters of the model, and a simulation study is undertaken to check the model and the goodness of the fit.



S. Koyama: Laplace's approximation of recursive Bayesian filter
Optimal filtering for general stochastic dynamical systems involves iterative application of Bayes' rule, which is analytically intractable. In the case of linear Gaussian systems the problem is solved by the well known Kalman filter. Constructing tractable filters generally involve either extending the Kalman filter approach, or numerical integration over a posterior distribution such as particle filtering. In this work we consider an approximation scheme in which the posterior distribution is evaluated by using Laplace's method at each step of recursion. A question is whether repeatedly applying the Laplace's method will compound the error over time-step, resulting in an estimate which is so far from optimal. We give a negative answer, showing that the error of one time-step Laplace's approximation does not compound the error over time under certain conditions. Combining the smoothing algorithm, we also provide an efficient method to estimate a model parameter by using the EM algorithm.



T. M. Love, C. Joutard, E. Airoldi and S. Fienberg: The Dirichlet process prior for choosing the number of latent classes of disability and biological topics
The Dirichlet process prior can be used as a prior distribution on the class assignment of a set of objects. This can be naturally implemented in hierarchical Bayesian mixed-membership models (HBMMM) and these encompass a wide variety of models with latent structure for clustering and classification. As in most clustering methods, a principal aspect of implementing HBMMMs is the choice of the number of classes. Strategies for inference on the number of classes (such as RJMCMC methods (Green, 1995) ) can be difficult to implement without expertise. The Dirichlet process prior for class assignment can reduce the computational problem to a Gibbs Sampler with book-keeping complications. We produce novel analyses of the following two data sets: (1) a corpus of scientific publications from the Proceedings of the National Academy of Sciences examined earlier in Erosheva, Fienberg, and Lafferty (2004) and Griffiths and Steyvers (2004); (2) data on American seniors from the National Long Term Care Survey examined earlier in Erosheva (2002) and Stallard (2005). Here, our aim is to compare models and specifications by their treatment of these two data sets. Our specifications generalize those used in earlier studies. For example, we make use of both text and references to inform the choice of the number of latent topics in our publications data. We compare our analyses with the earlier ones, for both data sets.



P. Marttinen and J. Corander: Bayesian Learning of Causal Graphs for Multivariate Time Series
Graphical modelling strategies have been recently discovered as a versatile tool for analyzing multivariate stochastic processes. However, majority of the existing statistical methods for this purpose are restricted to undirected graphs, which leave the most intriguing questions about causality open and may also yield spurious associations between variables in the dynamic context. We introduce a Bayesian method for unsupervised learning of causal graphical models for multivariate stochastic processes, which allows for non-decomposable graphs and structural breaks in the process. To obtain a numerically efficient approach we utilize a recently introduced Bayesian information theoretic criterion for model learning, which has attractive properties when the potential model complexity is large relative to the size of the observed data set.



C. Nunes, A. Pacheco and C. Sernadas: Mixed Random Walk
We consider the generalization of the usual random walk in which the probability of moving up in each step of the random walk is a random variable, in such a way that the movements at different steps are conditionally independent. We call such generalization a ``mixed random walk'' (MxRW), which has two levels of stochasticity. These two levels of stochasticity induce quite different properties from the classical random walk. In particular, the MxRW tends to be transient, it spreads quadratically faster than its classical counterpart and at same order of the quantum random walk. Although the process of movements is not a Markov chain, we prove, using exchangeability arguments, that it is a mixture of Markov chains. In addition, if we set up a prior distribution for the probability U of moving up in each step, we can analyse such process using a Bayesian approach. For the MxRW we derive several asymptotic results, notably concerning the position process, the rate of visit to new states, and hitting times.



M. Palmer, S. Aryal, B. Bates and E. Campbell: Spatial-Temporal Modelling of Extreme Rainfall
Extreme rainfall over two regions of Australia, the SW of Western Australia and the Sydney region of NSW, covering approximately the last fifty years, has been modelled using a Bayesian Hierarchical approach. A convolution kernel approach is used to derive Gaussian processes that model the spatial variability of the rainfall distribution. This is a flexible approach accommodating rainfall measured over different durations (from sub to super daily) and allowing for the possibility of linking the extremes to external drivers. The approach can be used to characterize the behaviour of extremes under present day and projected future conditions. . It can be used to derive intensity-frequency-duration curves and depth-area relationships, together with estimates of their associated uncertainties, for specific locations that are gauged or ungauged. Statistics associated with areal extremes can be derived from this approach and provide information for the design of engineering structures such as culverts, bridges, and stormwater and sewerage systems



A. Pascarella, A. Sorrentino, C. Campi and M. Piana: A Grid-Based Particle Filter for Solving Non-Linear Problems with Linear Computational Cost
Particle filters (PF) are powerful algorithms for applying Bayesian tracking in the case of non-linear problems and non-Gaussian pdfs. However, when multi-target PF are applied for solving inverse problems, the computational cost can be high, due to the great number of required forward computations. We present a grid-based particle filter which can be used when the state-space can be divided in a finite number of elementary cells: the forward computations can be performed only once and the computational cost strongly decreases. The method is different from a grid sampling of the pdf, as not all the grid points are used, but only random subsets of the grid points. We apply the grid-based PF to the problem of recovering neuronal electrical currents from measurements of the external magnetic field. In this problem, the state-space has not finite volume, but it can be split into two sub-spaces: one subspace contains the non-linear parameters and has a finite volume; the second subspace has not finite volume but contains linear parameters. Applying the so-called Rao-Blackwellization to the linear parameters and the grid-based PF to the non-linear ones, we obtain a fast and accurate method for solving the source estimation problem.



D. Salmerón and J. A. Cano: Using Euler schemes to combine data augmentation techniques and Monte Carlo methods in a filtering context
Particle filters are iterative algorithms used to approximate a sequence of conditional distributions through a Monte Carlo approach. They are mainly useful in problems involving sequential observations paired or not with unobserved state variables. In situations modelled through diffusion equations which are observed with or without error in the presence of unknown parameters, the standard classical particle filter is often used to estimate the posterior distribution of the parameters. In this paper we propose a specific particle filter adapted to deal with diffusions. The filter we propose combines data augmentation techniques with the Euler approximation and Monte Carlo methods. The introduction of the augmented data in this setting improves the results that are obtained when the standard classical particle filter is used. Numerical examples show that the proposed filter is competitive against the standard classical one.



A. M. Schmidt, A. R. B. Moreira, T. C. O. Fonseca and S. M. Helfand: Spatial Stochastic Frontier Models: Accounting for Unobserved Local Determinants of Inefficiency
In this paper, we analyze the productivity of farms across n = 370 municipalities located in the Center-West region of Brazil. We propose a stochastic frontier model with a latent spatial structure to account for possible unknown geographical variation of the outputs. This spatial component is included in the one-sided disturbance term. We explore two different distributions for this term, the exponential and the truncated normal. We use the Bayesian paradigm to fit the proposed models. We also compare between an independent normal prior and a conditional autoregressive prior for these spatial effects. The inference procedure takes explicit account of the uncertainty when considering these spatial effects. As the resultant posterior distribution does not have a closed form, we make use of stochastic simulation techniques to obtain samples from it. Efficient sampling schemes are also discussed. Two different model comparison criteria provide support for the importance of including these latent spatial effects, even after considering covariates at the municipal level.



V. N. Smelyanskiy, D. G. Luchinsky, A. Duggento, A. Neiman and P. V. E. McClintock: Bayesian Inferential Framework for Diagnotic of Non-Stationary Physiological Signals
A long standing problem of reconstruction of slowly varying control parameters in physiological signals is considered. A novel efficient Bayesian technique is introduced that allows real-time diagnostic of a slowly varying control parameter in coupled FitzHugh-Nagumo systems (FHN) with resolution 5-6T , where T is period of oscillation in FHN. Specifically, a multi-dimensional system of N coupled FHN oscillators globally mixed by a known "measurement matrix" is analyzed. The time evolution of parameters and noise coefficients is tracked despite the system is not stationary. The original hidden parameters from the pre-mixed driving equations are decoded. The method is applied to infer parameters of the model of spontaneous oscillatory firing system present in sensory receptors where detector cells in a sensory epithelium are synaptically coupled to one or more excitable primary afferent neurons. Data extracted fro a in-vivo recording of the electroreceptors of paddlefish are analyzed. Given an arbitrary highly nonlinear vector field that drives the dynamics, this Bayesian algorithm allows one to infer parameters under assumption that the noise is white and gaussian. It is easily implementable over a large number of dynamical systems. The advantages and limitation of the technique are discussed.



V. N. Smelyanskiy, D. G. Luchinsky, A. Duggento, A. Stefanovska and P. V. E. McClintock: Bayesian Inference for Model Discovery in Cardiorespiratory System
A Bayesian dynamical inference technique is applied to reconstruct nonlinear coupling between cardiac and respiratory oscillations, which enables to characterise cardiorespiratory dynamics (CRD) in humans by inverse modelling from blood pressure time-series data. With a technique applicable to a broad range of stochastic dynamical models without severe computational demands a simple nonlinear stochastic dynamical model has been identified of the cardio-respiratory interaction that describes, within the framework of inverse modelling, the time-series data in a particular frequency band. The method was validated by use of synthesized data obtained by numerically integrating the inferred model itself. The main source of error in the method is the decomposition of the blood pressure signal into two oscillatory components. The dynamical model of the cardiorespiratory interaction identified in the present research can be related to the well-known beat-to-beat model of cardiovascular control introduced by DeBoer and co-workers. The accuracy of the method is investigated using model-generated data with parameters close to the parameters inferred in the experiment.



A. Torokhti, P. Howlett and C. Pearce : Optimal Data Estimation from Partially Observed Stochastic Process
The tasks considered in this paper are motivated by restrictions arising in real applications. They mainly concern with a practical way for gathering data. In many practical cases, to estimate a component $x_k$ of the reference data $x = (x_1,..., x_m)^T$ , an estimator $\mathcal{A}$ uses (or 'remembers') no more than the $p_k = 1,... ,v_k$ most recent components $y_{s_k},...,y_{v_k}$ from the measurement data $y = (y_1,..., y_m)^T$ , where $s_k$ and $v_k$ are respectively defined by $s_k = v_k -p_{k}+1$ and $v_k = 1,...,k$ We say that such an estimator $\mathcal{A}$ has arbitrarily variable incomplete memory $p = {p_1,...,p_m}$. Such a restriction makes the problem of finding the best $\mathcal{A}$ quite specific. This is, perhaps, a reason that despite a long history of the subject (see bibliography in [1-3]), even for a simplest structure of the estimator $\mathcal{A}$ when $\mathcal{A}$ is defined by a matrix, the problem of determining the best $\mathcal{A}$ has only been solved under the hard assumption of positive definiteness of an associated covariance matrix and for the case of complete memory only. We avoid such bottlenecks and solve the problem in the general case of the polynomial estimator with the arbitrarily variable incomplete memory p: The proposed technique is substantially different from those considered for the case of linear estimators and for the case of non-linear regression in [1-3]. A distinguishing feature of our solution is that the estimator should be non-linear and causal with finite memory. The simplest particular case of our desired estimator is an optimal linear estimator defined by a lower p-band matrix. There is no published solution even for this simplest case. The novelty of our approach derives from our formulation of a new class of estimators, the new reduction technique to determine an optimal representative and a rigorous error analysis. Theoretical results are illustrated with numerical simulations.



A. Torokhti and P. Howlett: Optimal Linear Filtering with Piecewise-Constant Memory
We interpret stochastic processes $y = [y_1,...,y_n]^T$ and $x = [x_1,...,x_n]^T$ as observable data and reference stochastic process, respectively. It is assumed that $y$ is a function of $x$ and a random noise, and it is required to find a linear filter $\mathcal{A}$ so that $\mathcal{A}(y)$ estimates $x$ in the best possible way in terms of minimizing the mean square error. Let $\hat{x} = \mathcal{A}(y)$ where $\hat{x} = [\hat{x}_1,..., \hat{x}_n]^T$. We partition $\hat{x}$ in such a way that $\hat{x} = [\hat{u}^T_1 ; \hat{u}^T_2,..., \hat{u}^T_l ]^T,$ where $\hat{u}_i = [\hat{x}_{p_1+...+p_{i-1}+1},..., \hat{x}_{p_1+...+p_i} ]^T ,i = 1,...,l, p_0 = 0, \hat{u}_i \in L^2(\Omega;R^{p_i} )$, and $p_1 + ... + p_l = n$. To determine a best $\hat{u}_i$, the filter $\mathcal{A}$ may transform no more than $m(i)$ components $y_{s_i} , ..., y_{p_1+...+p_i}$ of $y$, where $m_i = (p_1 + ... + p_i) - s_i + 1, q_i = 1, 2,...,(p_1 + ... + p_i), s_i = q_i, q_{i}+1,..., (p_1+...+p_i)$ and $i = 1,..., l.$ Such an filter $\mathcal{A}$ is called the filter with piecewise- constant memory ${m_1,...,m_l}$. The above constraint implies that the filter $\mathcal{A}$ and an associated matrix $A$, must have a compatible structure. Essential conditions are that the components $\hat{x}_{p_1+...+p_i}$ and $\hat{y_{p_1+...+p_i}}$ have the same subscript and that si is different for each $i$, i.e., for each $\hat{u}_i$. This respectively means that all entries above the diagonal of the matrix $A$ are zeros and second, that for each $i$, there can be a zero-rectangular block in $A$ from the left hand side of the diagonal. To satisfy the special structure of the filter, we propose a new technique based on a block- partition of the lower stepped part of matrix $A$ into lower triangular and rectangular blocks, $L_{ij}$ and $R_{ij}$ with $i = 1,..., l; j = 1,...,s_i$ where $l$ and $s_i$ are given. We show that the original error minimization problem in terms of the matrix $A$ is reduced to $l$ individual error minimization problems in terms of blocks $L_{ij}$ and $R_{ij}$ . The solution to each problem is provided and a representation of the associated error is given. The results generalize those given in [1-3].



J. Tressou: Bayesian Nonparametrics for Heavy Tailed Distributions. Application to Food Risk Assessment
Using the fact that any heavy tailed distribution can be approximated by a, possibly in.nite, mixture of Pareto distributions, this paper proposes two Bayesian methodologies to infer on distribution tails belonging to the Fréchet maximum domain of attraction (heavy tailed distributions). Firstly, a Bayesian Pareto based clustering procedure is developed based on the weighed Chinese restaurant process, where the mixing distribution is chosen to be the classical conjugate prior of the Pareto distribution. A new estimator for the tail index is also exhibited and compared to the well known Hill estimator. Secondly a nonparametric extension of the model based clustering is proposed assuming that the mixing distribution is distributed according to a Dirichlet process. The estimation of the tail probability is con- ducted using a partition based Monte Carlo method. As an illustration, both methodologies are applied to a series of simulated data sets in an empirical validation perspective and to a true data set concerning dietary exposure to a mycotoxin called Ochratoxin A in order to propose a new tool to characterize the sub populations possibly at risk.



J. Vanhatalo and A. Vehtari: The Effect of Inducing Inputs for Model Performance of Sparse Log Gaussian Process in Spatial Epidemiology
Log Gaussian processes (GP) are an attractive manner to construct intensity surfaces for the purposes of spatial epidemiology. The surfaces are naturally smoothed by GP, and the spatial correlations can be included in an explicit and natural way into the model via a correlation function. The drawback of GP is the computational burden, which becomes prohibitive as the data amount increases up to around a few thousand of cases. In this work we use a Poisson model and the spatial GP prior is given a fully independent training conditional (FITC) sparse approximation (Snelson and Ghahramani, 2006). The posterior inference is conducted by Markov chain Monte Carlo simulations. The FITC approximation is based on an additional set of variables, called inducing inputs, that are used to give a low rank approximation for the covariance matrix. We study the performance of the approximation with varying number of inducing inputs and different choices of their locations. A comparison of different models is conducted by various model criteria in case of simulated and real data. The dependence of the posterior of the length-scale of covariance function to the distance between inducing inputs is also studied.


Inquiries: bisp5@uv.es