Méthodes algorithmiques pour l’analyse de données complexes et de grande dimension

Scientific description of the project

The goal of the “persyvact” research project is to build tools for analyzing hierarchically structured models for high dimensional complex data. The tools proposed by “persyvact” are based on cutting-edge mathematical and algorithmic developments. A universal idea underpinning the concept of structure is that combining simple local components in a consistent model allows modelers to describe complex data with great accuracy.
The challenge of analyzing “big” data sets also requires that researchers in the applied sciences, in biology and medicine or in signal processing interact with researchers in computational mathematics to improve the capacity of their traditional methods and to answer new questions.
Several traditional computational methods are actually limited by too narrow assumptions and their lack of robustness introduces errors that can bias data analyses.
This happens when the samples exhibit complex statistical dependencies, for example due to repeated experiments, uneven experimental designs, clustered or grouped data and spatial-temporal relationships. A second source of complexity
arises with the measurement process itself which could involve very different instruments or could record data of very different nature. Large dimensional data are often heterogeneous and potentially noisy or missing, and they can contain multi-level information about the spatial and temporal scales of the observed process.
The “persyvact” project proposes mathematical and statistical methods to facilitate the analysis of high dimensional data exhibiting complex dependencies and heterogeneous or multi-level structure. It addresses important questions in the analysis of large scale digital data, population genomic data and large sensor networks. 

Results

The project has fostered collaborations between researchers from GIPSA-lab, LJK and TIMC-IMAG. Four PhD theses co-supervised by researchers from 2 of the 3 different labs have started since October 2013. Five papers and conference proceedings have been produced by Persyvact. The Persyvact project organized many scientific events including the international workshop Statlearn 2015.

5 March, 2015 meeting of the ADM research action

Slides about the Persyvact project

Slides about the PhD of Alessandro Chiancone

Seminars and meetings

Publications

  • Mairal J. (2015). Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM Journal on Optimization, in press. HAL
  • Chiancone A., Chanussot J., & Girard S. (2014). Collaborative sliced inverse regression. Rencontre d’astrostatistique. Grenoble, 2014 HAL
  • Clausel M., Roueff F., Taqqu M., & Tudor C.A. (2014). Asymptotic behavior of the quadratic variation of the sum of two Hermite processes of consecutive orders. Stochastic Processes and their Applications, 124: 2517-2541 HAL
  • Duforet‐Frebourg, N., & Blum, M. G. (2014). Non-stationary patterns of isolation-by-distance: inferring measures of local genetic differentiation with Bayesian kriging. Evolution68(4), 1110-1123. HAL
  • Duforet-Frebourg, N., Bazin, E., & Blum, M. G. (2014). Genome scans for detecting footprints of local adaptation using a Bayesian factor model. Molecular Biology and Evolution, msu182. HAL
  • Frichot, E., Mathieu F., Trouillon T., Bouchard G., & O. François (2014) Fast and efficient estimation of individual ancestry coefficients. Genetics 196: 973-983. HAL
  • He X., Condat L., Bioucas-Dias J., Chanussot J., & Xia J. (2014) A new pansharpening method based on spatial and spectral sparsity priors IEEE Transactions on Image Processing, 23 : 4160-4174. HAL
  • Mairal, J., Koniusz, P., Harchaoui, Z., & Schmid, C. (2014). Convolutional kernel networks. In Advances in Neural Information Processing Systems (pp. 2627-2635). HAL
  • Prangle, D., Blum, M. G. B., Popovic, G., & Sisson, S. A. (2014). Diagnostic tools of approximate Bayesian computation using the coverage property. Australia and New Zealand Journal of Statistics, 56: 309-329. HAL
  • Clausel M, Roueff F, & Taqqu M (2013). Large scale reduction principle and application to hypothesis testing. Electronic Journal of Statistics, 9 :153-203. HAL
  • Frichot, E., Schoville, S. D., Bouchard, G., & François, O. (2013). Testing for associations between loci and environmental gradients using latent factor mixed models. Molecular biology and evolution30(7), 1687-1699. HAL

Members

  • Sophie Achard CNRS-Gipsa-lab
  • Hacheme Ayasso UJF-Gipsa-lab
  • Michael Blum CNRS-TIMC
  • Jean Marc Brossier INP-Gipsa-lab
  • Florent Chatelain, INP-Gipsa-lab
  • Jocelyn Chanussot, INP-Gipsa-lab
  • Marianne Clausel UJF-LJK
  • Jean François Coeurjolly UPMF-LJK
  • Laurent Condat, CNRS-Gipsa-lab
  • Michel Desvignes,  INP-Gipsa-lab
  • Jean Baptiste Durand INP-Gipsa-lab
  • Florence Forbes INRIA-LJK
  • Olivier François INP-TIMC
  • Stéphane Girard INRIA-LJK
  • Radu Horaud INRIA-LJK
  • Sophie Lambert UMPF-TIMC
  • Julien Mairal INRIA-LJK
  • Marie-José Martinez UPMF-LJK
  • Olivier Michel, PR INP
  • Valérie Perrier INP-LJK
  • Nicolas Thierry-Mieg, CNRS-TIMC

Coordinators

  • Michael Blum CNRS-TIMC
  • Marianne Clausel UJF-LJK
  • Laurent Condat, CNRS-Gipsa-lab