Point process models are used to analyze the type of data that take the form of a set of points in space, effectively just a set of random dots on a map. Such data are naturally very common and range from observed locations of animals or trees (in ecology) to disease cases (in epidemiology). The usual goal is to understand why things happen where they do, and to predict future occurrences. In our particular applications, our data consist of ocular fixation locations on an image, or a video, and the goal is to understand why subjects choose to fixate in one place rather than another. To do so, point process models try to tie observed locations to various spatial covariates. Animals are more likely to be found where food is available, and so a map of food availability would make a good spatial covariate for a model of animal locations. In the same way, fixation locations can be predicted from local characteristics of the image (Privitera & Starck, 2000), local contrast being a good example: homogeneous image regions are less interesting (Itti et al., 1998).
From a cognitive point of view, fixation sequences (so called scanpaths) reveal non-stationary processes. For example, in an information seeking task such as reading or visual exploration (visual search), reading, visual search, decision-making, or confirmation phases are intertwined. To infer latent states from data, an approach based on HMMs and inverse problem modeling has been considered by Simola et al. (2008). These cognitive phases are expected to be reflected in both eye-movement and EEG sequences. However, when using such multimodal data, theoretical issues appear, concerning mainly the heterogeneity of the observations (continuous, discrete –categorical or nominal, ordinal), and also the time delay between the different modes involving asynchronism between phases.