Say you want to model the evolution over time of a disease like HIV in an individual, or the evolution of moods like stress or craving from self-reports/ecological momentary assessments (EMA), and you believe there is some underlying ‘true’ trajectory. For disease and moods, you likely want a continuous-time model that handles irregularly sampled data and allows the disease or mood process to change at arbitrary times. However, you still have a big choice to make: do you use a discrete state latent variable model such as a continuous-time Markov chain (CTMC) or continuous-time hidden Markov model (CT-HMM), or assume a continuous-valued true trajectory and use a mixed model or Gaussian process (GP)??
There are several arguments for a discrete latent variable model
- it is arguably more interpretable. In diseases like cancer they describe the disease in terms of stages that every clinician is familiar with: stages 0-4. I’m not sure if they have stage-wise models in behavioral science the same way, but if they do, a discrete latent variable model would be appropriate.
- Because of the use of discrete states we can describe the rates at which we move between states, how long we on average stay in a state, and how covariates affect the rate at which we move between different states. For instance, if we had four states of stress, we could say that as you increase income by $100/month, the mean holding time for state two of stress goes down by x, but the mean holding time for state three of stress goes down by y. Statements like this are very difficult to make using models like mixed models or Gaussian processes. They are still somewhat difficult here, as they require extensive validation of whether the learned model makes sense, but the framework at least lends itself to such statements.
- It allows for more flexible observation models and of different types. For instance, if you want to model one binary question, one ordinal question, one categorical question in longitudinal survey data, and one continuous-valued observation, this is fairly easy with a discrete latent variable model. You simply assume that all observations are conditionally independent given the state, and then specify some exponential family observation distribution for each one. It’s less straightforward with mixed models (although [1] seems to do this) and even less straightforward with Gaussian processes.
We could alternatively use a continuous-valued one like a GP or a mixed model. Advantages here are:
- If you believe the true process is continuous-valued and want that extra resolution. In particular, a CTMC/CT-HMM makes the assumption that dynamics within a state don’t matter, but maybe they do.
- With a Gaussian process, while you lose flexibility in the observation model, you can gain flexibility in the evolution of the true trajectory. For instance, in a CT-HMM (discrete state model), the influence of old observations decays exponentially quickly. That is, old data becomes stale. In a Gaussian process, this doesn’t have to be the case and you can easily have old measurements have a large influence on the present based on your specification of your covariance function between time points.
- The long-run evolution may be easier to describe intuitively with both a GP (via the mean function) or a mixed model (via the slope and intercept). In Markov models unless you have an absorbing state it’s difficult to describe long-run dynamics.
- A mixed model is ‘personalized.’ Each individual has a different set of unknown parameters. Given the latent variable (random effects), everyone has different long-run dynamics.
Note that there are ways to address many of the limitations I described, but they either involve approximations with few/no guarantees, or make implementation substantially more difficult. For instance, one could use a Gaussian process prior with a discrete observation likelihood, but then one won’t have a Gaussian process posterior and would probably need a variational approximation or to use some sampling methods. Further, one could work on relaxing the Markov assumption in discrete state latent variable models. However, for a first analysis one should probably start with thinking about the simpler trade offs.
[1] Yu, Tingting, Lang Wu, and Peter B. Gilbert. “A joint model for mixed and truncated longitudinal data and survival data, with application to HIV vaccine studies.” Biostatistics 19, no. 3 (2017): 374-390.