6  Inferring the parameters of UNTB from outcomes

6.1 Key questions

  1. How do we use the outcomes of a process model to guess at underlying processes (parameter values)?
  2. How can we leverage machine learning (random forest) to infer parameter values from the outcomes of a neutral model?
  3. What are the limiting factors in inferring process from outcome in a process model?

6.2 Learning objectives

  1. Understand the outcome-to-parameter inference model.
  2. Fit and evaluate a random forest model to our UNTB data.
  3. Identify the limiting features of this approach (particularly model identifiability) and brainstorm possible solutions.

6.3 Lesson outline


Video lecture: Inferring the parameters of neutral theory from model outcomes

6.3.1 (Lecture/discussion) Going from outcome to parameters

Key question:

  1. How could we use a process model to better understand actual data?

Key points:

  1. As a starting point, we can see if we can recover generative parameters when we know how they were produced.

6.3.2 (Lecture/discussion) Random forest regression

Key points:

  1. To guess at the parameters that generated some outcome data, we’re writing a model of the general form parameter ~ results. This isn’t necessarily a linear or otherwise tidy relationship, so we’ll use machine learning.
  2. Random forest can do regression with many parameters predicting nonlinear relationships.
  3. We will begin by using a random forest to try to recover parameter values for known sims.

6.3.3 (Lecture) Fitting a random forest to our neutral data

6.3.4 (Discussion) What problems arise in the random forest?

Key points:

  1. Multiple sets of parameters can lead to the same outcomes, making it difficult to infer backwards.

6.3.5 (Discussion) How might we address these challenges?

Key points:

  1. More data dimensions can break a many-to-one mapping.
  2. It remains critical that the underlying process be appropriate.