6 Inferring the parameters of UNTB from outcomes
6.1 Key questions
- How do we use the outcomes of a process model to guess at underlying processes (parameter values)?
- How can we leverage machine learning (random forest) to infer parameter values from the outcomes of a neutral model?
- What are the limiting factors in inferring process from outcome in a process model?
6.2 Learning objectives
- Understand the outcome-to-parameter inference model.
- Fit and evaluate a random forest model to our UNTB data.
- Identify the limiting features of this approach (particularly model identifiability) and brainstorm possible solutions.
6.3 Lesson outline
Video lecture: Inferring the parameters of neutral theory from model outcomes
6.3.1 (Lecture/discussion) Going from outcome to parameters
Key question:
- How could we use a process model to better understand actual data?
Key points:
- As a starting point, we can see if we can recover generative parameters when we know how they were produced.
6.3.2 (Lecture/discussion) Random forest regression
Key points:
- To guess at the parameters that generated some outcome data, we’re writing a model of the general form
parameter ~ results
. This isn’t necessarily a linear or otherwise tidy relationship, so we’ll use machine learning. - Random forest can do regression with many parameters predicting nonlinear relationships.
- We will begin by using a random forest to try to recover parameter values for known sims.
6.3.3 (Lecture) Fitting a random forest to our neutral data
6.3.4 (Discussion) What problems arise in the random forest?
Key points:
- Multiple sets of parameters can lead to the same outcomes, making it difficult to infer backwards.
6.3.5 (Discussion) How might we address these challenges?
Key points:
- More data dimensions can break a many-to-one mapping.
- It remains critical that the underlying process be appropriate.