# 6 Inferring the parameters of UNTB from outcomes

## 6.1 Key questions

- How do we use the outcomes of a process model to guess at underlying processes (parameter values)?
- How can we leverage machine learning (random forest) to infer parameter values from the outcomes of a neutral model?
- What are the limiting factors in inferring process from outcome in a process model?

## 6.2 Learning objectives

- Understand the outcome-to-parameter inference model.
- Fit and evaluate a random forest model to our UNTB data.
- Identify the limiting features of this approach (particularly model identifiability) and brainstorm possible solutions.

## 6.3 Lesson outline

Video lecture: Inferring the parameters of neutral theory from model outcomes

### 6.3.1 (Lecture/discussion) Going from outcome to parameters

Key question:

- How could we
*use*a process model to better understand actual data?

Key points:

- As a starting point, we can see if we can recover generative parameters when we know how they were produced.

### 6.3.2 (Lecture/discussion) Random forest regression

Key points:

- To guess at the parameters that generated some outcome data, we’re writing a model of the general form
`parameter ~ results`

. This isn’t necessarily a linear or otherwise tidy relationship, so we’ll use machine learning. - Random forest can do regression with many parameters predicting nonlinear relationships.
- We will begin by using a random forest to try to recover parameter values for known sims.

### 6.3.3 (Lecture) Fitting a random forest to our neutral data

### 6.3.4 (Discussion) What problems arise in the random forest?

Key points:

- Multiple sets of parameters can lead to the same outcomes, making it difficult to infer backwards.

### 6.3.5 (Discussion) How might we address these challenges?

Key points:

- More data dimensions can break a many-to-one mapping.
- It remains critical that the underlying process be appropriate.