tags : Machine Learning, Statistics

## FAQ

### “correlation does not implies causation” meme

- “correlation does not implies causation” : Yes, commonly said
- But causation also does not imply correlation too. There can be hidden variables.

### What are associations?

`correlation`

is a limited measure of`association`

`variables`

can be`associated`

but have no`correlation`

`association`

are bi-directional:`associations`

between`variables`

run in both direction

## Basics

### What is Causal Interference

- What happens when some intervention is done
- Can only be done if we have a causal model

#### Causal Prediction

- Different from normal prediction
- Predicting the effect
- Being able to
`predict the consequences`

of an`intervention`

. - Eg.
`Movement of the trees`

and`wind`

are statically`associated`

. But nothing in the data tells you that`wind`

causes the trees to move.

- Being able to
- What if I do this?

#### Causal Imputation

- Knowing the cause
- Being able to
`construct`

some`unobserved counterfactual outcome`

- Being able to
- What if I had done something else?

### Related topics to Causal Inference

- Description (of
`population`

)- The
`sample`

is`caused`

by`things`

- The
- Design (of research project)
`things`

need to be drawn w a causal logic which will help us`design/calculate`

around them- thinking about why
`sample`

differs from the`population`

## Models

### DAGs

- Example
- Here
`A`

influences the treatment`X`

- Here
`B`

influences the outcome`Y`

- Here
`C`

influences both`X`

and`Y`

(Confound, we’d want to control it)

- Here
- We can ask multiple questions to this model
- DAGs are intuition pumps: get head out of data, into science
- Gives you a strategy for which
`control variables`

you need to play with

### GOLEMS

statistical models to produce scientific insight

- They require additional
`scientific (causal) models`

- The
`reasons`

are not found in the data, but rather in the`causes`

of the data- i.e We should not try to infer the reason in the data.

`No causes in(the data), No causes out(from the data)`

#### Decision Tree/Flowchart

- We use it for selecting an appropriate statistical procedure.

- But this kind of using a decision tree to select the statistical procedure is not much helpful for a research scientist because this quite limiting
- Each of these procedures can have bayesian and frequestist version of them
- Mostly useful in industrial testing