IDS 702: Module 1.8

class: center, middle, inverse, title-slide

# IDS 702: Module 1.8
## Transformations
### Dr. Olanrewaju Michael Akande

---

## Transformations

- As we have already seen, sometimes, we have to deal with data that fail linearity and normality.

- Transforming variables can help with linearity and normality (for the response variable, since we do not need normality of the predictors).

- The most common transformation is the .hlight[natural logarithm]. For the response variable, that is, `$\textrm{log}_e(y)$` or `$\textrm{ln}(y)$`.

- This is often because it is the easiest to interpret.

- Suppose 
.block[
.small[
`$$\textrm{ln}(y_i) = \beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_p x_{ip} + \epsilon_i.$$`
]
]

- Then it is easy to see that
.block[
.small[
`$$y_i = e^{(\beta_0 + \beta_1 x_{i1} + \beta_2 x_{i2} + \ldots + \beta_p x_{ip} + \epsilon_i)} = e^\beta_0 \times e^{\beta_1 x_{i1}} \times e^{\beta_2 x_{i2}} \times \ldots \times e^{\beta_p x_{ip}} \times e^{\epsilon_i}.$$`
]
]

- That is, .hlight[the predictors actually have a multiplicative effect] on `$y$`.

---
## Natural log transformation

- The estimated `$\beta_j$`'s can be interpreted in terms of approximate proportional differences.

- For example, suppose `$\beta_1 = 0.10$`, then `$e^{\beta_1} = 1.1052$`.

- Thus, a difference of 1 unit in `$x_1$` corresponds to an expected positive difference of approximately `$11\%$` in `$y$`.

- Similarly, `$\beta_1 = -0.10$` implies `$e^{\beta_1} = 0.9048$`, which means a difference of 1 unit in `$x_1$` corresponds to an expected negative difference of approximately `$10\%$` in `$y$`.

- .block[When making predictions using the regression of the transformed variable, remember to transform back to the original scale to make your predictions more meaningful.]

---
## Other transformations

- While the natural logarithm transformation is the most common, there are several options.

- For example, logarithm transformations with other bases, taking squares, taking square roots, etc.

- 
<div class="question">
Which one should you use?
</div>

- Well, it depends on what you are trying to fix.

- For linearity, for example, it is possible to need a logarithm transformation on the response variable but a square root transformation on the one of the predictors, to fix violations of linearity and normality.

- Overall, if you do not know the options to consider, you could try .hlight[Box-Cox power transformations] (to fix non-normality).

- We will not spend time on those in this course but I am more than happy to provide resources to anyone who is interested.

- First, see the .hlight[boxcox] function in R's .hlight[MASS] library.

---

class: center, middle

# What's next?

### Move on to the readings for the next module!