# Section 6 Axes scaling

Here, we consider the `medals`

dataset from the `MAS6005`

library. The aim is to compare total number of medals won against population size in the 2016 Rio olympics.

```
plot(medals$population,
medals$total,
xlab = "Population size",
ylab = "Total number of medals won")
```

A couple of problems here:

- the scientific notation (“e+”) on the \(x\)-axis isn’t very friendly;
- there are two countries with much larger populations than the rest. This ‘distorts’ the plot somewhat, in that a lot of the remaining points are bunched together.

We could use a log transformation of the population size (perhaps the first thing you’d try if you were fitting a regression model). We’ll use a \(\log_{10}\) transformation

```
plot(log10(medals$population),
medals$total,
xlab = "log (base 10) population size",
ylab = "Total number of medals won")
```

This fixes the two problems, but if you’re writing for a general audience, the reader may struggle with the idea of ‘log population size’.

An alternative is to use a log (base 10) scale on the \(x\)-axis. Visually, this will give the same pattern as plotting total medal against \(\log_{10}\) population, but the plot is (perhaps) a little easier to interpret, as we can still refer to the raw population, not the \(\log_{10}\) population. We can tidy things up further, by plotting the population in millions.

```
plot(medals$population/1000000,
medals$total, log = "x",
axes = FALSE,
xlab = "population (millions)",
ylab = "total number of medals won")
ticks <- c(1 , 10, 100, 1000)
axis(side = 1, at = ticks, labels = ticks)
axis(side = 2)
```

However, you may now need to explain to your reader how to interpret the log-scaled axis! In the following, for example, some readers may think the dashed line represents a population of 50 million (whereas it actually represents a population of \(100^{\frac{3}{4}}\) million).