Section 5 Bar charts

Here we consider the inequality dataset from the MAS6005 library. Suppose we want to show how the UK’s gini coefficient compares with that of other countries in the OECD. We could try a bar plot:

library(MAS6005)
attach(inequality)
barplot(gini, 
        names.arg = country,
        ylab = "gini coefficient")
An attempt to produce a bar plot with a simple `barplot` command. The bar labelling clearly hasn't worked here.

Figure 5.1: An attempt to produce a bar plot with a simple barplot command. The bar labelling clearly hasn’t worked here.

Obviously, the labelling of the bars is hopeless here. We could rotate them by 90 degrees, and then all the countries should fit, but a horizontal bar plot may look better. Note that we need to create more space on the left hand margin. See ?par and look at the mar option for details.

par(mar = c(5, 8, 4, 2) + 0.1)
barplot(gini, 
        names.arg = country,
        xlab="Gini coefficient",
        horiz = TRUE,
        las = 1)
With a horizontal bar plot, it's easier to fit the labels in. But we still need to think about the ordering of the bars.

Figure 5.2: With a horizontal bar plot, it’s easier to fit the labels in. But we still need to think about the ordering of the bars.

We can now at least see all the country labels. You can see that the countries are presented in the (alphabetical) order that they appear in the inequality dataframe , but this isn’t very helpful: what’s likely to be of more interest here is which countries have the highest index, and which have the lowest; it would be better to present the countries in ranked order. We create a new dataframe, where the observations are ordered by gini coefficient.

inequality.ordered <- inequality[order(inequality$gini), ]
par(mar = c(5, 8, 4, 2) + 0.1)
barplot(inequality.ordered$gini, 
        names.arg = inequality.ordered$country,
        las = 2,
        xlab="Gini coefficient",
        horiz = TRUE)
We have now arranged the bars in order of gini coefficient. This makes comparison between the countries easier.

Figure 5.3: We have now arranged the bars in order of gini coefficient. This makes comparison between the countries easier.

Tufte (2013) suggests having gaps within the bars to create a nice grid effect (“data-ink maximisation”). (He also suggests doing away with the axis line, but I prefer to keep it in). A Google search ‘R Tufte bar chart’ will give a solution. To get the right effect, we need to set the border colour of each bar to be the same as the fill colour (grey).

par(mar = c(5, 8, 4, 2) + 0.1)
barplot(inequality.ordered$gini, 
        names.arg = inequality.ordered$country,
        las = 1,
        xlab="Gini coefficient",
        horiz = TRUE,
        border = "grey",
        xlim = c(0, 0.5),
        xaxp = c(0, 0.5, 5))
abline(v=seq(0.1, 0.4, by=0.1), col='white', lwd=2)
Adding in 'grid lines' to the bar plot. We have actually drawn *less* here than in the previous figure, yet achieved a nice effect.

Figure 5.4: Adding in ‘grid lines’ to the bar plot. We have actually drawn less here than in the previous figure, yet achieved a nice effect.

5.1 Highlighting an individual bar

We can highlight an individual bar by changing its colour (and the text colour as well). This is quite fiddly (at least I’ve not found an easier way to do it). We can specify vectors of colours to use for the bars and bar labels. We then draw a bar chart without text labels, and then use the text command to add in the labels manually.

par(mar = c(5, 8, 4, 2) + 0.1)

# Specify colours for the bars and bar borders
border.vec <- color.vec <- rep("grey", 36)
# Specify colours of text labels for the bars
country.vec <- rep("black", 36) 

# Change bar and text colour to red for United Kingdom
index <- which(inequality.ordered$country == "United Kingdom")
border.vec[index] <- 
  color.vec[index] <- 
  country.vec[index] <- "red"

# leave out x-axis labelling in the main barplot...
bp <- barplot(inequality.ordered$gini, 
        las = 1,
        xlab="Gini coefficient",
        horiz = TRUE,
        border = border.vec,
        col = color.vec,
        xlim = c(-0.01, 0.5),
        xaxp = c(0, 0.5, 5))
abline(v=seq(0.1, 0.4, by=0.1), col='white', lwd=2)

#... and add it in manually at the end
text(y = bp, x = -0.01, 
     labels = inequality.ordered$country ,
      xpd = TRUE, col = country.vec, adj = 1)

References

Tufte, Edward R. 2013. The Visual Display of Quantitative Information. Second edition. Graphics Press.