I was looking for data I could use with the geom_text() object in ggplot2 and came across this data from the World Health Organization about the suicide rates by country which I found very handy for my example.
I used the scale_colour_gradient2() with 3 colors, red, gray and black but it only picked up gray and black and still don’t know why.
Anyway, here it is the graph:
The number of suicides is for every 100.000 people and number of divorces for every 1000 people. (I know, I should have added this to the graph)
The size and color of each country is the ratio of suicide women for every 10 suicide men (ratio_f_m if that make sense!?). So China has the same number of suicides between women and men folllowed by Kuwait and South Korea. The coeff of correlation was .02867 and excluding Maldives 0.5507.
As you notice Maldives is far away from the cloud and I bet $10 dollars that the key drivers behind that are the sun, beaches and really small bikinis and not population size
With a little bit of photoshop the above graph will look like this one
code and data can be found in github
Thanks to Louis, I’ve manage to show the three colors properly. Code also updated accordingly.
See comments for further details.
library(XLConnect) library(ggplot2) wb <- loadWorkbook('divorce_vs_suicide.xlsx') df <- wb['Sheet1'] df$Col6 <- NULL df$Col7 <- NULL p <- ggplot(na.omit(df), aes(x=divorce,y=suicide,label=country)) p <- p+geom_text(aes(colour=ratio_f_m,size=ratio_f_m))+ scale_colour_gradient2(low='red',mid="gray", high="black", midpoint=mean(range(na.omit(df$ratio_f_m)))) p <- p+scale_size(to=c(3,5))+theme_bw() p <- p+opts(panel.grid.major=theme_blank(),panel.grid.minor=theme_blank()) p