Suicide vs Divorce rates by country using ggplot

I was looking for data I could use with the geom_text() object in ggplot2 and came across this data from the World Health Organization about the suicide rates by country which I found very handy for my example.

I used the scale_colour_gradient2() with 3 colors, red, gray and black but it only picked up gray and black and still don’t know why. 😦

Anyway, here it is the graph:

The number of suicides is for every 100.000 people and number of divorces for every 1000 people. (I know, I should have added this to the graph)

The size  and color of each country is the ratio of suicide women for every 10 suicide men (ratio_f_m if that make sense!?). So China has the same number of suicides between women and men folllowed by Kuwait and South Korea. The coeff of correlation was .02867 and excluding Maldives 0.5507.

As you notice Maldives is far away from the cloud and I bet $10 dollars that the key drivers behind that are the sun, beaches and really small bikinis and not population size 😉

With a little bit of photoshop the above graph will look like this one

code and data can be found in github


Thanks to Louis, I’ve manage to show the three colors properly. Code also updated accordingly.

See comments for further details.


wb <- loadWorkbook('divorce_vs_suicide.xlsx')
df <- wb['Sheet1']
df$Col6 <- NULL
df$Col7 <- NULL

p <- ggplot(na.omit(df), aes(x=divorce,y=suicide,label=country))
p <- p+geom_text(aes(colour=ratio_f_m,size=ratio_f_m))+ scale_colour_gradient2(low='red',mid="gray", high="black", midpoint=mean(range(na.omit(df$ratio_f_m))))
p <- p+scale_size(to=c(3,5))+theme_bw()
p <- p+opts(panel.grid.major=theme_blank(),panel.grid.minor=theme_blank())
  1. January 11, 2012 at 12:10 am

    I did not spend a lot of time with this (and I could not find the data you used). I think you might be able to get the “red” color to show better if you use use the “midpoint” parameter. So the scale_colour_gradient2 function should look like:
    scale_colour_gradient2(low=’red’,mid=”gray”, high=”black”, midpoint=4)

    Of course you might want a formula instead of hardcoding a 4. Since the distribution maybe skewed try something like:
    scale_colour_gradient2(low=’red’,mid=”gray”, high=”black”, midpoint=mean(range(df$ratio_f_m)))

    • January 11, 2012 at 1:06 am

      Thanks for the tip. I’ll follow your suggestion and will update it as soon as possible!

    • January 11, 2012 at 1:09 am

      The data is in the R folder in GitHub.

  2. February 8, 2012 at 8:50 pm

    Interesting. I just googled divorce maldives and saw this:


    I like this data set, I was looking at it a little with this new MIC / MINE thing (using base R graphics): https://plus.google.com/photos/111418366927480232429/albums/5688997682113209921?authkey=CPTmi82N9cnQeA

