A very quick introduction to ggplot2

Why learn GGPlot2?

GGplot2:

Using ggplot2

ggplot2 offers at least three ways for producing plots, varying in verbosity:

  1. qplot
  2. ggplot+geom_xxx
  3. ggplot+layer

Today, we'll focus on ggplot+geom because it's verbose enough to make the workings of ggplot2 transparent and readable.

A first example 1/2

A first example 2/2

A first look at the relation between carat and price:

ggplot(small)
	+geom_point(aes(x=carat,y=price,colour=cut))
		+scale_y_log10()
			+facet_wrap(~cut)
				+ggtitle("First example")

Let's decompose the ggplot2 command.

Mapping aesthetics

All geom_xxx() require some aesthetics.

?geom_point
...
Aesthetics
The following aesthetics can be used with geom_point. Aesthetics are mapped 
to variables in the data with the aes function: geom_point(aes(x = var))
x: x position (required)
y: y position (required)
shape: shape of point
colour: border colour
size: size
fill: internal colour
alpha: transparency

Thus:

ggplot(small)+geom_point(aes(x=carat,y=price,colour=cut))

also possible:

ggplot(small,aes(x=carat,y=price,colour=cut))+geom_point()
ggplot(small,aes(x=carat,y=price))+geom_point(aes(colour=cut))

A subtlety to be aware of

Caveat: there is a difference between setting and assigning aesthetics.

p<-ggplot(small)

Assigning

(or mapping) is done through aes.

p+geom_point(aes(x=carat,y=price,colour=cut))

Setting

Fixes a parameter to a certain value and is done outside aes

p+geom_point(aes(x=carat,y=price),colour="blue")

Don't mix

But trying to set an aesthetics in aes produces unwanted results:

p+geom_point(aes(x=carat,y=price,colour="blue"))

Faceting

facet_xxx are used for conditionning plots on 1 or 2 variables.

p+geom_point(aes(x=carat,y=price))+facet_wrap(~cut)
p+geom_point(aes(x=carat,y=price))+facet_wrap(~cut,nrow=1)
p+geom_point(aes(x=carat,y=price))+facet_wrap(~cut,ncol=1)
p+geom_point(aes(x=carat,y=price))+facet_grid(cut~color)

Other geoms: geom_smooth

geom_smooth is useful for displaying a trend in the data.

p<-ggplot(small,aes(x=carat,y=price))

By default, geom_smooth uses splines.

p+geom_point()+geom_smooth()+facet_wrap(~cut)

You can also specified a function to use:

p+geom_point()+geom_smooth(method="lm")+facet_wrap(~cut)

Putting it all together:

p<-ggplot(diamonds,aes(x=carat,y=price,colour=cut))
p<-p+scale_x_log10()+scale_y_log10()
p<-p+geom_point(alpha=0.3)+geom_smooth(method="lm",colour='black')
p<-p+facet_wrap(~cut)
print(p)

Other Geoms: histograms

Histograms are useful for categorical variables:

ggplot(small)+geom_histogram(aes(x=clarity))

Show the composition of each bin:

ggplot(small)+geom_histogram(aes(x=clarity,fill=cut))

Using position="dodge" one can more easily compare each sub-bins:

ggplot(small)+geom_histogram(aes(x=clarity,fill=cut),position="dodge")

position="fill" is useful for showing relative proportions

ggplot(small)+geom_histogram(aes(x=clarity,fill=cut),position="fill")

Other geoms: density plots

Density plots are better suited than histograms for continuous variables:

ggplot(small)+geom_density(aes(x=price))
ggplot(small)+geom_density(aes(x=price,fill=cut))
ggplot(small)+geom_density(aes(x=price,fill=clarity))

We can use some transparency to distinguish between the distributions:

ggplot(small)+geom_density(aes(x=price,fill=cut),alpha=0.5)

Or use colour instead of fill:

ggplot(small)+geom_density(aes(x=price,colour=cut))

Boxplots

Use geom_boxplot is used for box plots.

ggplot(small)+geom_boxplot(aes(x=cut,y=price))
ggplot(small)+geom_boxplot(aes(x=cut,y=price,fill=color))

Many other geoms available

geom_abline		geom_jitter
geom_area		geom_line
geom_bar		geom_linerange 
geom_bin2d		geom_path 
geom_blank		geom_point 
geom_boxplot		geom_pointrange 
geom_contour		geom_polygon 
geom_crossbar		geom_quantile 
geom_density		geom_rect 
geom_density2d		geom_ribbon
geom_errorbar		geom_rug 
geom_errorbarh		geom_segment 
geom_freqpoly		geom_smooth 
geom_hex		geom_step 
geom_histogram		geom_text 
geom_hline		geom_tile
geom_vline

Good resources for ggplot2

The list of geom_xxx is finite but not limiting!

Christophe Ladroue, University of Warwick, UK

/

#