Data Visualization with : : CHEATSHEET Basics ggplot2 is based on the grammar of graphics, the idea that you can build every graph from the same components: a data set, a coordinate system, and geoms—visual marks that represent data points. BRIE! data geom x- F -y-A coordinate system plot To display values, map variables in the data to visual properties of the geom (aesthetics) like size, color, and x and y locations. data geom x- F -y-A color - F size-A coordinate system plot Complete the template below to build a graph. ggplot (data = S22ED)+ ^mjmm«|m^>lmapping = aes(Q2Q32IZ^9 stat = .position = amMHiwp) + co'or> grouP. linetype, size i + geom step(direction = "hv") jJH x, y, alpha, color, group, linetype, size visualizing error df <- data.frame(grp = c("A", "B"), fit = 4:5, se = 1:2) j <- ggplot(df, aes(grp, fit, ymin = fit-se, ymax = fit+se)) B™i j + geom_crossbar(fatten = 2) j x, y, ymax, ymin, alpha, color, fill, group, linetype, □ size j + geom_errorbar(), x, ymax ymin, alpha, color, group, linetype, size, width (also H geom_errorbarh()) j + geom linerangeO x, ymin, ymax, alpna, color, group, linetype, size j + geom pointrangeO x y, ymin, ymax, alpha, color, fill, group, linetype, IS' shape, size maps data <-data.frame(murder = USArrests$Murder, state = tolower(rownames(USArrests))) map <- map_data("state") k <- ggplotjdata, aes(fill = murder)) k + geom_map(aes(map id = state), map = map) + expand limits(x = mapSlong, y = mapSlat), map_id, alpha, color, fill, linetype, size THREE VARIABLES seals$z <- with(seals, sqrt(delta_longA2 + delta_latA2))l <- ggplot(seals, aes(long, lat)) I + geom contour(aes(z = z)) ■ 1 + geom_raster(aes(fill -z), hjust=0.5, vjust=0.5, x, y, z, alpha, colour, group, linetype, I interpolate=FALSE) size, weight t^M x, y, alpha, fill I + geom_tile(aes(fill = z)), x, y, alpha, color, fill, linetype, size, width RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org> ggplot2 2.1.0 • Updated: 2016-11 discrete d <- ggplot(mpg, aes(fl)) d + geom_bar() x, alpha, color, fill, linetype, size, weight Stats An alternative way to build a layer A stat builds new variables to plot (e.g., count, prop). data stat coordinate system plot geom x - x y- ..count-Visualize a stat by changing the default stat of a geom function, geom_bar(stat="count") or by using a stat function, stat_count(geom="bar"), which calls a default geom to make a layer (equivalent to a geom function). Use ..name., syntax to map stat variables to aesthetics. stat function J geommappings I i + stat_density2d(aes(fill = ..level..), V geom = "polygon"). variable created by stat c + stat_bin(binwidth = 1, origin = 10) x,y | ..count.., ..ncount.., ..density.., ..ndensity.. c + stat_count(width = 1) x,y,| ..count.., ..prop.. c + stat_density(adjust = 1, kernel = "gaussian") x,y, | ..count.., ..density.., ..scaled.. e + stat_bin_2d(bins = 30, drop = T) x,y, fill ..count.., ..density.. e + stat_bin_hex(bins=30) x, y, fill | ..count.., ..density.. e + stat_density_2d(contour = TRU E, n = 100) x, y, color, size ..level.. e + stat_ellipse(level = 0.95, segments = 51, type = "t") I + stat_contour(aes(z = z)) x, y, z, order | ..level.. I + stat_summary_hex(aes(z = z), bins = 30, fun = max) x,y, z, fill | ..value.. I + stat_summary_2d(aes(z = z), bins = 30, fun = mean) x,y, z, fill | ..value.. f + stat_boxplot(coef = 1.5) x,y | ..lower.., ..middle.., ..upper.., ..width.., ..ymin.., ..ymax.. f + stat_ydensity(kernel = "gaussian", scale = "area") x,y ..density.., ..scaled.., ..count.., ..n.., ..violinwidth.., ..width.. e + stat_ecdf(n = 40) x,y| ..x..,..y. e + stat_quantile(quantiles = c(0.1,0.9), formula = y -log(x), method = "rq") x,y | ..quantile.. e + stat_smooth(method = "Im", formula = y - x, se=T, level=0.95)x,y | ..se.., ..x.., ..y.., ..ymin.., ..ymax.. ggplotf) + stat_function(aes(x - dnorm, args = list(sd=0.5)) x ..x -3:3), n = 99, fun = ,..y.. e + stat_identity(na.rm = TRUE) ggplotf) + stat_qq(aes(sample=l:100), dist = qt, clparam=list(df=5)j sample, x,y | ..sample.., ..theoretic; e + stat_sum() x, y, size | ..n.., ..prop.. e + stat_summary(fun.data = "mean_cl_boot") h + stat_summary_bin(fun.y = "mean", geom = "bar") e + stat unique)) @Stud 10 Scales Scales map data values to the visual values of an aesthetic. To change a mapping, add a new scale. ■ ll (n<- d + geom_bar(aes(fill = fl))) aesthetic T prepackaged T scale-specific to adjust 1 scale to use I arguments n + scale_fill_manual( values - cfskyblue", "royalblue", "blue", "navy"), limits - cf"d", "e", "p", "r"), breaks -cf"d", "e", "p",' name = "fuel", labels = c("D", "E", "P", "R")) r"), titletousein labelstouse T breaksto use in legend/axis 1 in legend/axis ^ legend/axis GENERAL PURPOSE SCALES Use with most aesthetics scale_*_continuous() - map cont' values to visual ones scale_*_discrete() - map discrete values to visual ones scale_*_identity() - use data values as visual ones scale_*_manual(values = c()) - map discrete values to manually chosen visual ones scale_*_date(date_labels = "%m/%d"), date_breaks = "2 weeks") - treat data values as dates. scale_*_datetime() - treat data x values as date times. Use same arguments as scale_x_date(). See ?strptime for label formats. X&YLOCATION SCALES Use with x or y aesthetics (x shown here) scale_x_logl0() - Plot x on loglO scale scale_x_reverse() - Reverse direction of x axis scale_x_sqrt() - Plot x on square root scale COLOR AND FILL SCALES (DISCRETE) n <- d + geom_bar(aes(fill = fl)) ll ll n + scale_fill_brewer(palette = "Blues") For palette choices: RColorBrewer::display.brewer.all() n + scale_fill_grey(start = 0.2, end = 0.8, na.value = "reoP') COLOR AND FILL SCALES (CONTINUOUS) II ll Ml I o <- c + geom dotplot(aes(fill = ..x..)) ,|||| o + scale_fill_distiller(palette = "Blues") I • o + scale_fill_gradient(low-red', high-'yellow") •ilsi •iL o + scale_fill_gradient2(low-red', high-'blue", mid = "white", midpoint = 25) o + scale_fill_gradientn(colours=topo.colors(6)) Also: rainbow(), heat.colors(), terrain.colors(), cm.colorsO, RColorBrewer::brewer.pal() SHAPE AND SIZE SCALES p <- e + geom point(aes(shape = fl, size = cyl)) p + scale shape() + scale size() p + scale shape manual(values = c(3:7)) 0 1 2 3 4 S 6 7 8 9 1011 1213 14 151617 18 19 20 21 22 23 24 25 □ oA+xOVH*^®^fflBIESDOAoo o onOAv p + scale_radius(range = c(l,6)) p + scale_size_area(max_size = 6) Coordinate Systems Faceting r<- d + geom_bar() r + coord_cartesian(xlim = c(0,5)) .■I ..ll xlim, ylim The default cartesian coordinate system r+ coord_fixed(ratio = 1/2) ratio, xlim, ylim Cartesian coordinates with fixed aspect ratio between x and y units r+ coord_flip() xlim, ylim Flipped Cartesian coordinates r+ coord_polar(theta = "x", directional) theta, start, direction Polar coordinates r+ coord_trans(ytrans = "sqrt") xtrans, ytrans, limx, limy Transformed cartesian coordinates. Set xtrans and ytrans to the name of a window function. it + coord quickmapO n + coord_map(proiection = "ortho", orientation=c{41, -74, 0J)projection, onenztation, xlim, ylim Map projections from the mapproj package (mercator (default), azequalarea, lagrange, etc.) Position Adjustments Position adjustments determine how to arrange geoms that would otherwise occupy the same space. ill III s <- ggplotfmpg, aes(fl, fill = drv)) s + geom barfposition = "dodge") Arrange elements side by side s + geom_bar(position = "fill") Stack elements on top of one another, normalize height iVt"" e + geom_point(position = "jitter") .»•7* Addrandom noise to X and Y position of each element to avoid overplotting f n e + geom label(position = "nudge") || \\lfj11 Nudge labels away from points s + geom_bar(position = "stack") | Stack elements on top of one another Each position adjustment can be recast as a function with manual width and height arguments s + geom bar(position = position dodge(width = 1)) Themes m r+ theme bw() white background h grid lines wit I~| r + theme_gray() Grey background (default theme) r + theme_dark() dark for contrast u r + theme classic)) r + theme_light() Ir + theme linedrawO r + theme minimal)) Minimal themes ...I r + theme_void() Empty theme Facets divide a plot into subplots based on the values of one or more discrete variables. t <- ggplotfmpg, aesfcty, hwy)) + geom point() Hill t + facet_grid(.~fl) facet into columns based on fl t + facet_grid(year - .) facet into rowsDased on year t + facet_grid(year - fl) facet into Both rows and columns H5™ t + facet_wrap(~ fl) , , wrap facets into a rectangular layout Set scales to let axis limits vary across facets t + facet_grid(drv - fl, scales = "free") x and y axis limits adjust to individual facets "free_x" - x axis limits adjust "free_y" - y axis limits adjust Set la be Her to adjust facet labels t + facet_grid(. - fl, labeller = label_both) fhc fl: d fl: e fl: p fl: r t + facet_grid(fl-., labeller = label_bquote(alpha A (fl))) qc ad ae op ar t + facet_grid(. - fl, labeller = label parsed) c d e p r Labels t + labs( x = "New x axis label", y = "New y axis label title -'Add a title above the plot", subtitle = "Add a subtitle below title", caption = "Add a caption below plot", "New^^^^legend title") t + annotate(ge,om = "text", x = 8, y = 9, label = "A") geom to place I manual values for geom's aesthetics Use scale functions to update legend labels Legends n + theme(legend.position = "bottom") Place legend at "bottom", "top", "left", or "right" n + guidesffill = "none") Set fegend type for each aesthetic: colorbar, legend, or none (no legend) n + scale fill discretelname = "Title", labels = cT/'A",~"B", "C", ''D", "E")) Set legend title and labels with a scale function. Zooming Without clipping (preferred) t + coord cartesian) xlim = c(0,100), ylim = c(10,20)) With clipping (removes unseen data points) t + xlim(0,100) + ylim(10,20) t + scale x continuous(limits = c(0,100)) + scale_y_continuous(limits = c(0,100)) RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com • Learn more at http://ggplot2.tidyverse.org> ggplot2 2.1.0 • Updated: 2016-11