-
Notifications
You must be signed in to change notification settings - Fork 14
Expand file tree
/
Copy pathgsplotIntro.Rmd
More file actions
147 lines (110 loc) · 6.63 KB
/
gsplotIntro.Rmd
File metadata and controls
147 lines (110 loc) · 6.63 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
title: "Introduction to gsplot."
author: "Jordan Read, Laura DeCicco, Jordan Walker, Phethala Thongsavanh, Lindsay Carr"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
rmarkdown::html_vignette:
toc: true
number_sections: true
fig_caption: yes
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{gsplot Intro}
\usepackage[utf8]{inputenc}
---
```{r message=FALSE, echo=TRUE, fig.cap="Demo workflow", fig.width=6, fig.height=6}
library(gsplot)
MaumeeDV <- MaumeeDV
plot.new()
demoPlot <- gsplot() %>%
points(y=c(3,1,2), x=1:3, xlim=c(0,NA),ylim=c(0,NA),
col="blue", pch=18, legend.name="Points", xlab="Index") %>%
lines(c(3,4,3), c(2,4,6), legend.name="Lines", ylab="Data") %>%
abline(b=1, a=0, legend.name="1:1") %>%
legend(location="topleft",title="Awesome!") %>%
grid() %>%
error_bar(x=1:3, y=c(3,1,2), y.high=c(0.5,0.25,1), y.low=0.1) %>%
error_bar(x=1:3, y=c(3,1,2), x.low=.2, x.high=.2, col="red",lwd=3) %>%
callouts(x=1, y=2.8, lwd=2, angle=250, labels="Weird data") %>%
title("Graphing Fun")
demoPlot
```
## Overview
`gsplot` uses similar plotting graphics to R base graphics, but allows users to execute them in a more intuitive manner. Additionally, as the complexity of the plot features increase, `gpslot` code is simplistic compared to that of base graphics. `gsplot` also includes features not present in base graphics that are useful when working with USGS data, such as `callouts` (combines `segments` and `text` into a single call), `error_bar` (allows an error to be given as `y.high`, `y.low`, `x.high`, and `x.low` and automatically builds an error bar), and the argument `legend.name` (an argument within `points`, `lines`, etc. which does not require colors, linetypes, and other par information to be redefined within the `legend` call).
## Data manipulation
Data from Maumee River will be used to showcase the workflow and features that `gsplot` offers. First, the data is manipulated to extract the timeseries as four separate variables - dates (formatted as yyyy-mm-dd), flow (discharge in cubic feet per second), pH, and Wtemp (water temperature in degrees Celcius). Additionally, the USGS site IDs for the sampling stations are identified.
```{r echo=TRUE, message=FALSE}
library(gsplot)
MaumeeDV <- MaumeeDV
sites <- unique(MaumeeDV$site_no)
dates <- sapply(sites, function(x) MaumeeDV$Date[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE)
flow <- sapply(sites, function(x) MaumeeDV$Flow[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE)
pH <- sapply(sites, function(x) MaumeeDV$pH_Median[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE)
Wtemp <- sapply(sites, function(x) MaumeeDV$Wtemp[which(MaumeeDV$site_no==x)], USE.NAMES=TRUE)
```
## Simple timeseries
First, `gsplot` is used to create a simple timeseries graph for discharge, and a grid is added to help with data readability.
```{r echo=TRUE, fig.cap="Fig. 1 Simple flow timeseries using `gsplot`.", fig.width=6, fig.height=6}
site <- '04193500'
demoPlot <- gsplot() %>%
lines(dates[[site]], flow[[site]], col="royalblue") %>%
title(main=paste("Site", site), ylab="Flow, ft3/s") %>%
grid()
demoPlot
```
## Simple timeseries using a log scale
This data may be better represented using a log scale due to the range of flow values. Thus, the yaxis is easily turned into a logged scale by inserting the code `log='y'`. To make sure that the grid lines correspond to the logged axis, the code `equilogs=FALSE` is used to let gridlines be drawn at unequal distances from each other.
```{r echo=TRUE, fig.cap="Fig. 2 Simple flow timeseries with a logged y-axis using `gsplot`.", fig.width=6, fig.height=6}
site <- '04193500'
demoPlot <- gsplot() %>%
lines(dates[[site]], flow[[site]], col="royalblue", log='y') %>%
title(main=paste("Site", site), ylab="Flow, ft3/s") %>%
grid(equilogs=FALSE)
demoPlot
```
## Multiple plots in one figure
What if you wanted to see if there was any relationship between the pH and water temperature? Consider the following three graphs: pH vs water temperature, pH timeseries, water temperature timeseries. To view these three plots at one time, use `layout` to "append" the three different plots.
```{r echo=TRUE, fig.cap="Fig. 3 (a) pH vs water temperature, (b) pH timeseries, (c) water temperature timeseries.", fig.width=6, fig.height=6}
site <- '04193490'
plot1 <- gsplot() %>%
points(Wtemp[[site]], pH[[site]], col="black")%>%
title(main=paste("Site", site), xlab="Water Temperature (deg C)", ylab="pH")
plot2 <- gsplot() %>%
lines(dates[[site]], pH[[site]], col="seagreen")%>%
title(main="", xlab="time", ylab="pH")
plot3 <- gsplot() %>%
lines(dates[[site]], Wtemp[[site]], col="orangered")%>%
title(main="", xlab="time", ylab="Water Temperature (deg C)")
layout(matrix(c(1,2,3), byrow=TRUE, nrow=3))
plot1
plot2
plot3
```
## Compare timeseries of different units
For timeseries, it is sometimes helpful to plot data on the same graph to make comparisons; however, it becomes difficult when the data differ in units. Thus, a second y-axis can easily be added to plot the second timeseries. In this example, we can compare the water temperature and pH timeseries to identify relationships over time. pH is plot using the secondary y-axis by specifying `side=4`.
```{r echo=TRUE, fig.cap="Fig. 4 Water temperature timeseries on primary y-axis with pH timeseries on secondary y-axis.", fig.width=6, fig.height=6}
site <- '04193490'
demoPlot <- gsplot(mar=c(7.1, 4.1, 4.1, 4.1)) %>%
lines(dates[[site]], Wtemp[[site]], col="orangered",
legend.name="Water Temperature", ylab='Water Temperature (deg C)') %>%
lines(dates[[site]], pH[[site]], col="seagreen", side=4,
legend.name="pH", ylab='pH (pH Units)') %>%
title(main=paste("Site", site), xlab='time') %>%
legend(location="below")
demoPlot
```
## Adding to the plot retroactively
Oftentimes, data is plot and observations regarding missing or abnormal data are made afterwards. `gsplot` makes it easy to add to any plot retroactively by using the plot object (`demoPlot` in this example) in the call for the plot feature.
```{r echo=TRUE, fig.cap="Fig. 5 Initial plot of water temperature timeseries.", fig.width=6, fig.height=6}
# initially plot the data
site <- '04193490'
demoPlot <- gsplot() %>%
lines(dates[[site]], Wtemp[[site]], col="orangered") %>%
title(main=paste("Site", site), xlab='time', ylab='Water Temperature (deg C)')
demoPlot
```
```{r echo=TRUE, fig.cap="Fig. 6 Plot of water temperature timeseries with 'Missing Data' callout retroactively added.", fig.width=6, fig.height=6}
# notice the missing data from ~ 1991 through ~2011 and add a callout
demoPlot <- callouts(demoPlot, x=as.Date("2000-01-01"), y=10,labels="Missing Data")
demoPlot
```