5.2 Data preview
First, let us inspect the distribution of values:
hist(df$`discharge (cf/s)`, main = "Distribution of Rio Grande Discharge", xlab = "Discharge (cf/s)")
Like many other variables, streamflow is positively skewed: the distribution is asymmetric, with most values near zero and few instances of very high streamflow. Now let’s inspect the temporal behavior:
ggplot(df, aes(x = datetime, y = `discharge (cf/s)`)) +
geom_line() +
labs(x = "Time", y = "Discharge (cf/s)") +
theme_light()