A weather generator is a numerical tool that resamples an input
timeseries many times, while preserving observed or projected
characteristics of importance, such as the statistics of the transition
between wet and dry days. The resulting large group, or ensemble, of
likely rainfall and temperature timeseries represents a range of
possible amounts, daily patterns, and seasonality. This weather
generator is, to our knowledge, novel in that it includes
*seasons* (up to 26) in training the simulation algorithm.

The goal of `wxgenR`

is to provide users a tool that can
simulate, with fidelity, an ensemble of precipitation and temperature
based on training data that could include, for example, station based
point measurements, grid cell values derived from models or remotely
sensed data, or basin averages. The incorporation of seasonality as a
covariate in the training algorithm allows users to examine the effects
of shifts in seasonality due to climate warming (e.g., earlier snowmelt
seasons or prolonged summer dry periods). `wxgenR`

is an
effective and robust scenario planning tool for a wide variety of
potential futures.

`wxgenR`

All that is needed to run `wxgenR`

is a single time series
of precipitation, temperature, and season. Up to 20 seasons may be
defined, but most users will likely only need two to four based on their
study region. For example, `wxgenR`

is provided with single
station data from Blacksburg, Virgina, a temperate locale that is better
defined by four seasons. Within the data used to train the weather
generator, these four seasons should be noted with an index of either 1,
2, 3, or 4 for each day in the time series. The varying statistics of
each season will impact the resulting simulations. Precipitation and
temperature data are point measurements taken at the Blacksburg National
Weather Service office (GHCND:USC00440766).

For example, using the Blacksburg, VA National Weather Service
station-based precipitation, temperature, and season from 1991 to 2020,
we can generate simulated precipitation and temperature for any desired
time length and ensemble size. Your variables *must* be named as
the following: ‘year’, ‘month’, ‘day’, ‘prcp’, ‘temp’, ‘season’. All
input variables must be contained within the same dataframe or text
file. If inputting a text file, it must be comma separated (.csv). The
weather generator can handle NA values for precipitation or temperature,
but all other variables should be numeric values.

```
library(wxgenR)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(reshape2)
#>
#> Attaching package: 'reshape2'
#> The following object is masked from 'package:tidyr':
#>
#> smiths
library(ggpubr)
#> Warning: package 'ggpubr' was built under R version 4.2.1
#> Loading required package: ggplot2
library(data.table)
#>
#> Attaching package: 'data.table'
#> The following objects are masked from 'package:reshape2':
#>
#> dcast, melt
#> The following objects are masked from 'package:dplyr':
#>
#> between, first, last
#> The following objects are masked from 'package:lubridate':
#>
#> hour, isoweek, mday, minute, month, quarter, second, wday, week,
#> yday, year
library(moments)
library(seas)
#> Loading required package: MASS
#>
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#>
#> select
data(BlacksburgVA)
# head(BlacksburgVA)
```

Use the variables within the wx() function like `syr`

and
`eyr`

(start and end year) to set the temporal boundaries
from which to sample, otherwise, if left empty the start and end years
will default to the beginning and end of your training data. Use
`nsim`

to set the length (in years) of your simulated
weather, and `nrealz`

to set the ensemble size (number of
traces).

The variable `wwidth`

will set the sampling window for
each day of year (Jan. 1 through Dec. 31) for every year in the
simulation. The sampling window for each day of year is +/-
`wwidth`

+ 1, effectively sampling `wwidth`

number
of days before and after each day of year, including that day of year. A
lower value for `wwidth`

will sample fewer surrounding days
and a higher value will sample more days, resulting in dampened and
heightened variability, respectively. Typical setting of
`wwidth`

is between 1 and 15, resulting in a daily sampling
window of 3 days and 31 days, respectively. Generally, higher and lower
values of `wwidth`

result in higher and lower variance,
respectively, in the simulated data.

For example, to simulate precipitation on day 1 of the simulation
(Jan. 1 of year 1), with `wwidth`

= 1 (a 3-day window), the
algorithm will sample days in the training record between (and
including) December 31 and January 2 (for all years in the training
record). For day 2 of the simulation (Jan. 2 of year 1), the algorithm
will sample days in the training record between January 1 and January 3.
Simulation day 3 (Jan. 3) will sample between January 2 and January 4,
and so on. Increasing `wwidth`

to 2 (a 5-day window) will
sample between December 30 and January 3 for Jan. 1 simulations,
December 31 to January 4 for Jan. 2, and January 1 to January 5 for
Jan. 3, and so on.

In some cases, the `wwidth`

will be automatically
increased through an adaptive window width if precipitation occurred on
a given day but there were less than two daily precipitation values over
0.01 inches during the window for that day. `wwidth`

will
adaptively increase by 1 until two or more daily precipitation values
over 0.01 inches are in each window. Adaptive window width is most
likely to occur in regions with high aridity, dry seasons, a small
initial value of `wwidth`

is used, or if the number of years
in the training data is relatively short (e.g., less than 30 years). To
display the results of the adaptive window width, set
`awinFlag = T`

.

Here, our training data spans 1991-01-01 to 2020-12-31, but we don’t
want to use the full historical record, so we set `syr`

and
`eyr`

to 2000 and 2004, respectively, in the
`wx()`

function so that the training data is subset between
those years. We want a simulation length of 5-years (`nsim`

)
in order to match the length of the subset training record and 2 traces
in our ensemble (`nrealz`

) for computational efficiency
(although more traces, e.g. 50, are recommended). Sampling for each day
of the year will sample from the preceding 1-day, the day of, and the
following 1-day (`wwidth`

) for a total window size of
3-days.

We may also want to increase the variability of our simulated
precipitation by sampling outside the historical envelope with an
Epanechnikov Kernel (`ekflag = T`

). For more details on the
Epanechnikov kernel and its use in a weather generator, see Rajagopalan
et al. (1996). Setting `tempPerturb = T`

will increase the
variability of the simulated temperature by adding random noise from a
normal distribution fit using a mean of zero and a standard deviation
equal to the monthly standard deviation of simulated temperature
residuals. Given that simulated daily temperature at time t is a
function of temperature(t-1), cosine(t), sine(t), precipitation
occurrence(t), and monthly mean temperature(t), the standard deviation
of daily residuals from this model is calculated for each month and used
to add random noise to the simulated temperature. The temperature
simulation approach is inspired by- and adapted from- Verdin et
al. (2015, 2018).

Since the training data has units of inches and degrees Fahrenheit
for precipitation and temperature, respectively, we must set
`unitSystem = "Metric"`

. Setting `numbCores`

to 2
or greater will enable parallel computing for precipitation simulation
which is the most computationally intensive aspect of the weather
generator.

```
= 5 #number of simulation years
nsim = 2 #number of traces in ensemble
nrealz
<- Sys.time() #benchmark run time
startTime
= wx(trainingData = BlacksburgVA, syr = 2000, eyr = 2004,
z nsim = nsim, nrealz = nrealz, aseed = 123,
wwidth = 1, unitSystem = "Metric", ekflag = TRUE,
awinFlag = FALSE, tempPerturb = TRUE, pcpOccFlag = FALSE,
numbCores = 2)
#> ...Simulate precipitation occurrence and temperature...
#> 1
#> 2
#> ...Simulate precipitation amount...
= Sys.time() endTime
```

The wx() function will return a list containing both your
input/training data, and a variety of processed outputs, named here as
the variable `z`

. Within `z`

, `dat.d`

is the original input data as well as some intermediary variables.
`simyr1`

contains the years within your training data that
were sampled to generate simulated values for each trace. `X`

is the occurrence of daily precipitation for each trace, where 1 and 0
indicate the presence and absence of precipitation, respectively.
`Xseas`

is the season index for each day and trace.
`Xpdate`

shows which days from the training data were sampled
for each simulated day and trace, if precipitation was simulated to
occur on a given day. `Xpamt`

is the simulated precipitation
amount for each day and trace. `Xtemp`

is the simulated
temperature for each day and trace.

Generally, `Xpamt`

and `Xtemp`

will be of most
interest to users as these are the desired outputs of simulated daily
precipitation and temperature.

`# glimpse(z)`

```
#parse variables from wx() output
= z$dat.d
dat.d = z$simyr1
simyr1 = z$X
X = z$Xseas
Xseas = z$Xpdate
Xpdate = z$Xpamt
Xpamt = z$Xtemp
Xtemp
#write simulation output
#
<- seq(1, length(X[,1]), 366)
it1 = it1+366-1
it2
#initialize storage
= matrix(NA, nrow = nsim*366, ncol = nrealz+3)
sim.pcp = matrix(NA, nrow = nsim*366, ncol = nrealz+3)
sim.tmp = matrix(NA, nrow = nsim*366, ncol = nrealz+3)
sim.szn
#loop through realization
= 1
irealz for (irealz in 1:nrealz){
<- vector()
outmat
#loop through simulation years
= 1
isim for (isim in 1:nsim){
= FALSE
leapflag = simyr1[isim, irealz]
ayr if (lubridate::leap_year(ayr)) leapflag = TRUE
= rep(isim, 366) #column 1, simulation year
col1 = ayr*10^4+01*10^2+01; d2 = ayr*10^4+12*10^2+31
d1 = which(dat.d$date1 == d1)
i1 = which(dat.d$date1 == d2)
i2 = dat.d$date1[i1:i2] #column 2, simulation date
col2 if (leapflag == FALSE) col2 = c(col2,NA)
= it1[isim]
i1 = it2[isim]
i2 = Xseas[i1:i2, irealz] #column 3, simulation season
col3 = X[i1:i2, irealz] #column 4, precipitation occurrence
col4 = Xpdate[i1:i2, irealz] #column 5, precipation resampling date
col5 = Xpamt[i1:i2, irealz] #column 6, resampled precipitation amount
col6 = Xtemp[i1:i2, irealz] #column7, simulated temperature
col7
#create time series of 'simulation day'
= rep(isim, length(col2))
sim.yr = month(ymd(col2))
sim.month = day(ymd(col2))
sim.day
= rbind(outmat, cbind(sim.yr, sim.month, sim.day, col6, col7, col3))
outmat #isim
}
colnames(outmat) = c("simulation year", "month", "day", "prcp", "temp", "season")
if(irealz == 1){
1:3] = outmat[,1:3]
sim.pcp[,1:3] = outmat[,1:3]
sim.tmp[,1:3] = outmat[,1:3]
sim.szn[,
}
+3] = outmat[,4]
sim.pcp[,irealz+3] = outmat[,5]
sim.tmp[,irealz+3] = outmat[,6]
sim.szn[,irealz
#irealz }
```

#Format dataframes for simulated precip, temperature, and season

```
# df = sim.pcp
= function(df){
formatting
= as.data.frame(df)
df
colnames(df) = c("simulation year", "month", "day", paste0("Trace_", 1:nrealz))
#remove 366 days for non-leap years
= drop_na(df, c(month, day))
df
#assign simulation year to start at the same time as training data
$`simulation year` = df$`simulation year` + dat.d$year[1] - 1
df
#format date
$Date = ymd(paste(df$`simulation year`, df$month, df$day, sep = "-"))
df
#remove years that aren't leap years
# df = drop_na(df, Date)
= df %>%
df mutate(yday = as.numeric(yday(Date)),
week = as.numeric(week(Date))) %>%
relocate(c(Date,yday,week), .after = day) %>%
melt(id = 1:6)
return(df)
}
```

```
= formatting(sim.pcp)
sim.pcp #> Warning: 1 failed to parse.
#> Warning in melt(., id = 1:6): The melt generic in data.table has been passed a
#> data.frame and will attempt to redirect to the relevant reshape2 method; please
#> note that reshape2 is deprecated, and this redirection is now deprecated as
#> well. To continue using melt methods from reshape2 while both libraries are
#> attached, e.g. melt.list, you can prepend the namespace like reshape2::melt(.).
#> In the next version, this warning will become an error.
= formatting(sim.tmp)
sim.tmp #> Warning: 1 failed to parse.
#> Warning: The melt generic in data.table has been passed a data.frame and will
#> attempt to redirect to the relevant reshape2 method; please note that reshape2
#> is deprecated, and this redirection is now deprecated as well. To continue using
#> melt methods from reshape2 while both libraries are attached, e.g. melt.list,
#> you can prepend the namespace like reshape2::melt(.). In the next version, this
#> warning will become an error.
= formatting(sim.szn)
sim.szn #> Warning: 1 failed to parse.
#> Warning: The melt generic in data.table has been passed a data.frame and will
#> attempt to redirect to the relevant reshape2 method; please note that reshape2
#> is deprecated, and this redirection is now deprecated as well. To continue using
#> melt methods from reshape2 while both libraries are attached, e.g. melt.list,
#> you can prepend the namespace like reshape2::melt(.). In the next version, this
#> warning will become an error.
```

```
colnames(dat.d)[11] = "yday"
= dat.d[,c(1:3,8:9,11,4)]
obs.pcp = dat.d[,c(1:3,8:9,11,5)] obs.tmp
```

If your data contained NA values, they can propagate to simulated
temperature values (NA precip values in your data are set to 0), so use
`na.rm = T`

for any subsequent analysis. You may also choose
to replace `NA`

values with daily or monthly averages.
Additionally, leap years may be included in the simulated weather if
they are included in your training data, so all non-leap years include a
row of ‘NA’ values at the end of the calendar year as a book-keeping
measure so that the total number of rows in each trace is the same.

```
#plot simulated daily data
# simDat = sim.tmp
# obsDat = obs.tmp
# Tag = "Temp"
= function(simDat, obsDat, Tag){
dailyPlot
= simDat %>%
simD drop_na() %>%
group_by(variable, yday) %>%
summarise(
mean = mean(value, na.rm = T),
max = max(value, na.rm = T),
sd = sd(value, na.rm = T),
skew = skewness(value, na.rm = T)
%>%
) ungroup()
<- simD %>%
simDq group_by(yday) %>%
summarise(
mean_q5 = quantile(mean, 0.05, na.rm = T),
mean_med = median(mean, na.rm = T),
mean_q95 = quantile(mean, 0.95, na.rm =T),
max_q5 = quantile(max, 0.05, na.rm = T),
max_med = median(max, na.rm = T),
max_q95 = quantile(max, 0.95, na.rm = T),
sd_q5 = quantile(sd, 0.05, na.rm = T),
sd_med = median(sd),
sd_q95 = quantile(sd, 0.95, na.rm = T),
skew_q5 = quantile(skew, 0.05, na.rm = T),
skew_med = median(skew, na.rm = T),
skew_q95 = quantile(skew, 0.95, na.rm = T)
%>%
) drop_na() %>%
ungroup()
if(Tag == "Temp"){
<- obsDat %>%
obs drop_na() %>%
group_by(yday) %>%
summarise(
mean = mean(temp, na.rm = T),
max = max(temp, na.rm = T),
sd = sd(temp, na.rm = T),
skew = skewness(temp, na.rm = T)
%>%
) ungroup()
else if(Tag == "Precip"){
} <- obsDat %>%
obs drop_na() %>%
group_by(yday) %>%
summarise(
mean = mean(prcp, na.rm = T),
max = max(prcp, na.rm = T),
sd = sd(prcp, na.rm = T),
skew = skewness(prcp, na.rm = T)
%>%
) ungroup()
}
colnames(obs)[-1] = paste0("obs_", colnames(obs)[-1])
= left_join(simDq, obs, by = "yday")
df.comb
#plotting --------------------------------
= c(0.8, 0.9)
lgdLoc
if(Tag == "Temp"){
= "Daily Temperature "
yLabel = "(°F)"
units else if(Tag == "Precip"){
} = "Daily Precipitation "
yLabel = "(inches)"
units
}
= 0.65
trnAlpha
#daily mean
= ggplot(df.comb) +
p1 geom_ribbon(aes(x = yday, ymin = mean_q5, ymax = mean_q95), alpha = 0.25) +
geom_line(aes(x = yday, y = mean_med, color = "red"), size = 1, alpha = 0.8) +
geom_line(aes(x = yday, y = obs_mean), size = 0.3, alpha = trnAlpha, linetype = "solid", color = "blue") +
geom_point(aes(x = yday, y = obs_mean), size = 0.6, alpha = trnAlpha, color = "blue") +
scale_colour_manual(values =c('blue'='blue','red'='red', 'grey' = 'grey'), labels = c('Training Data','Simulation Median', '95% Confidence')) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
# text=element_text(size=14),
panel.grid.major = element_line(),
legend.title=element_blank(),
legend.position = lgdLoc,
legend.background = element_blank(),
legend.box.background = element_blank(),
legend.key = element_blank()) +
xlab("Day of Year") + ylab(paste0("Mean ", yLabel, units))
#daily SD
= ggplot(df.comb) +
p2 geom_ribbon(aes(x = yday, ymin = sd_q5, ymax = sd_q95), alpha = 0.25) +
geom_line(aes(x = yday, y = sd_med, color = "red"), size = 1, alpha = 0.8) +
geom_line(aes(x = yday, y = obs_sd), size = 0.3, alpha = trnAlpha, linetype = "solid", color = "blue") +
geom_point(aes(x = yday, y = obs_sd), size = 0.6, alpha = trnAlpha, color = "blue") +
scale_colour_manual(values =c('blue'='blue','red'='red', 'grey' = 'grey'), labels = c('Training Data','Simulation Median', '95% Confidence')) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
# text=element_text(size=14),
panel.grid.major = element_line(),
legend.title=element_blank(),
legend.position = lgdLoc,
legend.background = element_blank(),
legend.box.background = element_blank(),
legend.key = element_blank()) +
xlab("Day of Year") + ylab(paste0("Std. Deviation of ", yLabel, units))
#daily skew
= ggplot(df.comb) +
p3 geom_ribbon(aes(x = yday, ymin = skew_q5, ymax = skew_q95), alpha = 0.25) +
geom_line(aes(x = yday, y = skew_med, color = "red"), size = 1, alpha = 0.8) +
geom_line(aes(x = yday, y = obs_skew), size = 0.3, alpha = trnAlpha, linetype = "solid", color = "blue") +
geom_point(aes(x = yday, y = obs_skew), size = 0.6, alpha = trnAlpha, color = "blue") +
scale_colour_manual(values =c('blue'='blue','red'='red', 'grey' = 'grey'), labels = c('Training Data','Simulation Median', '95% Confidence')) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
# text=element_text(size=14),
panel.grid.major = element_line(),
legend.title=element_blank(),
legend.position = lgdLoc,
legend.background = element_blank(),
legend.box.background = element_blank(),
legend.key = element_blank()) +
xlab("Day of Year") + ylab(paste0("Skew of ", yLabel, " (-)"))
#daily Max
= ggplot(df.comb) +
p4 geom_ribbon(aes(x = yday, ymin = max_q5, ymax = max_q95), alpha = 0.25) +
geom_line(aes(x = yday, y = max_med, color = "red"), size = 1, alpha = 0.8) +
geom_line(aes(x = yday, y = obs_max), size = 0.3, alpha = trnAlpha, linetype = "solid", color = "blue") +
geom_point(aes(x = yday, y = obs_max), size = 0.6, alpha = trnAlpha, color = "blue") +
scale_colour_manual(values =c('blue'='blue','red'='red', 'grey' = 'grey'), labels = c('Training Data','Simulation Median', '95% Confidence')) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
# text=element_text(size=14),
panel.grid.major = element_line(),
legend.title=element_blank(),
legend.position = lgdLoc,
legend.background = element_blank(),
legend.box.background = element_blank(),
legend.key = element_blank()) +
xlab("Day of Year") + ylab(paste0("Maximum ", yLabel, units))
= ggarrange(p1, p2, p3, p4, nrow = 2, ncol = 2, common.legend = TRUE, legend = "bottom")
p.comb
print(p.comb)
# p.out = paste0(tempdir(), "/outputPlots/dailyStats_", Tag, ".png")
# ggsave(filename = p.out, plot = p.comb, device = "png")
}
```

```
dailyPlot(sim.pcp, obs.pcp, "Precip")
#> `summarise()` has grouped output by 'variable'. You can override using the
#> `.groups` argument.
#> Warning: Removed 2 row(s) containing missing values (geom_path).
#> Warning: Removed 43 rows containing missing values (geom_point).
```

```
dailyPlot(sim.tmp, obs.tmp, "Temp")
#> `summarise()` has grouped output by 'variable'. You can override using the
#> `.groups` argument.
```

Boxplot whiskers are in the style of Tukey (1.5 x interquartile range)

```
#plot simulated daily data
= sim.tmp
simDat = obs.tmp
obsDat = "Temp"
Tag
= function(simDat, obsDat, Tag){
monthlyPlot
if(Tag == "Temp"){
= simDat %>%
simM drop_na() %>%
group_by(variable, month, `simulation year`) %>%
summarise(
mean = mean(value, na.rm = T),
max = max(value, na.rm = T),
sd = sd(value, na.rm = T),
skew = skewness(value, na.rm = T)
%>%
) ungroup()
<- simM %>%
simMM group_by(variable, month) %>%
summarise(
mean=mean(mean),
max=mean(max),
sd=sqrt(mean(sd^2)),
skew=mean(skew, na.rm=T)
%>%
) ungroup()
<- obsDat %>%
obs drop_na() %>%
group_by(month, year) %>%
summarise(
mean = mean(temp, na.rm = T),
max = max(temp, na.rm = T),
sd = sd(temp, na.rm = T),
skew = skewness(temp, na.rm = T)
%>%
) ungroup()
<- obs %>%
obsMM group_by(month) %>%
summarise(
mean = mean(mean, na.rm = T),
max = mean(max, na.rm = T),
sd = sqrt(mean(sd^2)),
skew = mean(skew, na.rm=T)
%>%
) mutate(variable = "Observed") %>%
relocate(variable) %>%
ungroup()
# colnames(obsMM)[-1] = paste0("obs_", colnames(obsMM)[-1])
else if(Tag == "Precip"){
}
= simDat %>%
simM drop_na() %>%
group_by(variable, month, `simulation year`) %>%
summarise(
sum = sum(value, na.rm = T),
max = max(value, na.rm = T),
sd = sd(value, na.rm = T),
skew = skewness(value, na.rm = T)
%>%
) ungroup()
<- simM %>%
simMM group_by(variable, month) %>%
summarise(
sum=mean(sum),
max=mean(max),
sd=sqrt(mean(sd^2)),
skew=mean(skew, na.rm=T)
%>%
) ungroup()
<- obsDat %>%
obs drop_na() %>%
group_by(month, year) %>%
summarise(
sum = sum(prcp, na.rm = T),
max = max(prcp, na.rm = T),
sd = sd(prcp, na.rm = T),
skew = skewness(prcp, na.rm = T)
%>%
) ungroup()
<- obs %>%
obsMM group_by(month) %>%
summarise(
sum = mean(sum, na.rm = T),
max = mean(max, na.rm = T),
sd = sqrt(mean(sd^2)),
skew = mean(skew, na.rm=T)
%>%
) mutate(variable = "Observed") %>%
relocate(variable) %>%
ungroup()
# colnames(obsMM)[-1] = paste0("obs_", colnames(obsMM)[-1])
}
= rbind(obsMM, simMM)
df.comb
#plotting --------------------------------
if(Tag == "Temp"){
= ggplot(df.comb) +
p1 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = month, y = mean, group = month)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = month, y = mean, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = month, y = mean, color = "Observed")) +
xlab("Month") + ylab("Temperature (°F)") +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = 1:12) +
ggtitle("Average Mean Monthly Temperature")
else if(Tag == "Precip"){
}
= ggplot(df.comb) +
p1 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = month, y = sum, group = month)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = month, y = sum, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = month, y = sum, color = "Observed")) +
xlab("Month") + ylab("Precipitation (inches)") +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = 1:12) +
ggtitle("Average Total Monthly Precipitation")
}
if(Tag == "Temp"){
= "Temperature "
yLabel = "(°F)"
units else if(Tag == "Precip"){
} = "Precipitation "
yLabel = "(inches)"
units
}
#monthly SD
= ggplot(df.comb) +
p2 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = month, y = sd, group = month)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = month, y = sd, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = month, y = sd, color = "Observed")) +
xlab("Month") + ylab(paste0("Standard Deviation ", units)) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = 1:12) +
ggtitle(paste0("Average Standard Deviation in Monthly ", yLabel))
#monthly Skew
= ggplot(df.comb) +
p3 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = month, y = skew, group = month)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = month, y = skew, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = month, y = skew, color = "Observed")) +
xlab("Month") + ylab("Skew (-)") +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = 1:12) +
ggtitle(paste0("Average Skew in Monthly ", yLabel))
#monthly max
= ggplot(df.comb) +
p4 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = month, y = max, group = month)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = month, y = max, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = month, y = max, color = "Observed")) +
xlab("Month") + ylab(paste0("Maximum ", units)) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = 1:12) +
ggtitle(paste0("Average Monthly Maximum ", yLabel))
= ggarrange(p1, p2, p3, p4, nrow = 2, ncol = 2, common.legend = TRUE, legend = "bottom")
p.comb
print(p.comb)
= paste0(tempdir(), "/outputPlots/monthlyStats_", Tag, ".png")
p.out # ggsave(filename = p.out, plot = p.comb, device = "png", height = 8, width = 8, units = "in")
}
```

```
monthlyPlot(sim.pcp, obs.pcp, "Precip")
#> `summarise()` has grouped output by 'variable', 'month'. You can override using
#> the `.groups` argument.
#> `summarise()` has grouped output by 'variable'. You can override using the
#> `.groups` argument.
#> `summarise()` has grouped output by 'month'. You can override using the
#> `.groups` argument.
```

```
monthlyPlot(sim.tmp, obs.tmp, "Temp")
#> `summarise()` has grouped output by 'variable', 'month'. You can override using
#> the `.groups` argument.
#> `summarise()` has grouped output by 'variable'. You can override using the
#> `.groups` argument.
#> `summarise()` has grouped output by 'month'. You can override using the
#> `.groups` argument.
```

Boxplot whiskers are in the style of Tukey (1.5 x interquartile range)

```
#plot simulated daily data
= sim.tmp
simDat = obs.tmp
obsDat = "Temp"
Tag
= function(simDat, obsDat, Tag){
weeklyPlot
if(Tag == "Temp"){
= simDat %>%
simW drop_na() %>%
group_by(variable, week, `simulation year`) %>%
summarise(
mean = mean(value, na.rm = T),
max = max(value, na.rm = T),
sd = sd(value, na.rm = T),
skew = skewness(value, na.rm = T)
%>%
) ungroup()
<- simW %>%
simWW group_by(variable, week) %>%
summarise(
mean=mean(mean),
max=mean(max),
sd=sqrt(mean(sd^2)),
skew=mean(skew, na.rm=T)
%>%
) ungroup()
<- obsDat %>%
obs drop_na() %>%
group_by(week, year) %>%
summarise(
mean = mean(temp, na.rm = T),
max = max(temp, na.rm = T),
sd = sd(temp, na.rm = T),
skew = skewness(temp, na.rm = T)
%>%
) ungroup()
<- obs %>%
obsWW group_by(week) %>%
summarise(
mean = mean(mean, na.rm = T),
max = mean(max, na.rm = T),
sd = sqrt(mean(sd^2)),
skew = mean(skew, na.rm=T)
%>%
) mutate(variable = "Observed") %>%
relocate(variable) %>%
ungroup()
# colnames(obsMM)[-1] = paste0("obs_", colnames(obsMM)[-1])
else if(Tag == "Precip"){
}
= simDat %>%
simW drop_na() %>%
group_by(variable, week, `simulation year`) %>%
summarise(
sum = sum(value, na.rm = T),
max = max(value, na.rm = T),
sd = sd(value, na.rm = T),
skew = skewness(value, na.rm = T)
%>%
) ungroup()
<- simW %>%
simWW group_by(variable, week) %>%
summarise(
sum=mean(sum),
max=mean(max),
sd=sqrt(mean(sd^2)),
skew=mean(skew, na.rm=T)
%>%
) ungroup()
<- obsDat %>%
obs drop_na() %>%
group_by(week, year) %>%
summarise(
sum = sum(prcp, na.rm = T),
max = max(prcp, na.rm = T),
sd = sd(prcp, na.rm = T),
skew = skewness(prcp, na.rm = T)
%>%
) ungroup()
<- obs %>%
obsWW group_by(week) %>%
summarise(
sum = mean(sum, na.rm = T),
max = mean(max, na.rm = T),
sd = sqrt(mean(sd^2)),
skew = mean(skew, na.rm=T)
%>%
) mutate(variable = "Observed") %>%
relocate(variable) %>%
ungroup()
# colnames(obsMM)[-1] = paste0("obs_", colnames(obsMM)[-1])
}
= rbind(obsWW, simWW)
df.comb
#plotting --------------------------------
if(Tag == "Temp"){
= ggplot(df.comb) +
p1 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = week, y = mean, group = week)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = week, y = mean, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = week, y = mean, color = "Observed")) +
xlab("Week") + ylab("Temperature (°F)") +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = seq(1,52,2)) +
ggtitle("Average Mean Weekly Temperature")
else if(Tag == "Precip"){
}
= ggplot(df.comb) +
p1 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = week, y = sum, group = week)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = week, y = sum, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = week, y = sum, color = "Observed")) +
xlab("Week") + ylab("Precipitation (inches)") +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = seq(1,52,2)) +
ggtitle("Average Total Weekly Precipitation")
}
if(Tag == "Temp"){
= "Temperature "
yLabel = "(°F)"
units else if(Tag == "Precip"){
} = "Precipitation "
yLabel = "(inches)"
units
}
#weekly SD
= ggplot(df.comb) +
p2 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = week, y = sd, group = week)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = week, y = sd, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = week, y = sd, color = "Observed")) +
xlab("Week") + ylab(paste0("Standard Deviation ", units)) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = seq(1,52,2)) +
ggtitle(paste0("Average Standard Deviation in Weekly ", yLabel))
#weekly Skew
= ggplot(df.comb) +
p3 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = week, y = skew, group = week)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = week, y = skew, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = week, y = skew, color = "Observed")) +
xlab("Week") + ylab("Skew (-)") +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = seq(1,52,2)) +
ggtitle(paste0("Average Skew in Weekly ", yLabel))
#weekly max
= ggplot(df.comb) +
p4 geom_boxplot(data = subset(df.comb, variable != "Observed"), aes(x = week, y = max, group = week)) +
geom_line(data = subset(df.comb, variable == "Observed"), size = 0.5, aes(x = week, y = max, color = "Observed")) +
geom_point(data = subset(df.comb, variable == "Observed"), size = 1.5, aes(x = week, y = max, color = "Observed")) +
xlab("Week") + ylab(paste0("Maximum ", units)) +
theme_classic() +
theme(axis.title = element_text(face = "bold"),
text=element_text(size=12),
panel.grid.major = element_line(),
legend.title=element_blank(),
plot.title = element_text(size=10)
+
) scale_x_continuous(breaks = seq(1,52,2)) +
ggtitle(paste0("Average Weekly Maximum ", yLabel))
= ggarrange(p1, p2, p3, p4, nrow = 2, ncol = 2, common.legend = TRUE, legend = "bottom")
p.comb
print(p.comb)
= paste0(tempdir(), "/outputPlots/weeklyStats_", Tag, ".png")
p.out # ggsave(filename = p.out, plot = p.comb, device = "png", height = 8, width = 10, units = "in")
}
```

`# weeklyPlot(sim.pcp, obs.pcp, "Precip")`

`# weeklyPlot(sim.tmp, obs.tmp, "Temp")`

Save your simulated weather ensemble to a file via the
`writeSim`

function. It will conveniently save each trace to
a .csv file.

```
# setwd() to desired location for writeSim to save .csv files containing the simulated precipitation and temperature
# setwd(tempdir())
#
# writeSim(wxOutput = z, nsim = nsim, nrealz = nrealz, debug = TRUE)
```

Running the `wx()`

weather generator code for a 5-year,
10-trace simulation on a laptop with the following characteristics
results in the below run time. Parallel computing was enabled via
`numbCores = 11`

in the wx() function. OS: Microsoft Windows
10 Enterprise 10.0.19044 Build 19044 Hardware: Intel(R) Core(TM)
i7-10850H CPU @2.70GHz, 2712 Mhz, 6 Cores,
12 Logical Processors. 16 GB installed physical memory.

```
#wxgenR weather generation run time:
print(difftime(endTime, startTime, units='mins'))
#> Time difference of 0.1447891 mins
```

For more details and examples, including analysis of the dataset used in this vignette, see the following works:

```
Bearup, L., Gangopadhyay, S., & Mikkelson, K. (2021). Hydroclimate Analysis Lower Santa Cruz River Basin Study (Technical Memorandum No ENV-2020-056). Bureau of Reclamation. <https://www.usbr.gov/lc/phoenix/programs/lscrbasin/LSCRBStudy.html>
Gangopadhyay, S., Bearup, L. A., Verdin, A., Pruitt, T., Halper, E., & Shamir, E. (2019, December 1). A collaborative stochastic weather generator for climate impacts assessment in the Lower Santa Cruz River Basin, Arizona. Fall Meeting 2019, American Geophysical Union. <https://ui.adsabs.harvard.edu/abs/2019AGUFMGC41G1267G>
Rajagopalan, B., Lall, U., and Tarboton, D. G.: Nonhomogeneous Markov Model for Daily Precipitation, Journal of Hydrologic Engineering, 1, 33–40, <https://doi.org/10.1061/(ASCE)1084-0699(1996)1:1(33)>, 1996.
Verdin, A., Rajagopalan, B., Kleiber, W., and Katz, R. W.: Coupled stochastic weather generation using spatial and generalized linear models, Stoch Environ Res Risk Assess, 29, 347–356, <https://doi.org/10.1007/s00477-014-0911-6>, 2015.
Verdin, A., Rajagopalan, B., Kleiber, W., Podestá, G., and Bert, F.: A conditional stochastic weather generator for seasonal to multi-decadal simulations, Journal of Hydrology, 556, 835–846, <https://doi.org/10.1016/j.jhydrol.2015.12.036>, 2018.
```

This information is preliminary and is subject to revision. It is being provided to meet the need for timely best science. The information is provided on the condition that neither the U.S. Bureau of Reclamation nor the U.S. Government may be held liable for any damages resulting from the authorized or unauthorized use of the information.