diff --git a/vignettes/Plotting_Time_Series.Rmd b/vignettes/Plotting_Time_Series.Rmd new file mode 100644 index 0000000..f300387 --- /dev/null +++ b/vignettes/Plotting_Time_Series.Rmd @@ -0,0 +1,146 @@ +--- +title: "Plotting Time Series Data" +author: "Rami Krispin (@Rami_Krispin)" +date: "2018-09-17" +output: rmarkdown::html_vignette +vignette: > + %\VignetteEngine{knitr::rmarkdown} + %\VignetteIndexEntry{Vignette Title} + %\usepackage[UTF-8]{inputenc} +--- + +```{r setup, include=FALSE} +knitr::opts_chunk$set(echo = TRUE) +options(list(menu.graphics = FALSE, scipen=99, digits= 3)) +options("getSymbols.warning4.0"=FALSE) +options("getSymbols.yahoo.warning"=FALSE) +``` + +The `ts_plot` is a versatile function for interactive plotting of time series objects. It is based on the [plotly](https://plot.ly/r/) engine and supports multiple time series objects such as `ts`, `mts`, `zoo`, `xts` and as well the data frame family (`data.frame`, `data.table`, and `tbl`). + +```{r} +# install.packages("TSstudio") +library(TSstudio) +packageVersion("TSstudio") +# Function documentation +?ts_plot +``` + + +#### Plotting a univariate time series + +Plotting a univariate time series data with the `ts_plot` function is straightforward: + +```{r } +# Loading the USgas dataset +data("USgas") + +ts_info(USgas) + +``` + +```{r fig.height=5, fig.width=10} +ts_plot(USgas) +``` + + + +By default, the function will set the input object name as the plot title and leave the X and Y axises empty. It is possible to label the axises and set a different title: + +```{r fig.height=5, fig.width=10} +# Setting the plot titles +ts_plot(USgas, + title = "US Natural Gas Consumption", + Xtitle = "Year", + Ytitle = "Billion Cubic Feet") +``` + +#### Line setting + +There are several arguments which allow modifying the line main characteristics such as the colors, width, and type of line. + +The `line.mode` argument is equivalent to the `mode` argument of the `plot_ly` function, and there are 3 options: + +1. `line` - the default option, a clean line +2. `line+markers` - a line with a markers +3. `markers` - only markers + +The `dash` argument allows modifying the line to dashed or dotted by setting the argument to `dash` or `dot` respectively. The line width, by default, set to 2 and can be modified with the `width` argument: + +```{r fig.height=5, fig.width=10} +ts_plot(USgas, + title = "US Natural Gas Consumption", + Xtitle = "Year", + Ytitle = "Billion Cubic Feet", + line.mode = "lines+markers", + width = 3, + color = "green") +``` + +In addition, we can add grid lines for the Y and X axises and slider by setting the `Xgrid`, `Ygrid` and slider arguments to TRUE: + +```{r fig.height=5, fig.width=10} +ts_plot(USgas, + title = "US Natural Gas Consumption", + Xtitle = "Year", + Ytitle = "Billion Cubic Feet", + line.mode = "lines+markers", + width = 1, + color = "#66C2A5", + Xgrid = TRUE, + Ygrid = TRUE, + slider = TRUE) +``` + +#### Ploting a multiple time series object + +The `ts_plot` can handle multiple time series objects such as `mts`, `xts`, `zoo` and data frame family objects. The example below demonstrates the plot options for a multiple time series object with the closing prices of **Apple**, **Facebook**, **Google** and **Microsoft** stocks over time: +```{r fig.height=5, fig.width=10, message=FALSE, warning=FALSE} +library(TSstudio) +library(xts) +library(zoo) +library(quantmod) +# Loading the stock price of key technology companies: +tckrs <- c("GOOGL", "FB", "AAPL", "MSFT") +getSymbols(tckrs, + from = "2013-01-01", + src = "yahoo" + ) + +# Creating a multiple time series object +closing <- cbind(AAPL$AAPL.Close, FB$FB.Close, GOOGL$GOOGL.Close, MSFT$MSFT.Close) +names(closing) <- c("Apple", "Facebook", "Google", "Microsoft") + +ts_info(closing) +``` + +The `type` argument defines whatever to plot all the series in a single plot (`single` option) or plot each series separately (`multiple` option): + +```{r fig.height=5, fig.width=10} +# Plot each series in a sepreate (default option) +ts_plot(closing, + title = "Top Technology Companies Stocks Prices Since 2013", + type = "multiple") + + +# All the series in one plot +ts_plot(closing, + title = "Top Technology Companies Stocks Prices Since 2013", + type = "single") + + +``` + +#### Plotting data frame objects + +Currently, the `ts_plot` function supports three classes of data frame - `data.frame`, `data.table` and `tbl`. To be able to plot a data frame object, it must contain one column with a date or time object (`Date` or `POSIXlt`/`POSIXct`) and at least one numeric column. The **US_indicators** is an example of a `data.frame` object with time series data. It contains the monthly vehicle sales and the unemployment rate in the US since 1976 and a date object: + +```{r} +data(US_indicators) +str(US_indicators) +``` + +```{r fig.height=5, fig.width=10} +ts_plot(US_indicators) +``` + diff --git a/vignettes/Seasonal_and_Correlation_Analysis.Rmd b/vignettes/Seasonal_and_Correlation_Analysis.Rmd deleted file mode 100644 index ac80bae..0000000 --- a/vignettes/Seasonal_and_Correlation_Analysis.Rmd +++ /dev/null @@ -1,227 +0,0 @@ ---- -title: "Seasonal Analysis with the TSstudio Package" -author: "Rami Krispin (@Rami_Krispin)" -date: "2018-09-17" -output: rmarkdown::html_vignette -vignette: > - %\VignetteEngine{knitr::rmarkdown} - %\VignetteIndexEntry{Vignette Title} - %\usepackage[UTF-8]{inputenc} ---- - -```{r setup, include=FALSE} -knitr::opts_chunk$set(echo = TRUE) -options(list(menu.graphics = FALSE, scipen=99, digits= 3)) -``` -### Seasonality analysis - -The **TSstudio** package provides a variety of functions for seasonality analysis, with the use of interactive data visualizations tools such as seasonal, heatmap, quantile, surface and polar plots, based on the [plotly](https://plot.ly/r/) package engine. Most of those functions support the main time series classes (`ts`, `xts` and `zoo` objects, and as well data frame objects `data.frame`, `data.table` and `tbl`) with a frequency between daily and quarterly (with the exception of the quantile plot, which support half-hour and hourly frequencies). - -```{r} -# install.packages("TSstudio") -library(TSstudio) -packageVersion("TSstudio") -``` - - -#### The USgas dataset - -In the following examples, we will use the **USgas** dataset. This dataset is one of the **TSstudio** package datasets, and a good example of a time series with a strong seasonal pattern. This dataset represents the monthly natural gas consumption in the US since 2000: - -```{r fig.height=5, fig.width= 7, message=FALSE, warning=FALSE} -# Load the US monthly natural gas consumption -library(TSstudio) -data("USgas") - -ts_info(USgas) - -ts_plot(USgas, - title = "US Natural Gas Consumption", - Xtitle = "Year", - Ytitle = "Billion Cubic Feet" - ) - -``` - - -#### The `ts_seasonal` function - -The `ts_seasonal` function was designed for plotting time series data by its full frequency cycle (hence, hence break the series by years for monthly data) and/or by its frequency units (e.g., by the months of the year for monthly data). The function supports `ts`, `xts`, `zoo` and data frame family objects with a frequency between daily to quarterly. The function has three modes (and a fourth one that includes all the three modes together): - -1. `normal` - type provides a break of the series by the cycle units (or years). This allows identifying whether the series has a reputable seasonal pattern: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_seasonal(USgas, type = "normal") - -``` -The colors are being set automatically by a sequential color palette in chronical order. - -2. `cycle` - this option provides a view of each frequency unit of the series across the full cycle units of the frequency, in the example below you can notice, for example, that the consumption during January in most of the years was the peak of the year: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_seasonal(USgas, type = "cycle") -``` - -3. `box` - for representing the cycle units with a box plot: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_seasonal(USgas, type = "box") -``` - - -Alternatively, setting the `type = all`, print the three options above together in one plot: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_seasonal(USgas, type = "all") -``` - -Note: that the colors of the months in the `cycle` plot corresponding to the colors in the `box` plot. - -While the `box` mode provides a useful information about the distribution of each frequency units, it might be misleading as the series was not detrended. Similarly, the `cycle` represents the variation and trend, if exists, of each frequency unit over time, yet it is hard to identify the distribution. The `all` mode provides the full picture, as it is allowed to identify a seasonal pattern within the series without detrending it. - -#### Colors setting - -The colors of the plots can be modified to any of the **RColorBrewer** or **viridis** packages palettes options (see below the available palettes). The `palette_normal` argument set the colors of the `normal` mode. As mentioned above, colors set according to the chronological order of the lines. The `palette` argument set the colors of the both the `cycle` and `box` options. In the example below, the colors of the `normal` plot are set to `inferno` palette from the **viridis** package and the `cycle` and `box` plots are set to `Accent` palette from the **RColorBrewer** package. - - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_seasonal(USgas, type = "all", palette_normal = "inferno", palette = "Accent") -``` - -Below are the possible palettes of the RColorBrewer and viridis packages: - -```{r fig.height= 7, fig.width= 7, message=FALSE, warning=FALSE} -RColorBrewer::display.brewer.all() -``` - -```{r fig.height= 5, fig.width= 7, message=FALSE, warning=FALSE} -n_col <- 128 - -img <- function(obj, nam) { - image(1:length(obj), 1, as.matrix(1:length(obj)), col=obj, - main = nam, ylab = "", xaxt = "n", yaxt = "n", bty = "n") -} - -par(mfrow=c(5, 1), mar=rep(1, 4)) -img(rev(viridis::viridis(n_col)), "viridis") -img(rev(viridis::magma(n_col)), "magma") -img(rev(viridis::plasma(n_col)), "plasma") -img(rev(viridis::inferno(n_col)), "inferno") -img(rev(viridis::cividis(n_col)), "cividis") -``` - - - -#### The `ts_heatmap` function - -Another useful visualization tool for seasonality analysis is the `ts_heatmap` function for time series objects, where the y axis represents the cycle units and x axis represents the years: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_heatmap(USgas) -``` - -Likewise the `ts_seasonal` function, the heatmap colors can be set by any of the palettes in the **RColorBrewer** and **viridis** packages with the `color` argument: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_heatmap(USgas, color = "Reds") -``` - -One of the main improvement in the current version is the ability to handle daily and weekly frequencies in addition to the monthly and quarterly. By default, daily data (whenever it has "Date" index) will appear in a format of weekday by year. For example, we will plot the daily demand for electricity in the UK (available in the **UKgrid** package): - -```{r fig.height=6, fig.width= 7, message=FALSE, warning=FALSE} -#install.packages("UKgrid") -library(UKgrid) - -UKgrid_daily <- extract_grid(type = "tbl", aggregate = "daily") -head(UKgrid_daily) - -ts_heatmap(UKgrid_daily, color = "BuPu") -``` - - -#### Quantile plot - -Generally, as the frequency of the series is more granular (hence, the data was captured in higher frequency such as seconds, minutes, etc.), potentially the data would have different seasonality (or multi-seasonality) patterns. The `ts_quantile` function creates a quantile plots for time series data with the ability to use different time frequency subset. This function supports only time series objects with a `Date` or `POSIX` object as an index such as `xts`, `zoo` and data frame family (`data.frame`, `data.table`, and `tbl`) with a half-hourly, hourly, daily, monthly and quarterly frequencies. - -A good example of a time series data with a multi-seasonality is the hourly demand for electricity which could have hourly (high consumption during the day lower during the night), weekly (high consumption during working days and lower during weekend and holidays), and monthly seasonality (consumption patterns depends on the region climeat across the year). To demonstrate the usages of this function we will load the UK national grid demand for electricity dataset again, however, this time will use an half-hour intervale format: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -UKgrid_half_hour <- extract_grid(type = "xts", aggregate = NULL) -library(xts) -ts_info(UKgrid_half_hour) -``` - - - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_plot(UKgrid_half_hour) -``` - -By default, the `ts_quantile` function will calculate and plot the 25th and 75th percentiles (the lower and upper lines) and the median (the solid line) by the series frequency units: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_quantile(UKgrid_half_hour, period = NULL, title = "The UK National Grid Net Demand for Electricity - Quantile Plot") -``` - -It is possible to modify the percentile range with the "lower" and "upper" argument. For instance, set the range between the 15th and 85th percentiles (the solid line remains the median): - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_quantile(UKgrid_half_hour, - period = NULL, - lower = 0.15, - upper = 0.85, - title = "The UK National Grid Net Demand for Electricity - 24 Hours Quantile Plot") -``` - -The `period` argument subset and plot the series quantiles by a specific frequency hierarchy. Hence, if the series frequency is a half-hour, we can subset and plot it by weekdays, months quarters and years. This allows checking if the series seasonal patterns vary on a different period's subsets. For instance, when setting the period for `weekdays` the output is a 24-hour quantile (by half-hour intervals) for each day of the week: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_quantile(UKgrid_half_hour, - period = "weekdays", - title = "The UK National Grid Net Demand for Electricity - 24 Hours Quantile Plot by Weekdays") -``` - -The `n` argument set the plot rows number. This allows spreading the plots when using a higher number of plots: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_quantile(UKgrid_half_hour, - period = "weekdays", - title = "The UK National Grid Net Demand for Electricity - 24 Hours Quantile Plot by Weekdays", - n = 2) -``` - -#### Surface and polar plots - -The `ts_surface` function provides a 3D representative for time series data - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_surface(USgas) -``` - -The `ts_polar` function provides a polar plot demonstrative of time series data where the year is represented by color and the magnitude is represented by the size of the cycle unit layer: -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_polar(USgas) -``` - -#### Correlation Analysis - -The `ts_lag`, as the name implies, create a lag plot of time series data currently support only `ts`, `xts`, and `zoo` objects with monthly or quarterly frequency: - -```{r fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_lags(USgas) -``` - -By default, the function plots the first 12 lags, however you can set the number of lags by modifing the `lags` argument to either a sequence of lags (e.g., `lags = 1:24` for the first 24 lags) or for a specifics lags. For example, you can plot the seasonal lags of the `USgas` dataset by selecting the 12, 24, 36 and 48 lags: - -```{r fig.height= 4, fig.width= 7, message=FALSE, warning=FALSE} -ts_lags(USgas, lags = c(12, 24, 36, 48)) -``` - -The `ts_acf` and `ts_pacf` are wrappers for the **stats** package `acf` and `pacf` functions, providing a colorful and interactive version for those functions: - -```{r, fig.height=4, fig.width= 7, message=FALSE, warning=FALSE} -ts_acf(USgas, lag.max = 36) -ts_pacf(USgas, lag.max = 36) - -``` -