Time Series Analysis and Forecasting: Novel Business Perspectives

Image for post
Image for post

Time series forecasting is hardly a new problem in data science and statistics. The term is self-explanatory and has been on business analysts’ agenda for decades now: The very first practices of time series analysis and forecasting trace back to the early 1920s.

The underlying idea of time series forecasting is to look at historical data from the time perspective, define the patterns, and yield short or long-term predictions on how — considering the captured patterns — target variables will change in the future. The use cases for this approach are numerous, ranging from sales and inventory predictions to highly specialized scientific works on bacterial ecosystems.

Although an intern analyst today can work with time series in Excel, the growth of computing power and data tools allows for leveraging time series for much more complex problems than before to achieve higher prediction accuracy.

Time Series Problems

Time series problems, on the other hand, are always time-dependent and we usually look at four main components: seasonality, trends, cycles, and irregular components.

Image for post
Image for post

Source: Forecasting: Principles & Practice, Rob J Hyndman, 2014
Trends and seasonality are clearly visible

The graph above is a clear example of how trends and seasons work.

Trends. The trend component describes how the variable — drug sales in this case — changes over long periods of time. We see that the sales revenues of antidiabetic drugs have substantially increased during the period from the 1990s to 2010s.

Seasons. The seasonal component showcases each year’s wave-like changes in sales patterns. Sales were increasing and decreasing seasonally. Seasonal series can be tied to any time measurement. We can consider monthly or quarterly patterns for sales in midsize or small eCommerce, or track microinteractions across a day.

Cycles. Cycles are long-term patterns that have a wave form and recurring nature similar to seasonal patterns but with variable length. For example, business cycles have recognizable elements of growth, recession, and recovery. But the cycles themselves stretch in time differently for a given country throughout its history.

Irregularities. Irregular components appear due to unexpected events, like cataclysms, or are simply representative of noise in the data.

Today, time series problems are usually solved by conventional statistical (e.g. ARIMA) and machine learning methods, including artificial neural networks (ANN), support vector machine (SVM), and some others. While these approaches have proved their efficiency, the tasks, their scope, and our abilities to solve the problems change. And the mere set of use cases for time series today has a potential to be expanded. As statistics step into the era of big data processing, the Internet of Things providing limitless trackable devices, and social media analysis, analysts look for new approaches to handle this data and convert it into predictions.

So, let’s survey the main things that are happening in the field.

Methods to combat non-stationary data

Traditional forecasting methods strive to bring stationarity into time series, i.e. make a number of statistical properties repeat constantly over time. Raw data doesn’t usually provide enough stationarity to yield confident predictions. For instance, to the graph of antidiabetic drug sales above, we must apply multiple mathematical transformations to render non-stationary time series at least approximately stationary. Then we’ll be able to find patterns and make predictions that are more accurate than coin tossing, which is right in 50 percent of cases.

Image for post
Image for post

Source: Forecasting: Principles & Practice, Rob J Hyndman, 2014
Bringing stationarity to data

But time series in some fields are very resistant to our efforts as there are too many irregular factors that impact changes. Look at travel disruptions, especially those that happen during political unrest and the dangers of terrorism. Traveler streams change, destinations change, and airlines are adjusting their prices differently making year-old observations nearly obsolete. Or crude oil prices, which are critical to predict for players across many industries, haven’t permitted us to build time series algorithms that would be precise enough.

Traditional machine learning methods

In time series, the main difference is that a data scientist needs to use a validation set that exactly follows a training set on the time axis to see whether the trained model is good enough. The problem with non-stationary records is that data in the training set might not be homogeneous to the testing set, as time series properties substantially change over the period that training and validation sets cover.

Stream learning approach

Data Horizon. How many new training instances are needed to update the model? For example, Shuang Gao and Yalin Lei from the China University of Geosciences recently applied stream learning to increase prediction accuracy in such non-stationary time series as crude oil prices mentioned above. They’ve set the data horizon as small as possible so that every update on the oil prices immediately updates the algorithm.

Data Obsolescence. How long does it take to start considering historic data or some of its elements irrelevant? The answer to this question may be quite tricky as it requires a share of assumptions based on domain expertise, basically, an understanding of how the market you work with changes and how many non-stationary factors bombard it. If your eCommerce business has significantly grown since last year both in terms of customer base and product variety, the data of the same quarter of the previous year may be considered obsolete. On the other hand, if the country experiences economic recession the new short-term data may be less enlightening than that of the previous recession.

While crude oil forecasts based on stream learning eventually perform better than conventional methods, they still show results that are only slightly better than a flipped coin does and stay in a ballpark of 60 percent confidence. They are also more complicated in development, deployment, and require prior business analysis to figure out data horizon and obsolescence.

Ensemble Methods

Basically, while building robust forecasting is expensive and time-consuming, it doesn’t narrow down to making and validating one or two models with further choosing of the best performer. In terms of time series, non-stationary components — like different durations of cycles, low weather predictability, and other irregular events that have an impact across multiple industries — make things even harder.

This was the problem for the Google team that was building time series forecasting infrastructure to analyze business dynamics of their search engine and YouTube with further disaggregating these forecasts for regions and small-time series like days and weeks. With Google engineers recently disclosing their approach, it became clear that even the Mount Olympus of AI-driven technologies chooses simpler methods over complex ones. They don’t use stream learning yet and settle for ensemble methods. But the main point that they express is that you need as many methods as possible to get the best results:

So, what models do we include in our ensemble? Pretty much any reasonable model we can get our hands on! Specific models include variants on many well-known approaches, such as the Bass Diffusion Model, the Theta Model, Logistic models, bsts, STL, Holt-Winters and other Exponential Smoothing models, Seasonal and other ARIMA-based models, Year-over-Year growth models, custom models, and more.” — Eric Tassone and Farzan Rohani say.

By averaging the forecast of many models that perform differently in different time series situations, they achieved better predictability than they could with a single model. While some models work better with their specific non-stationary data, others shine in theirs. The average that they yield acts like an expert opinion and turns out to be very precise.

Image for post
Image for post

Source: Our quest for robust time series forecasting at scale, Eric Tassone and Farzan Rohani, 2017, Forecast procedure in Google

However, the authors of the post note that this approach may be the best one for their specific situation. Google services stretch across many countries where different factors like electricity, internet speed, user working cycles are adding too many non-stationary patterns. So, if you aren’t operating with a multitude of locations or a large set of varying data sources, ensemble models may not be for you. But if you track time series patterns across countries or business units in different regions it might be the best fit.

Automation of time series forecasting

The problem with automation in prediction and machine learning operations is that the technologies are still in their infancy. Fully automated solutions suffer from the lack of flexibility as they perform many operations under the hood and can either do straightforward and general tasks (like objects recognition on pictures) or fail to capture business specifics. On the other hand, hiring full-blown data science teams may be cost-sensitive in the early stages of your analytics initiative. A happy medium here are instruments like TensorFlow, that still require some engineering talent on board but provide enough automation and convenient tools to avoid reinventing the wheel.

Facebook’s Prophet

Prophet is positioned as “Forecasting at Scale”, which according to the authors means mainly 3 things:

1. A broad variety of people can use the package. Potential users are both data scientists and people who have the domain knowledge to configure data sources and integrate Prophet into their analytics infrastructures.

2. A broad variety of problems can be addressed. Facebook used the tool for social media time series forecasting, but the model is configurable to match various business circumstances.

3. Performance evaluation is automated. Here comes the sweetest part. Evaluation and a number of surface problems are automated and human analysts just have to visually inspect forecasts, do the modeling, and react to situations when the machine thinks that forecasts have a high error probability.

Image for post
Image for post

Source: Forecasting at Scale, Sean J. Taylor and Benjamin Letham, 2017

Modeling, in this case, means that analysts use their domain knowledge and external data to tweak the work of Prophet. For instance, you can input market size data or other capacities information to let the algorithm consider these factors and adjust to them. As you know when you are going to roll out some game-changing updates, like a site redesign or some mind-blowing feature, you can also signal the algorithm about these. And eventually, you can define the relevant scale of seasonality and even add holidays as recurring patterns in your time series. All retailers know how different the Black Friday or Christmas are from the rest of the year.

Twitter and Microsoft

The reason we’re mentioning Twitter here is that both Prophet and AnomalyDetection are representatives of the emerging automation trend in the time series field. Pretty soon these operations are going to become more affordable and potentially move to the popular cloud infrastructures. For example, Microsoft recently rolled out its Azure Time Series Insights for IoT that doesn’t seem to add the prediction capacity yet but already provides data streaming from devices and allows for anomaly detection.

Time Series Forecasting as a Sales and UX Lever

Internal analytics have usually been employed to gain business insights. But things get the new perspective as giving away some prediction results, especially those that relate to time series seems a great opportunity for improving and personalizing the user experience.

One of those cases is our client Fareboom.com. Fareboom is a flight-booking service that succeeds in finding the lowest air fares possible for its customers. The problem with airfares is that they change rapidly and without obvious reasons. Unless you’re buying tickets right before a trip, future pricing information would be advantageous. A great UX solution was to predict whether the prices are going to drop or increase in the near or distant future and give this information to customers. This encourages customer making return and makes Fareboom their go-to platform for optimizing their travel budgets.

Image for post
Image for post

Source: Fareboom.com
The engine has 75 percent confidence that the fares will rise soon

Giving away at least some of your analytics is a particularly good strategy for the travel industry and generally all businesses that connect people with end-service providers. If you have seasonal or trending data on the hotels that people enjoy during Christmas, why not turn it into recommendations?

The Main Trend

The main concern today’s executives’ should be defining an analytics strategy, whether it’s going to be customer-facing or internal, and leading the initiative.

Originally published at AltexSoft’s blog: “Time Series Analysis and Forecasting: Novel Business Perspectives

Written by

Being a Technology & Solution Consulting company, AltexSoft co-builds technology products to help companies accelerate growth.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store