Better Data for Better Forecasts

00:08 Introduction
00:30 If a company is already collecting data, is there something more they can do?
02:10 If companies weren’t meaning to collect data when they first started, does this mean that there are huge amounts of data just left in storage?
04:42 How granular does the data need to be?
07:21 What are the scenarios you may be missing out on?
10:39 Is it useful to re-import historical data into a new ERP?
13:28 Should companies clean their data?
17:22 Besides prices, what are the other factors companies should take into account?
19:14 What can companies do to build upon the raw data they already have?
22:05 What is the core message of today’s episode?


Arthur Conan Doyle’s iconic character Sherlock Holmes famously said ‘it is a capital mistake to theorize before one has data’. We certainly agree, as data is the foundation of any optimization process. Here, we discuss what companies that collect data can do and tackle the myth that data has to be “perfect” for a machine to be able to work with it.

Most companies actually collect data in a way that could be described as accidental. For example, an ERP is not designed to collect data, it’s instead designed so that mundane operations that occur constantly within a company are supported by a centralised IT system.

For instance, at the point of sale in a store, the electronic cash register is there to obtain your payment faster, it was not put in place with the intent of obtaining a customer’s entire transaction history. In this manner, companies end up incidentally amassing a vast quantity of data over a number of years.

Typically, this data ends up spread out through different layers of software and contains unintended complexities. Often, when a company wishes to start improving their forecasting process to better serve demand, they put in place a system to extract a simplified form of this data, usually in the form of daily or weekly sales. They then build their forecasts on top of that. While this may appear very reasonable on the surface, a lot of critical information has actually been lost.

To wrap things up, we expand on the history of ERPs and computing hardware compared to the present day difficulties of transferring data when a company changes its ERP. In addition, we go into more detail about the fallacy that faulty data is the source of many problems, when in fact it’s more a question of looking at the correct data sources, as the data a company stocks is rarely faulty.

In conclusion, we explain that when looking to improve forecasting you shouldn’t simply look at time-series and sales data, nor concentrate on fads like social media trends. More often than not, the data you should be looking at is somewhat mundane and already in the system, ie. pricing, returns and stock levels. The data that should be examined in the most detail also greatly depends on the sector the company is operating in, for example aerospace and fresh food both have their own very specific challenges.