Half a century of forecasting science (with Spyros Makridakis)

May 12, 2021

guest speakers

00:00:08 Introduction and Spyros Makridakis’ background.
00:01:36 Skepticism of statistical forecasting in the early days.
00:04:44 The evolution of forecasting in retail and consumer goods.
00:05:44 Results from Spyros’ initial forecasting studies.
00:07:21 The M5 competition and its significance for the forecasting industry.
00:08:01 Lokad’s performance at a forecasting competition.
00:09:01 Simple models and their effectiveness in the competition.
00:10:20 Evolution of forecasting techniques over the years.
00:11:46 Introduction of computers and their impact on forecasting.
00:14:32 Test scenario versus real-world application and methodological bias.
00:16:00 Discussion on challenges of forecasting erratic time series.
00:17:20 Shift in perception towards forecasting as a science.
00:18:28 The problem of overconfidence and unrealistic expectations in forecasting.
00:21:00 Dealing with uncertainty and fat-tail events in forecasting.
00:23:01 Injecting structural priors to account for extreme events in forecasting models.
00:24:00 Discussing the impact of tail events on forecasting models.
00:24:46 Injecting structural priors for more resilient supply chain decisions.
00:25:52 Recommending Nassim Taleb’s work for understanding black swan events.
00:26:35 Achievements from M competitions: simplicity, understanding uncertainty, and risk management.

Summary

In this interview, Kieran Chandler discusses forecasting with Joannes Vermorel, founder of Lokad, and Spyros Makridakis, a professor at the University of Nicosia. They explore the impact of M-Competitions, the effectiveness of simple methods, and the role of uncertainty in forecasting. Vermorel shares his experience in the M5 Competition, emphasizing the power of simple models and the importance of understanding uncertainty. Makridakis highlights the significance of empirical evidence and the need for preparedness in the face of risk. They stress the limitations of forecasting and the challenge of conveying the acceptance of uncertainty to clients.

Extended Summary

In this interview, host Kieran Chandler speaks with guests Joannes Vermorel, founder of Lokad, and Spyros Makridakis, a professor at the University of Nicosia and organizer of the M-Competitions. The discussion revolves around forecasting science and the impact of the M-Competitions on the industry.

Spyros Makridakis shares his background as a teacher and organizer of the M-Competitions, which have influenced forecasting in both academia and industry. The interview will later delve deeper into the M5 Competitions.

Joannes Vermorel discusses the skepticism he faced when founding Lokad in 2007, with some people believing that statistical forecasting was unreliable. Over time, this perspective has mostly disappeared from the industry, and Vermorel credits Makridakis’ M-Competitions for helping normalize the field as a scientific endeavor.

Makridakis highlights the importance of forecasting by mentioning the current situation in Texas, where supermarkets are out of goods. He explains that most of the time, consumers can find what they need because companies like Walmart and Target forecast millions of items weekly. When shortages occur, it actually highlights the effectiveness of forecasting, as people only notice when things go wrong.

Makridakis also reflects on his early career, when he conducted the first study on the accuracy of different forecasting methods. The results, which showed that simple methods were more accurate than sophisticated ones and that combining methods improved accuracy, were surprising and initially met with skepticism. However, these findings have since been proven and have greatly impacted the field of forecasting.

They discuss forecasting techniques, the M5 Competition, and how technology has impacted the field.

Makridakis explains that new forecasting techniques using deep learning involve generating a large number of models and taking the median as the best forecast. Vermorel shares his experience participating in the M5 Competition, noting that Lokad ranked 6th out of 909 teams on the quantile forecasting side. He highlights the absence of their major competitors in the top 100 and the disconnect between market share and performance in such competitions.

Vermorel then points out that Lokad used a very simple parametric model, demonstrating the power of simple methods in forecasting. Additionally, he emphasizes that accuracy is not the only important aspect in forecasting; understanding the structure of uncertainty is crucial as well. Makridakis concurs, adding that the M5 Competition results showed simple machine learning methods to be more accurate and effective than sophisticated ones.

The conversation shifts to the introduction of computers in forecasting, with Makridakis explaining that the key to success lies in simplicity. He describes the importance of separating forecasting data into training and testing parts to avoid overfitting the past and account for changes between the past and the future. Vermorel agrees, highlighting the challenge of accurately predicting data that is not yet available, and the importance of not assuming the future will be exactly like the past.

They discuss the evolution of forecasting, the importance of not overfitting data, and the significance of uncertainty in predictions.

Vermorel explains the development of forecasting theory at the end of the 20th century, with the work of Vapnik and Chervonenkis contributing to the concept of support vector machines. These machines highlighted the need to minimize both structural and empirical error, while also providing a lower bound for the real error.

Makridakis emphasizes the importance of competitions, where a portion of data is held back, as a means to establish a clean methodology for forecasting. He contrasts this with real-world scenarios, where there is a temptation to overfit data to achieve a perfect fit for past events, which can lead to less accurate future predictions.

Vermorel shares an example from his experience at Lokad, where clients were often surprised by the smoother forecast generated for erratic time series, such as the consumption of alcohol in hypermarkets. Competitors would often present forecasts that closely mimicked the erratic nature of the historical data, leading clients to be skeptical of Lokad’s smoother predictions.

Makridakis discusses the shift in perception towards forecasting as a science, emphasizing the importance of separating the past from the future and not trying to overfit past data. He highlights the importance of considering uncertainty in predictions and acknowledges that while clients may not appreciate this aspect, it is crucial for realistic forecasting.

The discussion was on the expectations from forecasting. Vermorel notes that some competitors in the retail industry make outlandish claims of high accuracy, which is impractical given the nature of consumer behavior. This raises the question of whether people now expect too much from forecasting and if they perceive it as infallible.

The conversation revolves around forecasting, the limitations and challenges in the field, and the impact of uncertainty and rare events on supply chain optimization.

The participants discuss how some vendors and consultants tend to oversell the idea of incredibly accurate forecasts, leading to unrealistic expectations from users. They emphasize that forecasting is not perfect, and uncertainty is inherent, especially in areas such as retail. Makridakis points out that not only is there normal uncertainty, but also “fat tail” uncertainty, which consists of rare and extreme events that can cause significant disruptions, such as the COVID-19 pandemic.

Vermorel agrees with the issue of consultants promising too much and shares that the challenge in probabilistic forecasting is not the technical aspect, but rather conveying the acceptance of uncertainty and the limits of control. He explains that simple forecasting models can be useful in injecting structural priors to account for tail events, even if the quantification is vague. By doing so, supply chain decisions can be steered towards more robust and resilient solutions in the face of infrequent events.

Makridakis highlights the importance of empirical evidence in determining what works and what doesn’t in forecasting. Through the M-Competitions, they have discovered that simplicity works best, acknowledging the randomness and unpredictability of the past. He emphasizes the importance of recognizing the uncertainty and risk associated with forecasts, and the need to be prepared to face them.

The interview touches upon the challenges and limitations of forecasting, the role of uncertainty in decision-making, and the importance of incorporating rare events in supply chain optimization.

Full Transcript

Kieran Chandler: When it comes to forecasting, we often take it for granted that there are tried and tested techniques which have been tested for generations. However, one person that didn’t have this luxury is our guest today, spyros Makridakis, who as one of the forefathers of the industry, actually invented many of the techniques which we use as standard. Today, we’re going to learn a little bit more about his career and what we can learn from over 50 years of experience in the industry. So spyros, thanks very much for joining us live from Cyprus today. And as always, we like to start off by sort of learning a little bit more about our guests. So perhaps you could just start off by telling us a little bit more about yourself.

Spyros Makridakis: Well, as you know, I have been a teacher for a long time, and that’s where I started working on forecasting. Then I left this and when I retired 15 years ago and here I am in Cyprus now in the university because we are continuing. And I know your companies participate in both the M4 and the M5 competitions, something I have been organizing for the last 40 years. So you know what my contribution is and how the M competitions, which are standing for Makridakis competitions, have affected the industry of forecasting and the companies and the academics that are using the findings.

Kieran Chandler: Brilliant! So we’ll get on to discussing a little bit about the M5 competitions a little bit later. Maybe in the first part, we’re going to look back at maybe the last 50 years of forecasting science as we know it and seems to have quite a lot of ground to cover today. Joannes?

Joannes Vermorel: Yes, what is interesting is that when I founded Lokad, when I was going through the motion of creating the company in 2007, at the time there were still people that were heavily skeptical about the very idea of statistical forecasting of any kind. It was very strange because, at the time, I was myself wondering whether I should continue my PhD in machine learning, which I had started but never completed by the way, or if I should go ahead with Lokad, this project. And when I applied to a startup incubator, the first time I applied, my application was rejected because there were two people in the jury that held very firmly the belief that statistical forecasting was just complete nonsense. It was literally like, “No, we don’t accept startups where the business plan is basically to sell divinations.” I mean, there is no question that you can make money with divination; people have been doing that for ages. But, are we okay with the idea that such a company would actually enter the incubator? The answer was a firm no. But the funny thing is, it was pretty much the last generation. I think during the decade or so that I’ve been running Lokad, there is pretty much nobody left in this industry that is holding this belief. So, it’s very funny; it was literally science in action.

Kieran Chandler: Joannes and Spyros, thank you for joining us today. Joannes, I believe that your contributions and the M-competitions have been really key elements in normalizing the field of forecasting. It has become more like normal science and no longer fringe science. What’s happening in Texas right now is interesting. The supermarkets are completely out of any goods, and people can’t find food or other essentials. When they talk about forecasting, I tell them to think of all the other times they go to a supermarket and find what they want. Supermarkets have millions of items, and they make forecasts for each one of them. Companies like Walmart and Target forecast millions of items weekly, so consumers can find what they want to buy. When there’s not enough supply, like now in Texas, it surprises people, but it actually proves how good forecasting is because most of the time, they can find what they want.

Spyros Makridakis: Absolutely, Kieran. Forecasting has been shaped in many ways by the work that I’ve done. When I first started out as a young professor, the landscape was quite different. We did the first study about how accurately different forecasting methods could predict. What we found surprised the statisticians of the time. I presented the results in London at the Royal Statistical Society, and everyone was attacking me, saying that we found these results because we were inexperienced in forecasting. We found that very simple methods were more accurate than sophisticated ones, and if you combined more than one method, the accuracy improved. Both of these findings were anathema for the statisticians of that time, who believed that you could find the best method and that more sophisticated methods would be more accurate. But now, new techniques using deep learning forecast 500 different models and then take the median of them, which they find to be the best forecast.

Kieran Chandler: Joannes, the M5 competition is one that you took part in not so long ago. From a vendor perspective, what does the M5 competition mean to you?

Joannes Vermorel: The M5 competition is quite fun. It’s one of those few opportunities where we can showcase our forecasting abilities and collaborate with others in the industry. It helps us improve our methods and keeps the field competitive, driving innovation and progress.

Kieran Chandler: Welcome everyone to today’s interview. Today, we have Joannes Vermorel, the founder of Lokad, and Spyros Makridakis, a professor at the University of Nicosia, Director of the Institute for the Future, and Emeritus Professor of Decision Sciences at INSEAD. Joannes, you participated in the M-Competitions, can you tell us more about it?

Joannes Vermorel: Yes, the M-Competitions are globally recognized events where people compete based on skill, which is unlike trade shows that mostly focus on marketing. The interesting thing is that in the top 100 companies, none of our biggest competitors were present, regardless of whether you were looking at one side of the competition or the other. This is surprising because what they sell is forecasting. So, there’s a massive disconnect between what happens during an actual test and the typical market shares observed in this market. Another thing I’d like to comment on is the simplicity of our model. Lokad arrived sixth out of 909 teams on the quantile side of the competition using a very plain parametric model with only three simplicities: day of the week, beginning and end of the month, and week of the year. We used ESSM and achieved results within 1% accuracy of the best model, which used gradient booster trees and a massive data augmentation scheme. The interesting thing is that we used only 0.001 of the complexity. I believe this demonstrates that very simple methods can be very powerful. The competition also showed that accuracy in the classical sense is not the only element that matters. Other dimensions to the forecast, such as having a better understanding of the structure of the uncertainty itself, are also important. That’s what probabilistic forecasts are about, and we at Lokad have been working hard on this for almost a decade.

Spyros Makridakis: You’re right, Joannes. In the first M-Competition, simple statistical methods were more accurate than sophisticated ones. In the M5 competition, we found that simple machine learning methods were more accurate than sophisticated machine learning methods, like deep learning. The top competitors in both the accuracy and uncertainty challenges used simple machine learning methods, and they were the most accurate and effective in predicting the Walmart data. One of the interesting aspects of the M5 competition is that everyone’s using computer-based forecasting techniques.

Kieran Chandler: And that’s now standard across the industry, but if you look back to when you were starting out, spyro, starting out as a professor, that was kind of before maybe the dawn of computers. So how did the introduction of computers change the way you were doing things? What kind of opportunities did that give you?

Spyros Makridakis: Well, the opportunities are that it made things simpler. In forecasting, there are two parts: one is to fit what happens in the past, which is the easy thing. Before we started the competition, people were overfitting the past, thinking that the future would be exactly like the past. There was not the idea of separating the forecasting data into a training part and a testing part. So we try to predict as accurately as possible, not the training part, but the test part - the future, in other words. Because the future is not exactly like the past, it changes, and the idea now is we don’t want to overfit the past because there will be some changes between the past and the future. So we try to figure out how these changes are going to take place and use them to predict more accurately in the future. This is a very big difference because in the past, they were not considering that; they were thinking that the future would be exactly like the past, but we know very well that’s never happening.

Kieran Chandler: Would you agree with that, Joannes? I mean, how have you seen forecasting techniques evolve through the decades if you’re going to look back?

Joannes Vermorel: I think what Spyros Makridakis is pointing out is fundamental. There is this apparent paradox that you want to be accurate on the data that you don’t have. That’s something very puzzling when you think about it because, naturally, obviously, whenever you want to measure the accuracy, by definition, you’re going to measure the accuracy against the data that you have, but this is not what you want to do. This problem has been partly addressed by the end of the 20th century with the theory of Vapnik and Chervonenkis. It’s a very abstract theory that gave birth to support vector machines, which are very complex. They started to formalize the idea that you have the empirical error and the structural error. The structural error is, and basically, the idea is that you want to minimize the real error, the real error being defined as the error that you’re about to make on data that you don’t have. You need to minimize both the structural error plus the empirical error, and that’s what support vector machines are about. Support vector machines have a very theoretical perspective. They have been implemented and have enjoyed great successes as a machine learning technique in a couple of fields. I think their most important contribution was to clarify, from a more theoretical perspective, what was going on. And then, when you want to actually get real results, I think resorting to those competitions where you really hold back a part of the data for real is probably the way to go in terms of getting accurate forecasts.

Kieran Chandler: In order to have, I would say, a very clean methodology, how does the approach you take in a test scenario, which is very much competition-based, vary from what you would be doing in the real world? I mean, in the real world, you have access to all the data all the time, and that gives you this strong methodological bias that Professor Makridakis was pointing out. It’s incredibly tempting to just have something that is going to fit the data, you know.

Spyros Makridakis: That’s what they used to do in the past. The famous Box-Jenkins methodology was to fit as well as possible in the past, and that’s why it was losing to all the simple methods that were not fitting very well in the past but were predicting more accurately in the future. If you overfit, then you’re losing the essence of forecasting. The future is never exactly like the past.

Joannes Vermorel: Exactly. And one of the puzzling examples was when I started Lokad. Clients were usually super surprised when they were looking at an incredibly erratic time series, for example, the consumption of alcohol in hypermarkets, a very erratic product with spikes. When we would be doing classic forecasts during the first few years of Lokad, where we were not doing probabilistic forecasts yet, we were doing classic forecasts. For those super erratic time series, I would show a forecast that was much smoother than the original time series. Most of my competitors were able to exhibit forecast time series that were as erratic, exactly as erratic, as the original time series. Clients were, and I went into many great debates with my prospects who were not yet clients, who just did not believe that this super smooth forecast could be correct because it was so unlike the historical time series that was super erratic and spiky. While my competitors were showing very spiky forecasts, they were able to show and put on display very spiky forecasts that just looked like the historical data.

Kieran Chandler: So, Spyros, one of the things I’m really curious about is how the perception towards forecasting has changed throughout your career. When did that shift towards forecasting being seen as a science happen, and when did it become more accepted into the mainstream?

Spyros Makridakis: Well, it took some time. In the beginning, the classical statisticians used the same thing that they were doing to you when they were telling you that what is important is to follow the fluctuations of the series. But that’s not how you forecast. It took some time to realize that you cannot predict randomness, and what the M Competitions have proven beyond any doubt is that what is important is to separate the past from the future and that you don’t try to overfit the past but try to have a model that is adaptive to changes from the past to the future. And that’s the major change. And now it’s accepted that in addition to looking at forecasting, we have to look at the uncertainty in our predictions. A lot of people don’t like it at all because, psychologically, it’s not a very good thing to talk about uncertainty, to say that I’m going to forecast but I’m uncertain.

Kieran Chandler: Joannes, how inaccurate would you say your forecasts are? Clients often tell me that you claim to focus but also admit that you cannot forecast due to a lot of uncertainty in the future.

Spyros Makridakis: That’s reality, you cannot avoid being realistic. It introduces this kind of idea of a confidence level in your forecast. Johannes, would you say that the perception has almost shifted so far that people expect too much from forecasting and they’re expecting it to be infallible?

Joannes Vermorel: That’s an interesting question. I’ve been to the National Retail Federation trade show in New York quite a few times, and what I saw is that most of my competitors were very frequently making completely outlandish claims of having 99% accuracy in retail. Frankly, I don’t even know what 99% accuracy means in hypermarkets, where most products are sold in small quantities daily. It’s ridiculous to think that you would know, to the last unit, whether somebody is going to pick a product, while this person might not even know themselves. I’ve seen many vendors trying to oversell the idea that you can have incredibly accurate forecasts everywhere, which is absolutely not the case. They use the aura of science that statistical forecasting has gained in other areas, such as demographics, electricity consumption, and water consumption, where the amount of uncertainty is comparatively very low, to claim that they can achieve the same level of accuracy in hypermarkets, which is just not quite the same. You can do a lot, but it’s not the same order of magnitude in terms of accuracy.

Spyros Makridakis: One of the biggest problems with forecast users is their expectations are too high because consultants try to sell them forecasts that they’re overconfident in, and this is one of the biggest problems. So part of what we have to do in the field is to say, “Look, we cannot be prophets. Our data in particular things like retailing tell us that the uncertainty is very big, and we have to do something about this.” We don’t talk about only the normal uncertainty; we also have uncertainty which is fatal, like the famous Nassim Taleb’s black swan events that destroy a lot of our forecasts and create problems like the pandemic has. We must take this also into account; you cannot avoid uncertainty.

Kieran Chandler: Johannes, would you agree with that? We’ve spoken a bit in the past about consultants coming in and promising a little bit too much from forecasts.

Joannes Vermorel: Yes, I agree. What was actually hard about probabilistic forecasting was not the technicality of producing it, but managing the expectations that had been set too high by consultants who promised too much.

Kieran Chandler: So Joannes, you’ve talked about the importance of probabilistic forecasting and injecting structural priors. Can you explain a bit more about that?

Joannes Vermorel: Certainly. Probabilities, this is not that difficult. What was hard was indeed to convey the acceptance that yes, Lokad was going for probabilistic forecasts. Not because our forecasts were crappy. I think we are even if we are not initially the best in those competitions, we are certainly not, you know, ranking quite high. The problem was not that we had crappy forecasts, and we had it was to have an acceptance that there are things that are beyond our control. And the interesting things about those tail events is that suddenly you have something where it’s very, very difficult to trust your DR. And that’s where it, again, that’s something where it is frankly a very high interest to me. Is that by keeping your I would say non-tail, you know, forecasting model simple, is that you have something where you can, with I would say obviously a degree of subjectivity, but if you have a simple, relatively simple and manageable forecasting model, you can inject structural priors to basically inject this dose of super rare and super extreme events.

Kieran Chandler: I see, and how does that help in supply chain optimization?

Joannes Vermorel: So that’s basically what we are doing at Lokad. Is that for example, things like pandemics, we can’t forecast pandemics. But what we can do, and that’s not even super complicated to do, is to say, “well, we can, may I can inject a prior to say there is one chance, a yearly probability of two percent let’s say that there will be a 50 percent downturn that impacts the company.” I don’t know why, I just know that it’s a reasonable assumption. It’s subjective, you know, why two percent, why two percent off of a fifty percent downturn? All of that is very subjective, but the interesting thing is that if you inject a dose of tail events in your forecasting models, even if it’s fairly inaccurate, you know, in terms of the quantification is very vague. What happens is that when you build your supply chain optimization on top, you steer the decision towards things that are a lot more robust against those tail events without investing too much money on it. So that the interesting thing is that and that’s the sort of things that we do is that we keep those forecasting models simple so that we can inject those structural priors that are, I would say, very much made up, although they are reasonable. They are not precise, but the consequence of that is that you can have at the end of the day supply chain decisions that end up to have a lot more resilience with regard to things that happen infrequently. And the process is quite simple, but in practice, it takes a lot of convincing to bring people to understand those black swans. Indeed, I frequently resort to “Please read the work of Nassim Taleb,” but it’s a tough sell when you want to convince a prospect that you want to give them, you know, a 600-page book written by yet another, you know, great Greek thinker of Nassim Taleb.

Spyros Makridakis: Joannes, can I ask a question? Because it seems to me that you are injecting structural priors, which could be seen as bias in your forecasting models. Do you believe that this bias can be detrimental to the forecasting accuracy of your models?

Joannes Vermorel: Yeah, that’s a very good question, Spyros. There are two things to that. First of

Kieran Chandler: Spyros, you have had half a century where you’ve been involved in the industry. What is it that you’re most proud of when you look back at your career?

Spyros Makridakis: Well, I’m most proud of the fact that we provide empirical evidence about what works in forecasting and what doesn’t. It’s not just talk, but we have conducted experiments through the M-Competitions. From these experiments, we can tell you which methods work and which don’t. What we can tell you is that simplicity works. We realize that there’s a lot of randomness in past events, and we cannot predict everything precisely. Because our forecasts are uncertain, there is risk, and we need to do something to anticipate this risk and be prepared to face it.

Kieran Chandler: Thank you both for your time today.

Joannes Vermorel: Thank you very much.

Spyros Makridakis: Thanks for interviewing me.

Kieran Chandler: That’s everything for this week. Thanks very much for tuning in, and we’ll see you again in the next episode.

Back to Lokad TV ›

PREVIOUS EPISODES