00:00:07 Introduction to the topic of machine learning in the supply chain industry.
00:00:46 Introduction to the guest Alexander Backus, who is the data and analytics leader at IKEA.
00:02:20 Explanation of the concept of self-fulfilling prophecy.
00:03:03 Discussion of how a self-fulfilling prophecy affects the supply chain, such as business targets and the influence of demand and supply.
00:07:14 Explanation of how feedback loops in the supply chain make the world more complex and how a surplus of a certain product can influence its sales.
00:08:53 Discussion about feedback loops in supply chains and the influence of human behavior on them.
00:10:41 Use of sales data in demand forecasting and the potential consequences of using a naive approach.
00:13:08 Zero forecasting problem in machine learning systems and the bullwhip effect.
00:15:17 Explanation of the stock-out bias and techniques to deal with it.
00:17:22 Discussion about the prevalence of stock-outs and the effectiveness of the method to deal with the stock-out bias.
00:18:15 Explanation of how the customer’s perception of a product can affect demand and the impact of stock levels on sales.
00:20:17 Explanation of loss masking and its purpose.
00:20:26 Explanation of how giving the model access to stock levels can help it understand the effect of stock level fluctuations on sales.
00:22:14 Discussion of the limitations of using a machine learning model for causal inference and the effects of confounding variables.
00:25:54 Explanation of how probabilistic forecasting can help reduce the impact of zero forecasting by acknowledging the “fuzziness” of the information available.
00:27:04 Explanation of the benefits of using a probabilistic forecasting model.
00:28:44 Advantages of using a probabilistic forecasting model over a point forecast.
00:30:42 Feedback loops and how they affect the forecast.
00:34:35 How prices can affect the forecast.
00:36:32 Explanation of partial observability and its challenge in creating a model for supply chain management.
00:37:04 Comparison to the concept of bandit feedback and its well-known application in e-commerce recommendation systems.
00:37:17 Discussion on the limitations of supervised learning in predicting the impact of decisions in supply chain management.
00:38:01 Explanation of policy-based reinforcement learning algorithm.
00:41:06 Discussion on the challenges in applying reinforcement learning algorithm to real-world supply chain management and the solution of starting with offline learning from historical data.
00:44:55 Discussion on how habits and past practices affect price movements in a company.
00:46:41 Explanation of exploitation and exploration in reinforcement learning.
00:50:57 The need to acknowledge feedback loops in forecasting as a change of paradigm.
00:52:45 The technical and cultural challenges in incorporating AI into business processes.
00:53:57 Discussing the challenges in modeling and making decisions in the supply chain industry.
00:54:55 Acknowledging the existence of feedback loops in the supply chain process.
00:55:06 Moving towards a decision-based approach rather than a forecast-based approach.
00:57:27 The trend in the supply chain industry, especially among large e-commerce companies.
01:01:03 What qualities are sought after when on-boarding new people to work on supply chain challenges at IKEA.

Summary

In an interview moderated by Nicole Zint, Joannes Vermorel, founder of Lokad, and Alexander Backus, Data and Analytics leader at IKEA, discuss the application of machine learning and AI in the supply chain industry. The interview highlights the impact of self-fulfilling prophecies and feedback loops on supply chain management and stresses the challenges of utilizing machine learning models in forecasting. The interview also explores approaches to avoiding the zero forecasting problem, such as using probabilistic forecasting, and the importance of acknowledging uncertainty in supply chain forecasting. The panelists emphasize the need for embracing uncertainty, moving towards decision-making models, and incorporating changes in a step-by-step manner to improve supply chain management.

Extended Summary

In this interview, Nicole Zint moderates a discussion between Joannes Vermorel, founder of Lokad, and Alexander Backus, Data and Analytics leader at IKEA, about the application of machine learning and AI in the supply chain industry. They discuss the concept of self-fulfilling prophecy and its potential impact on supply chains, the role of feedback loops, and the challenges of utilizing machine learning models in forecasting.

A self-fulfilling prophecy is a prediction that directly or indirectly causes itself to become true due to the feedback between belief and behavior. In supply chain management, forecasts can impact decision-making processes and ultimately change the future. Vermorel points out that self-fulfilling prophecies are not inherently good or bad; they simply make the situation more complex.

Feedback loops are prevalent in supply chains, as humans react to forecasts, which can then affect future predictions. Vermorel highlights how these loops can manifest in various ways, such as adjusting prices or product placements based on stock levels. He also notes that competitors may alter their strategies in response to a company’s forecasts, creating additional feedback loops.

Backus explains that sales data is a key input for machine learning models in forecasting, but sales are not the same as demand. Sales data can be influenced by supply and other factors, while demand is an unobserved quantity that needs to be inferred. He emphasizes the importance of distinguishing between the two and considering their interplay in the forecasting process.

Machine learning models can be problematic in supply chain forecasting if they’re not designed to account for feedback loops and self-fulfilling prophecies. Backus mentions the “bullwhip effect,” where small deviations in the supply chain can be amplified by the system. This can lead to detrimental effects, such as spiraling sales or inaccurate predictions. He contrasts predicting weather, which is not influenced by human behavior, with predicting business outcomes, which are subject to these complex feedback loops.

To mitigate the challenges posed by feedback loops and self-fulfilling prophecies, Vermorel suggests that companies should embrace the complexity of supply chain systems and recognize that point forecasts may be insufficient. Instead, they should seek to understand and anticipate the potential impacts of their forecasts on human behavior and decision-making processes.

In summary, the interview explores the intricacies of using machine learning and AI in supply chain management, highlighting the importance of understanding self-fulfilling prophecies and feedback loops to improve forecasting accuracy and decision-making.

The zero forecasting problem occurs when a system orders less stock due to a perceived drop in demand, causing demand to drop further and leading to a continuous decline in orders. To avoid this issue, Vermorel suggests removing the stockout bias by changing the metric used in the forecasting model. One approach is to zero out the measurements on days with stockouts. This method works well when stockouts are relatively rare but is less effective in industries with high stockout rates.

Another approach is to give the machine learning model access to historical and future stock level data, allowing it to learn the effect of stock level fluctuations on future sales or demand. This method requires feeding all decisions and factors affecting demand, such as promotions, pricing, capacity, warehouse constraints, and market forces, into the forecasting model.

However, Backus warns that using a standard machine learning model without all the necessary information can lead to mistakes, such as confusing the cause and effect of stock level changes and demand fluctuations. To avoid these issues, he suggests using probabilistic forecasting, which acknowledges the fuzziness of available information and avoids converging on an absolute confidence in the demand being zero.

Probabilistic forecasting spreads probabilities across many values, making it more difficult to converge on an absolute confidence in zero demand. This approach avoids inventory freezing at zero by estimating non-zero probabilities for future demand. It also accounts for the asymmetry between serving a customer and keeping extra stock for an extra day, favoring higher service levels.

Despite its advantages, probabilistic forecasting is not a perfect solution. It can still underestimate future demand in cases of repeated stockouts. However, it does provide a more robust method for managing inventory and avoiding the zero forecasting problem.

In conclusion, adopting machine learning techniques and probabilistic forecasting can help supply chain professionals better predict demand and manage inventory levels. By considering various factors that influence demand and accounting for the uncertainties in the available data, businesses can make more informed decisions and improve their supply chain performance.

Joannes Vermorel emphasized the importance of acknowledging uncertainty in supply chain forecasting, as perfect modeling of future events is unrealistic. He discussed the concept of probabilistic forecasting, which reflects the inherent uncertainty of supply chain events, and how it differs from point forecasts. Probabilistic forecasts, he explained, involve probability distributions, making the future look very different from the past. He also touched upon feedback loops as an extra dimension to enrich forecasts by making them dynamic and conditional upon future behavior.

Alexander Backus agreed with Vermorel’s points and elaborated on how giving models access to previous decisions, such as pricing, can alleviate issues with forecasting. He introduced the concept of partial observability, which involves only observing the effect of a decision without knowing the counterfactual. In order to better predict the impact of decisions, Backus suggested reframing machine learning problems to output optimal decisions instead of predictions about the future. This approach is called reinforcement learning.

The conversation revolves around the challenges of forecasting and decision-making in supply chain management due to feedback loops, limited data, and non-random decisions. They emphasize the need for embracing these feedback loops and moving towards a model that outputs decisions rather than forecasts. The trend among technologically-minded companies like Amazon and Alibaba is to let go of the idea of a perfect forecast and focus on decision-making. Despite existing challenges, the panelists agree that the industry should work towards incorporating these changes in a step-by-step manner to improve supply chain management.

Vermorel highlights the importance of embracing uncertainty and the irreducible complexity of supply chains, which are composed of humans, machines, and processes. He advocates for being approximately correct rather than exactly wrong. Backus emphasizes the need for great data science talent to tackle challenges within large corporations like IKEA, stressing the potential for global impact and the importance of challenging the status quo.

Full Transcript

Nicole Zint: Welcome Alexander Backus with us here today at our offices. Alexander is an expert in this field and is the Data and Analytics leader at IKEA. So, as always, we’d like to start off by letting our guests introduce themselves. Alexander, if you wish, the floor is yours.

Alexander Backus: Thanks, Nicole. Thanks for having me here. It’s great to be here in Paris with you. My name is Alexander Backus, and I’m leading data analytics in the inventory and logistics operations domain of IKEA Inka Group Digital. I’m managing a group of data scientists, data engineers, and data analysts working in cross-functional product teams on a mission to optimize inventory logistics operations planning. I have a background in data science, and I’ve worked as a consultant for big companies like KLM Airlines, Heineken, Vodafone Ziggo, and ING Bank. After doing a PGD in Cognitive Neuroscience, I think working in supply chain as a data scientist is a really exciting field because it combines a lot of favorable conditions for data science. There’s a lot of data, there’s an impact on real-world decision-making, so it’s something tangible, and you don’t impact only the bottom line, but also help create a more sustainable world by reducing waste in the supply chain. So that’s how I ended up here.

Nicole Zint: Before we delve into these topics, let’s first explain the concept that we will be discussing. Let’s just start off: What is a self-fulfilling prophecy?

Alexander Backus: The idea is that the forecast you make to optimize your business process actually impacts a certain decision-making process. There’s a decision being made based upon your forecast, at least that’s what you want. When that happens, it means that your forecast itself is changing the future and also changing the data that is being used to forecast the next time. This can pose certain challenges. Essentially, a self-fulfilling prophecy is when a prediction happens because it was predicted. So, you affect the future because you thought it would be a certain way. You’re not only affecting the future, but you also create a reality where the forecast becomes the truth, and that can happen in various ways. For example, if you have a forecast of your business or your sales, then this can become the target for your business.

Nicole Zint: So, marketing people make certain decisions, they say, okay, we should reach this target because we’re a bit low now, so we need to sell a bit more, and we need to do some promotions. So actually, the forecast that you made has become the target that has led to decision making along the way, that impacts what will be the end result of sales in this example. And that can happen in many ways. So another example is where you will have a certain forecast that makes it such that you secure a given delivery capacity or picking capacity in your warehouses, and that has an impact on the lead time. So when a customer looks at your e-commerce website and sees that the lead time is very high or very low, it can go either way; that actually influences the demand from the customers.

Alexander Backus: Exactly, so demand influences your supply, and supply influences the demand. It goes either way, and that is actually this effect that you’re hinting at, Joannes. When it comes to the forecasts that become business targets, how do you see that affecting the business itself? What are the drawbacks when the forecast is something people aim to reach rather than actually looking at their supply chain performance?

Joannes Vermorel: There are no drawbacks per se. It is more a matter of this being the way supply chain operates. You know, feedback loops are all over the place. We are dealing with essentially human affairs where what surprises practitioners is that in many engineering schools and even in many companies, people approach forecasting like the approach of the forecast of the movement of planets, something where you have a very clean framework where you have past observation, and you can make a statement about the future position of the planet. But you, being the forecaster, have no impact whatsoever on those elements being observed, like the planets.

Nicole Zint: So you mean to say that a self-fulfilling prophecy is not necessarily good or bad, it just is?

Joannes Vermorel: Yes, exactly. You can’t pretend that it doesn’t affect, but it certainly makes the situation more complex and complicated, actually a bit of both. And so, where it becomes a little bit puzzling is that many companies have a hard time coming to terms with anything that is not a point forecast or point forecast. Say, you have one future; this is it. And it is essentially something that is completely symmetric of the past. You have your past observation, and you would like to have the future that is just as clean and neat as the past, essentially more of the same.

Nicole Zint: Yes, more of the same but also really the same nature. So you have a perfectly clear vision about the past and a perfectly clear vision about the future. And by the way, in the case of the movement of planets, as long as you’re not looking millions of years ahead, you can have a completely perfect vision for the position of those planets one century from now.

Joannes Vermorel: Now, where it becomes interesting is that in supply chain, you have feedback loops all over the place. Whenever you are committing yourself to a product by buying a lot, then indeed, you create expectation, and people feel that they have to sell the product, and they will do whatever it takes so that the company is not left with massive overstock that they have not managed to push. They will organize themselves so that this massive supply transforms into massive sales, or at least that’s what they will try to do. They adjust the price according to how much they have in stock, or sometimes things that are even more mundane. If there are stores, if you

Nicole Zint: In a slightly different direction, just to establish a bigger differentiation, you see those feedback loops, they are all over the place and they are not bad. They are just present, and again the core reason is because in the middle we have humans that can think and act based on those enhancements about the future. So whenever there is humans in the loop, whenever you’re making a statement about the future, people are going to react according to those statements. Supply chains are very complex, so those reactions can take many forms. But all supply chains have in common of having plenty of people, and sometimes, for example, the feedback loop also takes the form of announcing a shortage of something. Then people rush to buy this something, and so you can have a man-made shortage just because it’s a psychological effect.

Joannes Vermorel: Exactly. And the idea that if you announce a shortage, you’re most likely going to cause a shortage, it’s nothing new. It’s relatively predictable, but nonetheless, it is difficult to anticipate all those cues because suddenly you have to be perfect. Yes, and suddenly you have to model, in a way, the psyche of the people who are in the middle of the supply chain.

Nicole Zint: Joannes, you keep mentioning these feedback loops. Alexander, may I ask you what actual data is fed back into these systems for our viewers to understand? So at what point in the supply chain do we feed the data back?

Alexander Backus: Good question. I think one very important source for doing any type of forecast is your sales data, and this is also the key data that is affected by the effects that we just talked about. So if we go back to what Joannes was explaining, the naive approach to demand forecasting, or business forecasting in general, is where you take a supervised machine learning model and you start treating it as a basic regression problem. So you say, “Okay, I’m just going to predict this quantity based on historical data using a supervised learning algorithm.” And then if you take that model that is trained to predict future sales and now think back about the examples of the feedback loops that we discussed, you can have detrimental or degenerate cases here. So where your model predicts low demand or low sales, let’s be very cautious in not confusing the two, but let’s ignore for a moment that sales are not demand.

And so it will get you into a situation where you predict low sales, so you also do low capacity planning, and therefore you also sell less, and then you will go down and down until you get zero. So the model will start to learn that the demand is falling, but it’s falling. And it can go also the other way around, actually. So it can also spiral up in that sense.

Joannes Vermorel: Yeah, there are these detrimental effects if you use a machine learning model to learn from history to predict the future in this more naive way that can go completely wrong here.

Nicole Zint: It kind of sounds like a bullwhip effect, where a mistake in a supply chain or a deviation from the norm just gets amplified by the system. And you also mentioned the fact that sales are not necessarily demand, because you may sell 50 units of your stock, but if the demand was 100, it will still only register that your sales are 50. That distinction is actually related to the core of this problem.

Alexander Backus: Yeah, demand itself is, of course, an unobserved quantity. You cannot measure it, so you need to infer it. And sales data are the closest to that, but that’s definitely not the whole

Nicole Zint: So, we’re discussing the idea that forecasts produced can influence demand and sales, creating a feedback loop. Some have described the difference between predicting the weather and predicting business, where predicting the weather doesn’t affect it, while predicting business can actually impact it. Alexander, could you elaborate on this feedback loop, and how do we avoid the zero forecasting problem you mentioned?

Alexander Backus: Certainly. When a machine learning model learns from its own output data, it can amplify deviations from the norm. For example, if the demand drops a little for any reason, the model may tell the system to order less. As a result, demand drops even more because less is ordered, and the model then suggests ordering even less, leading to a zero forecasting problem. This issue is particularly common with time series forecasting. Joannes, how do we avoid this problem with machine learning systems?

Joannes Vermorel: The zero forecast is something you get when you don’t remove the stock-out bias, which can be quite strong. If you run out of stock, you observe zero sales, but that doesn’t mean there’s zero demand. We have at least three techniques in production at Lokad to deal with the stock-out bias. One approach involves changing the metric that you’re optimizing against with your forecasting model. Instead of applying the metric uniformly across time, you zero out the measurements on days where you have stock-outs. That’s a crude approach, but it can work.

Nicole Zint: What metric is typically used initially that you suggest changing from?

Joannes Vermorel: There are thousands of metrics, but the simplest ones are L1, L2, or even MAPE. The question is whether you apply the metric uniformly across time. The answer is typically no, you don’t want to apply it uniformly. You want to zero out your measurements on days where you have stock-outs.

Nicole Zint: So, to zero out means to remove the contribution of a day where there was a stock-out?

Joannes Vermorel: Yes, you remove the contribution of a day when you know your signal is heavily distorted. It works fine to cut out that signal, but it’s a rather crude approach.

Nicole Zint: Not if your stockouts happen to be very prevalent. For many businesses, stockouts are statistically relatively rare. They have like a 95% plus service level, so this sort of method works well if stockouts are somewhat exceptional, kind of like a natural disaster that happens quite rarely.

Joannes Vermorel: No, I mean just like, let’s say, a general merchandise store, you know, your supermarket. They have a 95% service level plus every single day, that’s fine. Where it would not work would be, for example, for a store of hard luxury. In this case, a store of hard luxury, just to give you an idea, would typically have, let’s say, 500 articles out of a catalog of 5,000. So, by definition, you have like 90% plus stockout all the time. In this case, it’s not very sensical. So, you see, it really depends on the industry. There are industries like, for example, food, where you expect very high service levels. Your assortment is geared towards things that you’re supposed to have. For example, if your supermarket is usually selling a pack of soda bottles, you should be able to walk in the store with confidence that you will find those units. Sometimes you won’t, but those events will be rare. So again, it depends on the verticals that you’re looking at.

Nicole Zint: Okay, and essentially, the sales can send a wrong signal about the demand, like you explained. If it’s zero sales, it can just quickly be wrongfully assumed that it means zero demand, but in reality, it could be because you don’t have that stock. In fact, there is very high demand in that. And then the opposite is true as well. If you have a stockout for another product that happens to be a nice substitute, then you can see the sales for an item surge while it is just reflecting the fact that you’re running out of stock of something that is like a loose substitute. Nonetheless, the perception of the customer might be that it’s bad service.

Joannes Vermorel: Yes, because customers might still be okay to take the substitute, but they might still think it is an inferior option. So, again, what is interesting is that you have to consider the agent, the customers, and what they think, and try to adjust your modelization of the demand to capture the sort of basic line of thinking that is going to go into your customer base.

Nicole Zint: How do we avoid this zero forecasting problem so that zero sales isn’t assumed to be zero demand?

Alexander Backus: Jiran has mentioned just not taking that signal into account, to just avoid those days. In technical terms, that’s called loss masking.

Joannes Vermorel: Yeah, you basically remove the contribution of that data point. Another straightforward technique is giving the model access to stock levels historically and maybe some future projections of it, so you can make sense of how these sales are influenced by the stock levels.

Alexander Backus: The model can then learn what the effect of certain stock level fluctuations on future sales or demand is if you model it. Essentially, the effect of the decisions.

Joannes Vermorel: Yeah, that’s where everyone wants to go, where you take all the decisions that have been made based on your previous forecasts and you feed them as input to your forecasting model.

Nicole Zint: When you train it, that’s not only stock decisions that impact stock levels, but it can also be marketing decisions, like even a target set by the business steering. They say, “Hey, this is how much we want to sell.” That’s a decision in itself because we have all these market forces.

Alexander Backus: Yes, market forces. You put all of that into the forecast as an input, such as promotions, pricing data, and capacity data. Capacity can also influence demand. If the lead times skyrocket, people go and find alternatives. Essentially, all the constraints in the business, warehouses, and everything that can affect demand serve as input signals to your model. Then the model can learn from history what the effect of these signals is on demand and therefore correct for it.

This is sort of step two in your modeling because there’s a lot of things you have to be wary about here. An interesting side step is that business users want to use your model to do what is called, in technical terms, causal inference. They want to tweak things like, “What happens if we do this promotion or if we reduce the stock levels? What happens with demand?” It’s kind of like a simulation.

For this to work, you need to take a lot more care in the modeling. If you do it the way I explained it, your model can easily learn effects like when the stock is low, the demand is high, just because some marketing campaign, which is the actual cause, made the stock go down and the demand go up. It confuses the concept. That’s called a confounder or reverse causality. A standard machine learning model, not given all the information it needs, will make this kind of mistake.

A classical example is when you try to predict whether it will be hot weather. You can predict that by the number of ice cream sales. Well, of course, that’s a typical example of reverse causality. But maybe they cut their price down or had a stock out, and that was the actual reason. There are many things possible.

But you have to be careful. This is a way to get started with giving your model more information about the decisions that were made upon it and make sure that it learns how to relate. However, this will still be pretty challenging for the model itself to learn these relationships, especially if there are a lot of steps in between where you don’t have data from. If you give a forecast, it’s not one-on-one that someone in the business will take that and make decisions on it. There will be information added, changes made by planners in the business, and then you’re blind to that to some extent. It becomes again problematic and complex.

Before we dwell into how we actually approach these new challenges faced by creating a…

Nicole Zint: Machine learning is a smarter model that outputs decisions and learns. Alexander, how does every decision impact the business, and how can we compare them to find what decisions we should take? We don’t just want to forecast, but also understand the intermediate steps. But before we dwell into that, Joannes, we mentioned a bit earlier this zero forecasting model, which is an important concept in this machine learning model. What is the difference in forecasting approaches that we take at Lokad? Do probabilistic forecasts help solve the problem with zero forecasting and sort of amplify, as we discussed, these deviations from the norm that just become bigger mistakes? How does probabilistic forecasting change that?

Joannes Vermorel: Probabilistic forecasting is very interesting in this respect and more generally for the feedback loop. There are two completely different reasons for this. The first one is the idea that we introduce a notion of fuzziness, so we try to be at least approximately correct as opposed to exactly wrong.

When it comes to situations with zero forecasts, for example, what happens is that when you have probabilistic forecasts, you acknowledge that the quality of the information you have tends to be quite fuzzy. You don’t have a perfect vision of what is happening, and thus it is going to be much more difficult, numerically speaking, to converge to an absolute confidence that the demand is really at zero. So it’s not that the probabilistic forecasting model is so much better, it is just that it will be spread out and it will avoid deadlocking on this zero position. It considers all the probabilities across many values, and when you add into the mix the fact that you typically have strong asymmetries between being able to serve or just being able to serve a unit to be served versus just keeping one extra unit in stock for one extra day, typically in many situations, you are very much in favor of keeping one extra unit for a day rather than take the risk of facing a stock out. The trade-off is very much geared toward higher service levels.

Thus, what you get out of probabilistic forecasts is a situation where you have probabilities that are spread. You don’t have your forecast, which is your numerical statement about the future, that just collapses swiftly toward a degenerate state, which is we are just saying that the future demand will be zero. It will suffer problems, so if you have repeated stockouts, probabilistic forecasting is not magic. You will most likely underestimate the actual future demand. However, you will most likely avoid the inventory freezing at zero just because you’re still estimating that there is a non-zero probability of having one or two or three units of demand. That’s the first argument; it avoids amplifying in one direction.

Alexander Backus: Yes, it’s also important to consider that, especially when we have feedback loops, situations are very difficult to control completely. It’s better to have something that does not amplify in one direction, as Joannes mentioned.

Nicole Zint: Pretend to have complete mastery of everything. Again, this is not the movement of planets we are talking of. Phenomenons where 30-60 percent inaccuracy is nothing, you know, is nothing too surprising.

Joannes Vermorel: So we are talking of a degree of inaccuracy in the sort of numerical statement that we make about the future that is very high. Probabilistic forecasting at least gives something that just reflects this enormous ambient uncertainty that we have. Again, we are trying to model humans, you know, people that can react. It’s very, very difficult and the first thing to acknowledge is that you’re not in control. I mean, those people – those clients, those suppliers, those competitors – they are smart, they are playing their own games, you know, they are doing a lot of stuff. It would be, I would say, a bit of hubris to claim that you can perfectly model whatever is going to happen. That would be the foundation of the science fiction novel from Asimov, where you can have a perfect statistical modelization of the future of large civilizations. It is extremely difficult and most likely unrealistic.

Joannes Vermorel: Probabilistic forecasting is also of high interest for a completely different reason. The second reason is that, unlike the point forecast where you have complete symmetry between the past and the future – with point forecast, you have essentially one measurement per day per SKU, that would be your sales, for example, or your demand – and when you project into the future, you end up with one measurement per day per SKU. So the forecast is very much symmetrical to your past observation. However, when you are going into the realm of probabilistic forecasting, suddenly, what you’re looking at is a probability distribution or a series of probability distributions. And so, you have this very strong asymmetry between the past and the future. Suddenly, the future is completely unlike the past. The past, you have observations, they are unique, there is no uncertainty or if there is, it’s just the uncertainty of the measurement itself. I mean, there might be a clerical error in your sales record, but in terms of order of magnitude, this is very, very small. This can be almost always approximated in supply chain as no uncertainty compared to the future, where the uncertainty is vast and that’s your probability distributions.

Joannes Vermorel: And thus, what is very interesting, and that brings me to the feedback loop, is that the feedback loop is yet another extra dimension. It’s a way to enrich the forecast to make it more robust, but in a way that is very different because if probabilistic forecasting was about introducing probabilities, the feedback loop is about making the forecast a higher-order function. So fundamentally, your forecast is suddenly not a result, not even a probability distribution, it is a mechanism in which you can inject a policy, a sort of reaction, and you will get a different outcome. So you see, it becomes somehow something where you just know that if somebody acts – and this somebody can even be yourself in a certain way – you will still have an impact on the forecast.

Nicole Zint: So the situation becomes more dynamic and holistic when you go into the realm of feedback loops. Can you explain how this affects forecasting and how it becomes more elusive?

Joannes Vermorel: When you go into the realm of feedback loops, you’re dealing with something dynamic that needs a functional ingredient at its core, like a policy. This policy dictates how you react in terms of stocks, price, and different factors that represent your forecast. The forecast becomes more elusive because it’s not a simple object anymore. It’s affected by these feedback loops, and when people say “forecast,” they usually think of a point forecast. When we go into the realm of policy forecasts, we’re already stretching what people can think of. When we say it’s going to be probability distributions, it becomes much harder to visualize.

For example, the fact that your prices are going to evolve to help maintain the flow of goods in your supply chain. If a company is about to suffer a massive shortage, the most natural response is to gradually raise the price so that the shortage is less severe. Conversely, if you’re about to suffer a massive overstock situation, the natural response is to lower the price to increase demand and liquidate the overstock. The forecast you have about the future depends on your pricing policy in these examples. When you start thinking about feedback loops, your forecast becomes conditional, taking into account a policy that is under your control to some extent.

Nicole Zint: Alexander, do you agree with the strengths and differences Joannes just outlined with the probabilistic forecasting approach compared to a time series?

Alexander Backus: Yes, giving your model access to previous decisions like pricing can alleviate this issue. Joannes talked about time series and probabilistic forecasting in that respect. However, we don’t only have the effect of your forecast affecting future decisions and training data; we also have what’s called partial observability. This means you only observe the effect of the decision that’s taken, and you don’t know what would have happened if you had more capacity or more stocks. That’s a counterfactual. The challenge is to create a model that’s good enough to accurately predict the impact of all the decisions.

This phenomenon is very well known in e-commerce recommendation systems and is arguably less so in supply chain. It’s called bandit feedback. The term comes from the multi-armed bandits, a slot machine setup in a casino where you only observe the reward you get from the slot machine or which arm you pull.

Nicole Zint: And then that’s the same effect, and the recommendation system is similar to that because if you show a certain advertisement, you don’t know what would have happened if you would have shown a different one to the customer. There have been specific modeling approaches which are well-suited to this, and the naive supervised learning setup that I talked about in the beginning is actually where this falls short. So, this is not good at predicting the effect of the action. Rather, what you want to do is reframe your machine learning problem, so the model should not output a prediction about the future; it should output an optimal decision. And this is what I think Joannes alluded to as well, it’s called a policy. So, you learn a model that says this is what you should do. This is the ad that you should show, or in a supply chain context, this is the stock you should move from A to B, this is the amount of capacity you should reserve. So, the actual things that directly affect your supply chain rather than a forecast on its own, from which you make the decisions on your own that the machine doesn’t know which decisions you took. In theory, you could actually completely skip the whole forecasting and just say this is what you should do.

Alexander Backus: There are specific machine learning algorithms, and the broader class is actually called reinforcement learning. That’s where you take an action in the real world, you observe the effect of that, and you should frame it in terms of rewards, financial rewards. And that’s when you get the feedback and then update your model based on that feedback. You mentioned financial rewards, so is an example if, say, you make the decision to order this much stock, and then you observe how the supply chain performs, how much money came into the account, and then that is fed back into the system so it understands, well, when we took these decisions, this was the output, and so it continues from that.

Joannes Vermorel: Yeah, that sort of financial reinforcement or financial objective can be more complex, taking into account storage costs, missed opportunities, and so on. There’s a lot that can be elaborated on that, or we can keep it at that. So that’s what you then optimize with this reinforcement learning algorithm. That way, you directly learn the policy, the decisions you should output. So you kind of more embrace this self-fulfilling prophecy rather than avoiding which we did start to talk about in the very beginning of our discussion. So, it’s not good or bad; it just cannot be ignored. And that is a way to bypass this, to have this model that takes into account the decisions and learns from the impact of previous decisions to create better and better decisions.

Alexander Backus: We should think a bit about the implications of that because that means you should also be able to experiment. And that is, in this setup, of course, very challenging if the model needs to learn and see what happens if it does A or B.

Nicole Zint: So why hasn’t this been essentially applied before, or is it not applied everywhere?

Alexander Backus: Well, this is one of the reasons. And also, typical reinforcement learning algorithms are learning in an online fashion, as to say, they take an action and then they learn from the reward feedback they get from that. This is problematic in real-world settings where there’s a lot of risk involved, and also you don’t have

Nicole Zint: You don’t have something to start this algorithm with, to make it output sensible things in the first place. It starts randomly initiated. Or you need to have a very good simulation environment, which is what you often see in other reinforcement learning settings like AlphaZero learning to play chess from Google DeepMind. They have a simulation, so they have a computer simulation where this reinforcement learning algorithm can play around. So you don’t essentially sacrifice someone else’s supply chain.

Alexander Backus: Exactly, you don’t want trial bunnies. But this is a chicken and egg thing in our case here because then you need a very accurate model of reality. And if you have that, then you already have solved the problem. So, you need a supply chain in the first place to do that, and you don’t want to do that. You need a model of your supply chain. If you have that, you should not need to train, and you should already be able to figure out the opportunity. Back to where we started.

Yeah, but there is a promising direction nowadays where you learn from historical data. It’s called offline reinforcement learning, where you basically learn from historical decisions that were taken. Even though they are not as nicely spread as you would have liked them to be, it’s still possible to train algorithms based on real-world data that has been gathered previously.

Nicole Zint: Like a starting point?

Alexander Backus: Yeah, like a starting point. And from there, you can then go without sacrificing your supplies to more online settings, or you train it offline before you release it in batches. There are several options there, but this also comes with its own challenges. Joannes, what is your take on what Alexander just described: starting offline, learning from previous data, and then essentially, the machine bypasses this chicken and egg problem, becomes good enough to be applied to a real supply chain, therefore, has more real data to work with, and go on from there? What is your take on this?

Joannes Vermorel: Data efficiency is almost always a concern for any kind of machine learning algorithm in supply chain because you never have the luxury of having a gigantic amount of data, at least not at the granularity at which the decisions need to be made. In supply chain decisions, they need to be taken typically at the SKU level. And due to the fact that you have batching that takes place, even if we are looking at the SKU in a store, it’s not going to be millions of units per day. And if we are looking at the SKU in a factory, then there will be big batches, and it will be by batches of, let’s say, 10,000 units. And again, it’s not going to be millions of batches per day. So, the amount of relevant observation is still relatively limited.

That’s one aspect that is always a challenge for reinforcement learning because we don’t have that much data. A simulator is of very high interest, but it was also a point that I briefly touched on in one of my lectures. Essentially, there is a duality between a probabilistic forecast and a simulator. If you have a probabilistic forecast, you can always sample observation, and thus you get your simulator out of your probabilistic forecast. And if you have a simulator, you can just run many simulations and compute the respective probabilities, and you’re back to your probabilistic forecast. So, there is a very strong duality.

Yes, that is interesting, but that relies on having a very accurate probabilistic forecast, which is very challenging.

Nicole Zint: The partial observability is a specifically tough nut to crack because when you take a dataset, let’s say, for example, you want to investigate price movements. The company might have operated in a specific way for the last decade where they were not doing price movements at random; they had very strong habits. For example, sometimes the habits are so strong that it creates problems when it comes to actually differentiating what is the actual cause of something.

Joannes Vermorel: What if the company, at every single year at the end of January, decides to have the first beginning of year sales? They have a practice where they add big discounts across a large variety of products at the end of January, which you will observe as a surge of demand at the end of the month. But what is the effect of the seasonality? Would they observe a spike of demand at the end of the month even without the discounts? And what is the proportion of the impact that comes only from the discounts?

Alexander Backus: That’s the problem, indeed. The decisions were not taken randomly, and so what you observe reflects quite extensively the usual practices. One way in reinforcement learning to tackle that is to introduce a mix of exploration versus exploitation. Exploitation is you do the best from what you have observed based on what you’ve observed, and exploration is that you try something new, but with the expectation that because it is partially random, it will be inferior.

Joannes Vermorel: So why would you ever try something that you know is most likely going to be inferior? The answer is, well, because it’s the only way that ultimately you can discover something that turns out to be superior. That’s the idea of sacrificing, essentially, it’s an investment for research and development. And that could be something where it’s not actually something that can take forms that are very mundane. It could be, for example, that let’s say you’re in a store, you’re selling candles.

Alexander Backus: And you realize, what if you were trying to sell the same candles but at a price point that was four times higher or four times lower? Both options may be valid. Maybe if you go for a very big bulk order from one of your suppliers and you vastly increase the quantity, you could potentially vastly lower the price of a basic product. I’m taking a candle on purpose, so you could have a much lower price, and maybe you would multiply the demand that you observed by 10.

Joannes Vermorel: That would be a worthy trade-off. Or go the other route, change your path completely and say, “I’m going to go for something that is much more premium, add flavor or fragrance, and something else, better packaging, and multiply the price by four.” Instead of having a tenth of the demand that I used to have, I still have half of the demand but for a product that has a much higher price.

Alexander Backus: However, if we look at history, most likely, the variation that we have observed were just small variations compared to the baseline. Our history doesn’t encompass these more crazy, if you wish, scenarios.

Joannes Vermorel: Yes, and again, it can be about what if you take a product and say, “I introduce five variants of five different colors.”

Nicole Zint: You know, what is the degree of cannibalization that I will observe, or am I actually touching new markets? Again, if I take candles and if I say that I’m going to introduce multiple colors for candles, to which degree will those candles of different colors cannibalize themselves, and to which degree will I actually fulfill entirely new demand?

Joannes Vermorel: I don’t know, and maybe this recorder might give me some glimpse of this. But to a large extent, usually what we see is that, as long as companies start to introduce some kind of machine-driven randomness, there is very little randomness. It’s much more a matter of habit patterns. And again, it also boils down to the way those companies operate. When, for example, there is a pricing decision, it’s typically not just a person who came up with the idea. There is a method to it, and people have been trained to say, “In this sort of situation, you should be discounting the product because it’s the usual practice and it makes sense.” It’s fine, but it also means that most of the price variation that you observe in the historical data always follows a small number of patterns that are precisely the methods that are in place.

Alexander Backus: But surely, that’s still a good starting point, though. When, as you mentioned, what do you do else? Either you sacrifice a supply chain or you create a great simulation, but that is also based on the idea that you do have good data to go off on. But, as I mentioned, if we do it an offline way, that we do look at our existing sales history or data that we have, even though there can be this downfall that we might not see this huge deviation from the norm to observe the different consequences of that, is that still the right starting point, in your opinion?

Joannes Vermorel: I believe that the right starting point is slightly different. The right starting point is first to acknowledge that whenever we have feedback loops, this is fundamental. If we acknowledge that those feedback loops are real and we want to tackle them, it is a change of paradigm in the way we approach forecasting itself. You see, that’s a real starting point. The rest is technicalities. There are plenty of models. The simplest reinforcement learning models, like bandits, can be incredibly simple. Some are incredibly complex, but those are technicalities. What I observed in real-world supply chains is that the biggest challenge to actually start embracing something as simple as those feedback loops is to acknowledge that it will actually have deep consequences on the forecast themselves. The forecasts are never going to be the same, and I’m not saying quantitatively. I’m saying in terms of paradigm, you cannot look at those forecasts the same way. This is not even the same object anymore. This is something of a different nature, and that’s very difficult because usually, the question that I get is, “Will my forecast be more accurate?” One of the challenges is, as soon as we start looking at those feedback loops, how do you even measure accuracy when you have feedback loops? That’s a whole question of its own. It’s a difficult question.

Alexander Backus: Yeah, if I can tie into that, I think we’ve been discussing the technical challenges and data availability challenges. But I completely agree with Joannes that the main reason it hasn’t been applied or adopted in enterprise settings is also that it has a profound impact on your business process. So, in this sort of theoretical setting…

Nicole Zint: So, who do you think are the most technologically-minded players in the e-commerce industry?

Joannes Vermorel: The trend is, I believe, if I look at the very aggressive technologically-minded players, that would be the dd.com, the Amazon.com, Alibaba.com. You know, those e-commerce companies that are ahead of their games. Yes, they are really on top of their games. They are very, very effective.

Alexander Backus: I would agree with that. Those companies are definitely leaders in the industry when it comes to technology and innovation.

Nicole Zint: So, the world has changed a lot over the years. What do you think, Joannes, about the world we live in today?

Joannes Vermorel: Well, it’s not as simple as it used to be. The world is still progressing, but we’ve had a lot of surprises in the last couple of years. It’s clear that we’re not at the end of history where everything is predictable. The world is chaotic, and we have to embrace the uncertainty and complexity of humans, machines, and processes in supply chains. We can’t have complete control, so my approach is to be approximately correct to capture everything rather than being exactly wrong.

Nicole Zint: That’s a really interesting take. And what about you, Alexander? What kind of talent do you look for when onboarding new people onto your team?

Alexander Backus: At IKEA, we’re always looking for great data science talent to solve challenges in a big corporation. We have a lot of data and potential to impact on a global scale, so we need to challenge the status quo.

Nicole Zint: Thank you both for your insights. It’s been a pleasure having you with us today.

Joannes Vermorel: Yes, thank you.

Alexander Backus: Thanks for having me.