Mainstream statistics are all about the law of large numbers. Yet, supply chains are the opposite. It’s the law of small numbers that prevails. For decades, this mistunderstanding has generated problems for practitioners due to misdesigned tools, methods and processes. In this episode, we discuss the challenges and the appropriate perspectives when it comes to small numbers.

As a company that specialises in big data, it may seem that we’re focusing on small numbers. Yet, it is small numbers, and not large ones, that are ubiquitous in supply chain.

By “small numbers” we mean all the numerical choices and quantities that really matter, not streams of barcode digits. For example, when it comes to quantities in supply chain, sometimes you’re even talking in single digit numbers. However, most statistics are geared towards large numbers.

The performance of most calculations is driven by the size of the data. The bigger the data, the slower it will be. The bottleneck here isn’t the CPU, but the simple fact of loading and unloading the data, which can sometimes take days. Great gains in computation speed can be made by simply shrinking and compacting the data size.

Even at Lokad, when we talk about investigating “all possible futures”, we are still thinking in small numbers, as reality and reasoned judgement are already able to eliminate many of the futures.

To wrap things up, we discuss how, if basic supply chain science concepts from the 80’s were used, that Walmart could in fact be run on a modern day smartphone. We also discuss the various problematics when it comes to aggregating data and the granularity needed within supply chain, and how to make the best use of the computing power that is available today.

## Timestamps

00:08 Introduction

00:23 As a company that deals with Big Data, it is somewhat surprising to be talking about small numbers. What is the key idea today?

02:21 What do you mean with the expression “sending a number”?

06:17 Why should we really care about how many bits are used to send data?

08:34 How do you sort between what is small data and what is big data?

10:05 If we are looking at all the possible futures, does this mean that compute costs will be exceedingly high?

13:06 In a hypermarket with thousands of transactions per day, how do you know where to draw the limits for each item?

15:02 How well does consolidating sales by week or month work?

17:31 For someone watching this, what should they look to exploit in order to make the best use of their processing power?

20:47 Are people heading towards a more Big Data perspective?