I predict that predictive analytics is going to be increasingly relied upon and this won't always be a good thing as predictive analytics, like predictive text, can be wildly wrong. I once published a newsletter without realising that autocorrect had renamed 'Emma Saunders' to 'Emu Saunters'. Neat, but so very wrong.
Predictive analytics can't predict the future. It's an activity that spots trends from the past and applies them to certain situations in the future.
There are many forms of predictive analytics, some more complex than others. Banks have been at it since at least 2000 to produce Value at Risk (VaR) reports. There are three alternative methods to work out VaR and each is a form of predictive analytics.
The most sophisticated is the Monte Carlo method which applies historic price movements to a current portfolio in a series of stocks to conclude with 95% confidence the maximum loss over the coming days or weeks. As we saw in 2008, that less confident 5% can cause a lot of pain.
As far as I'm concerned, the more complicated the algorithm, the greater the risk in using it. Low risk predictive analytics use methods that are easy to grasp. Lognormal price distributions and the ability to borrow at a risk-free rate (assumptions for VaR) fail this test. If we can't grasp the method, we don't have hope of working out the implications of making these assumptions or spotting the moment when they no longer hold.
There is something comforting about producing a number that quantifies our risk. We can outsource our responsibility to that number somehow. Things might look a bit dicey in the market, but if our VaR remains steady, we're ok.
Road accidents dropped from 8 to 1 per year in Drachten (Netherlands) Town Centre once all road signs and traffic lights were removed. Drivers had to look, make eye contact and fully engage their brains during their drive. No technology needed, just logical human behaviour.
I'm not suggesting we stop modeling everything because I'd be out of a job. But I do think we should limit our reliance on predictive analysis. Complex algorithms, black box forecasting, simulation-based modelling, scenario analysis and unexamined assumptions should all be red flags to decision makers.
I have been employed to do all these things and they have their place when done well. But they should be presented with clear interpretation or be restricted to an audience that understands them.
Predictive analysis works best when its method is easy to grasp and its limitations are obvious.
Extrapolation is a basic example. Take a bar chart of annual turnover figures, identify the trend and extrapolate. Your audience grasps the method and sees its limitations. Of course, extrapolation tends to work better at a level where the trend is salient, such as per business unit or by-product. Recombining the figures gives a better result. It's crude, yes, but it's not wrong.
Frequency analysis is another example. It's my personal favourite and I think it is remarkably underused. When an e-commerce giant tells you that, “Customers who bought this item went on to buy these other items”, the assumptions are obvious. Other people liked both widget and gadget so you might too. Retailers are increasingly making use of frequency analysis in this way and we can learn from them.
Science is full of cause and effect data. Those who model this data typically want to use some kind of flow model.
I think there is a better way in some cases. I had a long chat with a doctor about the future of health technology and he revealed he was working on a cause and effect database for health conditions. For instance, a patient would reveal they had a headache, were losing weight and had loose bowels. Then the database would work out which conditions could produce all three symptoms.
Since we were sharing health tech ideas, I revealed that I was working on something with a similar output but wildly different method. If this health tech app was adopted by enough hospitals, after a couple of years I would use frequency data to flag possible conditions.
My system would not predict a diagnosis: it would simply report the frequency of past diagnoses. “80% of patients with these systems were ultimately diagnosed with IBS; 15% with bowel cancer; 5% with stomach cancer.”
Identifying cause and effect is a fundamental scientific endeavour. It moves the human race forward. We should in no way replace that with frequency analysis. But wherever there is cause and effect data, predictive analytics can easily be accurate if we focus on frequency analysis. It has the benefit of informing the doctor, not competing with him or her. It is also a statement of historical fact rather than a prediction and that appeals to a need for accuracy in analytics. I hope it appeals to yours.