We have heard so many times the “garbage in, garbage out” cliché when it comes to computing and data quality. There is an aspect of Data Quality that is rather important and which sometimes does not get proper attention when conducting data analysis; we are talking about the Currency of Data.
To explain the concept, I will use two examples together with the associated impact of the data currency in each one of them. Those examples come from the Risk Management world within Banking.
When individuals apply for a home loan they are assessed to see if they are suitable for credit: assets, stable job, good credit history, level of income and expenses, etc. That assessment will play a part in their Credit Rating which is associated with the Probability of Default (PD); this is a fundamental measure to calculate Credit Risk and ultimately the Capital the bank has to keep in case of financial difficulty for the institution. Home Loans are commonly issued in 30 year contracts. If the Credit Rating is not reviewed on a regular basis within those 30 years, it does not matter how fancy your analytic tools are. You will be calculating your Credit Risk and Capital using data that is not current.
Now let’s imagine a large bank with millions of customers with home loans, credit cards, personal loans, etc.; which could easily represent many billions of dollars in credit products. From those millions of customers year after year; how many do you think will be changing jobs (or losing their jobs), increasing their level of expenses, selling their assets, etc.? Perhaps many. Even if all of the customer data was complete and clean, it might be not be up to date and that could impact the bank’s reporting. Sometimes these changes in customer status aren’t realized until they default (i.e. they can’t afford repayments), but by then it is already too late in the process.
If calling your customers for a Credit Assessment re-evaluation is difficult to perform for a number of reasons (i.e. inconvenient for them, difficult to chase, etc.), then you could take an Analytics approach, since Analytics does magic sometimes. You could simulate people changing jobs, decreasing income, increasing expenses, etc. and with this simulation their Credit Assessment could be recalculated. If a significant number of simulations are run, a realistic approximation for Credit Risk and Capital figures could be achieved.
Let’s observe another example, but this time on the Market Risk front:
Brilliant quantitative analysts could come out with great pricing or risk models, but those models could become misleading if the data that sustains their underlying assumptions is not constantly revised. Such data must be current.
A good example would be a Value at Risk calculation, where a number of benchmarks have to be selected to reduce the number of products/factors to participate in the computation process. If those benchmarks are not constantly revised, the Value at Risk model could yield incorrect results. In fact, the concept of data as part of assumptions in models also applies to Credit Risk, Liquidity Risk, and Operational Risk to a lesser extent. It could be appreciated that the data currency concept is vital in ‘Scenario Analysis’.
As illustrated, it doesn’t take much to have obsolete data within a data repository, leading to inaccuracies in Analytics performed downstream. The reality is that we only can assume our data is correct and reliable if we have enough data controls in place. Which is a big if.