In November, the European Banking Authority (EBA) issued a Discussion Paper on Machine Learning for IRB Models. As described therein, the aim of the Discussion Paper ‘is to understand the challenges and opportunities coming from the world of machine learning (ML) should they be applied in the context of internal ratings-based (IRB) models to calculate regulatory capital for credit risk.’ As I explore in my new book Driverless Finance, authorizing machine learning algorithms for use in IRB models is as a very bad idea. While machine learning may have some useful internal applications for banks assessing credit risks, a bank’s regulatory capital requirements should not be tied to machine learning models.

The 2008 crisis revealed significant flaws in the IRB approach to assessing regulatory capital requirements. Perhaps the most famous and elegant criticism can be found in Andrew Haldane’s ‘The Dog and the Frisbee’ speech, which observed that ‘[h]owever well they perform in theory or in sample, complex capital rules do not appear to have performed well in practice and out-of-sample.’ Instead, Haldane argued that ‘[i]n complex environments, decision rules based on one, or a few, good reasons can trump sophisticated alternatives. Less may be more.’ He recommended that we ‘take a more sceptical view of the role and robustness of internal risk models in the regulatory framework’, imposing strict limits or floors on model outputs.

Haldane’s speech was given in 2012, before most banks were seriously contemplating using machine learning to assess risks. However, its critique of complexity-in-the-face-of-uncertainty is even more trenchant in the context of these machine learning models. Even if regulators won’t follow Haldane’s general admonition to put the genie back in the bottle by limiting reliance on IRB models, they should heed his warnings about letting another, even more complex genie out of the bottle. Unfortunately, despite identifying significant complexity and ‘explainability’ problems as key hurdles to incorporating machine learning-driven models into the IRB framework, the EBA nonetheless seems primed to adopt ‘a set of principle-based recommendations which should ensure an appropriate use of such techniques by institutions in the context of IRB models.’

While I have previously argued that using principles-based regulation to address new technology sometimes makes sense, there is a better and simpler alternative in this case—just say no. While machine learning analyzing big data may increase predictive capacity in some circumstances, that can only work if there is enough big data for the machine learning algorithm to learn from. As Rama Cont, the chair of mathematical finance at Imperial College London, said in 2017, ‘[w]e are not in a big data situation really. The only situation where we are really strong with data is consumer loans, credit cards and so on. We only have one market history, so is the pattern which led to Lehman the same which leads to the fall of bank X the next time?’

Like previous generations of technology used in IRB models, machine learning algorithms still learn by assessing probabilities. This means that, by nature, they tend to neglect low-probability events—but low-probability high-consequence tail events are the very things capital regimes are supposed to prepare for, and we simply don’t really have enough real-world data on financial system failures to train these algorithms to prepare for them.

Not only are machine learning algorithms unlikely to be sufficiently predictive to justify their inclusion in IRB models, their complexity and inscrutability will also invite new types of bias and regulatory arbitrage into bank capital regulation. Training machine learning algorithms is not a passive process—the selection of algorithm and data, along with the labeling, tuning, and testing processes, are all fertile ground for humans to tweak the operation of the algorithm. Notwithstanding this human involvement, though, answers generated by computers are often perceived as having in-built legitimacy, and so regulators may take the mental shortcut of deferring to them unquestioningly (this is known as ‘automation bias’).  The more complex and automated the technology, the more likely regulators are to defer to its outputs—and so regulated banks can exploit automation biases by selecting the most complex and least explainable forms of machine learning algorithms. This can hide from the regulators the many opportunities that a bank has during selection, testing, and training processes to magnify machine learning algorithms’ tendencies to ignore tail events, underestimate risk-weightings—and ultimately minimize the bank’s capital requirements. All of these problems will be exacerbated if banks using machine learning in their IRB models buy their algorithms and/or data from the same vendors. Then they will all underestimate tail risks in the same ways, which could create havoc in the event of a future crisis.

Many people see enhanced ‘explainability’ as a solution to the machine learning’s problems of complexity and inscrutability. In its Discussion Paper, the EBA recommends that ‘institutions find an appropriate balance between model performance and explainability of the results’, simplifying the model where possible. The EBA’s notion that sometimes it’s appropriate to sacrifice performance for the goal of explainability makes sense, but only to the extent that explainability is actually helpful for bank regulators’ specific purposes. If our goal is to determine how a specific decision was made after the fact (for example, scrutinizing a bank’s lending decision to see if it was based on discriminatory data points), then explainability is helpful. However, explainability doesn’t tell us much about how the model as a whole works or how it is likely to make decisions in an uncertain future—which is what regulators really need to know when assessing IRB models.

During the financial crisis of 2008, IRB models showed themselves to be highly flawed, and while machine learning may remedy some of those flaws, there are many flaws it will not address—and others that it will exacerbate. The EBA—and other financial regulators around the world—should heed the cautions offered by Haldane and others to resist the march towards greater complexity in capital regulation. Machine learning has no place in IRB models—as Haldane concludes The Dog and the Frisbee, ‘[a]s you do not fight fire with fire, you do not fight complexity with complexity.’

Hilary J. Allen is a Professor of Law at the American University Washington College of Law, and author of the new book Driverless Finance: Fintech’s Impact on Financial Stability.