Ongoing attempts to infuse Artificial Intelligence (AI) into the practice of law are actively taking place by commercial vendors of LegalTech products and by legal scholars in academic research labs. A dizzying array of new AI-enabled legal systems and state-of-the-art prototypes are oftentimes touted to the public, but trying to discern the wheat from the chaff is arduous and overly problematic. 

Questions arise such as how much intelligent-like behaviour has been achieved and whether the latest system is an improvement over prior instances, plus if so, what is the magnitude of the advancement so incurred. Usually, the wordy narratives conveying the added features and functions are chockful of technological buzzwords and fail to sufficiently indicate the actual calibre or degree of advancement.

A key reason for this difficulty is due to ambiguities of the AI moniker per se, proffering a rather broad and vague umbrella term that is ostensibly amorphous, lacking in any substantive demarcation of what the advanced automation constitutes. What is needed to rectify this ambiguity is a type of numeric Richter scale that denotes the level of AI that has been infused into a legal system. As such, having a definitive and standardized set of Levels of Autonomy (LoA) for AI-powered legal reasoning systems would usefully and succinctly provide a rigorous means to denote the inured capabilities. In short, the everyday use of a universal scale would demonstrably aid in unravelling and rationalizing the claims made by LegalTech vendors, doing so to achieve a no-malarkey indication of what the latest wares forthrightly achieve.

In my recent open-access article, a proposed LoA framework identifies seven core levels of autonomy that have been applied to AI in the law for legal reasoning capacities. This set is based upon an analogous and established standard that is used similarly for demarking the levels of autonomy for self-driving cars. Doing so leverages lessons learned about how to best differentiate AI autonomy and provides a substantive foundation for suitably recasting the realm of AI and the law.

Here are the seven proposed levels of autonomy associated with AI and the law:

  • Level 0: No Automation for AI Legal Reasoning 
  • Level 1: Simple Assistance Automation for AI Legal Reasoning 
  • Level 2: Advanced Assistance Automation for AI Legal Reasoning 
  • Level 3: Semi-Autonomous Automation for AI Legal Reasoning 
  • Level 4: Domain Autonomous for AI Legal Reasoning 
  • Level 5: Fully Autonomous for AI Legal Reasoning 
  • Level 6: Superhuman Autonomous for AI Legal Reasoning

Consider two brief examples of how this scale can be advantageously utilized.

A vendor comes out with a boosted e-Discovery tool that claims to have Natural Language Processing (NLP) and utilizes Machine Learning, which seems impressive. But what level does this attain? 

Envision that upon the appropriate rating, the e-Discovery amplification is classified as being at Level 2. Thus, this is considered advanced assistive automation rather than existing as an autonomous capacity. Meanwhile, in ready comparison, suppose that a competing vendor has an e-Discovery tool that is rated as a Level 3. All in all, one can readily construe that the Level 3 product has a superior level of an autonomous facility than the Level 2 offering. Thus, the use of this pragmatic scale enables a kind of above-board playing field and readily facilitates head-to-head comparison.

This same benefit can be realized in the legal research sphere too. Suppose a legal scholar criticizes that AI legal reasoning algorithms are weak at identifying suitable sentencing recommendations. That might be a valid concern, though this could be based on studying say Level 1 such systems and therefore provides only a narrow perspective. Other researchers might inadvertently misconstrue the result and assume that all AI-based legal reasoning systems are equally deficient, when in fact, it could be that Level 2 and Level 3 systems are more robust and have overcome the identified weakness. Researchers would be able to utilize the scale as part of their own legal research efforts, including applying the scale to other research results for the uncovering of hidden assumptions. 

All told, a measuring scale of this nature can be applied to both the day-to-day world of business and the law, plus likewise to legal scholarship.

Furthermore, this scale for AI legal reasoning serves as a catalyst for engaging in a timely and vital dialogue about how to best rate or assess the emerging plethora of AI-enabled legal applications. Businesses that are gradually and inevitably going to be adopting these systems will need a convenient and apt means to compare and contrast competing products. Academics also are in need of a robust method for assessing how far along the advances in AI legal reasoning capacities have progressed.

Management scholar Peter Drucker had opined that you cannot suitably manage that which you are not measuring. For those in business law, having a set of autonomous levels pertaining to legal reasoning can provide a substantive benefit toward winnowing the LegalTech wheat from the chaff. 

 

Lance Eliot is Chief AI Scientist at Techbrium Inc. and a Stanford Fellow at Stanford University CodeX Centre for Legal Informatics.