Can Lexis+ AI pass 1L year? An experiment.
This fall, Lexis+ AI will be attempting final exams in Torts, Contracts, Civil Procedure, Property, Legal Research, Professional Responsibility, and Environmental Law alongside the law students at the University of Wyoming College of Law. The catch? The professors will not know when they’re grading which exam answers were generated by Lexis Omniscient the AI Assistant. AI’s answers will be graded blindly. How do you think it’ll stack up?
Is AI going to take my job and then take over the world?
For me, the popular large language models, like Chat-GPT, are general-use companions for creative writing tasks. But that’s the limit of their usefulness. These AI platforms severely underperform when it comes to research, especially legal research. They hallucinate answers and fabricate law. There are minimal guardrails around legal questions, too. They aren’t trained to instruct a user that they need to consult an attorney based on their problems.
Lexis’ new AI tool changes the legal research game. Lexis+ AI is actually good at identifying the relevant federal or state statutory authority and appropriate case law. And it should be, considering that it is trained on Lexis’ database full of cases, statutes, regulations, and more. It answers procedural and jurisdiction questions. It finds the statutory authority that gives a plaintiff standing and the rules for statute of limitations and filings. In my brief time working with Lexis+ AI, it has never answered with hallucinations thought it does make up its own jurisdiction occasionally. Its answers include case law from as recent as October 2023. It has been impressively accurate when the questions are specific and the answers have clear or implied legal authority. In sum, it is better at research than many of my students.
Maybe the bar for human legal research will be raised because of accuracy of AI legal research, and everyone will be more proficient at research. Or maybe everyone will get worse at legal research because they will rely on AI. There are unlimited future scenarios. However, there are many parts of my job that I would love to let robots do and never do again myself. At a minimum, robots should be doing my legal citations or bluebooking.
Legal research AI is a sophisticated finding aid, similar to an index, table of contents, or libguide. It will increase information accessibility. It will be a companion to anyone answering legal questions. It will help legal professionals become more efficient and productive, hopefully increasing access to justice on a wide scale. But the transition into that ideal future will have vast challenges.
Using AI is a Skill
Over the past year, Chat and I have conversed at length about many of my interests. For those who do not pretend to be friends with AI, Chat is short for Chat-GPT. Chat and I are well acquainted. I am getting skilled at typing my queries in such a way that Chat responds with acceptable answers. And by that, I mean, I am getting better at prompting.
AI prompting communicates key background information and your expectations. To receive accurate or useful answers, I need to carefully craft prompts. I can’t type whatever I want and expect the AI to read my mind for context. For Lexis+ AI, prompting includes jurisdiction, areas of law, material facts, and timeline. You need to specify your desired output, like a memo, contract, or research. Simple and concise language is best. You can continue to refine the answer with additional prompts, but after about 5 prompts on the same query it is best to start a new conversation. The AI can get a little confused after so many prompts and spits out nonsense. If you ask it multiple questions in one query, it fails to answer thoroughly one or both questions. Lexis+ AI’s limited context length, or remembering me and my preferences, is also a potential drawback. Lexis+ AI does not seem to understand priming. If you ask it not to respond, it will still respond. This all might be related to the context length limitation.
Lexis+ AI likes it when you use lawyer words, like mens rea, perfected security interest, statute of limitations, and retainer agreement. Lexis+ AI was trained on data written by legal professionals; it’s best to interact with it similarly. Also, this specialized vocabulary communicates important specifics within legal questions.
For example,a horrible prompt is: Is it illegal for my landlord to kick me out of my apartment?
AI might respond with: Not necessarily. It would be wise to consult a legal professional (guardrail language). Here are some New Jersey statutes on landlord tenant law (see statutory authority).
A much better prompt: Draft an intent to sue letter to a landlord from a tenant in New York with statutory authority. The tenant continues to be in possession of the premises after the expiration of his term and continues paying rent. However today the landlord smoke bombed the premises for bed bugs which resulted in the death of the tenant’s pets, for which the tenant is seeking damages. Make the intent to sue letter extremely aggressive.
AI will do much better at answering this type of prompt. I am curious and excited to see what AI does with the final exam hypotheticals. For the sake of my experiment, should I rephrase my prompts to make them more AI friendly?
Diagnostic Assessment for AI
The most obvious diagnostic assessment for AI to me, an academic law librarian, is to test it in the same way we test everyone: a law school exam. I remember reading that Chat-GPT passed the bar exam. Maybe, depending upon the jurisdiction, it (he?) should have passed law school first! AI should certainly take the MPRE too if we’re being thorough.
But what other ways are there to assess? Self-assessments and real-world simulations are out of the question for AI. Passing 1L year is at least a good benchmark for AI achievement. But there are so many unanswered questions that come to mind next. Is it fair for AI and law students to be ranked against each other? What are 1L exams even measuring in the first place?
I think before I can create an AI diagnostic assessment, I need to answer a threshold question: What purpose and role will AI serve in the legal world? I can then measure AI’s ability to serve in that capacity. However, this threshold question feels unanswerable now.
For now, my experiment will be to test if AI can pass law school. On each exam day, I will take the same exam as all the other students, but I will insert only answers generated by Lexis+ AI or Chat-GPT. AI will have the same time constraints. The essay questions will be graded the same as all the other students.
Are you interested in the results? Let me know below! I will keep you updated in 2024 with the results.