The Stack Overflow Podcast

Legal advice from an AI is illegal

Episode Summary

Mark Doble, CEO of Alexi, an AI-powered litigation platform, joins Ben to talk about GenAI’s transformative effect on the legal world. Their conversation touches on the importance of ensuring accurate results and eliminating hallucinations when AI tools are used for legal work, how lawyers (like the rest of us) can adapt to GenAI, and what Alexi’s tech stack looks like.

Episode Notes

Alexi leverages AI to streamline litigation workflows and speed up research, with an eye to giving lawyers more time and energy to devote to client strategy and support. 

Find Mark on LinkedIn

Shoutout to Stack Overflow user ycr for dropping some knowledge in our CI/CD Collective: How to get the BUILD_USER in Jenkins when a human rebuilds a job triggered by timer?.

Here’s a quick preview of the episode:

“The founding thesis was, let’s try and build an AI that knows the law. And if we do that, there'll be lots of applications throughout the legal field. We knew that these foundational models, the underlying technology, were going to continue to improve and allow us to do more and more.” 

“I mean, law is one of the fields where it seems like these large language models could have the most utility, because often what you're doing is taking on a case with potentially an enormous amount of case law that you need to search through to find the needle in a haystack that will help you and/or enormous amount of documents that you need to search through. And so a system that's capable of understanding, synthesizing, and annotating and pointing you to the ground truth is incredibly valuable.”

“ It's not supposed to give legal advice if it doesn't have the licensure and the insurance.”

“Part of the problem is we have these laws that are just not being enforced at all. And so either the laws have to change or they need to start getting enforced.”

“ We realized that if we have almost 100% recall in the top 5,000 documents, why don't we just apply some sort of agentic flow to filter down from these 5,000 to the 10 documents that were really needed?"

Episode Transcription

[intro music plays]

Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am Ben Popper, Director of Content here at Stack Overflow. Today, I have an episode for you about the way in which generative AI is changing the legal industry, what it means for lawyers, and what is a very big business across many, many countries and across the globe. I have an interesting relationship with this question, and then I have a friend who's a working trial lawyer who has very strong opinions on AI. We've argued about it quite a bit over the years and I'll share some of those tales with you. Our guest today is Mark Doble, who is the CEO over at Alexi, and brings some strong opinions on lawyers needing to embrace generative AI or be left behind. So Mark, welcome to the Stack Overflow Podcast. 

Mark Doble Thank you for having me, Ben. 

BP So this podcast is kind of aimed at the world of software developers, engineers, CIOs, CTOs. Do you yourself have a technical background at all or just coming from the world of law?

MD I studied science and philosophy in my undergrad, and I was a little bit later, but throughout my undergrad, that's when I fell in love with programming, and during law school I spent about half my time building stuff, the other half doing the readings.

BP Nice. So were you building apps for yourself? What kind of tech stack were you using back then? 

MD I started as a Rails programmer. It was a great entry point for me, and really it started as a huge creative outlet– the power of code to create and build things, and that's what really drew me in. 

BP Rails is a language that a lot of people love that some startups made the commitment to go along with because it was the most popular thing at the moment and they've stuck with it, maybe not regretted it. Although these days, I think it's difficult to hire Rails developers because not a lot of new people are picking it up, unfortunately. But I was a tech reporter from 2010-2018 and so lots of folks at the time were deep in that world. All right, so you've got some background in the world of programming for yourself, learning that and dabbling, but you're also getting your JD in Canada. Is that any different from in the United States? So in the United States, typically you go to undergrad, then you go to law school, then you pass the bar exam, then you're admitted to practice. Is that how it works in Canada as well? 

MD The exact same. The only difference is that we have a three-year law school and then a year of practical training, similar to what a clerkship would be, but they also happen within law firms.

BP And so did you spend any time at a law firm? I checked your LinkedIn, it looks like you went from university right to Alexi, so is there a gap there? You did one year of practice or, what do I want to call it, not interning, but– 

MD It's called articling. In Canada, it's called articling. So like clerkship, articling. So I did that and then a little bit of practice at the same firm. I got just enough to know what the practice of law was like, and in particular, litigation, which is what I was most interested in. Enough to know what it was like, not enough to constrain my thinking in any way. I think I was still able to think outside the box, and so I had just enough of it, I think, to allow me to innovate. 

BP Okay, so CEO of Alexi starting in 2017. Were you also a founder at this company? 

MD Yeah, that's right. 

BP Okay, so 2017, let's jump back in time, nobody's talking about generative AI, no GPT system of any kind has showed off the ability to reason over language. 2017, 2018, transformers are just being invented. Deep learning is showcasing some amazing skills, it's beating people in Go, it's beating people in poker. I used to write about it all the time and the thing I always used to say was that it's amazing, and it's a little bit startling that it can beat people in these games that experts told us it would never beat them in– but think about it like Deep Blue and Garry Kasparov. This AlphaGo is amazing at Go, it cannot play chess. It cannot write a poem. It's brittle, it's narrow, it's deep in one direction. In 2017, what was your thought of, “Hey, I enjoy law. I've gotten a taste of it, but really I want to do technology.” What was your company predicated on at the founding?

MD Well, it was abundantly clear that the rate of improvement in AI was so fast. Every year we were clearly on this exponential curve of improvement, and it was right around that 2017/2018 time when the first transformer papers came out and it was being able to predict the next word. There seemed like a very credible path towards full narrative drafting of text, and that was super exciting and it was clear the potential in the legal field was significant. There was still a huge gap at that time, obviously, and still in some ways today, between what the technology could do and what lawyers needed, these industry-grade requirements. And so really the founding thesis was, “Let's try and build an AI that knows the law, and if we do that, there'll be lots of applications throughout the legal field.” We knew that these foundational models, the underlying technology was going to continue to improve and allow us to do more and more. We were a bit early in 2017, and it wasn't really until 2020 when we actually raised the first round of capital and got the company going. We had some customers, we had some idea of what the initial products needed to look like, and we were getting initial team, technical team, products teams together. But it really wasn't until 2020 when we started to have more of an impact. 

BP All right, well this is interesting maybe for some of the developers and hopeful entrepreneurs listening. How'd you survive those three years in the wilderness? Were other people working other jobs? Do you have some friends and family funds? It sounds like you have a few customers, you're iterating on ideas, you've got pilots going, but you haven't latched onto that product market fit, so what were you doing? 

MD We scraped together, I think it was maybe $50,000, and I was building the entire product myself. 

BP In Rails. 

MD In Rails, exactly. And there were some NLP components at the time, but there was no application for large language models or anything yet. That came shortly after, but even then there was some sort of NLP stuff that we could do. We could do some sort of retrieval, some basic search to help get a product running, and we began in legal research. So we had a Python back end that I was still maintaining, a Rails product. Horrible decisions being made early early on from a technical architecture perspective, but that's what we were doing, and we were getting some revenue, getting some customers. It was pretty tough to raise any sort of money at that time. We went through Techstars and then that got another couple hundred thousand dollars in equity that we raised, and then we were trying to validate a little bit more. We made the mistake of mainly starting in Canada with customers, and now about half of our customers are Canadian, half are US. The US is growing a lot more than the Canadian side, but really getting that commercialization proof points in the US was critical and we waited far too long to do that.

BP Gotcha. All right, well I won't make any comparisons between Canada and the US. That's not my bag, I'll let you do that. It's interesting that you said there were some NLP libraries or connections or ways you could tap into that. Now obviously, we live in a world where you often go in and they say, “You have your choice of five foundation models, and you have your choice of three sizes of each. Pick the one that suits you for speed versus inference cost versus X, Y, and Z.” But let's focus in a little bit on that turning point. Once this technology started to mature, once we saw the sort of explosion of its intelligence and capabilities in November of 2022, what did you hone in on? What is the sort of key product offering you have now to customers, and where do you see the most traction? 

MD Well, it was really interesting for us. So even in 2020 when we had our first real product that was working, it was essentially a RAG implementation. We would retrieval, but then it was an extractive summarization model. So we would retrieve primary law, case law, legislation that was responsive to a question, and we had retrieval models that were pretty good, nowhere near like they are today, but pretty good at identifying relevant answers, cases that were responsive to questions, and then we would use extractive summarization models to present an answer in an essay or a memo-like format, similar to what you would get from ChatGPT today, although these were 10-20 page documents. And even at the time, it was clear to us that general search, open domain search, a huge chunk of that could shift to this user experience where instead of getting a list of links that the user now has to go and review, it was way easier to just summarize the content in each of those links and present it in a coherent answer to the user. 

BP Law is one of the fields where it seems like these large language models could have the most utility, because often what you're doing is taking on a case with potentially an enormous amount of case law that you need to search through to find the needle in a haystack that will help you, and/or enormous amount of documents that you need to search through, and so a system that's capable of understanding, synthesizing, and then not just that, but annotating and pointing you to the ground truth is incredibly valuable. For so many people in their first X years at a large law firm, their job is to sit in the room with 50 cardboard boxes and go through all the documentation that the opposing counsel sent over and see if they can find things that will be critical or to read through 50 years of case law in a new state that they're going to, a new territory, a new jurisdiction, to see if there's something practical that can be applied if they don't feel like they have an understanding of the argument that they're going to win on based on the simple facts. So is that where your application sort of was first applied– to case law or to documentation? 

MD It was research. So primary law, which includes case law, legislation, regulation, making sense of this huge corpus of content which serves as the source of law. Where do we find what the law is, and it's these primary sources of law. So really it was questions about what the law is and using AI to dig through all of this corpus of content to surface an answer. And so at that turning point, certainly when ChatGPT was launched and we had access to GPT-3.5 APIs, it was certainly a significant turning point in a number of respects in how we thought about the technology, and it wasn't until probably another six months after the launch when RAG became much more mainstream and it was like, “Oh, this is actually what we've been doing for the past two years now.” 

BP Did you call it RAG? 

MD We didn't. We had no idea that that's what it was, but that's exactly what we were doing. And the big change really was that we had these Siamese encoder models, these retrieval models. We had generative models that would generate questions on case law, and then we would train these Siamese encoder models to do retrieval really well. So on these generated questions on case law, we'd have 20-30 million pairs of these questions/answers to train this retrieval model. And we spent a lot of time, we had a team of very smart AI scientists doing this work to get these recall scores higher. You get every percentage point improvement in recall. And then we realized that if we have almost 100% recall in the top 5,000 documents, why don't we just apply some sort of agentic flow to now filter down from these 5,000 to the 10 documents that were really needed? So instead of spending months trying to improve the recall score through improving training this retrieval model, and if the latency tolerance is high enough, you just make another 100 LLM calls to validate and you improve. Let's not spend any more time trying to improve this retrieval model, let's just add some agentic element. We didn't even call it agentic elements at the time either. It was just like, “Let's just add a few more LLM validation calls here to improve the quality.” Does this case actually respond to this question or no? If no, okay, get it out. Let's go on to the next one. And so it totally transformed how we even thought about retrieval. There's an initial retrieval that's required, but after that, let's leverage the power of these models to validate outputs. And then it turned into all these things that we know of now– scaffolding, chain of thought, all of these concepts that emerged naturally through the inherent power of the technology. I think they're quite obvious concepts to people now. 

BP So for a Siamese encoder, just so folks understand, you're trying to ask it, “Is there something within legal text we need, a precedent or within this document, that matches what we're looking for, that establishes there's some case law here where a ruling used the same language or had the same set of facts?” Is that what we're talking about? 

MD Exactly. So basically, if you look at questions about primary law, about case law, and then the passages that answer that, if you look at the vector representations of these two cases, they actually live in fairly different worlds. And so you train these Siamese encoder models to map from question vectors to answer vectors, and so it brings them together so that when you have a question coming in, you're not looking really for passages that have the exact same words, because they rarely do. It's different words that are used to answer the questions. And so these are the– still, I believe– state of the art retrieval models today, but it's just far more powerful to really leverage the LLMs. They are large language models, just without the generative head. 

BP Gotcha. And so you mentioned that it became possible to start deploying these agentic workflows. It became possible to, let's not call it brute force, but sometimes say, “Look, we don't need to improve the capability anymore. We can now work with a certain level of quantity instead.” And was there also something different that you found in terms of the capabilities once we got to GPT-3.5 and above where you said, “Okay, there's different applications that are now possible given the sort of increased ability to reason over language.”

MD Oh, yeah. Immense. It certainly wasn't immediate. It's taken a fair bit of time. Also I think there was even maybe about a year ago where it wasn't clear if we need to do some of this foundation model training ourselves? Is this where our company is going to have any sort of proprietary advantage going forward? Do we need to do it just to have some differentiated moat in the market? And so it took us quite a bit of time to figure that out– what we needed to do, what we didn't need to do. And I think where we're at today is that, similar to this, do we spend time improving this retrieval model, or do we just try and implement this agentic concept of, “Well, let's just make another 100 LLM calls, assuming the latency tolerance is there, the user is okay to wait a couple minutes for a memo to get an answer,” or whatever it is. This approach now which we're heavily leaning into, these concepts of chaining potentially hundreds of LLM calls together for a single task or function for a user, also this idea of scaffolding, having models validating the output of other models, and then tool use, being able to go and run Google searches to find content that you might have missed, all of these things now are just incredibly powerful and have completely opened up the possibility of what we can build for our users. We do not need to be doing any sort of pre-training, even fine-tuning is not really a thing that we do very much anymore. Maybe if we are trying to reduce some latency in some area on a very narrowly defined task we might do some fine-tuning, but by and large, let's just put all of these pieces together to build incredible value for our customers. 

BP That's interesting. I would have initially assumed that you see what's happening and you think, “Oh, gosh. We're going to have to build our own sort of legal-specific foundation model,” and yada, yada, yada. But then you say, “Okay, great. These capabilities at the foundation level are almost becoming commodified. There's open source versions of them and they're getting cheaper and the tokens you're allowed to use are increasing. But okay, maybe we'll do fine-tuning. Then there'll be us, not even that. This is good enough on its own. What we need to do is figure out the practical applications for our clients and box that in, almost productizing what's happening.” So let's dive into that. Now that you have these capabilities, what are the areas that you find the most traction? What are clients doing with this day to day? Actually, let's level set for a second. For people who don't know, how prevalent is this kind of technology in the legal world? And then for your company, how many clients do you have actively using it? And then we'll just dive into what they're doing with it day to day. 

MD There's several companies out there that are doing something similar. I think legal research is a very difficult problem. I think there's very well funded startups. There's the Thomson Reuters of the world also, famously acquired Casetext shortly after GPT-4 was launched. Casetext had early access and they've co-counseled, they've integrated into the Thomson Reuters products. But what we see from our customers and in the market is that one of the things is that, with these generative models, I think it obscured the real problem in research, which is retrieval. People thought that generative models solved all the problems for legal research. They certainly didn't. You still had to retrieve the right sources that answered questions and present that in the right way, which you can do with retrieval models, but there are still famous examples from LexisNexis, which is a huge company in the legal research space of citing cases that were published in the year 2025 which obviously don't exist. And so this hallucination problem that has been long solved for us, we do not have even a tiny percent of hallucination issues in our products. So I think it's been largely overstated how easy it is to do legal research really well with these technologies, and we haven't seen another company who does it anywhere near as well as we do. This is certainly not our long term advantage. It's still kind of our wedge into the litigation space. And now we're applying similar approaches, similar technologies to things that you talked about earlier, really understanding the evidence of the case. There's a lot of cases that have thousands, tens of thousands, if not hundreds of thousands of documents of significant length that lawyers need to sift through to understand what might be relevant, how to plan the case, how to strategize, what are the legal issues we really should focus on, all of these things. So it's the facts and the law that we really combine to apply the technology to make the lawyers understand.

BP So let's get back to that claim you just made about hallucination. In what way are you able to completely avoid that? You said that you're not really using the generative side of these things, you're only using them in sort of a search and recall kind of way? 

MD So the way we think about it is, are we using the intelligence or the knowledge built into the model? The more we rely on that, the more likely we are to get these hallucinations. And we broadly define hallucinations as errors, provable errors that are asserted with confidence. Their truth is asserted with confidence by these models. So if the model says, “Yes, this is definitely true,” but it turns out to be false, that would be a hallucination. If instead the model says, “This might be true. We're not totally confident,” and it turned out to be false, we probably wouldn't call that a hallucination because maybe somebody needs to go verify and maybe nobody has done that, and so it wouldn't necessarily be wrong to say that this might be true. So really it's the combination of asserting something is true with high confidence that is a hallucination. And so definitely referencing cases that don't exist would be a hallucination. The way we get around this is, one, only relying on the model's knowledge, built-in knowledge, which is incredible in many ways, but only relying on it when we can be confident that it's going to be accurate and also giving the model room to say, “I don't know the answer to this.” If you force the model to answer a question, it's going to make something up. If you give it room to say, “I'm actually not that confident,” then it's far less likely to make things up. 

BP So is that a matter of prompt engineering, of telling the model, “Listen, I want you to look, and then I want you to assess, and then I want you to form a confidence score, and then I want you to reply based on that.” Also I've seen versions of this where, again, there's an agentic workflow, like, “Send it to one, they read the data, send it to a validator, they make a check, send it to a judge or they pass it on, or send it to a QA kind of process.” How are you doing that? 

MD Well, certainly the prompts. The way the prompts are written is important, but what's far more important on the overall output is the series of LLM calls that are linked together and how they're validated, the testing that is done at each stage. If you test things enough and you get to a certain threshold of confidence then it's very unlikely that the model is going to make a mistake in the future, and so testing appropriately at each stage. It's the combination of all the models in the right way, looking for the right output, validating at the right steps, that's far more sensitive to the overall output than the specific drafting of an individual prompt. Certainly, that's important, but it's really the architecture or the building of the agent and the workflow from input to eventual output.

BP Today, you have clients in both the US and Canada, large law firms as well as small boutique ones? Is this mostly used just by bigger players who probably have a team of first-years for whom a lot of this work would be done? Or who's using this, what do your clients look like, and then on a practical day to day basis, where do you see them using it most, in what applications? 

MD We're focused on mid-market and below, so the AM Law 100, the enterprise law firms, that's not our focus at all. There's very specific requirements for these firms that make it much more difficult to build a business around. And even just the sales cycles, the process of selling to these firms is far, far more difficult. We have repeatedly seen sales cycles of a couple of weeks, and these are still great contracts that come in really quickly. And we have a PLG motion that does not work in the enterprise. PLG is super important for understanding the product, getting really quick feedback cycles in the product. And so mid-market and below, and there's thousands of lawyers that use our product on a daily basis. With recent updates now, we're at repeat daily use. 

BP So even an independent practitioner could sign up at a consumer small business cost and use it. 

MD Exactly. $150 a month is our smallest plan. That's only for solo lawyers running their own firm, may not have any staff at all. And there's a seven-day free trial. It's a pretty typical PLG motion that we have. 

BP That's cool. I had assumed I think because of my friend who I mentioned and I think they brought on Harvey at his firm, and I was thinking more about it at that level that you're sort of saying where it is very regulated, it's very bureaucratic, the cases themselves are global in nature or whatever, and so I can see that being a bit more difficult. But if you focus in on, “Okay, we understand US and Canada case law and precedent and we have practitioners who are doing X, Y, and Z. Now there's quite a number of those clients we could go after.” So are you offering them access to any kind of database documentation or information that your model is searching that they may not have access to, or just what's already publicly available and what the models have trained on and what they give it? Do they upload documents?

MD So it's a combination of all of the above, and we've got many licenses in place to be able to get access, bulk access to this content. It still is a bit of a challenge to get bulk access to everything, but that's changing very quickly. And so we worked really hard to get the licenses in place that we needed to confidently provide the product and the service to our customers. But also users are uploading content, uploading cases, and we do have various other strategies. As I mentioned, there's lots of real time Google searching that's done as part of several agentic workflows that we rely on to pull in content as needed.

BP Now I want to ask the question that I think is in the back of every software developer's mind, which is what does this mean for my career as a lawyer, hypothetically speaking? What does this mean for my career if I'm still in law school? What does this mean for my career if I want to get a very junior role at a 10-person law firm in my town and the way I would have cut my teeth is doing exactly this kind of documentation research and then eventually moving up to be part of a trial or more of a partner? What role does the very junior human have when LLMs are honestly better at a lot of this stuff now? 

MD So what we've seen is that in recent years, there's been a pretty important turning point in legal technology generally, I would say, from previous versions being limited to making firms a little bit more efficient, which is not the best incentive for them to adopt these technologies, just being a little bit more efficient. But now it's very clearly making lawyers significantly better at their job, and that shift has had a really important impact on the market. And I would say, certainly there's so much unmet need of legal services within broader society that there's very limited risk, very little risk I would say, to somebody losing their job. The real risk is if you're not actively learning how to use these tools, how to incorporate these tools. 

BP You're in the ‘AI is not taking your job, someone with AI is taking your business’ camp.

MD There's definitely a part of that. It's all about timing. It's really hard to predict 20 years, 30 years from now. Certain jobs might be eliminated but I do think there's something fundamentally human about the rights and liabilities of other humans. If aliens landed on earth today, we probably wouldn't let them be lawyers no matter how smart they were, because we can't be confident that their interests are perfectly aligned with somebody being accused of a crime or somebody being sued for hundreds of millions of dollars. Why would we let an alien who doesn't really know what it's like to be a human represent and determine the outcomes of that case? And I think having human professionals as at least intermediaries with the legal system I think is pretty critical for the next little while. But again, it's all about timing and it's very hard to predict the timing. 

BP Also, to become a lawyer, you have to move through a pretty well-gated set of academic and then judicial gates. You can't practice in the United States in different states even without having gotten permission for various reasons. And I do think it seems like we're pretty far from me sitting down with a robo-lawyer and having them defend me. That's a little bit distinct from fewer first-year research associates getting hired or whatever the answer may be. To get to Jevons Paradox, I understand there's plenty of legal work out there, and maybe this just means more one-person shops are available in every town and that means more people can get cheap legal representation and the answer to that is, “Great. Nobody is representing themselves or working with an exhausted public defender who can't manage 15 cases at once,” or “Every public defender now can manage 15 cases at once. Great. That person is a 10x lawyer now.” 

MD I think we're still on this exponential curve. Even with this plateauing, flattening of the scaling laws of these large language models, we are still 100% on this exponential curve of improvement. That means the next two years are going to be more than the previous two years of improvement, and it would be so naive, very foolish to think that AI is not going to be as good if not better at running an entire litigation file from beginning to end and representing. But the reason why we don't let that happen is for other important reasons, primarily around alignment, in my view.

BP Okay, I'll accept that. I won't say that there isn’t a slightly dystopian cast to the idea that the robot does all the work and then I show up as the flesh bag to present the argument and make sure that, ethically speaking, a human is responsible for the final verdict. But I interviewed Joshua Browder, this was back in August of 2022, and DoNotPay was already having robo-lawyers fight parking tickets and get people refunds and engage in chats with other human beings or who knows if there's a robot on the other end of that call service line, but to work through a script and make legal arguments and get people results in the most pro forma sense, which is a nice thing. And I recently also interviewed someone from Consumer Reports. They run an app called Permission Slip which is all about how I get control of my personal digital privacy and deal with these things, and at least in the US, you can have an agent work on your behalf, and so I sign up for the app and then an agent goes out and tells all these data brokers, “You need to take my stuff down. You need to delete this.” They send the email on my behalf. They'll respond. So there are some automated lawyerly tasks that are being done for people now and I can see the benefit to that. What is your tech stack right now and do you see that evolving in the future? I know you started with Rails. You mentioned Python foundation models you can access through APIs. What does your tech stack look like right now? What's your staff like? Do you have engineers, machine learning folks and lawyers? Let's go over that piece of the puzzle. 

MD Sure. Maybe first on the team side, so we do have a product development team made still in Rails, React front end, minimal back end running through Rails. It's primarily through other Python services in various Python frameworks. So the vector data stores are important. There's various tools that we continue to explore to see what's working. It's a constant effort to stay on top of what's working, so we've got an applied AI team. We previously had much more of an R&D AI focus and now we've shifted entirely to being applied AI, and that does still require some R&D, requires more experimentation, but not like it did even two years ago and we're not training models anywhere near like we were before. And a data team and a platform team. But we run a pod structure, so everything is product-oriented. These are multidisciplinary pods that have very specific product KPIs that they work towards, and this includes everyone from back end to applied AI to product development. We're still a small team. We're about 45-50 people in the company right now. Over 30 are entirely in writing code and in product and engineering. 

BP Nice. And you mentioned that there's regulations. Are there areas of the law, either in the US or Canada, where AI or computer-assisted legal work is banned in some sense or it cannot be done, not because the systems aren't capable but because the judicial system or the legal system is not accepting it at this moment?

MD Well, I would argue that it is currently banned and a vast majority of outputs from OpenAI for ChatGPT are illegal, but nobody seems to be caring about it. I think there's a fairly well understood– it's still ambiguous– fairly well understood meaning of engaging in the practice of law, what that means, and I would argue that AIs are doing that all the time. And throughout the US and Canada, you have to be a licensed legal professional that carries insurance, carries professional liability insurance, in order to engage in the practice of law. 

BP Okay, I see. So you're saying that somebody logs on and says, “Look, I've got this problem with my neighbor. I'm thinking about suing him,” and they ask a question and it responds. It's not supposed to do that. It's not supposed to give legal advice if it doesn't have the licensure and the insurance. 

MD Right. You're not allowed to. There's lots of laws and regulations that prevents non-practicing lawyers. It's UPL, unauthorized practice of law. There's lots of litigation that happens all the time, but if you build an AI that goes and does that, I'm not aware of any action being taken against any company right now, which I think is probably not good and will likely change. Part of the problem is that we have these laws that are just not being enforced at all, and so either the laws have to change or they need to start getting enforced. And likely, and hopefully, the laws just get changed. 

BP Can the LLM just always caveat with, “I'm not a lawyer and this is not legal advice, but,” and then take it from there?

MD And then you just say, “Well, talk to me like I'm a lawyer,” and then it says, “Okay,” and then forgets everything like that and it just gives you legal advice.

BP Okay, that's pretty funny. I thought you were going to say that they've all been trained on tons of copyrighted data and are using that without paying for the permission, but that's a separate issue. 

MD Primary law is not copyrightable. It's in the public domain so there's less of an issue with that in the legal field. 

BP All right, so look out for me, 6, 12, 18 months. You are in the boat that we have not reached the end of any kind of scaling curve, maybe even the exponential one. We hear all these reports of the big AI labs needing to delay the release of their next numbered model because they're not happy with the results, they're only seeing incremental gains. And to me, there is a certain logic to that. It's clear that we've used up almost all the text on the internet and so how can you 10x the model's intelligence again? Let's just say in the world of text, forget multimodal, but you want a chat model that works in text that's going to be 10 times as intelligent as the last one. We don't have 10 times the data to feed it. Where do you see things going? Let's just talk big picture and then maybe we'll hone in after your thoughts there about what you hope your firm will be doing in 18 months that it's not doing today.

MD Well, I think I heard something similar to this, which prompted this idea. This is not a totally original thought but I think the analogy is totally appropriate, though. If you put a smart human individual, lock them away in a room, maybe with some books, they're not going to think their way to ever higher levels of intelligence. Now, if you put three, four, maybe five people in a room and they're debating ideas, discussing ideas, correcting each other, now let's make this a thousand, ten thousand. I think this now becomes more plausible that regardless of the fixed constraint of the amount of data they have access to, it's very plausible that they might get to levels of higher intelligence. And I think that's exactly what's happening right now with multiple agents kind of interacting. Synthetic data, we probably need to come up with a new name for what's actually happening now. It's not really synthetic data. It really is this debate, this dialogue that's happening with models validating other models and I think there's something really powerful there. The scaling laws are real. They're plateauing, 100%. I think it's really hard to measure civilization's ability to build AI over generations. How do you measure that? I think data compression is a pretty compelling way, and this has continued on an exponential curve over the past 40 to 50 years and it's not slowing down. It's a straight line on a log chart. If we're just looking to the immediate technical limitations, there will always be immediate technical limitations, but we've always found ways to break through that and to find some other innovation. So that might be needed, but it's coming. But even if it isn't, this whole idea of unhobbling that's been popularized now, tool use is unbelievable. The scaffolding, this chain of thought, all of these things are incredibly powerful, so even without some other fundamental change from companies like ours, all these other products out there are going to get better and better and better every single month. And I think we're nowhere near the ceiling on what's possible for these tools right now. 

BP All right, so given that, what services do you hope to offer in 12-18 months that you don't offer today? Or how will the services you offer clients be different or improved? 

MD We're focused exclusively on the litigation workflow, everything from even prior to opening a file, right to resolution. And if we do this really well, if we provide really high value functionality to litigators for the entire lifecycle of a litigation file, we're focused on doing that really well right now and we know that about 2-3% of the US GDP is spent on resolving legal disputes right now. It's a huge problem. It could be way more. There's a lot of people priced out of resolving their legal disputes and I think we see huge opportunity that you'll hear a lot more about over the next year in this space.

[music plays]

BP All right, everybody. It is that time of the show. Let's shout out a real live human being who came on to Stack Overflow, shared a little knowledge or curiosity, building that ever growing pool of intelligence that we chatted about on this episode. Awarded one hour ago to M S, “How to get the BUILD_USER in Jenkins when job triggered by timer?” M S was awarded a Populist Badge because their answer is so good that it got way more votes than the accepted answer. This question is part of our CI/CD Collective, shout out to them. 163,000 people have viewed this question, so M S, thanks for your answer, you've helped a lot of folks with your knowledge. As always, I am Ben Popper. I'm the Director of Content here at Stack Overflow. Find me on X @BenPopper. Shoot me a DM. If you want to come on the show, if you have suggestions or questions for us, things you want to hear or are sick of hearing us talk about, email me: podcast@stackoverflow.com. And if you enjoyed today's conversation, the nicest thing you could do for me is leave us a rating and a review, or subscribe and get more episodes in the future.

MD My name is Mark Doble, CEO at Alexi. You can find us at alexi.com. And feel free to email me: mark@alexi.com. 

BP Very cool. All right, everybody. We'll put those links in the show notes, and thanks for listening. We'll talk to you soon.

[outro music plays]