The Stack Overflow Podcast

Using AI to find patient zero in marketing campaigns

Episode Summary

Ben Popper chats with CTO Abby Kearns about how Alembic is using composite AI and lessons learned from contract tracing and epidemiology to help companies map customer journeys and understand the ROI of their marketing spend. Ben and Abby also talk about where open-source models have the edge and the challenges startups face in building trust with big companies and securing the resources they need to grow.

Episode Notes

Alembic is an AI platform that helps organizations see how marketing spend connects to revenue so they can illuminate blind spots and make more informed business decisions.

Connect with Abby on LinkedIn.

Stack Overflow user krishnan muthiah pillai earned a Lifeboat badge with a top-notch answer to How to forget a wireless network in android programmatically?.

Episode Transcription

(Intro music)

RYAN DONOVAN: Tired of bugs crashing the party? No other tool covers end-to-end API testing. From functional to process to performance, all in one platform. Visit Qyrus.com/QAPI/stackoverflow. That’s Q-Y-R-U-S dot C-O-M forward slash Q-A-P-I forward slash Stack Overflow to try QAPI today for free. That’s Qyrus with a Q.

BEN POPPER: Hello everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am Ben Popper, one of the hosts of the Stack Overflow Podcast, and I am very excited today because we are going to be talking about data science and complicated and interesting technology in the service of marketing, which unfortunately is my career. Today, we will be chatting with Abby Kearns, who is the CTO over at Alembic. The promise here is that customers like NVIDIA and Delta can actually understand what happens to their money, every dollar in marketing spend. Does it give you ROI? There's a classic line about advertising: I know that it works, I just don't know which half. Maybe we can use some of this technology to solve the problem. So without further ado, Abby, welcome to the Stack Overflow Podcast.

ABBY KEARNS: Thank you for having me.

BEN POPPER: So, first thing we usually do is just let our audience get a little grounded in who you are. Can you tell us a bit about how you got into the world of software and technology and what led you to the role you're at today?

ABBY KEARNS: Yeah, it's been a crazy journey, but I've been in tech for over 25 years and all of my experience has been in the enterprise technology space. Most recently, I joined an early stage martech company, but prior to that I spent most of my career at enterprise infrastructure and dev tooling. So it was an exciting adventure to dig into this fascinating world of taking really novel bleeding edge technology and applying it to one of the world's hardest problems, which is figuring out what do I have in my data, my massive amounts of data, of marketing data.

BEN POPPER: Right.

ABBY KEARNS: And then figuring out how to derive insights, impacts, causality, and ultimately ROI.

BEN POPPER: Right. You're taking the guesswork out of calculating marketing ROI, and then there were a few different technologies mentioned, not all of which I'm familiar with. It says using composite AI, a graph neural network, and contact tracing mathematics developed during the pandemic. A neural network, I think, I've got a pretty good handle on that, but can we talk first about what is composite AI and then, even more interestingly, how are you using mathematics developed during the pandemic to track what happens to my marketing spend?

ABBY KEARNS: Yeah, there's, there's a lot to unpack in that statement. Composite AI is just a fancy way of saying we're using multiple LLMs, as part of our data pipeline. A lot of the heavy lifting though, that we use to surface the most interesting impacts, insights, as well as to drive causality, is done by our proprietary algorithm. And one of the questions, our founder, Tomás Puig, one of the things that he, when he was starting to build Alembic, was really thinking through during the pandemic is what's the difference between a contact trace– those things we're all familiar with when we were trying to identify the source of COVID outbreak, and a marketing trace. Like understanding what's the difference between those two?

And he said, “Well, they seemed really common,” which is where does a buyer start their journey and what drives them to complete a purchase or an engagement with a company or a brand, and how is that different from the contact tracing? And so, he really had the brilliant idea of taking the mathematics that were applied to that contact tracing and applying them to marketing. And that really began a multi-year journey to look into not just the mathematics around contact tracing, but the value add that neural networks can provide. To say, how do I want to build a graph of all the events that occur in a buyer's journey that takes them to the final, ultimate buying your product.

BEN POPPER: Right.

ABBY KEARNS: And for many of us, that journey includes everything from “I saw your advertisement during a Super Bowl commercial” to “I saw your sponsorship” to “Hey, I got an email from you and I signed up to do a webinar,” or “I showed up at your conference and I went by your booth, and then I took a class and then I bought your product or signed up for a service.”

BEN POPPER: Right.

ABBY KEARNS: And I think, all of us know in our day-to-day lives, it's never just the one thing. We don't just see a billboard ad and are like, “Yup, done…buying it!”

BEN POPPER: Right.

ABBY KEARNS: It really just kicks off a journey. And so what we set out to do is to really figure out how do we map that journey and show the path that a group– and so we really look at population base, we're not looking at individual users, but how does a group, what is the path that they follow and what is the revenue impact to your organization? So really tie that very clearly to the impact that's most important for you as a company, and for most, it's revenue. For some, it can be things like what drives foot traffic, but really a lot of it really comes down to what is driving the revenue.

BEN POPPER: So I loved the example you gave. Somebody comes and reads a blog post. Then they're sitting, waiting for the bus and they see a billboard. Later on, they attend an event and sign up for a class, and eventually they get to a point where they realize they'd like to demo the product, and so they head to the website via a Google search.

I have trouble imagining how you would connect the dots on those, especially because some of it is happening in the real world without a digital footprint, in the case of the bus stop, but talk to me a little bit about how you might do it. And I understand now you're saying not this individual's journey necessarily, but the journey of an individual like their demographic or their geo.

ABBY KEARNS: Exactly. And I think one of the things that were really important to us when the company was founded, and it was very important to the founding team, was to really figure out a way to get away from cookies. And, you know, philosophically we're against cookies and tracing an individual's experience through cookies. But we also envisioned a world where eventually we're not gonna be able to use cookies, and cookies aren't going to be an accurate reflection of someone's journey.

BEN POPPER: Right?

ABBY KEARNS: And so that really beget the journey into saying, how do we look at the aggregate in a population and start to identify patterns? And again, we're looking at things that are driving meaningful impact. We weren't trying to build another dashboard. I think marketing has plenty of those, but what we were really striving to do is build something that sifted through all of the data. So all of these data lakes that the world's largest companies have been building for the last 10 years…

BEN POPPER: Right.

ABBY KEARNS: …Which one of the things I didn't realize, having spent so much time on the infrastructure side, is that the marketing divisions in the world's largest companies, think Fortune 500 companies, is their data set is probably the biggest in the company. There's massive amounts of data that these teams are sitting on. If you think about it, these teams are sitting on not just email campaigns, and digital ads, and SEO data, but they're also sifting through POS data, and Google Analytics data, and foot traffic data, and all of the sponsorships that are done that are one-offs, think F1 sponsorship or arena sponsorship. But they're also trying to figure out TV ads, radio broadcast, social media. So if you think about all of this data, it's massive amounts of data that marketing teams are having to, not just sift through, but connect the dots and understand what questions should I ask and how do I connect the dots amongst all of these different things.

And so, really what we set out to do is say, how do we sift through all that data for you and do some pattern matching and identify those anomalistic activities that are really driving the most impact to your business and connect the dots. So maybe your CEO goes and gives an update on CNBC on The Marketplace and people see them and are like, great, let me go to the website and let me read a little bit more about this company. Oh, you know what, we actually really need to buy this hardware. So that spawns off a whole series of events. And so what we're trying to do is connect the dots on all of those things and say, hey, based on this activity and based on this time window, we're able to ascertain the activity that a population takes, based in what drives and triggers these events, to result in an outcome and attribute that outcome.

And so, specifically what that looks like for some of our customers is to say, hey, our CEO was on CNBC and that drove a lot of users to these particular websites and from there it led them to contacting our sales department and they showed up at our global conference and that resulted in this amount of revenue that we're able to affiliate with that.

BEN POPPER: You mentioned at the beginning that you use a number of different LLM models and then after using them, the data analytics would pass through some custom algorithms that you've created. When we say we're using a number of different LLMs, is that similar to the mixture of experts approach that I've read about and understand is becoming foundational to a lot of people's approach to agentic AI?

ABBY KEARNS: Sort of, and I would actually reverse the flow here. So we really leverage both– a couple of different things that we do that's novel. One, we ingest tons of disparate data from all these different sources, so not just customer data. We have deep partnerships with companies that provide everything from TV, radio, podcast transcripts to social media and integrations to third-party data that includes everything from brand and competitive landscape to foot traffic data. So we're adjusting all of this data. Then we have our own– we were just talking about the neural network, we have our own novel neural network algorithm along with signal processing algorithm that allows us to identify, connect the dots amongst these things to derive, in our parlance, a causal graph, which allows us to really map all of these events that occurred, that are related.

And then, we generate an output from that. The LLMs we use are not really a mixture of experts, but what we do is we have them at different parts of our pipeline and they do everything from helping us enrich the data to summarizing the data. I like to say that we use LLMs to make things pretty. So essentially we use LLMs to really summarize and format the output of these algorithms and then we display that in our platform so that it's super easy and approachable for anyone to read and understand what it means.

BEN POPPER: Right. The LLMs are sort of doing the ETL part of the pipeline they’re finding the data, transforming it into something great, outputting it in a way that makes it easy for the customer to understand.

ABBY KEARNS: Less ETL and more just, can you take this output of, say, this JSON file and make it look pretty. ETL happens way before that.

BEN POPPER: Right, right. No, I just meant in the sense that like, you're sort of extracting and transforming the data so that the end product is something that your users can get value out of as opposed to, I guess, something a little bit too complex for them.

ABBY KEARNS: Making it approachable and so that you don't have to think too much about it, which, hey, if these are the top five things that happened in my environment, in my company, in the last month, wouldn't that be great if those things were summarized and I could copy and paste that and stick it in an email to my CMO or my CEO or put it into a presentation? And great, how do we validate that so I can double click on that and follow that chain of events all the way down to understand the specifics?

BEN POPPER: Gotcha. Do you assign a probability score to this or do you come away? I mean, when I think about contact tracing, it's like this is probably what happened. It sounds like one of the interesting things you're doing is grabbing from data sources far outside the company, foot traffic data based on where people checked their phones and stuff of that nature, based on everything we're collecting and our unique blend of proprietary algorithms, we're gonna say with 95% confidence that this appearance on TV, this ad campaign that you ran, and this event were the biggest drivers. Is it something like that or do you have even more certitude?

ABBY KEARNS: We have more certitude because we're able to go all the way down to the specific events that drove and what those events were. And so, for example, if you're saying, hey, the CEO's appearance on CNBC, we can tell you exactly the ratings when that was aired. Potentially, it has a different air date on local and national, so we can tell specifically when that was aired. So we can go all the way down to the details, but we really start the user journey on what are the things that matter the most for me right now then you can click in.

BEN POPPER: Gotcha.

ABBY KEARNS: Versus, where a lot of– unfortunately, a lot of marketers today have to start with, here's all the raw data, now figure out how they all map.

BEN POPPER: And so, do you avail yourself of open-source models? It seems like one of the things that's exciting about this space is that a lot of powerful frontier models come from research communities or are made available by folks like Meta. Do you feel like the open-source aspect of LLMs is something that you can leverage?

ABBY KEARNS: A hundred percent. We only leverage open-source models. We self-host open-source models. We want to make sure that our customer data is not being used to train any other models. And we also just want the control over the models that are used and when they're used and how they're used. And we wanna make sure that we keep the data within our data center.

BEN POPPER: Gotcha. That means you are operating, at least in part, with sort of an on-prem, bare metal approach to this in order to have control over all this data and ensure that nothing of your customers gets out or is used improperly.

ABBY KEARNS: Yeah. We have our own infrastructure. We do have our own colo facility with our own cluster of shiny NVIDIA TDX H100s.

BEN POPPER: Right.

ABBY KEARNS: Having come fresh off GTC, along with all the other disciples in the acolytes of NVIDIA these days, we do run our own NVIDIA hardware, but we also have infrastructure in the cloud as well.

BEN POPPER: Okay, cool. Do you find that doing this kind of work now as a start-up can be challenging because you're competing with so many other players, including some of the largest companies in the world, for scarce resources, for example, the latest NVIDIA has to offer?

ABBY KEARNS: No, I think we're very thoughtful about what we need and how we leverage it, so I think we've been pretty lucky in that respect. Also, NVIDIA's a customer and was kind enough to help us get hardware last year when we needed it.

BEN POPPER: Okay. A little quid pro quo. I like it.

ABBY KEARNS: We did. So they were really kind enough to help us get ahold of some of the ever elusive H100s last year. The Blackwells are obviously going to be the hotness for this year.

BEN POPPER: Right.

ABBY KEARNS: We were able to get the hardware we needed, in which gave us the access and the ability to innovate as quickly as possible.

BEN POPPER: Awesome. So how big, roughly, is the team at your company at this point?

ABBY KEARNS: We're around 40 people total in the company, so we're pretty early and pretty small and scrappy. But we work with the world's largest companies and are having fun really pushing not only the art of the possible with kind of bleeding edge neural network algorithms and the latest and greatest on LLMs, but it gives us a chance to also say, what could we do if we have all the data, the access, the technology, and the ability to pull all of these things together?

BEN POPPER: What gave these Fortune 500 and Fortune 50 and Fortune 5 companies the confidence to work with a relatively young and small start-up? What do you think it was that they saw that made them say, you know, this is a company and a service we want to turn to as a source of ground truth for some of our biggest spend?

ABBY KEARNS: I think one– you know, it's a huge problem. You just referenced the old adage which is, “I know that “50% of my marketing spend is wasted, I just don't know which 50.”

BEN POPPER: Right.

ABBY KEARNS: And I think it remains a big problem. For many of these companies, the investment in marketing across-the-board is one of their biggest investments. For many of these companies, they're spending upward of hundreds of millions of dollars to close to a billion dollars a year in marketing spend. And so, they are really looking for products and capabilities that can help them connect the dots on that data and then help them start to really understand what that data actually means and how that powers the business. And I think we were able to develop the right product at the right time and that story is continuing to resonate with the enterprises of the world, who are really looking to understand exactly what actions that are being taken in their organization are providing the most impact to the business and where they can double down, or potentially pull back in order to make sure they're investing their money and their resources effectively.

BEN POPPER: Right. You make a good point there, which is that this has always been a part of business in which we acknowledge there's a large amount of waste. If I said, I'm gonna build out some infrastructure, over the next year, but only half of it's going to work, and I don't know which half, I probably wouldn't have my job for very long as the Head of Infrastructure.

ABBY KEARNS: Right?

BEN POPPER: There were a few other things that I wanted to get at in terms of the company structure. You are the CTO. Are you mostly hiring folks who are data scientists, machine learning experts, engineers of that type, or do you have a mix of those alongside your front-end engineers, your SREs?

ABBY KEARNS: I have a whole team of people that are a mixed group of people. We have AI researchers, so we have people that have good deep data science and ML experience. Alongside, I do have a couple of amazing SREs and some front-end engineers. We're also hiring right now. I am looking for some additional full-stack engineers and some more data engineers and others to help us really take this team to the next level, so we're always growing the team.

BEN POPPER: What, if anything, have been areas of friction that you've had to overcome over the last year or so? I do think that there has been a sense of excitement when ChatGPT came out and then a year of furious building, but not putting into production, and within maybe the last year to nine months, a sense that things are being productized and real businesses are getting created on top of the LLM technologies that have fundamentally reshaped the tech industry and maybe society. In that journey, what were some of the obstacles and the points of friction you had to overcome? I would love to hear about that problem solving.

ABBY KEARNS: I suspect we had the same challenges that everyone else had, which is things were changing– are changing, continue to change really, really quickly. It's like, there's always a new, quarterly, there's a new update to an LLM. We saw LLMs really largely– we saw a couple two years ago that were really out in front, and now there is a whole host that are getting more and more similar across the benchmarks. And the open-source LLM models are very competitive and gives us an opportunity to really stay deep on the open-source offerings, but it really requires us paying attention to what's coming up and what's going on and where the benchmarks are. The tooling around all of these pipelines is, as I'm learning, not as robust as is the tooling I'm used to on the infrastructure side.

BEN POPPER: Right.

ABBY KEARNS: You know, when you think about CI/CD, and we think about deployment, and we think about observability or APM, those tools exist. And when you start thinking about LLMs, and you start thinking about machine learning workloads, the tooling is still a little nascent. And so, I think that building these– stringing these things together in a workflow that is highly automated and highly efficient still seems a little bit like moving the Lego blocks in just the right way.

BEN POPPER: Yeah, it's been interesting to watch the conversation around these things, you know, LangChain was something early on that people used to string things together. The Model Context Protocol that was recently introduced is one thing that people have been talking about. And I was reading a conversation, or reading the comments this morning on Hacker News, something you should never do, about Model Context Protocol, and just folks saying like, look, I understand why they're trying to solve these problems. Definitely, standardization and protocols are great. However, most of this seems like it's three times as complex as a simple API or HTML request would be. Are we sure we're building this right? So yeah, people attempting to put some of that stuff out there and then, obviously, folks wondering like what horse to bet on.

ABBY KEARNS: Yeah, it's funny. I was actually talking to someone not this long ago that had built another start-up that was in the model testing and validation, and I was like poking on that and I was like, “Explain this to me.” Because this is a bit of a new world for me and I'm so used to having all of the tools at my disposal to do things like CI/CD and automation. And it's just baffling to me that we're having to reinvent a lot of that tooling in real time for a machine workload, which really honestly operates in a very similar construct to deploying a containerized workload. And so, it's been a really interesting journey for me. And I oftentimes feel like, “Are we really doing this right? 'Cause this feels a little harder than it needs to be.”

BEN POPPER: Right.

ABBY KEARNS: But then I was assured that, no, everyone's struggling with this. It's not just you. So I think that made me feel a little better, but also real sad. I'm like, alright, we've got a lot of work to do to get this up to these, what I view as solved problems. When we think about running distributed systems at scale, some of this is largely a solved problem, technically.

BEN POPPER: Yeah. I mean, I think one of the most fascinating things about this wave of technology is that LLMs are non-deterministic, which is not something most software engineers and creators of frameworks and languages would aim for. Andrej Karpathy once called them “Dream Machines,” you know, meant to create novel things based on our requests and instead a lot of people are trying to shoehorn them into search engines, and analyzers, and things, which to your point, can be reliable and robust to that Five9s. I think that that's a fascinating new dynamic that's probably emerging.

ABBY KEARNS: It is, but it makes things both fascinating and a little challenging all at the same time.

BEN POPPER: (laughs) Right. Well, if it wasn't challenging, it wouldn't be any fun. So, Abby, looking forward over the next year at your product roadmap, is there anything you're particularly excited about or want to highlight?

ABBY KEARNS: I'm excited about everything on our product roadmap, but we’ve been working long and hard on the next version of our neural network and we've been working with the spiking neural network, which is a little bleeding edge, but is fascinating in that the way we can slice and dice data and really start to sift through spatial temporal data has been a lot of fun and gives us a whole lot of crazy insights and the ability to carve up data in interesting ways. And then, obviously I continued to stay eyes glued, like everyone else, to the big leaps with LLMs and the opportunity that they're gonna provide us to do a whole lot more with less, as these models get more and more efficient and a little bit smaller.

BEN POPPER: Cool. I don't know much about spiking neural networks, but I've heard of spikes in neurons. So this is like a version of an LLM and SNN that tries to mimic that in an even deeper way?

ABBY KEARNS: Yeah. What we really wanted to do is– spiking neural networks really simulate the way the brain works and so the goal was to really help us generate not just graphs and understand the nodes and the edges and generate the outcome in the graphs that we wanna really be able to walk and identify the path the buyer took, but how do we do that in a three-dimensional way and start to connect very different edges and understand the truly complex pattern that emerges as we think about what that journey is for a customer and a buyer and give us all of the opportunity to really look at that from all sides.

BEN POPPER: Very cool. Well, maybe you'll give me a free demo for a week and I can see what it's like on the other side with my crystal ball for marketing.

(Outro music)

BEN POPPER: Alright, I'm gonna take us to the outro. That means I'm gonna shout out somebody from the community who came on and spread a little knowledge or shared a little bit of curiosity. Awarded yesterday, a populous badge to Krishnan Pillai, “How to Forget a Wireless Network in Android Programmatically.” If you've ever wanted to forget a wireless network in Android, Krishnan has a great answer. So good, in fact, that it received more uploads than the accepted answer and earned a populous badge. So we appreciate you coming on to Stack Overflow and sharing some of that knowledge with everyone.

As always, I am Ben Popper. I'm one of the hosts of the Stack Overflow podcast. You can find me on X @BenPopper, or shoot us an email podcast@stackoverflow with questions or suggestions, ideas for guests, or topics you wanna hear more about or hear less about. And if you like the show, you could leave us a rating and a review, but the nicest thing you could do is just tell one other person, spread by word of mouth. That's the best way for things like this to grow.

ABBY KEARNS: Thank you, Abby Kerns, CTO. You can learn more about what we're doing at Alembic at getalembic.com, G-E-T-A-L-E-M-B-I-C.com. You can also find me on LinkedIn, if you wanna reach out and connect directly.

BEN POPPER: Alright everybody, thanks for listening. We'll put those links in the show notes and we'll talk to you soon.

(Outro music)