The Stack Overflow Podcast

Building GenAI features in practice with Intuit Mailchimp

Episode Summary

SPONSORED BY INTUIT Ryan and Ben chat with Shivang Shah, Chief Architect, and Jon Fasoli, Chief Design & Product Officer, both of Intuit Mailchimp. Where we talked last time about building their generative AI operating system, this time we talk about implementing it and how all the pieces came together to make a better end user experience.

Episode Notes

Intuit shares more about their generative AI operating system (GenOS) in this Medium blog

If you want to try out generative AI in MailChimp, sign up here

Learn more about Intuit technology here.

Many thanks (and a Lifeboat badge) to Dherik for dropping an answer on cURL: how can I return 0 if status is 200?

Connect with Shivang on LinkedIn.

Episode Transcription

[intro music plays]

Ben Popper Hello, everybody. Welcome back to the Stack Overflow Podcast, a place to talk all things software and technology. I am your host, Ben Popper, Director of Content here at Stack Overflow, joined as I often am by my comrade, Ryan Donovan. How's it going, Ryan? 

Ryan Donovan It's going pretty well today. 

BP So Gen AI, the new hotness, it's everywhere, it's in everything, everybody needs it, it's practically going to be an OS some people say, but why? Why do you need it? Does every business really need it? All the investment you're going to have to put in– we've been over this– vector databases, embedding this, RAG that. Is it worth all that time and money? Should you build it? Should you buy it? So today I'm excited, we have a great episode coming up. It's a sponsored episode brought to you from the fine folks at Intuit. We're going to be talking with Shivang and Jon about all of these things, what they've been building, what they've been seeing in the market, and a bit of the why. Why does a company that does your tax filings and sends your emails need great Gen AI? So without further ado, Shivang, Jon, welcome to the Stack Overflow Podcast. 

Shivang Shah Pleasure to be here, Ben. 

Jon Fasoli Thank you for having us. 

BP First off, let's just introduce you to the audience. Take a second, let them know who you are and kind of what it is you do day to day. 

JF Hey, Ben. Hey, Ryan. Hey, audience. I'm Jon Fasoli. I'm the Chief Product and Design Officer for Intuit MailChimp. 

SS Hey, everyone. This is Shivang, Chief Architect for MailChimp. 

BP So for people who don't know, can you just explain quickly the relationship between Intuit and MailChimp? 

JF Sure. Intuit acquired MailChimp over two years ago, and MailChimp plays a critical role as a part of the Intuit portfolio in that we serve small businesses and MailChimp serves small businesses as well. Whereas Intuit's, prior to MailChimp, small business portfolio was focused on finances, getting paid, paying workers, accounting, MailChimp really focuses on customer growth. So together that front office and back office gives us a full end to end growth platform for small businesses. 

BP Nice.

RD And so for folks listening who may not have known that, like Ben said, what's a company that does tax filing doing with generative AI? Can you talk through the use cases that you all are using Gen AI for?

JF For sure. Across Intuit, as y'all would imagine, we've been deploying AI and versions of generative AI for quite some time. These are use cases like accounting. You have transactions that are coming in that ultimately need to wind up into the right category with the right treatments, and tax, which is not too far off from the same type of workflow going from data into a static form. Shivang and I are joining from MailChimp, who's also been working with Generative AI for a very long time. Those use cases started with just coaching, so helping to evaluate emails that were being written to help with some best practices and ways to improve those emails. And as the technology has evolved, we've had a lot of fun creating much more dynamic experiences where we're co-creating more so than just reviewing the content being created. 

BP When you say ‘co-creating,’ do you mean keeping a human in the loop there? 

JF Yes. Yes, we do. I'm sure many of your listeners and the two of you have experienced this as well, this is the most fun age of product thus far because you truly can see what each and every customer is doing alongside the AI. And there are patterns for sure, but the creative use cases and the brainstorming that you see happening in the product is very, very cool to watch.

BP Nice. And so Shivang, are you also participating in these experiments? Are there things you've worked on recently that get you excited? 

SS I would say essentially that I'll double down on what Jon mentioned, specifically on the co-create part. The technology has advanced quite a lot in the generative AI world and being able to have users explicitly co-create with a technology solution that has never been done in the past is extremely exciting to see, especially when you are in the sessions and people are like, “Whoa, this looks amazing. I did not expect it to do this. I did not know it was supposed to be this advanced.” Those have been really exciting conversations with our customers. 

BP Nice. So last time we talked, we were sort of looking at a high level at GenOS, but today I'd like to drill a little bit into the product side. What did you actually use to build MailChimp features and how do those function from a sort of software perspective? 

JF Maybe, Ben, I could tee up for the audience some of the features we're talking about and then Shivang and I could collectively bring to life some of the technology that we've used to build them. As I mentioned earlier, we've been leveraging generative AI for a long time to do the coaching of content, and we moved from that into subject line generation, which is fairly easy to do when someone has created an email. But from there, we've made really rapid progress in actually co-creating emails, so removing that cold start problem and starting someone with an actual, “Here's a first draft to react to.” Not just the content, but also the layout. And MailChimp has a powerful automation engine, so coupling it with the ability to connect to a use case such as that you’ve got a new product in your store. So we're able to generate a new product campaign for you that's triggered by that new product showing up in your store. So not only creating the content, but now starting to do a lot more of the work for customers. And then most recently, we've been essentially going through the product and in any place where we have workflows that we can just leapfrog with a more conversational workflow, we've been doing so. So these range from the co-creations or co-creator within the construct of an email or an SMS through allowing customers to create a segment by just describing the segment, like, “I want to send a campaign to those in Atlanta who are into the Braves and the Hawks together.” And that would come back with a list from your audience, and then things like analytics and some of the more visually pleasing results that we put in front of customers. And I promised I would hand Shivang off to you to maybe speak to how we bring some of this to life at Intuit. 

SS Yeah, absolutely. In the previous podcast, Ben, we chatted about Intuit Assist, which is the generative AI assistant for Intuit, and how it gets powered by the generative AI operating system GenOS that is essentially the system for Gen AI at Intuit. A lot of this co-creation essentially gets powered through it. We can go into very specific details. Imagine you are a content creator or marketer trying to set up campaigns and email, and you are on this page where you are about to put a content or a subject line or something and you're essentially hitting what you call an empty cursor at this point. This is where Intuit Assist comes into play. This is where we essentially send the context, the intent of what this email needs to be. Is it a Thanksgiving campaign? Is it a Fourth of July campaign? And you know the intent because that started this journey in the first place. We gather all of the content, we send it to GenOS, and eventually we get this content ready to go for them. At least it gets them started. Double-downing on the aspect of co-creating, you don't want to get kind of a writer's block there. You want to get something up there in the email and then you can start editing it, start making it better. So those are the kind of use cases and that's how we kind of leverage the power of LLMs essentially, where we are right with them as and when they're starting to work on building this creative content. 

BP Nice. I like it for the rubber ducking. It's sort of like, “Hey, it's Thanksgiving. I'm not feeling too inspired, but I know I've got this new product. Actually, you know I've got this new product, and this has been working with this audience segment, but I want to try to reach out to these folks,” and then it gives you something and you can respond. What's the back and forth look like there? Is there an iterative process where each side can give a little and change a little?

SS Yes. What we essentially do is we capture as much context we have about the user and the small business and we provide it as a part of our request. “This customer is sending this email to 50,000 customers. Make sure that it's done with the right level of personalization, the right level of content as a part of that email.” Now, some of these are welcoming ones. You're essentially personalizing it for, “Hey, welcome Shivang joining my newsletter,” for that matter. So there's a back and forth there in the sense that, “Oh, I wanted a welcome newsletter, but I want it to be a lot more specific in marketing a specific product or a specific service.” So don't just welcome them, but also take this opportunity to add more context of marketing my business. So there's definitely that back and forth between the Intuit Assist framework and also the user. 

RD So how much of these features, these tools, were already available and implemented in GenOS and how much did you have to kind of create and add yourself? 

SS As a consumer of GenOS, you probably bucket the developer experience in three places. The very first one is the paid path to adopt the system. It exists, you go in, you essentially do a few clicks in our internal tooling system, and boom, you're ready to go. In a matter of, I would say, hours, you should be able to make these changes within your product. The next point, and I think, Ryan, this is where you were going to, was the extensibility of the platform. And in the previous discussion I mentioned, it's essentially like a Lego table and you have all of these Lego pieces that are already available, but the extensibility is important. One of the key use cases that Mailchimp had from a domain perspective is that not only do we need agents and the tools and making sure that the orchestration happens the right way, but once in a while we also have these use cases where we just want to get to the LLM and get a response back because it's literally a stateless transactional answer that we need from LLM. Those are the things that are very easily extensible. You look at this whole orchestration layer of agents and tools, and then we essentially had this request of, “Hey, we're spending some time here. We don't really need to from a latency perspective. Why don't we just go straight in and get the answer?” We are able to extend that pretty easily, I would say maybe within a couple of days, if I remember correctly. So the extensibility was very important. And then finally, the third bucket for us is the actual contribution, like a big Lego piece that didn't exist and needed to be built and put on the Lego table for others to use. The majority of our use cases require very strong context generation and prompt engineering, especially when you're writing these creative content. So one of the things that we explicitly needed was governance around these prompts because system prompts are pretty critical. So what we very clearly needed was some form of a system that manages prompts. We essentially called it Prompt Manager, and what we ended up doing was we built a governance system around it. We had versioning of prompts, we had security and ACLs around prompts, who can update them, how they update them, and also contributing that as a component back to the operating system to make sure that if others want to leverage it they can, because now that system already exists, the management of prompt engineering kind of already exists now. So we definitely had to extend the operating system. We had to definitely contribute to the operation, but all of this system being very loosely coupled is comparatively easy for MailChimp developers to do.

RD Do you know if anybody is using that new Lego piece you added? 

SS So far, it's MailChimp heavily using it because we have a very strong presence of prompts and prompt engineering, but it is built in a way where literally anyone can use the Lego piece. That's the power of the system. 

BP What qualifications do you look for in a great prompt engineer? I myself have an English degree. What are you thinking? 

SS I'll give my two cents on that. Maybe Jon, you can probably provide some, too. As a language, definitely it'd be good to have a very strong domain knowledge of the business that you are in. You need to very clearly understand that it's not just about marketing, it's also about the context that you give, what context is available within the domain of marketing, within the domain of MailChimp, within the domain of any product that you are in. And then I think you definitely need some good level of understanding of how large language models generally operate. You need to find that way to navigate into, “Oh, these are the couple of keywords or couple of reverse prompt engineering that you'll have to do in order to get to the right prompt that LLM will understand.” 

BP It seems to me that a great prompt engineer is someone who doesn't mind playing around with the system for a long, long time to find their way to the right prompt. Go ahead, Jon.

JF No, I was going to add something similar, Ben. I think the most important attribute that we're seeing success with is just extreme curiosity. Trying to design something that you have a high confidence that customers will use that way is no longer a construct that makes any sense. So not only designing a good prompt to begin with but then attacking it with really strong listening posts and then that curiosity to see what customers are actually doing with it and that iteration cycle is what I look for right now.

RD In any of my experimentations with Gen AI, it's been a lot of playing with it and trying stuff and seeing what I can do. And I'm wondering, when I first approached it it was a little intimidating because that costs money. How do you address that within your Gen AI offerings and even in the internal stuff?

JF We are strongly embracing it. We deeply believe that every little piece of insight that we're learning from customers is so tremendously more valuable than whatever it costs at the moment, and so we're encouraging, embracing, and really pushing for as many iterations as it takes. And then as Shivang mentioned, and as I mentioned before with some of our customer use cases, we're finding these deep grooves that are really big unlocks that are not necessarily constraining the prompt, but the area where the technology is not just a parlor trick, where it actually connects with what customers want to be using on a regular daily/weekly basis. 

SS Exactly the same. You can't put a price on curiosity and customer learning right now, especially when the technology is so new and we know it will absolutely have a huge impact on how businesses behave. 

BP Nice. So I guess give us a sense– you're excited to be working on this. You're getting it into the hands of customers. Are customers comfortable using AI generated content? At what scale are they using it and what kind of feedback are you getting from them?

JF To your first question, Ben, yes, customers are comfortable using it. I had to recheck the data actually when I saw how many customers were reusing it, reengaging and coming back and almost immediately made it a part of their daily usage pattern. And typically it takes some time for that habit to form, so I have been pleasantly surprised at how quick the uptake has been, and I think it's because it's so versatile. It's not one rigid feature that was built, it's very versatile in terms of how customers are applying it. 

SS Yeah, similar sentiments. I think experimentation with velocity is the name of the game right now. There are a lot of places you can and will leverage Gen AI, it's just a matter of how we fast track our learnings to a point where it will be most impactful. 

RD Are you seeing any best practices emerge? 

JF One of my favorites for a long time, and we've all looked at customer surveys, all feedback that comes in in a categorized way. And then of course you always have that other bucket, and in that other bucket, it's typically the biggest bucket. And even the stuff that's not othered, you just have to sometimes wonder what's underneath the hood there. So my favorite best practices that we've adopted is, instead of firing up a dashboard and looking at any sort of qualitative feedback from customers, we just dump that feedback into GenOS and we start having conversations with our customers. And it also allows us to be a lot more dynamic in terms of the way in which we ask those survey questions up front. That has been the most fun thing for me. 

SS And specific learnings for us are mostly around having a very tight feedback loop of the responses getting generated, the responses that were supposed to be generated, and having that continuous loop of, “Did the customer like it? Did the customer actually apply it in their email or apply it within their content? Did they actually send it out to their customers as an email?” That tight feedback loop really helps us kind of understand the systems that we have built in place and are going to be in place to a point where we can have that level of automated feedback understanding is some of the best practices. Because we know that you're kind of running in the blind here, so it helps to have that feedback loop. 

BP Often when I'm interacting with a Gen AI, there's a thumbs up, a thumbs down. We have one internally. There's the Pinocchio nose if it hallucinates. There's, “I'm confused by this,” but how do customers give you feedback and can they send both a positive and a negative signal that helps inform what you would do in the future?

JF Yes, although we don't have the Pinocchio nose. I'm very interested in this. I think I’m going to follow up afterwards to see what this looks like. 

BP That’s for when it gives false information.

[music plays]

BP All right, everybody. It is that time of the show. Let's shout out someone who came on Stack Overflow and helped to spread a little knowledge. “How to return 0 if status is 200 in cURL.” If you've ever wondered, we've got an answer for you, and shout out to Dherik on your Lifeboat Badge for saving this question from the dustbin of obscurity. I am Ben Popper, I'm the Director of Content here at Stack Overflow. You can always find me on X @BenPopper. If you have questions or suggestions or you want to be on the show, hit us up– podcast@stackoverflow.com. And if you like the show, leave us a rating and a review, because it really helps. 

RD I'm Ryan Donovan. I edit the blog here at Stack Overflow. You can find it at stackoverflow.blog, and you can still find me on X @RThorDonovan. 

SS I'm Shivang Shah, Chief Architect for MailChimp. You can find me on LinkedIn, shivangshah15@gmail.com, that's my email too right there. And if you want to play around with Gen AI stuff, please sign up at MailChimp.

JF And I'm Jon Fasoli, Chief Product and Design Officer for MailChimp. On the socials as @Fasoli, except for X, where you can find me at @JonFasoli. 

BP All right, everybody. Thanks for listening, and we will talk to you soon.

[outro music plays]