The Stack Overflow Podcast

How we keep Stack Overflow's codebase clean and modern

Episode Summary

We chat with Roberta Arcoverde, the tech lead on Stack Overflow for Teams. She explains why we ignored several "best practices" when building Stack Overflow's public site 12 years ago and how we're working to adapt and modernize our codebase so that it's approachable and powerful more than a decade after inception.

Episode Notes

You can find Roberta on Twitter. For anyone who understands Portuguese, you can also check out her podcast

Check out Roberta's recent blog post on best practices, and when to ignore them.

If you're interested in Dapper, an open source project built by Stack Overflow folks that works as a simple object mapper .Net, you can check it out here.

Thanks to our lifeboat badge winner of the week, Colonel Panic, for explaining: What the boolean literals in PowerShell are

 

 

Episode Transcription

Roberta Arcoverde Our code base is 12 years old, but it doesn't look like legacy code at all. We are very concerned with making sure that we have the latest versions of .NET, Core and the latest C Sharp running and whenever a new version is launched, we actually proactively make the effort of going through the entire code base, and updating it making it look fresh, making it look new, which as an engineer is great. Everybody wants and loves to work on new things and the latest and greatest.

[INTRO MUSIC]

BP Hello, everybody. Welcome to the Stack Overflow Podcast, a place to chat about software and technology. I'm Ben Popper, Director of Content here at Stack Overflow. And I'm here with my wonderful co hosts Paul and Sara. Hi y'all. 

SC Hey everyone! How's it going?

PF Hello friends! Good day! 

SC Good day!

BP Yeah, it's pretty good. 

PF Here we are. Trying not to say good morning, even though we're recording in the morning.

SC I know!

BP I didn't say it. 

PF Yeah, no. Asynchronicity is mentally challenging in programming as well as life, we're just getting through.

BP Paul, I was playing around trying to make the dog park app last week and I don't know do your kids do this, whenever they see this, they're like "You know what? I'm gonna grow up to be a hacker too!"

SC That' nice!

BP They say hacker. That's like their word for like somebody who uses computer.

PF Yeah, that's, they learned that from you. You know what my kids are into? My son likes to talk about glitches, he's really into glitches. 

SC Oh, really? Yeah. 

PF That you could like could it go through the down the rabbit hole in Minecraft and there'd be like a whole 'nother world or like that the structure of reality is actually just a construct. It's really flimsy and that you could kind of tear it away. 

BP Has he seen The Matrix or he came up with this all on his own?

PF No, you know what has it? It's Wreck It Ralph.

BP Ah, yeah. 

SC So I thought he was like building a glitch, like Ben is. But no, you mean like glitch in the matrix. 

PF Glitch in the matrix. Yes. But you know, it helps you remember just what good branding that is for Glitch.com.

SC Yeah, it's true! 

BP So today, we have a wonderful guest, Roberta Arcoverde. Today, she is a software engineer here at Stack Overflow and has been for many years. I think, congratulations on your seven year anniversary?

SC Seven year!

RA Thank you! Yeah. So yeah, I started in 2014, March 2014. And, curiously, the first person to greet me when I first started on Twitter, was Sara Chipps asking me "awesome, have you ever played MTG?"

SC Oh, that's great. [Sara laughs]

PF Wait, wait, MTG meeting Magic the Gathering? 

SC Yes, of course. 

PF Neerrdds. [Sara laughs]

SC We used to play at the office. Yeah. 

BP For welcoming new people. Yeah, that's a common way to just be like, check the levels.

PF It's weird that they pay you in cards. That's, I don't think that's cool.

SC They're valuable cards, though. 

BP It's like Bitcoin, you get paid in original black Lotus. And six years later, it's worth a whole bundle. 

PF That's what they told me.

RA But Roberta, you're joining us today, you wrote a great story for the blog, along with my colleague, Ryan Donovan, about best practices and when you should ignore them. So tell us a little bit I guess about how you came to Stack Overflow and what you've been building. And then we can sort of segue into, you know, the argument you were making in this in this piece?

RA Yeah, absolutely. So I'm a tech lead at Stack Overflow for Teams team. So that's the team that is building and responsible for Stack Overflow for Teams. One of our products.

SC It's very confusing internally sometimes. But yes, very great team. Very great product.

PF Wait, let's, let's, I mean, what is the Teams product? Just in case people aren't keeping close track at home?

RA Oh, yeah. So Stack Overflow for Teams is private instance of q&a site like Stack Overflow, but it's private. And you can use it on a company, or university or places where it wouldn't make sense to ask questions in public Stack Overflow, or other public network sites. So we can sell you your own, very own instance of Stack Overflow.

BP So yeah, just like for people that Microsoft has, you know, 70,000 seats, lots and lots of people. There's little startups, like, Astro VR, I think I interviewed once. They have like 40 seats. And then Delft University in the Netherlands, they have a computer science department and a couple 100 seats, but they use that, you know, for every, like, incoming class.

PF I mean, it's already the knowledge management tool of the entire internet. Why not have one back at the office? It absolutely makes sense.

BP So yeah, Roberta, what's your sort of like, day to day like, you know, on Teams, but I guess more centrally to sort of this piece, you're talking a lot about building, you know, some of the sort of fundamental infrastructure of Stack Overflow, like why certain decisions were made about the public side, right?

RA Yeah, absolutely. So I actually thought this would be non controversial, but we do have an industry, [Roberta laughs] we do have a set of so called best practices for building systems and good quality architectures. And one thing that I wanted to emphasize with that piece is that every best practice not even in the software development industry, but in general has to be In a certain context, so you cannot remove the context from whichever practice you adopt. And software development, that's the same, we have also kind of the habit of saying that there's no silver bullet and the software development industry. But at the same time, sometimes we tweet some of those best practices, as they as if they were silver bullets, like if you adopt a certain way of building systems, that's gonna make it you know, not fail, that's going to make it perfect. And we do know that every decision that we make comes as a trade off, you're given something else up. And in our case, that's, that's where the piece came from, actually, we were reflecting on our history. And it's, we have a 12 year old codebase. And it comes with a lot of problems. And we did make a lot of mistakes along the way. But we also made a lot of right decisions. And one of those was prioritizing performance when we first started. So that's actually what the piece is about trade offs and context and when to adopt best practices and when to adopt other practices because you have other priorities.

PF Out in podcast land, 40,000, middle aged men just stood up holding their like Gang of Four patterns book and said "Over my dead body!"  It is what it is, right? I remember this coming into this industry, you can spend years trying to do the right thing and understand the abstractions. And actually, I think that's what killed me about that part of programming. Java was getting really big, design patterns. This way of programming was getting really big. And it was like they threw the book at you the minute you showed any interest in code, it was just like, okay, good. Here's what it is. And he didn't know how to add two integers together. But it was like, you need to think about Singleton's. I remember just being horrified for years. Like I couldn't understand anything. It actually turned out that they were dealing with problems that were in such a larger scale, right, then anything I was dealing with when I was learning and getting my start. And yet people want to get you into that big, you know, best practices to me are always really big and really enterprise and not about doing things in a small way so that you can learn. But now you represent a working this really, really big platform like I have to imagine you need a lot of that sort of more abstract stuff to get your work done. So how do you balance that out?

RA Right? Yeah, absolutely. I have the design patterns book myself, big fan of the Gang of Four, I have a master's degree and architecture quality actually. And but I think that we have to observe those sort of best practices and the way that we design our systems and the light of the context where they are, right? And 12 years ago, our context was we didn't have a lot of hardware, we didn't have a lot of resources. And how can we write software that runs really fast and using fewer resources as possible. And unfortunately, the code design decisions that we made back then were to emphasize that okay, performance is our primary goal. testability is not an unfortunately, a lot of the design patterns that we have in place, are there to improve modularity and decoupling. And that's exactly the sort of thing that we made very hard by ignoring things like dependency injection, for example, because that involves instantiating, new objects that will have to be collected. And that will have to use more memory than we wanted to. But that, like I said, before, that was the context that we had. And for a long, long time, those decisions paid off. And they were successful, right. But now our context has changed a bit. Now, we are actively trying to improve areas of the code to make them more testable. And we know that we can do that in certain places. Because we have built a system that runs so fast that while we still need to worry about performance. But we can also give it up for a little bit less coupling so that we can make the code a little bit more testable. And that's specifically important now that we're handling private data, especially and my team Stack Overflow for teams team, where security testing is actually very, very important. So we are happy to change our minds now. And to go back and reassess and rebuild what we have to rebuild. 

BP I want to ask a question. But first, Sara, let me throw that to you other things from your career or from your time in Stack Overflow that you can think of in this in this context, like areas where you either broke the rules or came across some stuff that didn't make sense, given what you had at the time? 

SC Yeah, well, I would actually, that's a good question. I think I think Roberta said it better than I could have a lot of the trade offs that we made as a company in order to do those things. I think my question for Roberta is, why do you think there's so much anxiety for that with that, because what I heard from you is that there's a lot of anxiety that you hear from new coders about doing things correctly and doing them right. Where do you think that comes from? And when in your career do you think you start to realize, oh, this, these rules aren't always hardened and fast? 

RA Absolutely. So and at Stack Overflow for teams, our audience, a vast majority of our audience has actually never interacted with the site before. Those are new users who are creating an account at Stack Overflow, because of Teams. And they are not necessarily as familiar with the site dynamics as our main site users are. And we also have audiences that are not developers and people from other departments of the companies that may have purchased Teams. And we need to build this product, thinking about this new audience as well. So that reflects a new wax, that reflects in new features. But in and from a completely engineering perspective, what we have now are different concerns, right? We are not as concerned with performance as we are for the main site because our audience is smaller. But at the same time, we are way more concerned with things like usability, security, of course, because we are handling private data and, and other engineering problems that we do not have in the public platform. Like we don't have access to the user's data, we don't have access to their databases, and therefore, troubleshooting some things that we can troubleshoot on Stack Overflow the main site, by having direct access to the data. That's something that we cannot do on Teams. So we need to improve our troubleshooting infrastructure as well. 

[MUSIC]

BP Complex, multi-cloud environments, siloed teams, a huge volume, velocity and variety of data that overwhelms human teams. Well download Dynatrace's free ebook to learn how you can overcome these challenges, innovate faster, and transform the way you work with AI and automation. Visit dynatr.ac/SOpodcast to learn more. 

[MUSIC]

PF Can we brainstorm some silver bullets? Like there's automated testing, design patterns. What are some of the other things that have showed up? Functional programming in the last few years. Other silver bullets?

SC Normalized databases.

RA I don't have any favorites. But one that I would mention for sure is microservices, that has been showing up a lot and meetups and conferences and discussions about how to best build scalable systems, which is a little bit ironic, because we are very invested in making ours our system perform. And it is a monolith.

SC Yeah, that's one thing, Roberta you mentioned in the article is something that we use called Dapper, which I was really surprised about when I looked into working here is it's literally I mean, I don't want to be reductive, but it's literally just SQL statements. Like it's not a lot of ORMs have like a lot of bells and whistles where you can just do like object dot blah, blah, blah.

PF Dapper is an ORM that Stack built. Is that right? 

Sc Yes. Yeah. 

PF Okay.

RA So Dapper actually, is one of those systems we wrote with back in the day because we wanted to run real fast and using as little memory as possible. And it's a micro ORM, an object relational mapper, and it is really, really fast. And it uses really, really a small memory footprint so Dapper was created by Mark Ravel. And it is currently, curiously one of the most popular open source projects for .NET on GitHub.

PF It's got 13,400 stars, that's a lot of stars. 

SC Lotta stars.

BP Not to toot our own horn.

PF That's like as many as there are in the Northern Hemisphere.

SC In New York, maybe.

BP  Roberta, just to sort of back out for one second, the piece that you wrote about Stack Overflow. What was the trade off that we did there, just so people can have the context? It was about testing and speed? 

RA Yeah, absolutely. So it's not necessarily testing versus speed is more like what kind of code design we want to write, right? And unfortunately, our design is very coupled. And it gives us a lot of static methods and service locators. And those things make our code a little bit harder to test, they make it faster, but they do make it also harder to test. And right now, what we are doing is revisiting those paradigms. And because our context has changed, so we actually need to pay more attention to automated testing now. The software industry as a whole has also evolved. So there are new practices and patterns that we can adopt now without damaging performance too much, but also making our code easier to test than it used to be. 

PF This is not a popular attitude in our broken industry, right? How do you, you're a manager. How do you get people to calm down long enough about whatever they believe is the right way in order to just get some work done?  Do you find that this is something you have to help people get get over? Or they just--how does it work in practice?

RA Right. Yeah, that's interesting, because our code base is 12 years old, but it doesn't look like legacy code at all. We are very concerned with making sure that we have the latest versions of dotnet core and the latest C sharp running and whenever a new version is launched, we actually proactively make the effort of going through the entire code base and updating it, making it look fresh, making it look new, which as an engineer is great. Everybody wants and loves to work on new things and the latest and greatest. But the other thing that I really like about the way that we design our software is that because we focus so much on being clean and being fast, it turns out as a very objective kind of code, you know. So when you read it, it's very easy to tell what's going on and Dapper that we mentioned before is actually a great example, as a micro ORM, Dapper forces you to write your own SQL, it does not write SQL for you. And the advantage that that brings to me is that I can look into code and I know exactly what query is going to be run against my database. And that makes it easier for me to reason about all the things that are happening across all layers of our infrastructure, Stack traces of exceptions are very short, very lean, because there are not a lot of dependencies and modules involved. It is a monolithic application. So we have a single code base. But that is really direct, objective and consistent. We have I think, around 20 developers working on it right now. But it's very hard to tell who wrote what, because everybody follows the same patterns of encoding styles, which also makes it really easy to work with and easy to read.

PF I mean, let's let's brainstorm with that, again, for a sec. Like, what, what makes for pleasant code?

BP I was listening to a podcast this week. It's interesting what you said, the way they phrased it was, you know, sometimes there's best practices or thing you get, but the ergonomics of the code, how easy it is for the team to work with is actually sort of like often an overlooked value that, you know, for something like our company where we have to sustain all these people for many, many years is pretty critically important, right? Like that it's pleasurable to work with. 

SC I like that the ergonomics of the code, that's a great way to put it. 

RA Absolutely. Well, it has certainly grown, it's a much bigger code base it used to be in the interesting thing about it as well is that it changes a lot. The last time that I ran some numbers, I noticed that over one year, we had touched over 80% of our lines of code in our code base, which is a lot. And that is both a mixture of our efforts to make it look fresh, you know, and to update the coding patterns that we use. But also because we're always adding more features, we're always interested in making also performance improvements. And just involved in the product, you know, it's a very successful 12 year old product, what I am very excited about for the future is our initiatives to start breaking it down a little bit more. Of course, every code base and especially 12 year old code base has its problems and areas of that that should could be improved. So I'm really looking forward for us breaking it down perhaps, into API's and services that we can consume. And this whole idea of breaking down the monolith, because it's a hype and a bandwagon everybody seems to be jumping on, but because we really understand that it's going to make it easier for us to reason and to work on it.

SC That makes a ton of sense. I've typed into the code just a few times to do things. And I found it really easy to navigate, that it's a super self documenting, right, there's like not a lot of comments. And the only comments, you'll see, I'll be like, I did this in a terrible way. And I apologize to you and your children for how this is written. But it's mostly very easy to navigate and understand, which I thought was really nice, especially given how many people have had their hands on that code for so many years. 

RA Well, we actually use Teams internally a lot. Which is great, because well, dogfooding and all. But yeah, the code is actually so easy in directory. They're not there's not a lot of code comments, right? But we are actually really good at keeping Stack Overflow for Teams up to date with the latest questions that every new developer when onboarding developers, we encourage them to ask questions on SO for Teams and therefore everybody can benefit from that shared base of knowledge. 

PF You know, I always feel that the best, for me that the test is I always think of it as the squint test, where you just you can kind of just tell from across the room by the level of indentation and or you know, how just sort of the granularity if you're getting into a mess or not. And I'm looking at Dapper now. And I'm like, oh, yeah, this is all healthy and sane. You know, what it approaches is standard library code. Like I like Python. If you look inside the Python standard library, it is a lot of really boring, well documented, nicely architected, very granular code, and you're like, oh, this is how you're supposed to do it. It's just adults were here. That's all it looks like. It looks like a nice organized library. And I'm looking at Dapper and I'm like, oh, yeah, give me an hour. And I can tell you roughly how this thing works. And then I'm sure it's it's a lot of edge cases and complicated things to figure out. But that hour is so important. When that hour is a couple of days to weed through a mess you it's, you always lose track of why you're there in the first place. Then suddenly, everybody's refactoring everything and building a new standard library instead of solving the problem they set out to, to solve so that makes sense to me.

RA Absolutely, wholeheartedly agree. It's the kind of thing that that's why it's also so great to have new people come in because they come with their own opinions and experiences and the diversity of opinions, especially on engineering helps us build a better product.

[MUSIC]

BP Alright everybody, every week we read a question and shout out somebody who got a lifeboat badge. That is a question that had a score of negative three or more and got up to a score of 20. Plus, this week, the award goes to Colonel Panic. Thank you Colonel Panic. "What are the Boolean literals in PowerShell?" And I have a comment here. "This is a good question and it got an upvote from me. I always upload so questions which show up in my Google search history. Those four downvote should really rethink their attitude." Thank you for that, and shout out to Colonel Panic. Alright, everybody, I am Ben Popper, Director of Content here at Stack Overflow. You can always find me on Twitter @BenPopper. And you can always email us podcast@stackoverflow.com. 

PF Roberta if people wanted to get in touch with you. What would they do?

RA Oh, yeah, I live on Twitter. So you can ping me and talk to me @rla4.

SC I'm Sara Chipps, Director of Community here at Stack Overflow. And you can find me at @SaraJo on GitHub.

PF And I'm Paul Ford, friend of Stack Overflow, check out my company Postlight. We're growing, we're hiring. Would love to talk to you. And can I leave us with a workplace.stackexchange.com question? [Ben laughs]

BP Always.

PF Ready? Okay, is this is from today. It's been viewed 6k times. That's 6000 for people who don't know. "Is it okay if I tell my boss that I cannot read cursive?" That's a Generation Z problem right there.

BP Interesting one.

SC Who's writing cursive?

PF That's all. Just wanted to leave us with that. See everybody soon.

SC Okay, great. 

BPAlright, everybody.

[OUTRO MUSIC]