Changelog & Friends — Episode 59

Kaizen! The best pipeline ever™

Gerhard Lazu discusses continuous improvements to Changelog's infrastructure, including migrating from CUE to Go within Dagger, implementing ASDF for dependency version management, and achieving faster deployment times.

Transcript(25 segments)

0:00SPEAKER_00
We are go to seven, three, two, one.
0:19SPEAKER_01
Back to Change Loggin' Friends, the weekly talk show dedicated to continuous improvement. Thanks to our partners for helping us bring you the best developer pods each and every week. Check them out at fastly .com and fly .io. Okay, let's talk. Well, we're here for Change Loggin' Friends and we have one of our oldest friends with us here to kick off this new talk show. It's Gerhard Lazu, what's up, man? It's good to be back. Everything's up. I was just telling Adam, everything that should be up is up. Nothing is down. Don't ask Gerhard that question. Nothing is down, okay? Oh my gosh. It's all up from here. And of course, Adam's with us as well. What's up, Adam? What's up? Chasing the nines. Chasing the nines. How many can we fit? They get more expensive as you go, don't they? They do, yes. Orders of magnitude, each of them. That's right. What's our SLA and SLOs, Gerhard? We're kaizening. I don't know. Okay. What do you want them to be? We'll make them whatever you want them to be. I want them to be just right. Not too much, right? Not too little, just right. Yeah. I think for change .com, it's 100%. That's just right. It's gonna be a billion
1:37SPEAKER_00
dollars,
1:38SPEAKER_01
please. Because they didn't go down. Like change .com did not go down. That's true. And that's Fastly. Thank you. Yep, that's Fastly. Speaking of being an old friend, how long have you known Gerhard? 2016? Yes. 15? Something, 16, I think. So 16 is when you started working on the code base. In fact, I've been creating this. I haven't got it finished yet because it hasn't turned out very well. But I'm creating this visualization with this tool I found called Gorse. I don't know if that's how you say it, but it's G -O -U -R -C -E, like source. And it goes through your version control and the whole history. And it creates a visualization of people working on the code. And I did one across the eight years of our code base. And it was like 45 minutes long. So it's too like, who's gonna watch 45 minutes? So I started futzing with it, trying to make it better. Anyways, long story short, I know exactly when you started contributing. Because it was the summer of 2016. I remember your little avatar coming in on this visualization and touching all the files. But I knew before that, briefly at least, or I knew of you before that. I think you wrote a blog post for us about Ansible prior. And I think we've told the story before on Kaizen's past. But what is that? Four plus three, seven years? That's it. That's a long time. I think we started talking in 2015. I think it was December or something around that time. And then it took us a couple of months, right, to figure out how and why. The why was important for me. And I'm glad that we got that right. Remember the 10 questions? I mean, I can still have the email. We can dig it up. I think we've done this. I think we've done this before. We have. That's why I say we've told this story before on ShipIt on the previous Kaizen. So here we are now. We're on Changelog and Friends, but we're still Kaizen -ing. There was a big question on ShipIt 90. Where would we Kaizen next? And this is where we decided to put our Kaizen episodes. Of course, it's also on the Changelog. This is our new show. It's also our old show, but it's just a different flavor of the Changelog. We're happy to have you here, Gerhard. We could maybe throw this on the ShipIt feed for those folks if we want to, but we can talk about that perhaps offline. Maybe you'll be mad because it'll be like episode 91 and it'll ruin your flow. That's okay. I wanted fewer episodes. So I think this is okay. Okay. Why not? They're happening, but just less frequently now. Well, for those people who listen to the Changelog and we haven't Kaizen -ed on the Changelog, except for maybe a cross post years ago when we first started ShipIt, Gerhard, why don't you give the conceit of what Kaizen is and then what we do on these episodes? So Kaizen stands for continuous change for the better. And it has a strong association with Agile. And I know that somehow has fallen out of favor with industry. I'm not sure why exactly, but I've seen that there's a lot of anti -agile feelings out there. Maybe people have been doing it wrong. I know, that's my joke. Okay, that's possible. You're holding it wrong. Yeah, and one of the principles in Agile is to keep small improvements, keep it rating, and keep continuously delivering those small improvements. So for us, what that meant was that on a cadence, for us, it was every 10 ShipIt episodes, that was roughly every two and a half months, we would talk about the improvements that we're doing for Changelog. So we don't have a release as such. There is no Semver, we're continuously releasing. But what we do is we talk about all those improvements that we have shipped in that timeframe, whatever it is, on ShipIt it was, and now on Changelog. So this idea of continuously improving something and then pausing and taking stock of what we've improved, it's almost like a retro for those, again, that have done Agile the right way, or even the wrong way. You can do retros the wrong way. Anyways, it was one of my favorite meetings because it brought everyone together and we would get so much better for it. But this idea of continuously improving, if you just keep focusing on that and always be better today than you were yesterday, that's all you have to care about. It doesn't matter how big or how small it is, they compound, that's the beauty of it. All those improvements compound over time. So keep improving, one of my mottos. So we're obviously fans of this process and we're putting our money where our mouth is, so to speak, because we're not just talking about our improvements, but we're committing to kaizening every so often. Our new cadence we're gonna try is gonna be every other month here on Change All Good Friends. We were doing it once every 10 episodes, like Erhard said, which was roughly two and a half months. Now we've been a bit on hiatus because we've lost our groove. We've got our groove back now. So it's been longer than a typical kaizen time period. Doesn't necessarily mean I accomplished more than I normally do. I think we did actually. I think you did. Yeah, I think you did for sure. So let's start hitting through some of the stuff we've been working on in and around changelog .com. So for those first coming to Kaizen, changelog .com is our open source podcasting platform. It's written in Elixir and Phoenix. It's deployed to fly with Fastly in front of it, has a Postgres backend and a pretty cool, I don't know, deployment infrastructure. What do you call it, a pipeline? I would say infrastructure. I think all of it, it's infrastructure. The way it's structured, the way we talk about it, the way we capture it, even in documentation, the way it's documented, I think it's really cool. Yeah, and so we're doing this. We've been coding on this, like I said, since 2016, for many years, kind of in fits and starts. We go heavy sometimes, we go light, we continuously improve it, and we also experiment. So one of the reasons we have this platform is so that we can try new services, try new techniques, hit ourselves on the thumb with a hammer and tell you all about it so you can avoid said hammer or give it a try if you like. And so that's kind of one of the ideas behind this. It's not merely to make changelog .com a better website or make the distribution of our podcasts better, although that's definitely a huge part of it. It's also for learning, experimentation and hacking because we're hackers and it's nice to have something to hack on together. I think that's very rare, right? Being able to do this in the open source, in the spirit that it was intended, to have the time, right? We are very busy during our day and during our work week and then the weekend comes and there's like all sorts of pressure on your time. Being able, just give yourself permission to try things just for the fun of it. It's so easy to just get down ruts or delivering off a backlog or whatever. There's never time to try things out. So this is, again, us giving ourselves permission to do fun things, talk about them, but also keep improving the platform, the whole changelog platform in an open way. My favorite approach. Yeah, so many tools I come across, I don't have a good enough excuse to try them because at a certain point in your life, it's just opportunity costs left and right. And it's like, if I try this thing, I can't do that other thing on my list of things to do. And so we almost need an excuse to tinker. And this has been, for me at least in certain ways, one of my excuses where I can feel like I'm also pushing the ball forward while I do something versus merely futzing around with my git configure, my VIM RC, right? Gosh, Jared.
9:04SPEAKER_00
The
9:05SPEAKER_01
old VIM RC. Sign commits, finally, I see. I think since the last time at Kaizen, you have signed commits now, Jared. I did do something with that. We succeeded. I still see that there's, when we merge some stuff, there's still that PRs that come in. There's still like a, what do you call it? A DCO thing that's like failing because the sign -offs aren't correct. And maybe it's because I rebased via the web interface. I don't know. I've had some problems where I'm like, why can't we just do it the old easy way? Why are you gonna have all this security and stuff? But you're dragging me along, kicking and screaming. All right, let's get into some of the major changes since our last time talking about this. The biggest one it seems like was the upgrade of Dagger and the switch from Q, C -U -E, the configuration language for configuring our pipelines, to Go. And I think we had that last time, but now we're actually using it for more stuff. So obviously you can tell by the way I'm talking about it that Gerhard should be talking about it, not me. So go ahead. So in the last Kaizen, let me go back to the beginning. I think it was November, 2021. That's when the story started. This experiment start was a long, long one. And the idea was, why are we using YAML for all these pipelines? And at the time we were using CircleCI, we wanted to migrate to GitHub actions. We're just trading one YAML for another. So I came across this tool called Dagger, which was, the whole idea was like you write Q, you don't write YAML, and you can run your pipeline locally. And what that means is that you can run the same pipeline written in Q, whether it's locally on your laptop, whether it's in GitHub action, CircleCI, it doesn't matter where you run it, it will always run the same way. It has a container runtime. So all operations run in various containers. You have the caching, you have a lot of nice things. That was again, November, was 0 .1. We're very courageous, but it worked well. And it was a good improvement. We talked about it plenty. We wrote about it. Where do we write? It's output pull requests. If you go to our GitHub repo, there's the change of .com. Even the topics that we're discussing today, there's a discussion, 4 .5 .2, where you can go and see all the various topics. And this is the first one, migrate Dagger 0 .1, Dagger Go SDK 0 .5. And what that meant is that there was a big shift in Dagger from people that liked Q to people that wanted to do more. They wanted to do Go, they wanted to do Python. Like why should you write your pipeline in any one language? Why can't it be the language that you love? And for me, it was Go, right? We didn't have Elixir at the time. By the way, that's changing. We can talk about it later. Oh, you're getting Elixir support? By the way, you came across Dagger, but then you also went and got a job at Dagger. So when you say we, people might be confused. When you say we, you're talking about you and your cohorts at Dagger, right? Yes, exactly. There's like we, these different things depending on the context. But yes, I joined Dagger as well. Yeah, I really liked it beyond just the tool. You can tell I'm really passionate about deployment, about pipelines, about CI, CD, all that space. That's the space where Dagger sits. And remember how we met, Jared? Deliver? I do. E -deliver, deliver and e -deliver. Exactly. E -deliver was a fork of deliver, the Bash framework for deployment that I wrote. So there you go, 10 years later, boom, Dagger came along. Rest was history. All right. So 0 .5, which actually 0 .3, Dagger 0 .3 introduced these SDKs. You can write Go, your pipeline in Go, your pipeline in Python or Node .js. We pick Go and we transition. We migrated from having our pipeline declared in Q to a programming language, which has, again, a lot of nice things, right? When you have a programming language, you have very nice templating, you have nice loops, you have functions or whatever you may have, right? Can't you loop and do other crazy stuff in YAML too? You can, but it gets very, very messy if you do that. Very messy, yeah. And to be honest, why not use your language as much as you can? And again, we didn't have Elixir at the time, that's slowly changing. But I prefer to write my pipeline in Go. So the first thing which we did, we migrated 0 .1, we were running it in the new one. So I think the last Kaizen when we talked about it, it was just a straight import, right? Like wrapping the previous version in the new version. A nice gradual improvement. Now, all of that has been rewritten in Go. We're using Mage locally to run the pipeline. So many things happened. Again, this was two and a half months ago. Now, I wanna show you something really cool. I'm going to share my screen. I'm going to go quite a bit. Do you see that? This is a pull request that has not been submitted yet. And this is Dagger engine on fly apps version two. It's exactly what it says. We're experimenting with fly apps version two, which is the latest implementation of apps in fly. We're running Dagger on it. I'm connected here, experimental Dagger runner host, via WireGuard tunnel, and I'm running Dagger run the CLI, and I'm wrapping Mage CI. If your mind is blown, that's okay. I think you need to watch this video. So you're running Mage inside Dagger, inside fly on their V2 platform or locally. I'm wrapping Mage in Dagger run. Dagger run is just a command that adds a CLI that connects my local command to a remote engine. It gives me this very nice view, which is my pipeline. I'm showing my Dagger pipeline that is running in this Dagger engine on fly apps V2 that I'm connecting to via WireGuard tunnel. So what we can see here is we have three pipelines in one, and this is something that starts becoming even crazier. So we are building the runtime image. We are building the production image, which by the way, makes use of the runtime image. And down here, we're also running tests, mixed tests, but because nothing changed, everything is cached. It completes in seven seconds. So let me go into application EX very quickly, and let me do a foo two, just a comment, foo two, okay? And I'm going to run the same pipeline again. So now what's going to happen, it will detect the code change, and now it has to resolve the dependencies, compile the app, run the tests, compile the assets, and all this that we have a very nice UI that shows us the different pipelines, how they run and how they combine. I'm really excited about this. I don't know about you. Maybe you're still like trying to process what you're seeing. Well, I'm still watching it stream by, and obviously our listener here is imagining this in their mind, in their mind's eye, but it does seem very nice. I like the fact that it's going to cache everything. So let's say I just update an image in my assets folder. I don't touch any elixir code, and I deploy that out. And with this new code, it's going to run just the mix Phoenix digest command. It's not going to run compile and stuff. Not currently. Ah, see, I went to the logical conclusion. That's the next improvement. That's the next improvement. You beat me to it. Okay, so I'm almost excited. You'll have me excited later. Almost excited, yes. Oh
16:19SPEAKER_00
gosh,
16:20SPEAKER_01
this is still good. Now, if you're going to run this from start, by the way, it should finish right now. It will finish in a minute and a half. Start to finish? Start to finish in a minute and a half. With no cache? Well, you do have a cache, right? If you try to compile all the dependencies from scratch locally, it will take you about six to seven minutes to get the dependencies, compile the dependencies. Remember, you have to do it for both test and prod. You have to digest the assets, build the image. There's like a lot of stuff happening behind the scenes. The next improvement would be to split the static assets from the dependencies from the actual application, right? So now all of a sudden you have like three inputs. Right now, we have just the application and the application means the dependencies, the static assets, and the application code. So it's all seen as one, which is why it would run everything if something changes in any of those files. One minute and 39 seconds, right? It just recompiled everything. And by the way, this would also rebuild the image if it needs to. Let's say we bumped Erlang, part of the same pipeline. It would rebuild at runtime. It would publish to runtime. All of that would happen automatically.
17:23SPEAKER_00
And
17:24SPEAKER_01
that's why you don't have to worry. There's like a couple of good pull requests worth checking out. The first one is 4 .5 .4. If you go in our repo, you can see that the whole migration, we removed the last make file, we introduced make. That's a fairly big one. There's a lot of refactoring there. There's another one, 4 .6 .4, where we are reading the versions for all the different dependencies, the runtime dependencies, Erlang, Elixir, Node .js, from tool versions. And that's an ASDF configuration file. ASDF is, I think I need to refer to the page manage. So ASDF manage multiple runtime versions with a single CLI tool. You could use Brew to install Elixir, but then which version are you installing? It doesn't do versioning very well, right? Having multiple copies or multiple versions of a binary, of a language runtime on your machine and being able to switch between them is not easy with Homebrew. I think it's maybe possible with Homebrew, but not easy or maybe not possible. Yeah, I'm not sure which one. But with ASDF, it's built specifically for this purpose. So this is from the, this goes way back to the days of RBM and what was the big Ruby one? Cause we had to do this in the Ruby world. RVM. RVM, Ruby version manager. And then there was NVM node version manager. And then there's probably EVM, maybe not Elixir version manager. And so each little ecosystem, each language had their own version manager. And then ASDF folks came along and said, hey, let's build one that could handle all these distinct things with one API, one CLI. That's a really nice tool. I mean, I've been using it for years and excited to see this getting, you know, further into what we're doing. So keep going. This also solves the one string to rule them all in terms of versioning too, right? Like we had an issue where there was multiple versions of Elixir in the code base. And this is like one now, only one, right? Yes. If you're using it, so if you follow the updates of the contributing guide, you can see how to use ASDF to install all the things. Now we are versioning tool versions. We capture everything, including Mage, for example. Now you wouldn't want to use Mage other than if you want to run the pipeline locally. There's also PostgreSQL, right? So PostgreSQL is also versioned using this. If, for example, you're on Linux, there's like a couple of extra things that you need to do because maybe you don't have some system dependencies on macOS, it just works. So we are capturing every single dependency in this file, every single dependency version, like down to the patch. So for example, with Brew, because you mentioned Brew, Jared, you're right. There is a certain flexibility in terms of which version you can run to the miner. Some have major. So for example, PostgreSQL 14 or PostgreSQL 13, I think 13 is still available there. What you can't do, you can't say 14 .1
20:07SPEAKER_00
or
20:08SPEAKER_01
14 .8. And then there may be differences that you don't realize between your local development version that you're running. Everything looks fine. You push to production, guess what? Things start breaking suddenly. You don't know why. Well, we had one of those a long time ago where it was actually the Erlang patch version that was different. Not even the miner or major version, it was the patch which had this bug in the TLS library or something that didn't exist on my local, but only existed in production. We ended up having to debug that sucker. Exactly. So ASDF to manage all our dependencies locally, and you just need ASDF, that's it. I know there's certain tools which use Nix, but then I forget the name. We mentioned it on changelog. But if you have that, then you have to use the Nix package manager and you have to install that. And I ran it for a while, but to be honest, you really want the Nix OS to get the best experience. But then you're running Mac, you can get the Nix package manager. Some things will be a bit weird, especially when you restart. At least that was my experience. So ASDF is a fairly lightweight for what it is and what it does. And we are reusing the tools versions file for our pipeline. We're reading this file in our pipeline and that determines what version of Erlang we're using to compile in the image that we're using for tests, for example, and as well as building the final image. What version of PostgreSQL we are running in production and so on and so forth. Right, so is this, it's the .tool -versions, that's a hidden file, tool -versions. Is this an ASDF creation? It is, okay. And what is the syntax or what is the format of this? It looks like it's just like plain text. Do they have a spec? Like here's how it works or is it just simple enough they don't need it? It's very simple. It has like multiple lines. On each line, you have the name of the dependency, space, the version of that dependency. So you have Erlang space 25 .3 .2. New line, Elixir space 1 .14 .4 and so on and so forth. Too simple to even write the spec down. Just look at it and you can see how it works. I know, right? I do like simple tools. So as a homebrew user and an ASDF user, I sometimes have to ask myself the question of like, what do I install this with? And my go -to logic is like, well, if I could ever imagine myself having to have two different versions on my machine at the same time, for instance, I have this project over here requiring Ruby 2 .3 .7 and then this project requiring Ruby 3 .2 .3 and I don't wanna deal with switching, then I go with ASDF. Because of that, I'm a homebrew install PostgreSQL guy and now we have it inside of ASDF and so do I need to uninstall and install with ASDF? I don't know, does that screw things up for me? It doesn't screw things for you, no. It will keep things as they are. You can, for example, it won't prevent you from running a PostgreSQL that you've installed via homebrew, so there's like no such guard in place. Because it's all containerized or at least isolated. If you're using ASDF, it doesn't use containers. When you run the pipeline locally, basically it will ignore whatever you have running locally. It will read the versions from this ASDF generated file, which is tools versions, and those are the versions that it will use for the pipeline, both locally and remotely. Now, if you, for example, want to switch to PostgreSQL, the same version that you're running in production, the same version that we're testing with in our CI CD system what you want is obviously to do the ASDF integration. Usually when you integrate ASDF with your shell, that will prepend the ASDF path, which means that the PostgreSQL from ASDF will have precedence over the PostgreSQL from homebrew. Same is true for Erlang, Elixir, so you can have Erlang installed via homebrew, but as soon as you're in ASDF and you have it nicely integrated with your shell, then automatically anything that you have installed with ASDF, that's what you'll be using in that specific directory. Gotcha. So the pipeline though, it does use containers, correct? Yes. Everything, all the operations actually run in containers. Imagine all the commands that you have in Docker files, right, you have like those one lines. Now imagine if you could write those one lines in code. In our case, it's actually go. We capture the equivalent of those lines. There's their operations, basically. We're capturing the equivalent of those operations in go code. They get submitted to the Dagger engine. They create a DAG view of everything that needs to run, and it can reconcile what ran, what still needs to run, what has changed, which parts of the cache need invalidating, it has volumes. Behind the scenes really, it leverages BuildKit, which is at the core of Docker too. So anything that you can do, and by the way, the Docker file is just an interface to BuildKit. So I love this. I love having one place to specify versions and have all tooling and pipelining and production just do their thing. The one that always has scared me historically or been more complicated was Postgres, because sometimes a Postgres upgrade requires a data migration. How does this handle that circumstance? So currently, the version that we have specified in tools versions is the one that we have deployed in Fly. So in Fly, when you deploy Postgres, you do it via the Fly CTL, and it's the version that we had at the time of running this command. It is a platform, right? So you run and you have a CTL, and you also have a web interface. But in our case, we had Fly CTL, Postgres, deploy, create the cluster with the whole clustering setup and everything, and at the time, we had 14 .1 deployed. One of the things that are on my list to do is to deploy Postgres CQL again, like have another cluster using Fly machines, which is apps v2, and basically pick whatever the latest 14 version is. I think it was 14 .8 or 14 .9 when I checked. I'm not sure one or the other. So once we do that, there will be like a data migration, and then we will capture this version in our development environment via SDF, which will automatically be picked by the pipeline so that when we run any tests in our pipeline, in our GitHub actions, GitHub actions will obviously be connected to a Dagger engine, and the correct, actually the same Postgres CQL version will be used there as well. So we have dev, test, and production, same version, but the production is what determines the Postgres CQL version, and that happens when we deploy the cluster to begin with. I follow. So if I want to upgrade Elixir, it's as easy as changing the version in the tool versions file. But if I want to upgrade Postgres, it's the other way around. Postgres CQL, yeah, Postgres is different because it's like a stateful service. It's, you're right, you have data. There's like a bunch of things. You can obviously change it in the SDF file, but that won't change what's deployed in production because you're right, there's a data migration part to that. Fair enough. It was not quite as cool as I thought it was, but it's still amazing, Gerhard. Well, everything except that, right? Like the yarn, no JS, right? I'm two for two on finding the one thing it doesn't do. You know where to look. Oh yeah, I do. Oh, I know what I like. I know what, usually what I want is the hard parts taken care of for me. So still, that's very cool. So the reason why I asked it if this tools versions thing was ASDF or if it was broader, because it seems like this little bit of our infrastructure at least could be generalizable enough to the point where maybe it's useful for people to say, here is an ASDF based pipeline integration that you can use. Do you think that's the case maybe? That is possible, yes. That requires a bunch of things on the dagger sides as well. So right now, obviously you can get inspired by our pipeline. You can take it as is and change it and adapt it to your needs. It is a starting point, but really what you want is reusable pipeline fragments or components, right? So for example, if you had like elixir tests, like how would you run those? You would want a component that you can just consume. That doesn't exist in dagger today, but it's somewhere there on the roadmap. Very cool. Well, that's progress and not perfection, but it's progress over perfection. And besides we have to Kaizen again soon. So we'd have to have something else to strive for. You can't just - Can't do it all now. You haven't asked me something important. So I'm going to ask myself the question. Hey Gerhard, how long does our pipeline take to run? Good question. So it depends, right? If you have it cached. So even today, we are connecting from GitHub actions to a Docker running on fly. And again, if you look through our GitHub actions workflow, you will see that there is a fly wire guard tunnel set up and from GitHub actions, we connect to fly. That's where Docker runs. Internally, whenever you run an SDK, it automatically provisions, whichever version of dagger engine it needs. So what that means is that whenever our GitHub action runs, if it's in our repo, we have everything cached. Even though we have things cached, we weren't parallelizing our pipelines. So we were basically running, build me the runtime image sequentially, then move on to building the test image, then run the tests, then build the production image and so on and so forth. So the whole thing was like one long line. What we did part of 464, we parallelized these pipelines. So now they run all at the same time. So the last pull request, which emerged, there weren't any code changes. I just had to recompile everything, rebuild everything, just make sure everything is fine. It took, I think about two minutes. There was just like a markdown file change. So what I'm curious is next time that you run it, Jared, I think it used to be six to seven minutes for the pipeline to run. I think it will be around four minutes now, maybe even three minutes. The next thing is to switch, and this is the pull request, which I haven't submitted yet, to switch to the fly apps v2 dagger engine that by the way, we can stop from within GitHub actions. So because these are very small, like they're firecracker based, you can start them in within 20 seconds. So you don't have to have this thing running all the time. You spin it up, the state is there. There's like a local NVMe volume, and this is all basically managed by the fly platform super fast. You run your pipeline, the cache is there, the state is there, whatever needs to change changes. A few minutes later, again, in my tests, one minute and 35 seconds to deal with a code change, recompile everything, everything runs in parallel, and the deployment part, that's the one that basically just depends how long it takes for it to be deployed. You can add another few minutes, but within three to four minutes, actually even like three minutes, I think, you can get a code change into production. That's awesome. And that's much faster than what you had to deal with, Jared, right? It was like eight, nine minutes. Yes. So much faster. Love it. So much so that I would forget it, and then sometimes the build process would fail, or something would happen strangely, mainly for my stuff, really. I don't know if it's like just a me thing or you thing, Jared, but I feel like - No, it happens for me too. Okay, okay, I don't feel bad then. I'd forget about it and like, okay, I have no idea how to restart that action, I guess. I guess I can go back to GitHub Actions and say try again, but then there's like another thing going on. So then Jared deploys and it succeeds, and it takes mine with it, so that's fine, because we're committing to main. Very cool, Gerhard, I love it. Chasing not the nines in that case, right? Like that's the anti -nines. Chasing the zeros, man, we're chasing the zeros. Chasing the zeros. How fast can we get this thing? Well, I don't think we can get it down to zero, but. Maybe we get zero in the minutes column, you know? You can dream it, it's already there, you know? It will take some seconds. Yeah, I think that will be very difficult, because even if you run it locally, right, where everything is as fast as it gets, right, you can have the fastest Mac. If you're trying to compile some code, it will take some number of seconds. And it's not just that, right? You have to spin the process up, you have to reestablish connections, you have to do all those things, right? You have to check the health checks, make sure it's healthy. So it will always take some number of minutes, because you're bringing a new instance, and by the way, this is like running in production, so you have blue -green, you have like a lot of traffic, you're shifting traffic, so it will take some number of minutes to get the code change out in production. And that's only with one instance, which brings us to the clustering part. To go to fly machines with our app, we'll definitely need to solve that problem. We will need to be able to cluster multiple instances of the changelog app. Without that, we may, I mean, if the host has a problem, right, with Nomad, with apps v1, the instance could be migrated. With apps v2, the instance cannot be migrated, because it's tied to a specific host. If that host becomes unavailable, there's nowhere to schedule the app, because again, that's how the platform was designed. So the idea is you need to have more than one instance. So we need to solve clustering, Jared, it's time. So you're pushing this on me, I see what we're doing here. Yeah, this one's on me. By the way, as a follow -up to your last statement, I just ran our test suite locally. It took 17 seconds to run, so I will accept a deploy of 20 to 25 seconds, no problem. Sure. Yes, I did not get around to this, although to my credit, I stated that I probably will not get around to this during this Kaizen period, because most of my efforts have been in and around the migration of changelog news onto its own podcast and the meta feed, which is our three shows, which are all distinct shows, so they can have their own feeds, their own subscriber base, et cetera, and then our, the changelog show, which is all three of those shows in one show, and the one -time migration of stuff. We had to re -implement how we do our newsletter and stuff, and so that was like what I've been doing the last three months, and it's pretty much finished now. We do have a idea for how we can make changelog news's webpages better, which I would love to do, because it's quite the upgrade from what it looks like right now, and it simplifies things as well, so that's kind of like what I was thinking about doing before this, but we have honeycomb tracing now from Phoenix, which we didn't have previously, thanks to you, Gerhard, so I'm now without excuse, because I can monitor the speed changes as I make these caching changes, and I have a prototype from our friend, Losh Vikman, who showed me a way of doing a clusterable caching solution, which doesn't completely rip out the guts of what we're currently doing, which was my previous plan. So the skids are greased, and the observability is observable. By the way, you guys have seen Honeycomb has their new open AI integration in there. I just saw it today. No, I didn't see this. I haven't. Yeah, man. Spill the beans. This is my new favorite thing, is like never make me write a SQL query again, right? Never make me write a whatever Honeycomb queries are again. Just let me explain in plain English what I want you to do, and then you figure it out, and they integrated that. It's still beta. It errored out 50 % of the time I've used it, and I've used it twice. So the first one errored, the second one worked, but you just kind of like tell it what you want to see, and it's going to have the open AI API come back with what looks like a pretty good query to get you started inside of Honeycomb. So kudos on them for rolling that out quickly, and I think this is just every tool. I mean, this was one of my complaints with Grafana was like Loki or whatever. Gerhard, the stuff that you wrote inside of there was like learning a new language, and I liked them because you could save them as dashboards, and I could look at them, but I was like, I ain't never going to write one of these from scratch because I don't have time for that, but if I can go in and tell Grafana what I want, and it can put together the actual query, which GPTs are very good at. I mean, I haven't written a SQL query in months. I've edited some, right? You tell it what you want. It writes the query, and you edit it to work correctly, and so anyways, it's cool that we starting to see this stuff get integrated. So it's there inside of Honeycomb. It's limited, 25 a day or something, and it errored out a couple of times, but they did quote it as useful in terms of saying they don't know what useful is for you, so they put the word useful in quotes. They hope it's useful. Right, well, they said that query assistant will produce in quotes useful queries, and they said it's a loose term because they're not sure it's impossible for them to know up front what the perfect query is for you, so I guess erring out is one thing, but a bad query or something that's not useful is. Yeah, and that could have been a one -off. Like, I literally have used it a couple times. It errored one time, no big deal. I'm looking at this, though. It's pretty cool. They have a GIF at the top, and it's like users with the highest shopping cart totals, results, and slowest DB queries, results. Because that's what you almost always know what you want to see, right? Or at least you have an idea of where you'd like to start. I need to know this, but it's hard to get from there to the query. It's not like hard -hard, but it's like speed bumps. Sometimes that pain, too, as a user, will make you not even use the tool at all. Totally. You know, that's what I love about the way that generative AI is, I wouldn't even say disrupting. I would just say, like, man, fine -tuning the useful experience of something, because Honeycomb is so powerful, but only if you have certain keys to unlock doors. And this just lets you bypass the keys, because they're just like, hey, go find me keys, in. It's kind of cool. Sentry is doing something similar. I've been inside the Sentry dashboard, by the way. Those are both sponsors of ours. This is not sponsored conversation, by any means. It just happened to be the tools that we're using. And Sentry has this thing where they're like, do you want us to tell you what this might be the problem? Obviously, that's not how it's worded inside their dashboard. That's a terrible copy. Rewrite that. I know. I'm not giving it credit, because I don't have the dashboard open. But it's like, click on this button, and we will try to determine not just what this error is, but it's kind of like, let me Google that for you, but on steroids. Anyways, I just like the idea that all these developer tools are just getting these upgrades that make them more usable at clips that are really fast and super useful. So anyways. Yeah,
38:22SPEAKER_01
I concur. Side combo. But I have what I need with Honeycomb. Yeah, that was pull request 456, by the way. If you wanna go and check it out, how integrated you can see the code changes. It wasn't that many, but you can see how we added, how we were sending app traces to Honeycomb. I want to give a shout out to AJ Foster. Shout out. And yeah, shout out to AJ Foster. He's AJ -Foster on GitHub. He had a good blog post about how to do that integration, which by the way, he was inspired by Dave Lusche's blog post. And that's Davydog187 on GitHub. So shout out to Davydog187. Thank you, Dave. Between the two of you, that was super easy. And I added links in that pull request 456. We can check it out. So if anyone wants to do this integration again, it was super easy following these two blog posts. Super easy, barely any convenience. Yep. Just took an hour. That was it. And most of it was like figuring out where to get a credential, how to get it right, things like that. Awesome. Thanks guys. Just the wiring. I guess I could have asked chatgpt for that. That's cut off date, it depends when they wrote it. Although the new version is getting browsing, which Bard has. So I've done a little bit of Bard versus chatgpt. Just literally copy paste the same command. I don't know if you guys have tried Bard yet. Nope, it's not yet. And Bard has access. I mean, Google has answered to a certain extent. I still think GPT -4 specifically is better than Bard because Bard by default, A, no cost. So I am paying for the plus to get access to the faster features, 20 bucks a month on OpenAI. Bard is free to use. You have to be signed in obviously, but it has default access to the internet. So you can like paste any URL and say summarize this for me or whatever you want to say, and it will go curl that sucker and spit it into itself. So that's super useful for documentation purposes. And the new, I mean, OpenAI will be there briefly. Like they have with browsing, but the same exact command, all coding stuff. I don't care as much about the other stuff, but just the comparisons. And Bard has been 100 % inaccurate on Elixir code. Okay. It's completely failed on every attempt. There's a lot of improvements they have to ship. There is. That's what I'm thinking. Yeah, Kaizen. So they need a Kaizen, that sucker. But it's nice for other, I mean, it's cool to see the different responses. They both need to get better, but I always just thought that was really funny because the best thing is that I don't have in RAM storage, Elixir, certain functions. Like I just, I use it enough and I know like, okay, the enum module has these functions, but then like other ones are in list, other ones are in map. You know, what's the function signature. And I'm so insecure about my knowledge that every time, and ChatGPT has done as well, that Bard was wrong. I 100 % believed it. I'm like, oh, cool. I love this. And I went and tried it. I'm like, no, you can't do that. Oh my gosh. But I would love it if you could. So I think it's very optimistic. Yes. Well, if you look in changelog discussion 452, which is again the ones for this episode. Yes. I asked ChatGPT if we deployed 120 times, it's between March 8th and May 20th. Yes. How many deploys per day did we do? And it was accurate. It could figure out how many days there are between March 8th and May 20th. It was so nice. The answer is 1 .62. There's a screenshot, 1 .62. That's how many deploys per day we did. Really? And most of those were generous changes. Yeah. Obviously not every commit results in a deploy because you batch them. So it's not quite that many, but one per day. I think we had one per day on average. Right. Well, there's definitely some heavy lifting there. So also I am not a pull request person, as you can tell, like you keep referring to your PRs by number. You probably have like little tattoos with PR numbers on them. Nope. You're like this was the time that I, my first commit to Dagger PR number. No, just messing with you. But I'm just like a trunk developer. I'm just like right there. I don't, I just want it to go out. Me too. I think this context is perfect. The changelog, right? That's when we developed it from the beginning. Every commit to main master will go into production. That's it. Right. So yeah. That's why I just keep doing that.
42:39SPEAKER_00
I
42:39SPEAKER_01
don't have the patience. The only reason why it's pull requested because of these kaisens, right? Like what do we reference? Oh, you know that commit and there's that other commit and there's that third commit. If you put them together, that's how we did it. It's a lot more difficult to talk about them. Right. Well, that's why we're talking about your work more than mine, because I don't remember any of the individual things that I did. Exactly. And by the way, Jared did a lot more work than I did just to make it clear, okay? I just have pull requests to prove it. He has lots of commits. That's true. You got receipts. Fair enough. Well, your stuff is cooler too. The one thing with this chat GPT question is you didn't reference the year. Does it matter? It shouldn't. With the year, I guess not. Like May 8th through, or sorry, March 8th through May 20th will always be the same no matter the year, right? As long as it's not a leap. Gerard knows leap years. That's right. I do, yes. He'll leap your baby. Yep. February 29th. As long as you don't have to worry about February 29th. Yeah, because March will always have 31 days, and I think May is also a 31 day month. And here's a question. Would it account for that if you went January to May, asked for the day count? Would it say, well, it depends on the year because of leap years? Let's try it and see what happens. All right, follow back up on that. Yeah. All right, in the meantime, what else have we done during this time period, or what else at least is worth talking about? Well, I'd like to give a shout out to Ken Cost for W3C HTML validation fixes. That's pull request 462. Oh yes. Thank you for contributing that. That was a good one. Ken would like to get more involved. He hopped into our channel and said, hey, I'd like to hack on this. And I just honestly don't have a great answer to that. I'm just like, I would love help, but we don't know what we're doing. Pretty much, we're just making this stuff as we go along. Yeah. I mean, when I see a thing, I do a thing, and I have visions in my head, but I don't write those visions down, because a lot of those visions aren't ever going to be worked on, you know? So it's tough having contributions. I love that he found a way of contributing, right? It was awesome. He just went through and fixed all of our HTML to be valid. He has an easy merge, but I wish we had a better story around contributions. Something I want that he might be able to do is the ability to have Tailwind built into the app alongside our current SaaS pipeline, so we can incrementally move away from old design to something that's Tailwind powered. That's a good one. And that requires a bit of infrastructure to do that, right, Jared? I know you said you could do it. That one's on my hit list. So there's things that I'm just planning on doing that I'm not going to write up for someone else, because then I'll have to tell them exactly what I'm thinking. You know, it just becomes, now they work for us or something, and I'm like, hmm. It'll be a task. You find your own job. You make your own job around here if you want a job, okay? We're not making your job for you. But that's the hard part, though. It is hard. Our repo is almost like open source, not so much anti -contribution, but we don't know how to tell you to contribute. Right. It's not quite Ben Johnson level, where it's like my code limited contribution. Welcome, but in certain contexts. It's just we don't have a label out there for easy changes to make, for example, and certain tasks just kind of like hanging out there. And we never have, and I don't know if we ever will. It's just a hard problem to solve without having a very clear and specified public roadmap of where we're headed with the website. But I mean, we make decisions all the time that change where we're headed, and they could be like 30 minutes before I code them up. Yeah. There probably are things that we could have out there, but we just don't. So thank you, Ken, for finding a way. Even this show here. This show was named something else. No, not this show. Changed all the news. It was named something else for a bit there. And up to like the legit the 11th hour, was it renamed back to something that was more meaningful, I suppose, to the brand? Yeah, or more aligned, at least. And then all the, you had done some schema, some visualizations in Obsidian with how the feeds would shake out and which one would be the primary feed. Right. It wasn't until like the 11 .5 hour of that change, and then we were like, okay, cool. Because you and I both had... Because we don't always agree on things. I don't know if everybody knows that, but Adam and I don't always agree. Well, we partially agree, and we're just still sort of in that unclear state. It's like, well, I kind of agree with that. Yeah, or just not confident yet. Sometimes we just need like the other person to become confident before we will be, you know? I'm like, well, Adam's not sure, I'm not sure. But once he becomes more sure, it's like this is starting to feel good. Or I'll be 100 % sure and have to convince him. And if I can, then he becomes sure. And if I can't, I mean, this is typical business stuff, but it makes it hard. And I definitely have empathy for people building in public, you know? And there are teams that have like SaaS products that are open, either open source or open source -ish SaaS products with public roadmaps, where they're taking like their user's input and stuff and trying to pick a direction. And I think that's just a very, very hard thing to do well. Tell me about it.
47:47SPEAKER_00
I
47:47SPEAKER_01
actually watched the Homer Simpson car clip the other day, like the four -minute extract on YouTube of when they asked Homer Simpson to design the car. It's such a good little analogy for the software world. And the best part is at the very end, they go to present the car, you know? And it has all the features. I'm not sure if you guys have seen this. I know you know the meme. Yeah, it's ugly, but it's souped up. I mean, it's got tons of stuff. And they present it to this crowd, you know? Cause it's like a concept car. And the presenter goes to name how much it'll cost, you know? And the guy just whispers it into his ear. He's like, it costs what? And it's like $80 ,000 or something and nobody wants it. Cause they just, you know, just packed it to the brim. That's right. And so not only is it ugly and that nobody would ever buy, but it's also completely too expensive, which software, right? It is half the battle, getting the price right. The right product, the right price, right availability, the right quality, all the rights. Gerhard, in that example you shared earlier, that visualization of all the dependencies and whatnot, the pipeline being built. You had apps V2 mentioned. What's not in our list to talk about, but I'm curious of since you were playing with apps V2, was the machines versus apps V2 consideration we talked about the last time we talked. Can you share anything on that front? Has there been experimentation, evaluation? That was the one that I was showing. So we have, we will ignore Changelog Social for now. There's like a whole Changelog Social side of our infrastructure that we can ignore. We have three things today important for Changelog that are running on apps V2. Apps V2 is the Fly .io platform that is based on Nomad scheduling. So there's a couple of limitations with that. Basically it's the scale that Fly reached and that basically meant that various things weren't working quite as well. There's like a whole thread on this course where Kurt was very transparent about some of the issues they encountered with apps V1. And what they're doing about that. So apps V2 is a complete redesign of the scheduling. The way I understand it, it's like their own proprietary scheduling, which does not use Nomad, does not use Kubernetes, does not use any of the things that you may expect. And as a result, it's a lot more robust. It's built for their scale, for what they need right now. And what it means in practice is for us, it's really, really fast to use and deploy things onto. So it means that you can get very fast VM -like containers. It's using Firecracker VM, again, mine's standing behind the scenes. And they spin up very quickly, but they have all the security that you would expect from a VM. So spin up time is seconds really, even less depending on what you're doing. But obviously by the time you do health checks and a bunch of other things, it can take, in our case, some things, 20 seconds, 30 seconds to come up and be healthy and for us to be able to use them. These apps V2, they are pinned to specific machines, hosts, like physical hosts in this case. What that means for us is that we can't run just one. So right now we have a single instance of changelog running. It's been like that for quite some time. That's why the clustering is important so we have more than one, but that's okay because we have a CDN in front. We do a bunch of things where even if the app goes down, we are still available. And Jared did a couple of improvements, which means like even the post, for example, now they continue being available when you do certain requests against like news items and whatnot. So our availability really relies on Fastly. And what that means is that it takes like once every five years sort of event to take us offline, which is again, we discussed, I forget which guys in it was, but we talked about that when half the internet was down. Like two or three of it. It happened once. Exactly, only once, right? And that was like in seven, eight years. So I think it's okay. And even then it was fixed fairly quickly. It's kind of like the end of Cable Guy. Have you guys seen the end of Cable Guy where Jim Carrey falls onto the broadcast dish? No. And all everyone's TVs just go to fuzz. The moment of truth must
51:52SPEAKER_00
have been. Jacob's sweet, has been found.
52:05SPEAKER_01
And they suddenly realize there's a whole world outside, you know, and they look out their window, they step outside, their feet touch the grass and they're like, wow, that's what we did that day. You know, when Fastly went down, the whole internet was basically down because Fastly powers so much of it that we just took a walk in the park. Exactly. Systems were affected that you didn't even know existed. I just texted Gerhard, just like, Gerhard, what are you doing? It's down. Sunbathing. That's what I remember. That's right, you were sunbathing, weren't you? Exactly, it was like a day off. So it was good, yeah. Yeah, TMI. BBC was down and New York Times was down and Guardian was down, like, you know, change log, it doesn't matter, it's okay. All the other big news companies are down. Anyways, the point was, coming back, that with Apps V2, if the host is unavailable, the app cannot be scheduled somewhere else. So we have to have more than one. We can't have just one. With Nomad, with Apps V2, let's say the host was down, the same app could be scheduled elsewhere because like it was more of a, basically the apps could move around the platform, right? In this case, in Apps V2, they're pinned to physical hosts. So even though they're faster, they're better, they're, you know, like more self -contained, like when one thing fails, it doesn't affect, there's like basically the, it's a lot more resilient. The platform as a whole is a lot more resilient. But that means that you have to design with this in mind, right, with blast radiuses and when one thing goes down, like how do you basically work with that, with partial unavailability? And it makes sense, but it means we need clustering. So moving something like, for example, Dagger Engine, it's okay, or Docker, right? Because if that's down, that's okay, it can fall back to whatever's local. And by the way, we do that in our pipeline. If PostgreSQL is down, well, you're already clustering that. You know, we have like, again, some sort of redundancy there. In our case, we had one primary and one replica. And by the way, I think we should look into that as well to see if we can get a managed, proper managed PostgreSQL service. We keep talking about that, maybe Crunchy, maybe Superbase, try a few and see what sticks. To continue, we can still run on fly for the PostgreSQL instance, even though I think they were mentioning at some point, they will want to invest in that. So we'll see where that goes. Things may have changed. Again, this was on their forum. So we can move Dagger Engine, Docker, we can move PostgreSQL to apps v2, to machines, but the app itself, it's a bit more problematic, not without the clustering part, because of what they explained, like the limitation with hosts. So while it's a modern platform, it's very performant. Again, all the tests which I ran, you get very nice CPUs, the AMD epics, they're yours, especially if you get the performance instances. You get local NVMEs, again, very fast, but if the host becomes unavailable, it won't get moved. So we have to get the clustering to do apps v2 properly. Exactly. All right, Jared, the whole world is on my shoulders. Well, the whole changelog world, just quantify that. There's a lot of things too we just talked about. Jared, you mentioned news items. I know they're still in our infrastructure, Jared, but there's a lot with this change from weekly and news items and this news homepage to this world we're in now, which is sort of obsolete and primed to delete code. There's a lot of places that we could delete from the app code, we just haven't, because A, you wanna make sure that you like this new home that you're living in for a little while, and you're not gonna be like, why'd I throw that furniture away? I like it a lot better already. I got my feet up. I'm liking it too. I don't think we're gonna go back, but I wasn't gonna rush into deleting the code. I do love to delete code, but my least favorite command is get revert, you know?
55:52SPEAKER_00
Like
55:53SPEAKER_01
a lot of stuff that we built, even the whole impressions thing and all that, just it's obsolete. We're just not using it anymore. We're not doing news items like we used to. And in fact, my concept, which I guess before we started recording, we were talking about STI, single table inheritance. Gerhard and I were reminiscing on war stories. From the old rails days, this was a feature, it's probably still in there, where you could do a classical object -oriented inheritance with a single database table and use a type column to instantiate different classes. So you could have a table called pets and classes called dog and cat, which both inherit from pet, but they get stored together in one table. And my concept when I built this site, you know, you make certain decisions that are foundational to an application. And one of them, which served us very well for many years, was everything's a news item. And every news item points to something. So it can be on our website, it can be on somebody else's website, it can be audio, it can be video, it can be text, it can be a tweet, it can be a whatever. And we decorate those differently based on information we have about the news item itself. Now, I didn't actually use single table inheritance to implement this. It's just similar conceptually. And that was great. And it served us very well for probably all the years until right now, where we literally are abandoning the news item part of what we do. And we're only publishing blog posts and episodes, right? Audio episodes. And every other news item is just kind of like an appendage that is still there in the data structures, but only represents things that could just be represented directly, you know, if I hadn't made that decision, so. Aren't comments also hanging on the news items too? They are. Yeah, they're attached to news items. Yeah. Because that was like the foundational, you know, atomic unit of content in the CMS. Like I said, it was good for us for many years, but as we simplify and change now, where we're just publishing, I write changelog news in Markdown, which I love, but changelog weekly was generated from a list of news items that we published throughout the week. And that's a foundational change to the way that we do content. This one's simpler and I think more sustainable and hopefully produces better content over time, but all of our infrastructure is built for that other way. And so there's stuff that we can delete. There's also stuff that's just gonna be there cause it's like ripping out your circulatory system. Well, when you look at, for example, changelog .com slash podcast, which is where this show is at right now, right? The list of things there are news items. Right. Everything we fetch is news items. Right. Unless we change the design dramatically to run this obsolete, like this still is pertinent, right? This is still useful. Oh yeah. We're still using that for sure. That's why I say we're not gonna rip out certain things, but there's areas of the code base, like that are news item related that could definitely go away. Like in the admin for creating a new issue, which was once changelog weekly, that can go away completely cause like - We don't use that anymore. Yeah, we're not turned back from that. And we attach the newsletter content. So changelog news goes out at the exact same time, both the episode and the email. And we attach the email content directly to the episode, not to a news item. And so, yeah, issues are gone. News ads can go away. There's a bunch of stuff in there around scheduling sponsorships. All that stuff was based on the old system. So yeah, we could definitely do some spring cleaning, probably summer cleaning at this point. But then you're like, well, it's also not hurting anything. And they're sure waiting on me for this clustering feature. So maybe I'll work on that instead. I love it, man. Digging the changes, digging the kaizen, digging the commitment to incremental change. Like, it reminds me of the book, Atomic Habits. If you can make a 1%, a half a percent change today, and you do that for three or four days straight, well, what did you do, right? You made a almost 5 % change. And you stress out about, I can't do all of it today. That's okay. Just do what you can today, just enough to move the needle forward. Whatever it takes to progress, do that thing, right? We've been doing that for many, many years now, collectively. We've embodied that with kaizen on ShipIt. Now we've brought it here to change like our friends with our oldest friend, Gerhard, with a sheer commitment to chasing the nines. And the zeros. The best pipeline ever. Like, TM that at the end, right? Trademark that thing. The best pipeline ever. It is world -class, honestly. Gosh. No one has a better one. No one has a better pipeline. Do you hear that out there? Show us your pipeline that's better than ours. If you check out the code to see like a pipeline as complicated as ours in code and how it all combines with apps dependencies, with the versioning, with how the app gets compiled, we even have, and again, I haven't finished this, it's there, Erlang releases. Let's get that sucker done. And it makes it releases. It's not there, again. And no one has like thought of this, like, hey, Gerhard, can you please like get to those Erlang releases? Or hey, Gerhard, can you please upgrade PostgreSQL? These things happen organically. Because you kind of get like in the middle of it and you kind of understand what the system needs. It's like a living, breathing thing. And you know, if you have half an hour and even talked about this, like way back when, what is the most important thing I can do right now that will make this thing slightly better? An hour tomorrow, whatever the case. I have a weekend, two hours. Okay, so what can I do in those two hours? And all those things add up. And two months later, it's amazing how much has changed. And that's why we're trying to capture these changes because again, if you look at it every day, it's not a lot there. Same thing with news. I've seen like there's like so many things that you had to do, Gerhard, like to get those news. It was way too much work. I don't know what we were thinking. It was worth it. We had great ambitions, you know? It's always good to have good ambitions, but it's okay to admit your fail, right? You know, we just didn't quit soon enough, Gerhard. We just kept going for years. We had like the ultimate limp for many years. Right. Well, so when you're dedicated to consistency, it's hard to stop something and do something different. Cause like one of your major aims is to consistently do the thing that you said you're gonna do. And so yeah, stopping and starting and making changes, these are hard decisions to make. And especially when they require heavy lifting, you know? And you're experimenting. We didn't know if people would like change log news. So we started shipping it inside of our current infrastructure. You know, if we had done all that work and then people were like, this show sucks. Why are you guys making, please stop putting this into my change log feed. Then it would have been a bunch of wasted effort. So I feel like we went in the right order at least. And we're here now. There's a lot of things that we can improve and we are dedicated to consistency on Kaizen every two months. So we won't have an episode number quite as sweet as every 10th, like Gerhard had on ship it. But we do want to do every two months. And we do have this discussions thread. So Gerhard, lay this out for folks. You can participate in our Kaizens. You can give us ideas of things you want us to try, things you want us to talk about on these episodes. You can obviously hop into the code and find your way where it makes sense and how you can help out, take it shout outs. We even have our most popular change log t -shirt is the Kaizen t -shirt. I wore it at open source summit and got some compliments. So maybe you can get yourself a Kaizen shirt by getting involved. But each Kaizen episode has a discussion that we use on github .com slash the change log slash changelog .com and hop in there. Tell us what we should do for Kaizen 11. We will be in there discussing what we're gonna work on for Kaizen 11 and what else? Anything else left to say here? I would like to give a shout out to Jason Bosco. He's the one that participated in Kaizen 10 with Type Sense. Thank you, Jason. Our partner is Type Sense. They are powering our search. And we had some questions around how it worked. Sounds like it's working as advertised at least at the basic integration. The key answer direct matches will beat fuzzy matches every time, I guess. Makes sense. It just unfortunate cause LLMs versus LLM was the search term for using and you tend to search for one, not the other even though you tend to say the other one cause there's many of them. So yeah, thanks Jason for hopping in there and explaining that to us. It's a great example. And we can certainly delete the Algolia code. So if someone wants to help with that, it's right there. Well, I already deleted the account and everything so we have no account there. So if the code's still there, then. I thought the Algolia code was all gone. What's left? Was it? I don't know. I mean, that's what it says. April 9th. Oh yeah, I ripped out some Algolia stuff. Did you? Okay. So that's done. Cool. Yeah. See, no pull requests. I don't know. It's a commit, one of 50. Good luck finding it. I'm a follower of that gardening principle, you know, like whatever area of the code base you're in. You do the same thing with upgrades and stuff. Like while I'm working on this, if I see something and I'm like, oh yeah, we can get rid of that. I just do it. Boys couch rule. I don't have the PR, but I'm just like constantly improving any area of the code base that I'm touching. And once it became clear that we were done with Algolia and I was just futzing with the search stuff, I deleted it, so. The cool thing I want to plug real quick because Jason did so cool and Kaizen 10's mentioned there. If you're not familiar with TypeSense, the cool thing about TypeSense really is it's super fast in memory search. So like if you can hold a terabyte of your data in memory, like Jason can speak better than I can. I'm speaking out of turn really, but super fast and it's in memory, which is just the best. So check it out if you don't know about it, typesense .org. Well, what's next? Where should we go from here? This is the first official changelog and friends, which is a brand new flavor of this show. And Jared, I'm excited about this because we've got lots of friends and we don't talk to them frequent enough. And every two months we're coming back here with you at least Gerhard, so maybe sooner. I know we have some home lab material coming up. We want to talk about Unify, Ubiquity, networking, maybe some VLANs. I'm trying to talk Jared into VLANs. He's like, I don't want to do that. Should I have VLANs Gerhard? You should, especially for your kids. You should have a kids network. I found it the hard way. That's exactly what Adam's telling me. I think that needs a drink, that story. Yeah, I have a separate Wi -Fi for kids that's on the kids VLAN. And IOT, yeah, you want like an IOT network for sure. Okay, sell me. You want two networks. Yes, always two. Two routers, two networks, two ISPs, oh yeah. You know how hard it was for me to get one ISP out here in the boondocks? Getting two is not easy. I'm sure it was, yeah. No Starlink? I could have Starlink, yeah, I suppose, but that's expensive, right? 500 bucks down and then 100 bucks a month? It's like 700 -ish dollars in just hardware alone, and then it's about 100 bucks a month, I want to say, for decent speeds. As a backup, can I turn it off and turn it on? I don't think so. Doubt it. That's too much for a backup. My main line's costing 115. That would be cool, actually, if they can have a plan that's just for backup service, because I mean, you already own the hardware. How hard is it to just metered use, right? Failover. I don't know if your Dream Machine Pro supports it, but I don't think that has WAN failover. Does it have WAN failover? Not yet. Dream Machine Pro has it, yeah. It does have WAN failover. I know my USG Pro did, and I say did, because I used to own that, and I sold it like a fool. I want that USG Pro back so bad, it's so cool. The UDM Pro's pretty cool, too, but. Unifi has, is it Unifi as the company, or is that the device? I always forget. Ubiquiti's the company, Unifi is the product level. Oh, Ubiquiti's the company. Ubiquiti's the company. They do live at UI .com, though, which is like. Yeah, UI .com is a sweet domain, right? Phenomenal domain. So they have a product that I really want. So, you know, I have a small acreage, I got eight and a half acres. Do you really want that still? Of course. You do, okay. Why wouldn't I? I wasn't sure if this was a long -term thing. Oh, I still want it, so. What do you want? Wi -Fi base station XG. It's an outdoor, basically, dish that just broadcasts your Wi -Fi for hundreds of miles. No, 1 ,500 plus, 5 ,000 square foot coverage. So one of the things that I do around my land is that I walk and I listen to our shows for QA, for clipping, for stuff like that. And sometimes I'm uploading stuff. I'm basically working and walking, which is a spectacular aspect of what we do. I love that part of the job. I get down to the orchard and I don't have Wi -Fi. I'm on cellular. I want blanket coverage. I want Wi -Fi all over my land. And this sucker will get me there. The problem with it, A, it's sold out. B, it's 1 ,500 bucks. But it is powered over PoE. And I can run it up to my roof. I think you'd probably go with the AC mesh professional, though, too. That might do a similar thing. Is that one cheaper? Check this out. Oh, gosh. Check the wireless systems that Microtech has. Microtech? Garrett is sharing links with us. I have both. Of course you do. You have two? Of course, yeah. Ubiquiti and Microtech. So I have UDM Pro. I basically have two networks, one inside the other and one alongside the other. So they are interweaved. So I have two 10G trunks, basically, 10 gigabit trunks. And I really like the Microtechs. I think they are underappreciated. You need to be a bit of a hacker to get into them. But once you set it up, you have terminal access. You have nice auto completion on the shell. You can run containers if you're crazy enough. But you can run containers natively on the router if you want to. They have container support. It's a proper operating system. And if you use Ubiquiti, it's great. It's almost like a Mac. Microtech is almost like Linux. So that's the difference. Oh, OK. But however, some of their latest hardware is just so good, so, so good. The UI will not be as polished, but do you need that? Or do you want the stability of, just think Linux for your network? You can do 25 gigabits. You can do SFP. You can do all sorts of crazy things. If there's good docs and there's an LNN to power it, I would totally configure it. NetApp apply. I configure my stack stuff on Linux every day. No GUIs here, so cool with that. Their YouTube channel lately, it's just so good. If you look like the YouTube Microtech channel, they have so much good content. Check it out. Yeah, definitely have to check it out. That's a friend recommendation. OK. Well, that'll just be a tease then. So what I was really trying to do is just tease what could be the future of this show. Having friends like you to come up and talk with us about things. And I'm a nerd, and I'll nerd out with Jared, and he's maybe slightly less of a nerd than I am on networking and IoT and home lab -y stuff, but way more nerdier than I am in back -end coding and databases and stuff. It feels nice to be in a place where I'm the least nerdy person. It's usually I'm the most nerdy of all people in the room, but not in this room. I like this room. Not in this room. Maybe we could talk about Microtech and Unifi and the Mac versus Linux of networking. That'd be kind of cool. Coming to a change looking friends near you in the future. OK, cool. We do have a little list here at the very end of this discussion which says what else is coming up. Can you blaze through that in one minute, Gerhard, so we can close out this show and tease the next Kaizen? So I really think we should improve our integration with 1Password and Secrets in general, right? Because right now it's copy pasta in our GitHub actions, and that's not good. Now we have the code. We can connect programmatically to wherever we store our secrets in a nice way and get them just in time and keep them secret throughout, right? No copy pasting, none of that stuff. We switched off Pingdom, and now we have Honeycomb, just Honeycomb. So we need to configure the SLOs, just have a better understanding of what's happening there. I have uptime Kuma running locally, and I think we should have an instance on fly IO as well, at least like a few. Just deploy them, see what's happening from the outside world. There's a bunch of upgrades like PostgreSQL that has to be upgraded, apps v2, like role, everything that we can to apps v2. Follow up with the Fastly guys, just like to share our VCL madness and our UI experience and just figure out what exactly we're doing wrong when it comes to Fastly. On my list, part of that is to look into Cloudflare. Like what did Cloudflare add to logs and how easy it is to integrate with Honeycomb? Because if they have a good logging story, maybe we should check them out. R2 has no egress fees. Currently we're storing stuff on AWS S3. If we went to Cloudflare R2, if we did like a couple of things, I have like some amazing stuff like Cloudflare, a bunch of things. They are a really cool company in terms of the technology which they develop. And as you know, I like running two of everything. So why not two CDNs? I know it's crazy, but you know, many things were crazy to begin with and they turned out to be great ideas. I'm one step ahead of you on the R2 thing. I have an open tab. I have an account that I've created for us over there. I begin testing the waters. One thing that I immediately don't like about it is that it doesn't support transmit. Like it's S3 compatible API, but not enough that you can use your S3 client transmit, my S3 client transmit. And that just hurts my heart. It's not gonna be a blocker, but it hurts my heart. So it's gonna be an R2 CLI though, right? Where you could do the same thing that transmit gives us with the CLI? Oh, there's definitely ways you can do it. You can use CyberDuck if you want to. Just like years and years of using transmit, technically transmit doesn't support it is probably the way that they would say it. But they say they're S3 API compatible and transmit is an S3 client, but it does not work. Their S3 does not work with R2. So it's not a one -for -one. It has something to do with streaming data versus chunked uploads or I don't know, I didn't read the description, but it's like a known thing that transmit is like, yeah, we might add support, but they haven't yet. So
74:08SPEAKER_00
anyways,
74:09SPEAKER_01
that just bugged me because I just like to drag and drop stuff in there with the same SFTP client I've used forever. But what really needs to happen is can we simply change all the environment variables in our code and will the XAWS work seamlessly? Will all of our stuff work seamlessly with R2 to upload to R2? And if that works, then we're pretty much good to go. I also need to go through and change a bunch of our hard -coded URLs in our text areas from the S3 URLs over to cdn .chainsaw .com, which we should have been using the whole time. I just didn't even think about that because then we can seamlessly switch the backend without changing anything. So that's like a migration step that'll have to happen. But
74:49SPEAKER_00
anyways,
74:50SPEAKER_01
I'm down the road a little bit on that because our fees at AWS are going up and we would love to have zero egress fees. I was gonna say, if we need some motivation, in March our bill was $50, $53 .69 actually. And here in May 2nd, and June's coming here soon, May 2nd, it's $132 .32. So we've almost tripled our bill in three months. Yeah, it's on track for like $450 this month. Is it really? I thought so, inside the Cost Explorer. $450 this month? You said May 2nd, it was $150 something. I have to check this out. Remember we did this last time when we set up shielding? That was exactly like during a Kaizen. So I have to check this out to see what exactly is going on here. I used the Cost Explorer to try to figure it out and I just saw it was S3, but I didn't actually drill down on what changed and when. So if you wanna do a little bit of research on that, you're hard to be good. But then I just started thinking, well, now is the time, R2, like let's just eliminate this problem for us. So that's on my hit list. Also on my hit list, obviously, is this clustering stuff with the caching stuff. Change log news improvements will continue. And then my big, not for the next Kaizen, but my very big desire this year, this calendar year, I'm not gonna guarantee it, but I'm gonna shoot for it, is I want to migrate off Supercast and I want to provide a first party change log plus plus offering that solves a few of the pain points for our plus plus members. Specifically, one feed to rule them all is just not good for people. They don't want one feed. They want their plus plus feed, but not the master feed, which is what it currently is. And so we can provide that if we're off Supercast. And so that's just a thing I'm putting out there as a big item for me here soon. That would be good, because they want it. Gotta give them what they want. They do. It's like 50 % of the people who sign up for plus plus say, hey, can I have just JS party or just the change log in JS party? I don't want the other shows. And I always had to say, nope, you gotta have all of them. And that just sucks to say, cause I don't understand it. So unfortunately, Supercast doesn't have that functionality inside there. We can only give them one feed. Whereas once we're on our own platform, obviously, everybody gets their own feed and you can customize it and have multiples or whatever you want to do. So that'll be super nice. We've proved enough though, that it makes sense to bring inside. But also requires some of this caching stuff to be taken care of. It does. It's just, there's so many little things with subscription services and like Stripe integration and blah, blah, blah, blah. Where you're like, you start to realize all of the emails you're gonna have to send, you know. That is true. At least a few. I mean, there's probably five or six different emails just to manage a subscription. Plus a dashboard where you can go and change things, cancellations, this person's card expired. Refunds. Refunds. Yeah, Stripe handles the refunds part, but yeah. You still got to trigger it though. You got to get that you want to do something or other. That you got to just manage the fact that either you're doing it as an admin or they're doing it on, they're requesting something. Then they're like, where's support? We don't have support. This is just. You're looking at him. Yeah, this is support right here. And you're figuring out why our bill went from basically 30 bucks to 130 bucks. As soon as I get permissions to billing, I'll have a look at them to that. Back on you, Adam. All right. Oh my gosh, okay. One more thing. This is important. How can we run multiple instances of a change log with the same data without the second instance behaving as like the live one, without sending any emails, without anything like that? Because when I'll be setting up the new 2023 infrastructure and that is right next on my list, that starts with PostgreSQL, with the app, with whatever we can migrate, I do like a long blue green. And that means getting another production and having like a nice easy way of basically migrating the data, migrating everything. So it's like very easy to switch back and forth if we need to. But always my concern has been when you have a second production, it shouldn't be sending any emails or anything like that. So I think that's a follow up for us. We obviously, we don't have to solve it now, but that's something on my mind. We're getting long in the tooth. Let's call this a Kaizen. Let's call this a change log in friends. I'm very happy to have been here. Thank you very much for having me over and I'm looking forward to the next one. Thank you. Thanks for hanging with us. Kaizen always. Kaizen. Kaizen always. If you're digging change log in friends so far, tell your friends. And if you have feedback for us, let us know in the comments. Link is in your show notes. Our intro episodes sparked a great discussion. Wow. So many shower listeners. I knew I was right, but I didn't know just how right I was. Thanks once again to our partners for supporting our work. Fastly .com, Fly .io and TypeSense .org. And to Breakmaster Cylinder for this epic new outro track. It's called Link -A -Rama and you'll only hear it right here on Change Log. Monday is Apple's big WWDC keynote and they're expected to unveil the Reality Pro and XROS. Watch along with us in the Apple Nerds channel of our community Slack and tune in next Friday. We'll be discussing all the interesting bits with homebrew lead maintainer, Mike McQuaid. That's all for now, but let's talk again real soon.