Coinbase
2020 - 2026
2020 - 2026
Go, developer tooling, infrastructure, cloud services, databases, compiler technology, blockchains, elliptic-curve cryptography, consensus algorithms, and much more.
I joined Coinbase's Infrastructure team and worked on developer tooling and code quality efforts. At the time I joined, Coinbase had recently pivoted from Ruby as its main implementation language to Go, but the engineering staff lacked adequate Go expertise. I was able to instill some good habits and unwind bad ones.
Even after the staff became much more proficient, I remained the go-to expert for tips, techniques, and best practices in Go. I developed a reputation for teaching through detailed code reviews. I also mentored several junior engineers, and published periodic essays and internal tools to help level up the whole organization.
Part of my time at Coinbase was as an individual contributor, but after my leadership of the MrExit project (see below) I spent a couple of years as an engineering manager before switching back in order to work on improving CSF (see below). Here are some of the highlights from both parts of my Coinbase tenure.
The monorepo and MrExit
In the pivot to Go, Coinbase sought to adopt some other engineering practices from Google, including a monorepo. But the implementation was incomplete (crucially, it lacked a shared cache of build artifacts to speed compilation and testing) and governance was entirely absent, with the result that changes were hard to make and builds were broken more often than they were healthy. Not to mention the fact that the Go toolchain solves a lot of the same problems that Google originally built its monorepo to address (for C++, Java, and Python).
After several months of lobbying to reverse the monorepo policy, I got buy-in from leadership and a team to tech-lead in the monorepo exit effort, also known as "MrExit." We had to migrate hundreds of projects to their own repos. Identifying project boundaries was a challenge, as was the fact that the projects followed no single template. There were dozens of combinations of different build and testing pipelines, containerization practices, and deployment strategies.
Luckily the team was nimble and talented. Ours was the first companywide code migration in Coinbase history to reach 100% completion. As a result of teasing apart the monorepo, teams had greater autonomy and looser coupling with one another. A breakage in one project had a much smaller "blast radius," affecting far fewer other teams. Concretely, we reduced the time that engineers spent waiting for CI to complete by a whopping 91%.
The Go module proxy
I inherited a homegrown Go module proxy server when I joined. (At that time there was no good commercial or open-source option that could handle private repositories like ours.) It was poorly written: an instance of the server cached downloaded modules only on its local disk, meaning it couldn't share that cache with other instances, and the cache began ice-cold on every restart.
Later, we needed the service to handle our GitHub migration (see below), spanning both old and new GitHub locations, finding migrated Go modules in their new locations even though their module paths continued to reference the old server. Finally I decided to reimplement the module proxy from scratch. Thanks to better caching and other improvements, p90 response times went from seconds to milliseconds, and operational costs plummeted.
The Go doc server
Documentation for open-source Go modules is found at the public Go doc server at pkg.go.dev. (Example.) We wanted the same thing for our private Go modules but one didn't exist, so I wrote one, based on pkgsite but heavily modified for our private-repository environment. It became the standard way to document and to learn about Coinbase Go libraries. Later it became how our AI agents learned about those libraries, too.
Profile-guided optimization
A recent version of the Go compiler added a feature called profile-guided optimization (PGO). Using traces collected at runtime in production, it allows the compiler to optimize programs as much as 30% better than it can without that data, reducing CPU usage and therefore cloud costs. But getting everyone to adopt PGO was a non-starter, because of the many other competing priorities; and centralizing the use of PGO was effectively impossible because of the wide variance in different projects' build pipelines. Nevertheless I managed to identify chokepoints that allowed us to instrument most Coinbase services with the necessary recording of runtime traces, and to inject those traces into most services' builds to create very-optimized binaries.
Gosins
The Go ecosystem has a rich collection of static-analysis tools (linters, etc). But certain Go errors that are rare in the wild were endemic to Coinbase, largely because of the history of Ruby developers slowly learning different idioms and paradigms for Go. I wrote a new linter for use at Coinbase called gosins that could identify those. This resulted in thousands of lines of code cleanup that I know about and probably a lot more that I don't.
Adoption of my open-source tooling: Modver and Decouple
After the monorepo exit (see above) it became very important for projects to observe semantic versioning rules, lest a change in one module break the build of another. But most teams lacked the expertise to do so. Working with the owners of our shared CI infrastructure I added a versioning check based on my open-source tool Modver that allowed teams to observe the rules much more easily.
One of the problems endemic to Coinbase Go code written soon after the pivot from Ruby was too-tight coupling between packages, the typical result of trying to hang on to object-oriented style when coming to Go from Ruby, Java, Python, C++, etc. I educated our developers on the differences between abstract base classes in object-oriented languages on the one hand, and Go interfaces on the other; and I explained how to use my open-source tool Decouple to identify much-smaller interfaces that would suffice in many places where giant types were being required.
Cobacobadaba
To aid in the analysis of our Go codebase (beyond what our Sourcegraph installation could do), I wrote a tool called Cobacobadaba (for "Coinbase Codebase Database"). It periodically scraped the Go modules in all of our repositories and performed most of the steps of the Go compiler: parsing, type annotation, call resolution, etc. It then froze all of that data in the form of a queryable SQL database.
Querying that database allowed us to perform sophisticated codebase-wide refactors, deduplication, and dead-code elimination. It also allowed us to quantify the prevalence of certain antipatterns and informed the design of gosins (see above).
The Coinbase Service Framework
Near the end of my tenure at Coinbase, I switched from focusing on developer tooling to taking over stewardship of the Coinbase Service Framework (CSF). This was a kitchen-sink library that undergirded most services. It had had no proper ownership for most of the time I was there. Instead, teams from all over the company threw into that library whatever they felt might be helpful to other teams. The result after many years was a jumble of tangled dependencies, differing styles and patterns, and of course a heap of bugs. Many teams had reimplemented what they needed rather than depend on CSF.
I tech-led a small team that narrowed and stratified API surfaces, unblocked upgrades away from end-of-life dependencies (notably v1 of the AWS Go SDK), standardized on idiomatic Go context-object usage and error handling, identified and fixed dozens of latent bugs (with agentic AI help), established proper versioning practices, and more. Several of our improvements required callers of CSF to switch from deprecated API calls to safer, more idiomatic new ones. To aid in this I wrote a tool called csffix (modeled on go fix) for automatically rewriting the code of callsites, making it easy for teams across all of Coinbase to modernize their usage of CSF.
Project Teflon
I led this pet project of CEO Brian Armstrong's. His idea was to form a "tiger team" of engineers spanning all of Coinbase to address day-to-day friction that could be eliminated with the right automation. Between four and six team members, each on loan for a six-month rotation from their home team, helped to identify and address these opportunities. We devised a ranking system for our automation ideas: an idea's "Teflon number" was roughly the number of person-hours we expected to be able to eliminate divided by the effort involved in automating them away. In the 18 months of the team's existence we developed around a dozen high-impact tools and processes that became part of everyone's daily workflow.
GitHub migration
Coinbase operated its own GitHub Enterprise Server (GHES), which is the red-headed stepchild of GitHub products: last to receive features and fixes, imperfectly supported, and just different enough from github.com to be annoying. It also didn't scale very well - which didn't become apparent until after GHES had become integral to essentially all internal development processes, not just version control. It acted as an authentication server and a database for numerous other services, and was being used in dozens of off-spec ways causing noisy-neighbor problems that we couldn't control without hobbling the rest of Coinbase.
My team owned GHES and was responsible for addressing the incidents that arose from these problems and for doing what scaling we could. Eventually we saw the writing on the wall - we were going to hit some hard limits - and began a gigantic project to migrate to GitHub Enterprise Cloud (GHEC), which effectively is github.com but for private enterprises. This took months of planning, coordination with our internal security teams and several others (chiefly the worst abusers of the GHES API), and 6-10 contractors to augment my team for the duration. Even though my role at this time was as a manager, I also contributed meaningful amounts of code, such as the aforementioned Go module proxy rewrite.
This was a migration many times more complex than MrExit (see above) but, thanks to a lot of dedicated talent and (in the later stages) help from AI agents, at the time I left Coinbase the migration was more than 90% complete, and shutdown of GHES was in sight.
Other migrations: GitHub Actions and Artifactory
My teams owned most of the company's build and release infrastructure. During my tenure as manager, we migrated CI from Buildkite to GitHub Actions, and migrated a variety of artifact-publishing mechanisms to JFrog Artifactory. In those cases we also tamed the wild west of differing project layouts and workflows so that many more teams followed our "paved roads," simplifying support, maintenance, incident resolution, and feature development.