Rewriting quic-go's test suite
Over the course of the last year, we have completed an exceptionally large behind-the-scenes refactoring in quic-go: we rewrote almost 50,000 lines of test code.
Comparing Testing Frameworks
When quic-go was started more than 9 years ago, Go was a pretty young language. Using a test framework like Ginkgo felt natural, especially coming from a Ruby background.
Ginkgo is very opinionated, and encourages users to write tests in a way that’s very specific to this framework. Let’s take a look at an example from the project’s README:
var _ = Describe("Checking books out of the library", Label("library"), func() {
var library *libraries.Library
var book *books.Book
var valjean *users.User
BeforeEach(func() {
library = libraries.NewClient()
book = &books.Book{Title: "Les Miserables", Author: "Victor Hugo"}
valjean = users.NewUser("Jean Valjean")
})
When("the library has the book in question", func() {
BeforeEach(func(ctx SpecContext) {
Expect(library.Store(ctx, book)).To(Succeed())
})
Context("and the book is available", func() {
It("lends it to the reader", func(ctx SpecContext) {
Expect(valjean.Checkout(ctx, library, "Les Miserables")).To(Succeed())
Expect(valjean.Books()).To(ContainElement(book))
Expect(library.UserWithBook(ctx, book)).To(Equal(valjean))
}, SpecTimeout(time.Second * 5))
})
Context("but the book has already been checked out", func() {
var javert *users.User
BeforeEach(func(ctx SpecContext) {
javert = users.NewUser("Javert")
Expect(javert.Checkout(ctx, library, "Les Miserables")).To(Succeed())
})
It("tells the user", func(ctx SpecContext) {
err := valjean.Checkout(ctx, library, "Les Miserables")
Expect(err).To(MatchError("Les Miserables is currently checked out"))
}, SpecTimeout(time.Second * 5))
// ...
})
})
})
The Describe
block usually contains all tests for a given object / struct. Individual tests are wrapped in It
blocks, which can be grouped using When
and Context
blocks. BeforeEach
blocks are executed before each test, with multiple BeforeEach
blocks being allowed.
At first glance, the hierarchical structure might look very powerful and expressive, but once the test suite grows to thousands of lines, it becomes very hard to keep track of the nested BeforeEach
(there’s also a JustBeforeEach
to make things even more complex), and the shared state between tests: Notice that even this supposedly simple example defines three global variables.
Let’s now take take a look how the above example would look like using standard Go testing, using testify for slightly nicer looking assertions.
func TestLibrary(t *testing.T) {
library := libraries.NewClient()
book := &books.Book{Title: "Les Miserables", Author: "Victor Hugo"}
user1 := users.NewUser("Jean Valjean")
require.NoError(t, library.Store(context.Background(), book))
// lend it to the reader
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
require.NoError(t, user1.Checkout(ctx, library, "Les Miserables"))
require.Contains(t, user1.Books(), book)
require.Equal(t, user1, library.UserWithBook(ctx, book))
// the book has already been checked out
user2 := users.NewUser("Javert")
require.EqualError(t,
user2.Checkout(ctx, library, "Les Miserables"),
"Les Miserables is currently checked out",
)
}
This test does everything the Ginkgo example does, but it’s much more straightforward: it doesn’t use any global variables and it only contains a single level of indentation. Most importantly, it’s really easy to read and understand: a book is checked out by one user, and then cannot be checked out by another user. And it’s only about half of the lines of code.
When we started the process of translating the test suite in September last year, all of our testing code looked similar to the Ginkgo example. Package for package, file for file, we migrated the test suite to use standard Go testing, starting with the simpler cases, and gradually moving on to the more complex ones.
Unfortunately, the various AI tools were not able to really help with that task, other than super simple things like translating an Expect
to a require.NoError
statement, no matter the amount of prompting and context provided.
We completed this work in June this year, after a whopping 68 pull requests. The tracking issue is easily the issue with the most linked pull requests in the history of the project.
What’s Next
Having completed the transition for a couple of months, we can already see it paying dividends. For example, we just added support for the QUIC Stream Resets with Reliable Delivery extension, which required quite extensive changes to the QUIC stream state machinery (which is not trivial to implement in the first place). It’s hard to imagine making these kind of changes to a high-stakes code path without rock-solid test coverage.
We’ve also started using the new synctest
package that was introduced in Go 1.25. synctest
modifies the runtime to make test execution reproducible, and it uses a synthetic clock.
At the moment, we’re adopting synctest
to combat flaky tests. In the future, we’re planning to test loss recovery, making full use of the synthetic clock to test QUIC’s loss recovery mechanism, without having to wait for timeouts.
The Importance of Sustaining Open-Source
Behind-the-scenes work like this is absolutely vital to the success of any open-source project. It’s tedious, it’s time-consuming, and it’s not the most glamorous work. But it’s the basis for future work on new features and performance improvements. As an open-source maintainer, I rely on the support of the community to make this work possible. If you or your company is using quic-go, especially in a commercial setting, please consider sponsoring the project.
The quic-go project and the QUIC Interop Runner are community-funded projects.
If you find my work useful, please considering sponsoring: