In this article, I will try to cover some do’s and don’ts for system testing from the perspective of an open-source project. To keep things simple, it all boils down to running the system as our customers would: think of the different use-cases of your system, the environment where it runs, the configuration options, and more.
As an early-stage open source project, there are many unknowns here. Unlike enterprise software where we manage users as named accounts and the visibility to actual use cases is high, the community doesn’t always share them.
Sure, when you designed your system, you planned for certain use-cases and wrote your system tests accordingly, but how many of the actual use-cases are they covering?
Why do you need system tests (OSS perspective)?
There’s the trivial answer for almost every system out there: make sure your system is functional end-to-end, meets client requirements and that integrations with external services work as expected (there’s some overlap with integration testing, but we won’t get into that). For open-source projects it’s more challenging, you are not in control of who deploys the system and how they use it since anyone can simply run it. You can try and collect some data on those runs, but inferring usage patterns isn’t going to be easy. There are different use-cases for the system, the ones you plan for and others you didn’t.
Another consideration is the contribution of external developers. Contributors come from different backgrounds and with different expertise. Their understanding of the entire system may not be complete. System tests give more confidence in the quality of pushed changes, regardless of their origin.
Lastly, there’s the environment in which the system runs. Users can deploy it however they like – as a simple binary, a container with or without an orchestration system. Good system testing can also cover some unknown use-cases and environments which usually aren’t covered by unit and integration tests.
What should you test?
Services and dependencies
First of all, it’s important to pick the services your clients are most likely to choose from. Let’s examine lakeFS: an installation consists of three pillars: the server, the metadata DB (Postgres) and the underlying data-lake (storage layer). The server is the SUT (system under test), so we’ll always deploy the docker image with the version that is being tested.
For the DB, we chose a simple Postgres container as we didn’t care about performance for the sake of the functional system testing (benchmarks are measured elsewhere) and also for reducing costs (as opposed to RDS). For the storage, both S3 and Google buckets seemed like great candidates. Compatibility with the two storage options is one of our key promises to the customers. So we ended up running all system tests twice – once for each storage solution (for lakeFS it’s a simple change of configuration). Even though we test the logic twice, it was important to assess both storage options.
Choosing a starting point
We found it is better to deploy the system from scratch each time, instead of upgrading existing servers or relying on old data. We took this approach because:
- The first experience with the system is very significant for the user, so we need to guarantee that we always test the system from the startup.
- As an OSS project, we enjoy the collaboration of dozens of developers (30 contributors and counting 😊). We made it our goal to ensure their development experience
- Is easier. No in-between states to deal with.
- Is repeatable. Having a system testing that runs locally and produces the same failures as they see when creating a PR, is part of that effort.
Having said that, we’ll also add tests for upgrades as we realize that users will upgrade servers from time to time. You don’t want a smooth first installation and system crash during upgrades.
Don’t test everything
For most projects, system-tests aren’t the only layer of testing. After covering most of the flows with unit and integration tests (100% coverage?), don’t try to redo everything with the system-tests – it will be wasteful in development and test execution time.
We have more than 60 different API operations. All of them are tested by unit-tests and integration-tests so there’s really no need to repeat the majority of those calls. As we mentioned before, find the use-cases that best fit your clients, and make the calls that reflect those. If more use-cases come up later, you can always add more tests.
Where/When should you run your system tests?
It’s best practice to require system tests for every PR. But sometimes that’s not the case when you want to save time during PRs, reduce costs of your ops or when tests are flaky. Open source contributors rock, but their familiarity and understanding of the system varies. A best practice is to ensure your tests are running with every PR change and merge to release branch. For that to happen, they need to be completed in a reasonable time and also be resilient. The latter might sound trivial, but switching off flaky system tests is more common than you would think.
As maintainers, you want to avoid friction when it comes to the development cycles of contributors. They already have to familiarize themselves with an unknown codebase, understand the open tasks and where they can help, deep-dive into one or more areas and adhere to the project’s code standards. You want them to feel the project is well-maintained on the one hand, and that they can reach out for any question on the other. Constantly failing checks can give a bad signal to new and existing developers.
We use GitHub Actions for our CI pipeline that deploys and connects the SUT to AWS and GCP services. Secrets (like AWS secret keys) are stored as GitHub Secrets and injected during runtime. External contributors fork our repository and submit PRs to it. Up until recently, secrets were inaccessible from forked PRs, which failed our CI for external contributors.
Github recently introduced a new trigger that allows access to secrets from forked PRs, but you need to be very careful when using it. A malicious user can still change the code to extract the secrets, submit a forked PR and steal them.
We hope we made system testing development for OSS a bit easier. Like any other system, testing it as a whole is fundamental for validating functionality. With OSS, you need to think and design the tests to match use-cases of unknown clients, as well as maintaining a welcoming development experience for external contributors.