28 July 2023

Realising PHPUnit is not just Unit tests

It is far too easy to assume that PHPUnit only does Unit tests. I was guilty of this until a recent shift in thinking and now the latest linguistic faux pas on my list of things to hate is when colleagues casually refer to the PHPUnit tests as "the unit tests". They are not the unit tests, they are just "the tests"!

As of this month I'm encouraging we all watch our language and form the new habit because when we run the vendor/bin/phpunit --testdox command in our project it will actually execute a bunch of unit tests (tiny bits of test code that operate on 1 method in isolation) and feature tests (full end-to-end application features, with a working database and minimal mocking).

I now put PHPUnit into the group of technologies that if they could, the maintainer Sebastian Bergmann - knowing what he knows now - would probably go back in time and choose a less misleading name.

Here's what my chain of logic used to look like until recently...

  • Rule No. 1: Unit tests must not require an environment with a database
  • + Misperception: PHPUnit is only Unit tests
  • + Behat is the thing for feature tests (eg. With a database)
  • ...Therefore we must use Behat for tests that test the database schema / queries / repositories.
  • ...Therefore it is appropriate to use both PHPUnit and Behat in our project despite the complexity.

That line of thinking with its innocent misperception led to unnecessary complexity and if there's one thing that defines my style of software engineering, it's a never-ending battle against unnecessary complexity. With 2 test frameworks everything had to be nested 1 level deeper, the PHPUnit tests in one folder, the Behat ones in another. This doubles the number of config files, bootstrap files. The maintainers have to run both suites. The developers have to choose if something should be tested in one suite or another and they might waste time writing tests in both that ultimately test the same thing.

Complexity and confusion in tests naturally leads to a resistance to maintain them as well as impacting the estimate on future changes when features inevitably evolve.

Now I strongly believe in the value of the natural language scripts that sit in front of Behat tests. The ability to share those scripts with semi-technical colleagues and hide all the foreign-looking code from them is valuable. But the overhead of producing these is high. Writing neat reusable snippets in your context files is an exhausting task and you never manage to avoid duplication. Not all developers are good at striking the balance between including too much or too little detail. Then when you think you have achieved the perfect self-describing natural language script, you share it with a semi-technical colleague and they still need a developer to explain some delicate nuance and you end up questioning if the effort was worth it.

But all the stresspoints described above should be somewhat resolved by my latest approach....

Essentially I reached a point of frustration that led me to just let go of the rule that "unit tests should not require a database under them". I decided this because as I did a deep dive into some colleagues Behat tests, I found they were mocking the 3rd-party API response (fair enough) but they were also mocking the repository layer (which in turn cuts out the database schema from being tested) and the logging, and the queues and even some of the transformers. They had written so much test code ensuring that their tests do NOT test certain parts of the application that they then had to write more tests to test the bits they'd mocked! You're never really sure if the whole thing plays nicely together and it was infuriating when we had to refactor something or change the way it behaved. Madness!

So I envisaged a feature test that only mocked minimal parts. Just the bits we absolutely cannot have in a test environment (like connection to 3rd party APIs) but everything else, the database, the repos, the logging, the queues, all the interconnected fragile bits that are just as likely to break, they still get tested as they actually are in the application code.

It was only after deciding to rebel against the rules and run PHPUnit with a database under it that I discovered I was operating on a misconception. I found PHPUnit docs elegantly describing how to run feature tests with real reading/writing to a database as well as the classic unit tests.

After that point the blocks all started falling into place. I deleted all Behat stuff in the project, moved the PHPUnit tests up a level into the root of the "tests" folder. This made it easier to use the generate test stub feature in my IDE because that's the default location. I removed an entire step from the pipeline, stripped a bunch of setup and boilderplate and simplified the testing section of the project's README.md.

But then, the real cherry on the cake was discovering TestDox attributes section in the docs. These allow you to override the sentence that is displayed when a test is run. For example, if the test method is named testPostProcessOrganisationLocationEndPointReturns200() then the sentence that will be displayed is just the camel-case split version "Post Organisation Location End Point Returns 201". But with a TestDox snippet you can define it as "POST /organisation/{org}/location end-point returns 201" which is much more descriptive, just by uppercasing the word "post" and adding the forward slashes to the route it looks like how you'd describe it in spec. And woah! Suddenly I realise that I've gotten back the 1 thing that I thought I was losing by dropping Behat... the natural language scripts that can be shared with semi-technical colleagues.

R.I.P Behat and all your complexity, messy context files and difficulty to mock. Long live neatly-structured PHPUnit running both unit tests and feature tests, with a light dusting of TestDox attributes.