Building a Test Culture

If you happen to be in a meeting full of IT specialists talking about a project, designing a new product or feature you are bound to hear the word test coming out at some point.

No one disagrees that tests are important, no wonder we spend so much time building different environments to fully check our changes. However, even though we all agree that we must test, it is quite hard to agree how we will test.

Why do we even do it?

We all know different tests we can do, I will not talk in this article what is a unity test, or stress test, or integration test, etc. From testing our code in isolation to testing our whole infrastructure, today we have technology to build and run all possible test dreams our organisations might have, but why are we even doing it? What are trying to achieve?

Confidence, according to The Oxford Learner's Dictionary it is defined as the “the feeling that you can trust, believe in and be sure about the abilities or good qualities of somebody/something”. While in IT terms I define it as "I am sure I am not gonna be paged this weekend”.

Jokes aside, confidence is something we build to bring evidence that what we deliver is fulfilling what we have previously agreed with someone. Let’s break down some concepts here and, together, try to find some sense out of it.

Agreement

We create things for our customers, and these customers can be external ones like other organisations or our society. They can also be internal customers like other teams and departments. The role of a customer can be played by a myriad of different people. However, for our discussion, it does not matter who is our customer but actually what expectations we have established with them.

When it comes to agreements, the tests we build will collect the necessary evidences that we are meeting what was promised, therefore if our agreement is unclear that might directly impact us.

Figuring out what the customer wants is hard, especially because most of the times the customer also does not know all about its needs.

Agile plays a critical role here as it taught us that, by iteratively building the product and putting the customer in the centre, will lead us to a feedback cycle which will positively impact the product. Therefore, if we design tests following this flow it will validate the feedbacks our customer gives and will be the evidence that we are going in the right direction.

Context

There is a rule in software architecture that states that everything is a tradeoff. What this means is that there is no right answer but actually consequences to the decisions we take, positive or negative. It also means that a decision might be right in a context but totally wrong in another, therefore software architects spend much of their time understanding the context they are part of so they can assert the tradeoffs accurately.

Context is also an umbrella term; it helps us enrich the meaning of what we do. Properly defining the context will then improve our chances of understanding the impact of a decision and, when it comes to building a test strategy and culture, will be our guiding flashlight in the darkness of unknowns and biased opinions.

Personally, I cringe every time I hear someone start an argument with “in my experience” (I am not gonna lie to you, I also do it). Although looking back in time and learning from our mistakes and successes is totally valid, we need to be careful with those stories since they happened in a different context. So, remember, taking a decision based on past experiences might lead to completely different results simply because the differences in contexts.

What is part of a context? The team(s), the department, the company, the customers, the society, the government, local laws, the nature of the business, the time when it is happening, the culture, technologies involved and a lot more things that I might be forgetting here. All of these dimensions play a smaller or bigger part in your context and understanding them will be key to build a cohesive vision for the involved parties and reduce friction since everyone will be on the same page.

Confidence

Now we can talk about the core concept of this article: confidence. Tests are the building blocks of confidence. When we build tests, we are also building our team’s and organisation’s confidence. When we hear statements like “it’s prohibited to deploy changes on a Friday afternoon” the message behind that actually is “our tests do not meet our level of confidence therefore we are not comfortable with changes on a Friday since the risk of disruption during a weekend will be higher and our business and engineers want to avoid that”.

If we look things from a difference perspective, we see that companies with a solid DevOps and SRE culture embrace changes and make it part of daily operations, it becomes routine, business as usual. That does not mean accepting higher risks but instead it means building a test strategy that, in our context, meets our confidence level and fulfills our agreements.

Risk is part of business; we all hear and repeat this mantra. What I will add to that is that, if we true believe this statement, then we should mitigate risk instead of avoiding it. When we mitigate risk we directly think about tests, when we avoid risk we think about control. Avoiding risk is about building fences and processes to prevent or delay changes from happening (read bureaucratic change management and multi-level approval processes) while mitigating risks is about paving roads where engineers can drive safely (read automated continuous integration, delivery and deploy pipelines).

Although I paint this in a binary way, as you need to select one in favour of the other, reality is not that simplistic. Some contexts will require a mix of both approaches to meet their level of confidence. That's not necessarily wrong.

Remember, is all about reading our contexts, finding our confidence level and building a test strategy that will help us gather evidence that we are fulfilling the agreements we made. Those concepts may mean different things if you are trying to build a nuclear reactor, developing a vaccine, changing a payment API or building a mobile game. Again, context and confidence.

Challenges

Let’s get something clear here, it is way easier to write about building a test culture than to implement one in practice. I know, I have been there, and I felt the pain. Believe me. One of the sayings we have at Schuberg Philis is that making two specialists agree on a topic is almost impossible. So, prepare for some conflict management.

Having said that, the main goal of this article is to help you create a foundation or a common vision, so everyone has the same goal and speaks the same language.

Things to keep in mind

Some of the fallacies we face when building a test strategy is that the quantity of tests we run is directly proportional to the confidence we build. Maybe that’s true in some contexts, so if that’s your reality I am sorry. However, in most cases deciding what tests we need is critical and we might not be able to test everything in all possible scenarios due to multiple constraints.

Have in mind that tests cost money and time to build and run; Maybe having a shared pipeline to run unit tests is cheap but having multiple environments replicating production and being available all the time will require some serious budget discussions. Infrastructure can be expensive and tests will require lots of them and having to explain to a financial board that does not understand technology why you need your development, sandbox, integration, acceptance and QA environments that cost ten times the price of production is not a position you want to find yourself.

When production environment stops, your customers get mad but when your test environments stop your engineers get mad. Remember that each environment you create, that being production or not, will need dedicated support to keep them available and up to date.

I am raising those points because the idea of testing everything in all possible scenarios is seductive (google “the nirvana fallacy”). But tests can be expensive and time consuming. So, always think about the consequences of your trade-offs in your context.

There is one more dimension I want you to be aware of before I change the topic: people. Tests do not grow on trees or are found in nature spontaneously growing in a meadow with bees spreading them during spring (wow, that was quite specific). Tests take knowledge and huge amounts of time to develop, there is a learning curve in this process that might be steep or not depending on your context. Therefore, be kind to people and understand that everyone has limitations, a good and successful test culture is not just one that mitigates risks but is one that people can actually implement. In some contexts that might happen as if it was a second nature, but in other contexts it might take a long time.

Driven by Evidence

When a test finish running, it generates an evidence that tells us whether our hypothesis was correct or not. Humans came up with a pretty cool method to collect evidence: science. Obviously, in IT we are not as rigorous as a researcher so we can cut some corners, but we can learn a lot from the science method to guide us on our test strategy and create a common vision on how our team should design them.

Hypothesis

The test must always start with a hypothesis, some truth that we want to uncover. It does not necessarily need to be anything intricate. However, it must be something we can design a test for. This hypothesis needs to be falsifiable, that means that before your test pass it needs to fail (did I hear test driven development?).

When we were discussing the agreements we make I stated that a lot of times our customers will not know all details of what he needs. How can we then design tests for something that is unknown? Logically, we cannot.

However, that does not mean we should not make the effort to find them! A test culture just like innovation are tools to help organisation dip their toes into the unknowns and perhaps uncover hidden gems.

Models

After defining our hypothesis, we need a model where our hypothesis can be validated and data be collected. Since most of us are testing software, our test environment needs to be isolated enough so our data reflects the truth. For example, if you are running a performance test for your application, you cannot have another process consuming all machine resources during the tests, otherwise your data will be tainted and the results useless.

It is a good practice as well to reflect production reality as close as possible. That means that the architecture of the test environment should be as close as possible to production. Not only the architecture but the data needed to bootstrap the test must be compatible with production standards. However, watch out if you are not breaking any policies if you are thinking about copying data directly from production databases or your security and compliance teams will not like you anymore (and they will have all the reasons not to).

Think of models like a smaller production environment. Although you don’t want them to be an exact copy (probably due to cost reasons) you want them to provide similar features, at least for the things that matter in the context of your test.

Measurements

A running test is like a living organism, what I mean by this statement is that the subject of your test (mostly a software change and its surroundings) will contain dynamic properties. If those properties were constant there wouldn’t even be the need to run the test. Having said that, part of a test strategy will be to come up with a set of properties (or variables) that we want to measure over time.

I used to work for a large bank and every new change to our system required a stress test, our hypothesis during this test was that the change would not impact the servers’ resource consumption (read CPU, memory and I/O utilisation) and page load times. Our model environment was a third the size of the production environment, therefore we had to size the test to input a third of the load volume of production.

After the test was completed, we would analyse the resource utilisation report for our infrastructure and compare it to a baseline. This baseline would be previous successful test reports and the resource consumption of the production environment during normal circumstances. This would give us an idea if the change would generate more or less stress in our production environment if deployed, therefore requiring us to scale up our production infrastructure or go back to the drawing board and optimise our software change.

Results

It’s important to have a plan to not only capture these metrics but also to store them for a long time since this data is vital to understand how your product is changing over time. Without this historical information, combined with high turnover of certain organisations, we are forced to rely only on the experience of senior engineers that have been around long enough to tell us whether the results are fine or not.

Another lesson science taught us is that memory is not trustworthy (google: “memory biases”), so I can’t stress enough how a retention plan for your data is important.

How you want to communicate the results is also important. Some tests can be kept internal to your team, but some organisations will require a formal communication so the approval process for the deployment to go through. So, keep in mind that test results are also an amazing communication tool (infographics, right?).

Wrap up

Well, that was quite a trip we went through here and I appreciate if you stayed here until the end. Remember that a big part of building a test culture is understanding the level of confidence and reading the context you are part of.

Always strive to use your tests as a tool to collect evidences to support your hypothesis so you do not deviate from your goals and the vision of your customers.

I hope that this can give you and your team the common mindset to build a test culture and reap customer satisfaction and that engineering and business can adapt faster to changes.

This website is licensed under the Creative Commons Attribution Non Commercial No Derivatives 4.0 International License. Read the license.