As systems become more isolated and autonomous, teams diverge from having shared, centrally managed test data. Legacy test data, which many teams use, can cause problems when it changes without team members’ knowledge. You may get lots of advantages using these more independent systems, but you now have to deal with the pain when your older test data management strategy doesn’t hold up. You need a new way to manage our test data and to make your problems visible.
In this post, you’ll learn how to strategize to conquer test data, just as a general strategizes to conquer a territory. You’ll find out the key elements to a test data management strategy, including what tools to use and how to measure success. We’ll also cover some tactics you can apply to ensure a strong strategy. Consider this an Art of War for test data management.
I hope this post gives you the confidence to implement your own strategy successfully. Let’s get started!
Map the Landscape
An effective general surveys the landscape where a battle is likely to occur. Based on that landscape, they choose the best troops, equipment, and movement to win. Too many times it seems big consulting companies come in and give “expert advice” on best practices. They base these practices on what other companies have done, or what their “gut” tells them. But they didn’t pay attention to the landscape those other companies are in. If a 10-person shop tries to follow Facebook’s test policies, that shop will burn through tons of cash with little benefit.
Like the effective general, you need to map your landscape before you can form your strategy. First, you need to know which views of your test data are worth looking at, across the organization. Second, you need to invest in tools to give you those views.
Quite a View!
Here’s a non-exhaustive list of ways to look at your test data in order to see your landscape. Such views will give you insights into where your problems lie.
- Dependency graph. Looking at who uses which test environments and data shows you the potential impact any given data set can have. The more dependents a test data environment has, the worse it is for you if there’s a problem. On the flip side, the higher the impact, the more valuable it may be to include those environments first in your strategy.
- Steps to get data ready. Looking at the significant steps it takes to get data into a system can let you know where people may be blocked, waiting to test their own systems. This can let you easily identify lead-time bottlenecks.
- Test environment uptimes. Unstable (low uptime) test environments will continually cause problems for anyone using them. Teams that deal with instability have to burn time diagnosing such problems, only to realize there’s nothing they can do but wait. This testing also goes well with your dependency graph. The more dependents an unstable system has, the more money it’s costing you.
- Test data sensitivity. There’s a growing set of laws around data that you need to keep ahead of. This includes policies such as HIPAA, PCI, PII, and GDPR. You’ll need visibility into which data sets deal with which policies. This will help you avoid unnecessary liability.
You’ll want to find tooling and processes that will support these views of your test data as part of your strategy.
Understand Your Problems
After a general maps the landscape, they pinpoint their biggest risks and weaknesses. They want to play to their strengths in the coming battle while shoring up these weaknesses. In a similar way, you’ll want your landscape data to drive your focus, spotting weaknesses in your test data management.
Costs can really hit you in two main areas: lead time until data readiness and liability.
- Lead time until data readiness is how long it takes from knowing you need data until someone on your team uses it for that need. This includes putting the work in your team’s backlog, getting approvals to add or change the data, and putting the data into the environment. With your tooling in place, you can see where bottlenecks may lie with this. One big issue is manual approval gates. When an outside team must approve the data before another team uses it, expect much idle time until the data is ready. Lead times can get longer depending on where they are in your dependency graph. Think of it this way: Cost equals lead time multiplied by number of dependents.
- Liability is a classic risk. As I mentioned earlier, more data points are being considered sensitive and are protected by many, many laws. This includes credit card numbers, health information, and even the full names of your users. You’ll want to profile your test data so you understand what level of sensitivity each data has. Then you can analyze the risk that someone may fine you or sue you for being out of compliance.
A general going to war has an end goal. It could be to gain new land. Perhaps it’s to drive the enemy out of their base. The noblest may be to defend territory from attack. Whatever the goal is, an effective general knows when they’ve achieved their objectives. Like that general, you’ll want to develop objectives that you can lead your team toward.
When you’re creating goals, try the template of Google’s objectives and key results (OKRs). You’d start with a high-level objective, usually centered around a significant problem. For example, you may want to reduce your data compliance liability by 30%. Then you glue that together with what key results you’ll see when it happens. In this case, you may see a reduction in exposed PII data. Or you may increase how many teams pass third-party compliance audits.
A key aspect of these objectives is that they can be broken down into smaller steps. Your teams should be able to break down your large objectives into smaller ones they can tackle themselves, as well as into finer-grained key results.
When you know your goals, you’ll need a cadence—in other words, a rhythm, pace, or tempo at which you’ll implement your strategy and review your key results. Generals break wars into battles; you can break your strategy into a cadence of review. It’s worthwhile to do this because as in battles, no plan survives first contact.
In line with the idea of Plan, Do, Check, Act, you’ll want to look at the strengths and weaknesses of your plan so you can adapt to your landscape and improve. These reviews need to be psychologically safe for team members and focused on blamelessness. The strategy needs to improve, not the people. If you don’t have these guidelines in place, then you won’t get the accurate feedback you need to succeed.
Also, start small. Since you can slice your objectives down to smaller goals, you’ll be able to get feedback quickly—ideally within a couple of weeks of executing your strategy. Perhaps just start with one team that has many dependents, or even with one type of data set for one team. For example, if de-risking liability is a goal, you may give just one team the tooling to scrub their staging test data and see how it goes.
Lead, Don’t Manage
Now let’s talk about one of the trickiest parts of your strategy: how to lead it. You need people in your organization to lead this in two ways:
- Inspire people and serve them. Don’t command them or manage their work. Managing your team will just inhibit team members’ ability to execute.
- Align the interest of the members of the team to the test data management goals you put forth. This is what the book Accelerate defines as transformational leadership.
Like inspiring officers marching into battle, such leaders will free your teams to perform at their best and to adapt to the inevitable challenges that will come as you implement your strategy.
Unfortunately, this sort of leadership is rare in many organizations. Take an honest look at yours to see how mature your leadership is. Even the greatest strategies may not hold up if your leaders are missing these qualities.
Autonomy and Learning
Effective transformational leadership will allow teams to autonomously execute on your strategy. This helps you for many reasons, not the least of which is that autonomy and self-service systems decrease the lead time for data readiness. Another great benefit is that you don’t need to worry about how the teams are going to achieve your goals. You don’t need to even give them solutions directly. They’ll figure it out.
While you don’t need to give solutions directly, you do want to invest in effective test data management coaching for struggling teams. Too much autonomy without competence can cause a team to go into chaos. Consider using the information on the DataOps Zone blog to create a community of practice around establishing good patterns and avoiding anti-patterns, creating a place where experts in test data management can teach others. Ensure that leaders provide a time budget for their teams to do this.
Put tooling and processes in place that lets teams self-service test data. This encourages autonomy.
A Cheat Sheet
To summarize, here’s what developing an effective strategy looks like:
- Map your landscape. Investigate and invest in the right tools and processes to see your test data and what’s happening with it. If you’re investing in such tooling for the first time, start with giving it to just one team.
- Find your problems. Look for signals in your landscape that tell you either test data is taking too long to get ready or that you’re exposed to significant liability.
- Create goals. Use the template of OKRs to create high-level objectives. Understand how you’ll measure those objectives. Find ones that your teams can break into smaller objectives.
- Create a cadence. Ensure your strategy is made of short-term iterations that your leadership and individual teams can review. For example, you can create a monthly leadership review, and each team may have a weekly review. This cadence can also help your teams continuously manage their test data for the foreseeable future, adapting as necessary to changing business needs.
- Lead, coach, and grant autonomy. Avoid managing your teams. Put transformational leaders in place to inspire. Trust your teams to execute on your objectives, and give them tools to self-serve data. Invest time and coaching resources for your teams to gain the skills they need to do their best. The vast majority of people will use that time if properly inspired.
Attack Bad Test Data
Now that you understand the components of a test data strategy, you can make this happen in your organization. Remember, make sure you understand your landscape and know how to spot problems. By taking this data-driven approach, you’ll get real value from your strategy. Put the proper tools in place, and then take your plan to the field of battle. Incrementally get your teams ready to bring this plan to reality, letting the teams decide how to carry out the plan. Be willing to revise as necessary. By following these guidelines, you may be shocked by how quickly your teams’ test data problems get routed.
This post was written by Mark Henke. Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.