The Combined Engineering Software Model

Happy New Year, everyone! I find myself looking forward to 2016 despite, perhaps, some of the worrisome predictions I may have made in our most recent podcast.  2016 should be the year when I finally complete my Master’s degree, celebrate 20 years of marriage with my adorable wife, take a really exciting vacation (TBD) with my family (my eldest turns 18 this year, so want to do a “big bang”), and make some serious progress advancing the principles of customer focused quality and engineering in my Data Science job. Big year!   I hope your year turns out even better than mine will be. J

Today’s topic is something that Alan and I have gotten lots of tweet, emails, and questions.   It IS something we’ve talked about a lot on the podcast, but not in a very cogent fashion.   A quick search on the web yields nothing, so I’d thought I would add some detail.

Please note: on the podcast, Alan and I will often switch between discussing Combined Engineering and Unified Engineering models. These are the same things.     The model is called Combined Engineering, but honestly, we both think Unified Engineering is better.   I think of an Oil and Vinegar salad dressing.   You can shake it really hard and combine it together, but it’s really not cohesive and will separate again. You will have to shake it up again to reap the rewards. Too much work.   Unified in my mind is more like Ranch dressing.   Several ingredients melded together to make a whole new awesome thing.

A quick note

This is not my model. It was first deployed by a pair of leaders in the Bing team (I am leaving out their names to “protect the innocent”). One who grew up in Test and the other in Dev. Both are brilliant guys and have ascended to high ranking positions in my company.   However, I was in Bing when the model was piloted and was instrumental, a year and a half later, to helping my next organization move to the model when I left Bing.  About 2 years after the model was first piloted, it hit the rest of the company like gangbusters. The majority of the company was switched to this model now and dedicated positions in Test are remarkably rare.   I’ve met only 1 person in the last year who’s title is still Test Manager and honestly, I felt sad for her.   It takes time and effort to do the transition and she and a team were already about 1 year behind the rest of the testers in the company. I remember thinking at the time: “How can I help this person? Surely being in Test as well as being in management is going to make this person a target for the next set of layoffs (God forbid that they happen again sometime soon)”.

I am not fully aware of what was the inspiration for the guys that piloted the CE model, but I am very positive that it was grounded in Agile.

History

As of this month, I have been with the company for 22 years and during that time I have experienced 3 very different organizational models.   The first was the PUM (Product Unit Manager) model where a business was run entirely by one person, the PUM, who would have middle managers for each of the disciplines reporting to them. The PUM would report to a VP and the middle managers would have frontline managers as directs.   The frontline managers would generally partner with the managers of the other disciplines and their collective teams would work to add features together, but with clear dev, test, and PM boundaries. The second was known as the Functional model. It took the idea of clear distinctions between the disciplines even further. PUM roles were eliminated and in replaced by 3 new heads, one for each discipline. This continued up the entire organizational stack.   Directors and even VPs in Test, were quite common. The idea of this model was that each discipline would be far more efficient due to allowing them further control over optimizing their craft. It would reduce waste and duplicated effort.   Lastly, the CE model.   I think of this model as mostly the polar opposite of the Functional model.   It rejects the notion that discipline optimization is the key for producing ROI, and suggests getting teams to use their very different skills together towards business goals in a tight knit fashion is more effective.  It is important to note that on all teams that I have encountered, the CE model has almost no impact on PM. However, Dev and Test are combined into the same frontline team a single leader and are accountable for both the development and testing of the features they are producing.

CE Goal

OK, there’s a LOT to discuss with respect to the Combined Engineering model, but absolutely beyond a shadow of doubt the first and foremost should be the goal of it. Be forewarned: in no way should “implementing CE” be the goal.   In order to execute on a business strategy, CE is one of many implementations to consider for your organizational model, but it should be considered alongside the business outcomes and strategies.   The goal in a nutshell is to create an environment where empowered individuals with complementary, but different specializations, are working together towards common team goals.

If it makes sense for your business, do enter into CE, but don’t do it flippantly. There can be a lot of change incurred with the model and such change brings risks.   Consequences I have seen include significant slowdown in project effectiveness and speed and massive morale decrease.   Where change management strategy has been neglected, I’ve seen those teams lose their best people which further disturbs the downward spiral.   However, a strong implementation results in the opposite.   Huge performance and morale boosts.   Teams that feel like families and work together to achieve the best and most important goals for the business.

Combined Engineering, accordingly, is a very strong complement to your favorite Agile Methodology.   I’ve regaled on the importance of considering software development as knowledge work.   Dig into that statement further than just surface level. As Peter Drucker tells us, knowledge work is distinct in that usually the task is not known, but rather, must be determined.   On an assembly line, the work is known upfront. In Knowledge work, there is only outcomes and a selection of choices. Each person on a knowledge working team holds a set of the pieces for the business jigsaw puzzle. CE works to activate Knowledge Sharing amongst team members and get the collective knowledge of the team working together towards business goals.   It creates collaboration and unity and from this, high ROI productivity.

When to implement

My Inner Agile Coach screams “always” as a response to when to implement CE as part of the business strategy, but this is not true. If your product has static and relatively stable goals and/or long release cycles, then your product likely has no fast feedback loop between the engineers and customers. No real reason to change or adapt. Software being released to governments, packaged products or software embedded in your coffee maker or car are examples.   Honestly, I would push to make product changes to enable fast feedback loops before I would consider a CE model.   Otherwise, the benefits are unlikely to be worth the cost, pain, and risk incurred from the changes.

However, if you are able to ship at “internet speed”, have a strong leadership that is trying to push decision making down to its soon-to-be-empowered staff, have the ability to react quickly to competitive pressures and/or customers’ demands AND have a discipline-segregated workforce, then IMHO, this is a no-brainer.

How to Implement

On paper, the implementation is cake. I recommend using the simplest implementation.   Yes, it won’t be perfect, but you can change it 3 to 6 months later once you’ve learned the next change problem to resolve.   Making everything “perfect” up front takes time and creates anxiety.   Better to just start by making the smallest amount of required change NOW. Here are the simple (naïve) steps:

  • All frontline Dev/Test managers are now Engineering managers and no longer representatives for just their discipline.   Managers, who were once partners (in the functional model), are now peers.  (note for simplicity, I will continue to use their former titles though)
  • The product they used to co-own as peers should split in half as vertical slices. One half going to the former Test manager and the other to the former Dev manager.
  • Team accountabilities are the same as they were previously except now instead of being split by discipline each leader owns this within their team.   Both teams are responsible for feature development, unit tests, test harness improvements, etc.
  • The individual developers and testers need to reorg. How?
    • Split into 2 teams of equal sizes
    • The TOP 20% Best Testers go automatically to the Dev Manager’s team
    • The TOP 20% Best Developers go automatically to the Test Manager’s team
    • The Test/Dev Manager work together to appropriately place the rest of the individuals into each team.
  • Create Monthly team based goals for each vertical slice and do so *without* acknowledging where the team members came from. Ie.   The Test manager should not have easy to achieve development goals, nor should the Dev manager have softened quality goals.
  • The Test manager gets a new mandate:
    • He was given the best developers in order to bootstrap his success and help transform the rest of his team.   He should use these folks to aggressively train the others.
    • Goal: At the end of 6 months, every former developer should have created and executed their own test suite and every former tester should have checked in their own bug fixes/features into the product.
    • Training will be offered, of course, but the accountability for success falls to the Test manager
    • He has 6 months to achieve that goal. At the end of the period, each of the developers that were forced to move will be given the opportunity to change teams, if desired.
  • The Dev manager gets a similar mandate especially regarding the best testers
  • Obviously, this relies on a strong leadership team that can communicate the expectations and outcomes in a fashion that alleviates concerns. We are forcing people to move teams.   Most uncool, but the 6 month escape clause helps here and honestly, you really do need these experts to help with the training.
  • Lastly, you will need a public “incentive” plan that clearly remarks on how performance reviews will be judged based on the ROI of the features that individuals produce.    You will not be able to get a good review score by either the previous test review standard or dev’s equivalent.   You are NOT incorporating testers into the dev team (or vice versa), nor are you now going to judge Testers by the long practiced dev standard.   A new Engineering standard will need to be developed that makes sense for your business/company, which will be 50% weighted towards test’s benefit and 50% weighted towards devs. (Remember, you will change this later when you learn more).   The goal here is to breakdown the old us vs. them discipline thinking. Your ideal standard will be one that equivalently makes both dev & test comfortable that they can succeed.

The agile-wary reader will notice that, in essence, you are trying to take a team of specialists (dev) and a team of generalists (test) and merge them into more fluid teams of generalizing specialists and specializing generalists.   What I have found, is that the developers have a much harder time with this than the testers.   The testers will be mostly concerned about all of the bugs that will be shipped to the customer now without their protection.   They will also be concerned how they will be judged by the dev standard. Leadership simply needs to stick to their guns on the 6 month objective.  Consistently sticking to these goals are key to them being achieved. You should not call it a failure or a success until you’ve seen movements in the new “muscle memory” of the team.

If it ain’t broke…

Many will tell you the old model wasn’t broken so why are we changing it. Rarely is it the case that I have found their assertion true. More often than not, the reason these folks are proclaiming it is not broken is because they are not measuring the right goals. The name of the game in today’s software world is speed and the single most precious resource you have: calendar time. We no longer can afford to find huge design flaws a month before we ship and we certainly can’t afford the month long test pass at the end.   These are common consequences of the functional model and optimizing for discipline strengths.   They are harmful to today’s business strategies.   They are too slow.

 

Optional ingredients

To close, I will list out some challenges I have encountered alongside recommendations

Lack of training – training is key for success. We’ve already given teams their own in-house experts, but this may not be enough.   Listen carefully to the struggles of the team and bring in outside help in necessary.   Looking at external coding bootcamps for your former testers as well as design patterns, algorithms, etc. Treat your high ranking individuals as the leaders they should be and get them presenting brownbags and setting the example of the transformative outcomes you are expecting.

Specialist vs. Generalist battle – many devs will proclaim loadly and proudly that they are specialists and they don’t want to be generalists.   Much of this comes from how they got good review scores in the past (focused on owning deep complex code).   Respect their strengths, but be firm on expecting them to broaden.   At the end of the day, you don’t want either specialists or generalists, you want people who can go deep when it matters but you want those same folks to be able fluidly move to higher ROI tasks. This creates stronger adaptability – a key business imperative.

Shared assets and true specialities – Some topic of concern such as security, performance, and government compliance are deep topics by their nature.   It is very unreasonable to expect one team to be able to master these AND do development AND do testing. Consider funding a separate team to own this for all. Their goal is to create those assets that make it easier for the regular engineering teams to just “do the right thing” automatically.   Test Harnesses and Bug databases and other internal “products” should be owned by a single team with a strong sense of customer focus.   Their job is to accelerate the success of the product engineering teams and prevent duplication of effort.   Customer focus is key here.    Too often I see this teams build what they want to build and senior leadership force others to use it. Far too often, I see that these systems fail to achieve the needs of the engineering team due to this.   They need to be built to achieve goals, not to look good for the veep.

Area Owners – I recommend heavily to ban individuals from owning areas. I encourage centers of gravity and pursue areas of passion, but make it clear they these areas are owned by the team and anyone can and will work on it.   This helps to remove the reliance on specialists to solve problems and removes a key bottleneck from your system flow.   Lastly, it really amps up knowledge sharing on the team.   If multiple people are familiar with an area of the product that’s broken, then the odds of a fast and righteous fix being implemented is much, much higher than otherwise.

Command and Control – Leadership is the single easiest way to break or stall this transformation. Get rid of command and control methods.   As mentioned above, they don’t align with knowledge work anyways.   Tell your people the outcomes you want and the criteria for what defines it’s completeness.   They are professionals they will be able to figure out the next action to take as well as the subsequent ones.

 

 

I hope this is helpful.   Please feel free to post any comments and questions below.

 

Happy New Year!

 

In Pursuit of Quality: Shifting the Tester mindset

Last time, I wrote a book review on Lean Analytics. Towards the end of that post, I lamented that I see a lot of testers in my neck of the woods trying to map their old way of thinking into what’s coming next. Several folks (Individual contributors and managers of same) have come to me wondering why should test move into this world of “data crap” and why is how they have been previously operating so wrong now. It is my hope today to explain this out.

But first, before continuing, I’d like to try something new and offer you a poll to take.

Please consider the following:

So which did you pick? Over time, it will be interesting to me to track how people are viewing this simple comparison. I have been doing this example for almost a year now. When I first started it, about 1 in 2 testers polled would select the bug-free code. Whereas with testers I talk to lately, about 1 in 3 will select it. I definitely view this as a good sign and that folks are starting to reflect on these changes and adapting. My ideal world is that 1 year from now the ratio is closer to 1 in 10.

Why is this poll so hard for folks?

Primarily, it is due to our training. Test was the last line of defense – a safety net – needed to assure we didn’t do a recall when we released product out to manufacturing. When I first started in the software development world, 1.44 floppy disks were the prevailing way customers installed new software on to their system. Windows NT 3.1, as example, required 22 of them. It was horrible. Installation of a new machine would take the better part of the day, disks would be asked for out of order, and lastly, people would often get to the end of the install to discover that a setting they were asked for at the very beginning was wrong and that it was easier to just redo install than to hunt through the manual to figure out how to fix it after the install.

Customers who got their system up and running successfully and found a major bug afterwards would be quite sore with us. Thankfully, I have not heard of this is quite some time, but back then, Microsoft had the reputation of shipping quality in version 3.0. There was a strong and successful push within the company to get our testers trained with a singular mission: find the bugs before our customers do and push to get them fixed. I was proud to state back then that Microsoft was the best in the world at doing this.

The problem I am attempting to address is the perceived value loss in Test’s innate ability to prevent bugs from hitting the customer. A couple of months ago I presented to a group of testers and one of the questions asked “All of this reacting to customer stuff is great, but how can we prevent bugs in the first place?” Thankfully, someone else answered that question more helpfully as my initial response would’ve been “Stop trying to do so“.

The core of the issue, imo, is that we have continued to view our efforts as statically valuable. That our efforts to find bugs up front (assuring code correctness) will always be highly regarded. Unfortunately, we neglected to notice that the world was changing. That, in fact, it was more dynamic: Our need to get correctness right before shipping was actually tied to another variable: Our ability to react to bugs found by customers after shipping. The longer the time it takes us to react, the more we need to prevent correctness issues.

“Quality redefinition” – from correctness to customer value

A couple of years ago, I wrote a blog, Quality is a 4 letter word. Unfortunately, it seems that I wrote it well before it’s time. I have received feedback recently from folks stating that series of posts were quite helpful to them now. One such person had read it then and had a violent allergic reaction to the post:

“Brent, you can’t redefine quality”.

“I’m not!”, I replied, “We’ve *always* had it wrong! But up until now, it’s been okay. Now we need to journey in a different direction.”

While I now refer to the 4 pillars of Quality differently, their essence remains the same. I encourage you to read that post.

The wholeness of Quality should now be evaluated on 4 fronts:

  • Features that customers use to create value
  • The correctness of those features
  • The extent to which those features feel finished/polished
  • The context in which those features should be used for maximum value.

Certainly, correctness is an important aspect of quality, but usage is a significantly greater one. If you take anything away from today’s post, please take this:

Fixing correctness issues on a piece of code that no one is using is a waste of time & resources.

We need to change

In today’s world, with services lighting up left and right, we need to shift to a model that allows us to identify and improve Quality faster. This is a market differentiator.

It is my belief that in the short term, the best way to do this is to focus on the following strategy:

    • Pre-production
      • Train your testers to rewrite their automation such that Pass/Fail is not determined by the automation, but rather, leveraging the instrumentation and data exhaust outputted by the system. Automation becomes a user simulator, but testers grow muscle in using product logs to evaluate the truth. This set of measurements can be directly applied to production traffic when the code ships live.
      • Train your testers to be comfortable with tweaking and adding instrumentation to enable measurement of the above.
    • Next, move to Post-production
      • Leverage their Correctness skillset and their new measurement muscle to learn to understand the system behavior under actual usage load
      • This is an evaluation of QoS, Quality of Service. What you want Testers learning is what & why the system does what it does under prodtraffic.
      • You can start here in order to grow their muscle in statistical analysis.
    • Then, focus their attention on Customer Behavior
      • Teach them to look for patterns in the data that show:
        • Places in the code where customers are trying to achieve some goal but encountering pain (errors, crashes, etc) or friction (latency issues, convoluted paths to goal, etc). This is very easy to find generally.
        • Places in the code where customers are succeeding in achieving goal and are walking away delighted. These are patterns that create entertainment or freedom for the customer. Unlike the above, this is much harder to find, it will require hypothesis testing, flighting, and experimentation, but are significantly more valuable to the business at hand.
      • Being stronger in stats muscle will be key here. Since Quality is a subjective point of view, this will force Test away from a world of absolutes (pass/fail) and into one of probabilities (likelihood of adding value to customers vs. not). Definitely, it is wise to befriend your local Data Scientist and get them to share the magic. This will help you and your team to scale sustainably.
      • This is an evaluation of QoE, Quality of Experience. What you want Testers learning is what & why the Customers do what they do
    • You will then want to form up a dynamic set of metrics and KPI’s that capture the up-to-date learnings and help the organization quickly operationalize their goals of taking action towards adding customer value. This will generate Quality!

Lastly, while executing on these mindshifts, it will be paramount to remain balanced. The message of this blog is NOT that we should stop preventing bugs (despite my visceral response above). Bugs, in my world view, fall into 2 camps: Catastrophes and Other. In order to have Quality high, it is critical that we continue to work to prevent Catastrophe class bugs from hitting our customers. At the same time, we need to build infrastructure that will enable us to react very quickly.

I simply ask you to consider that:

    As the speed to which we can react to our customers INCREASES, the number of equivalence classes of bugs that fall into Catastrophe class DECREASES. Sacrificing speed to delivery, in the name of quality, makes delivering actual Quality so much harder. Usage, now, defines Quality better than correctness.

Ken Johnston: a friend, former manager, and co-author of the book “How we test at Microsoft” recently published a blog on something he calls “MVQ”.    Ken is still quite active on the Test Conference scene (one next month), but if you ever get the chance to ask, ask him “If he were to starting writing the Second Edition of his book, how much of the content would still be important?“. His response is quite interesting, but I’ll not steal that thunder here. J

Here’s a graphic from his post for your consideration. I think it presents a very nice balance:

Thank you for reading.

Automation doesn’t kill productivity. People do.

Shortly after I wrote Forgive me, Father, for I have sinned, I received the following email from a colleague of mine:

Professore!

I read your most recent blog. Your blog is actually dangerously close to sinning as well. In principle I agree with your sentiment, but be aware of violent pendulum swings. There is still a lot of value in the type of automation systems we have built, but it has to be tempered with a self-enforcing quality of quality, and quality of developer code measures. Good test teams actually do enable bad developer behavior. We become like a cheap compiler. Test will catch any issues, and quickly too. Developers are perfectly capable of writing solid (not bug free) code. They are just not always incentivized to do so. With a good test team, they don’t have to. At [my company], they don’t get rewarded to do so. The test team carries the burden, and the blame, for quality. There are many factors that play into the subject you have chosen. You are only tackling one facet.

Also, you are not really presenting a fix in your “how to fix” section, but rather pointing out a possible end result of the automation effort.

H

I really appreciate this sort of feedback as it really helps me to understand where I communicated well and where I did so poorly. That blog can be read as written by someone who was newly “enlightened” and automation was not invited to the renaissance. This was not my intent and not the case. (Aside: I am very nearly at that point when it comes to UI automation… I get a very visceral nauseous feeling lately when I hear of folks investing in this…) When used properly, automation becomes one of the most important tools in a software engineer’s arsenal. That is the crux of it, though. It must be used properly. The point of my story is that I had not done so and it led to some bad outcomes: thoughtlessness and poor code quality. I had done a really great job doing something that the business wanted me to do, but in retrospect, it was not the right way to solve the problem. In fact, perhaps it was solving the wrong problem…

Damned if you do

My eyes really began to be opened about 10 years ago. I had changed teams and become a middle manager on a product I used every day and loved. I quickly learned they had 2 big problems: First, they could not get their Build Verification Tests to pass 100%. I, later, learned that this had been the case for 6 years in a row. This by itself was interesting to me. In my experience, no team kept moving forward when BVT’s failed; they stopped and fixed the problem. When I asked about it, they mentioned they had tried several things, but none of them worked. Second, the test team did not have the payroll they needed to keep up with dev. At was the first wave of Agile Development at Microsoft and this team had decided to experiment with it. Dev believed documentation was overhead and velocity was all that mattered. As a consequence, Dev would move *really* fast and ask for “smoke tests” – testing done by the test team before checkin. When the product still failed BVT’s the next day, they would rally around the need for even deeper smoke testing. I saw a vicious loop and asked to own the solution. My manager readily agreed… The problem had gotten so bad, he was seriously considering banning all automation. He dreamed of the untrained Test Engineers world that dominated Microsoft only a few years earlier. He felt automation killed productivity.

To solve the problem, I first measured it. I learned my teams were spending 50% of their time doing smoke testing and another 20%, fixing automation. I also was able to show that these efforts were not *in any way* helping the BVT’s to pass. The more things failed, the more time they would spend on trying to fix it, but would not. It was depressing. Once I got to the bottom of the problem, it was fairly easy to fix. The hardest part was getting people to let go of sacred principles that they held to be true. Without proof. This team refused to recognize that their automation program, as implemented, was never going to work. In a nutshell, they were stuck in a vicious loop. They had super complex automation running in their simplest suite (no unit testing existed in those days) and they were using it to validate the build. Since they had not pre-validated the individual components, they *always* failed when integration occurred. This high level automation was hard to debug. As a result, the Test team kept on slowly losing more and more resources to maintenance. Bigger than that, the team so overloaded, they did not notice that they were not fixing the problem, but rather making it worse.

Once I realized how much it was costing the project, we did three things: 1) Ban E2E automation in that suite, 2) Limit Smoke requests to 8 hrs per week per feature team, and 3) built a tool for dev to run on their desktop to run the new BVT suite themselves. Once this was fixed, the automation began to work consistently and correctly. The dysfunctional bottleneck was removed from the system.

I would come to believe that I had learned the true point of automation:

To reduce the overall cost of development.

I concluded: Automation that didn’t do this, should be stopped. I would later learn this was wrong.

Damned if you don’t

Years later, I would join another team that had the opposite problem. Their system at that time was “not automatable” (or so I heard over and over). Really what this meant was that it was really hard to do and expensive and no one had created the hooks to make it possible. Because of this, they had a small army of vendor testers that would do manual testing every day. The team (including me) thought this was super expensive, so we looked into starting an automation program (after all, this made it cheaper, right?)

Our constraints:

1) They did a (yet another) different variant of “agile” where they planned out their 2 week sprints based on dev capacity only. As a result, time for automation was often very rare.

2) There were far too few unit tests. As a result, dev “needed” test to work night and day at sprint end to validate the new code in time for sprint end.

3) As I mentioned above, test hooks were missing and/or unstable.

4) The vendor team was only able to keep running the same tests… They did not have the ability to absorb more tests into their runs. As a result, monthly test passes had to be funded by the sprinting testers. This caused a starvation problem for 50% of each month in the sprint teams.

Lack of automation was killing productivity.

My manager and I worked on this over and over and finally came up with a solution. I would take a few of my team and create a new team responsible for curating the automation.

Their goal would be understand and optimize the execution of test cases for the division.

NOTE: this following part is not really needed for this story, but I am including it mostly because I think was a nifty process invention. You can skip ahead to “THE POINT” should you like.

Here’s how we started:

1) The Optimization team started by getting all teams to document and handoff their tests, automated or not. Teams were motivated: a team that handed off their tests would no longer be responsible for running their tests during the monthly test pass.

2) The Optimization team would own these passes instead.

3) The Sprint teams were required to write whatever automation they needed in order to get to done and exit the sprint. This large meant sparse unit tests at best. But enabled the sprint teams to have higher confidence that the code worked as expected each sprint. This by itself was a massive improvement.

4) The Sprint teams were also required to write the test hooks needed for that automation.

5) After the initial handoff, sprint teams were required to handoff again at the end of each sprint.

Once tests were handed off, the Optimization team owned the following work:

1) Establish SLA: Adjusting the priorities on the tests cases into 4 different SLA buckets: Daily, Sprintly, Monthly, Quarterly. (aside: this team shipped every 4-6 months)

2) Drive getting these tests executed using the Vendor team

3) Prune: Length of time ignored was used as to determine the test’s importance. Any test case that had been consistently failing for “too long” (initially set to 3 months) would be moved to an ‘archive’ folder (essentially deleting it) and mail would be sent to the team that owned the relevant area.

4) Categorize and Automate: Go through each test case and categorize by the type of automation problem that test represented. UI? Stress? Backend storage issue? API? Etc. There were eventually around 15-20 categories. They would then automate whole categories based on their ROI. This was considerably more efficient than automating all of the P1’s across all of the categories.

5) Maintenance: Frontline investigation on any test automation failure when the vendor team reported it and either fix the problem or move it to the sprint team’s backlog.

It took a good while to get the priorities right based on business need and the team’s desire/ability to react to a failure, but once we did, we had an efficient model for funding the execution of the manual suite.

Every day the vendor would get a backlog of tests to run: (see fig:1)

  • 2/3rd of the Vendor team’s time would be spent on running the daily tests… all of them.
  • 2/3rd of the remaining time
    would be spent on the sprint tests. A small chunk would be executed each day, so that all would be execute at least once each sprint)
  • 2/3rd of the then remaining time would be spent on monthly tests
  • The rest would be spent on the remaining tests

Fig 1: Capacity allocation plan for test execution

This allocation meant we could predict and control our payroll costs for manual test execution. If the number of tests in a category exceeded its funding level, some other test got demoted. Tests being demoted out of the quarterly runs meant a conversation: 1) test was no longer represented risk that we cared about or 2) more resources were needed on that team.

THE POINT

Once we had done all of this work and socialized it, we were about to reduce the vendor team by almost one half. In addition, the rest of the test team loved us. We had enabled them to focus on their sprint work as well as taken the tiresome test pass off of their shoulders. “WooHoo!” I thought, “Look how we reduced the cost, mitigated the risk, and boosted team morale…” That had saved a TON of payroll money. Greedily, I went to the manager I put in charge of the Optimization team and asked how can we reduce the cost more (we were still 80% or so manual, so I assumed we could use automation to make this super cheap!)

He then pointed out that, in general, for every 1000 test cases we automated or pruned from here on, we would be able to get rid of 1 of these vendors.

“That’s fantastic”, I said, “That doesn’t seem like very many tests to have to automate. Do you know the breakeven point? What’s the max we can pay for the automation in order for it to pay off?”

“$50 per test case per year”, he replied.

“What?!? $50 per test case?!? That’s impossible! That’s essentially 1 hour per test per year. I’m not certain we can even develop the automation at that pace.”

The really great thing was that we had built a system in which it made it easy to see and make the call. Though I am drastically simplifying things for this post, he could show me the math readily… It was all true. Over time, the automation system would improve and its pricetag would lessen, but not to the degree necessary. At the time, this news was shocking. It turned out manual testers were very effective and a lot cheaper than the automated equivalent for our product.

Automation on this team, clearly, was not reducing the cost of development.

Cost savings was not the reason to automate. Automation was a tax.

The morale of the story is that automation’s purpose is not about saving money. It’s about saving time. It’s about accelerating the product to shippable quality.

My colleague, H, is right, of course.  There is “a lot of value in the type of automation systems we have built”. We have built great tools, but any tool can be abused. I believe the fix lies in transparency and measurement. Understanding that the goal is in accelerating the product to the goal, not in accelerating the intellectual laziness of its Dev and Test teams. A dev team that is leveraging the automation system that test built as a safety net might be making choices that are contributing to slower releases and greater expense. Please send these folks to ATDD/TDD classes to start them on a better direction.

Ultimately, it comes down to choices. What do we choose to measure and what do we choose to believe? Automation is a tool; how we use it is a decision.

All your Leftover are belong to Test

Today in the United States, we are observing Labor Day, which celebrates a strong history of economic and social benefits wrought from the American worker. So for you testers out there, can you find the bug in this code?  (being from US, not required)

Notice Test as default?   Some may not even view this as a bug.  This is the case if you believe a colleague of mine who stated a few weeks ago, “The definition of Test’s job is easy… We are the remainder. We pick up the work that’s left over after the other disciplines do their part.” Is that right? Is Test the cleanup crew of software engineering? I can see where that viewpoint is coming from. In TEST is a four letter word, I point out that the word TEST is meaningless because of all of the different types of work that can generally land on our shoulders. However, even given this, I don’t view test as essentially a technical Janitor. As I wrote in The Tester’s Job, we have a valuable and important job to do. While how we do it may take on a diverse set of actions, I don’t think it is those actions that define us, but rather the goal we are striving to achieve…. the acceleration of the product to shippable quality… I do think if you are on a Test team and the work you are doing does not help the product to be shippable quality, then you are doing the job of some other discipline (which may or may not be ok… it’s just not Test work).

About a year ago, I predicted my work environment was heading for a tipping point. The roles and responsibilities for Test would change in a dramatic fashion. I predicted it wouldn’t land for another 1 to 2 yrs. I was wrong. I believe the tipping point has already happened. There are a number of indicators that would point in that direction – the biggest of which being a large shift towards Service orientation. I bring this up because I think Test is fundamentally changing in a predictable fashion and those who understand the job of testing as well as the direction Test is heading can shape their own careers according to their preferences. If you view the value add of Test is the actions Testers do, then you may be heading for a surprise when the reformation finally lands on your doorstep.

So what are the options? This is the number one question I get from folks when I bring up this topic. I see several options available to Testers, but I think they will likely be one of the following:

1) Development – I think the bulk of test will shift into development roles. Testers who are strong coders and can’t shake the view that finding bugs is all that matters will go into this role, which optimizes for same.

2) PM – Those strong in customer outreach will head this way.

3) Dev Ops – Integration experts of the machine topology and software…

4) Toolsmiths – Title will likely be Developer, but will be developing tools that accelerate the rest of the business.

5) NFR Specialists – Non Functional Requirements (E2E, Stress, Performance, Load, etc) are a needed and deep technical specialization. Once tooling makes doing this simple and friction-free, these jobs will likely minimize, but today, it is often more work for Dev to own this than they can afford.

6) Data Scientists/Analysts – Beancounters of the world unite.

7) New Career/ Company

I think when the world stabilizes again after this shake up, we will find the role of “the remainder” gone. If that was what you liked about test, then you may be heading for a surprise.

Why not instead take a moment and think through how you really like to accelerate quality and then make a conscious decision to pursue it?

I, myself, am aggressively heading down the #6 path. And loving it… Unlike prior times in my career when my job was to point out flaws in other people’s code, I truly can state I am accelerating the product towards quality.

Forgive me, father, for I have sinned.

 

 

 

 

 

 

 

I have sinned. In the pursuit of silly and stupid goals and beliefs, I have committed crimes against quality, delivery dates, and the businesses I have worked in.

Goals like:

  • We need to move quality upstream
  • We need to automate everything
  • We need to automate in order to scale
  • Friction-free automation is the key to unlocking limitless potential

What did I do?

I (along with my accomplices) created a hugely powerful, popular, and flexible automation system. It’s even patented (though, not available publicly).

I created this system with 3 primary goals in mind:

  • Freedom – I wanted an automation system that let feature teams decide how to automate without becoming a burden to execution staff. Often times, I encountered folks trying to “standardize” automation efforts in order to “avoid duplication”. But these standardization efforts often entailed teams having to rewrite their functioning automation suite just to toe the line. Often at greater long term cost than maintaining the already stabilized, but no longer in favor, automation suite. In addition, these automation efforts often were a least common denominator approach, where the system solved 80% of the problem for everyone, but was not able to solve the P1 issue for a feature team, making it a non-starter or an excessive burden.
  • Friction-Free – I wanted a system that a “brainless monkey could use while sleeping”… I argued by making it so easy to utilize, we could hire much cheaper staff to get work done, saving money.
  • Cheaper – by doing the above, the system will enable teams to create more automation thereby making test passes cheaper.

If only I could go back 15 yrs and slap myself in the back of the head…. Hard…

Um, Brent, those goals seem awesome. What gives?

First off, I succeed in all 3 goals. But here’s what happened over time:

  • By enabling team freedom, I enabled the execution team to scale and handle more and more automation frameworks, test cases, etc
  • By doing this, teams got more invested into their automation infrastructure and created more and more collateral faster and faster
  • As more and collateral got added, the system remained “friction-free”, so there was no reason to think deeper on improvements. The reduction in payroll cost justified the increase in machine execution cost (though, neither were measured).
  • So more and more test passes got scheduled, this in turn, increased automation maintenance cost, total payroll, total machine cost, etc…

 

 

 

 

 

 

 

 

 

In short, I helped to create Test Zombies, but not only that I created Dev Zombies too. By trying to move quality upstream, I did the exact opposite.

 

I  had enabled brainlessness to thrive.

How to fix?

Learned from my Agile training, Systems Thinking helps to see the way. Lean Agile has an important principle that specifies we need to “Optimize the Whole”. This means that one should consider the final goal and determine the right things to do in the immediate context to achieve that final goal. In my case, by making test passes cheaper and encouraging simplicity., I essentially helped the team optimize for *more* test passes using cheaper resources. Ultimately, costing the business more money via the need for more and more machine resources, and while we could hire cheaper people, the test pass load was such that we needed more and more of them to maintain the machines, schedules, etc and lastly, we had enabling the dev team to produce arguably sub-standard code and still succeed. And what’s worse: almost all of the participants, when they think of their own sub-goals, will today *praise* me for what I have done….

By changing the “Cheaper” principle to be around decreasing the pricetag for regression testing for each milestone and/or release and increasing the efficiency of the program, the implementation strategy and therefore, the automation strategy changes in very important ways.

But that, my friends, is a story for another day.

MyTestCareer++;

It has almost been 4 months since I have last blogged. Several big things (big to me, at least) have occurred in that period. The least relevant to this post, but the hardest on my schedule, has been a house move. We are still living 33% of the time out of boxes, but things for my family are slowly getting back to normal at our new location. In addition, the old house has most of its laundry list completed and will appear on the market very soon. <Fingers crossed> It has definitely been a rough adventure. If I wait another 15 years to move again, that will be fine by me.

That isn’t the only change for me. Today’s post, as is often the case, the result of deep thinking of themes that are problematic to me. As example, I wrote my last post when I was thinking through what I wanted to work on next in my career and where I wanted to do it. As a result, I have made a non-trivial decision to move back into a test role (Gory details? check out my LinkedIn profile). Several really great things happened after my last post. I has offered several great Development Lead roles and even a Development Manager role. These all were very interesting and if I hadn’t already committed to the new team, I might have taken that DM job. I definitely still think of it often. Why? Even though I am back in Test now, I suspect the Test discipline is not my final destination.

I have gone back to test because I have some new personal and “organizational” goals to aggressively pursue and I believe the best place to do that and contribute to the delivery of ground-breaking, customer delighting products is in that role.

To explain further, I want to improve lives and solve problems. I want to contribute to building businesses that satisfy a customer need and excel at doing it. In particular, I want to advance the science of producing Quality in the context of that business. This is a strong passion of mine.

In my last year in Development, I had zero testers. My team had to own their testing themselves. As a result, I learned 2 very important things. 1) Far too much of our focus on quality is around code correctness and 2) Testers are *not* needed to deliver improvements there. In fact, Testers who grew up the way I did, might actually be slowing down the product and contributing to a much more expensive bottom line. I think Whittaker is right when he claims “All that testing is getting in the way of quality“.

My developers, without a test team, had to focus on testing things themselves. Since we were an Agile team, we shipped each ticket when it was completed. Every day we were releasing something and every day if someone on my team messed up, the whole team would swarm to fix or revert. We discovered over time, what tests helped, what hurt, what policies protected the customer, and what were just burdensome costs. Over time, we learned to rely *more* on design practices make changes quickly and effortlessly than on costly overtesting. As we focused on our ability to react *quickly* to failure, our ability to prevent failure grew and grew. Our designs became much more cohesive, much less coupled together. Easily detangled and simple to test.

With code correctness managed, this enabled us to turn to quality and customer satisfaction of our feature (link here for what I mean by Quality). This ended up being a much harder and interesting problem. The team I left behind is still aggressively working hard to nail this. They are an Agile team and able to self-optimize for what they learn on a weekly basis. They will get there. I have no doubt of it.

They are working towards having pinpoint accuracy around customer insight and quality for their customer base. They will achieve QWAN.

So why the new team? And why test?

The test org of my new team is going a different direction than most. They are aggressively shifting functional testing over to development and changing their focus to higher value output. They want to create a much, much stronger investment in infrastructure that reduces the friction to high value customer insight and work prioritization. They want to shift to a monthly release cycle and a service orientation.

I went there to help them execute the shift and bring about an age of Data Science and Agility in the organization.

Just found out on Friday that I am going to be the leader for one of the key initiatives needed to make this happen.

My boss told me this along with the phrase “be careful of what you ask for….”

This is going to be one of the biggest tasks I have ever tackled. I do not yet know everything that I and my team will need to do, but it’s going to be very exciting and the final goal very rewarding.

TEST is a four letter word

(note: this claim has not been tested in all global markets)

Last Friday, I attended a meeting with several technical Test leaders from around the company.    We were focused on Test PR.  Specifically, we wondered what the Test community could do to get college students really excited about the role or, alternatively, find those CompSci folks that would thrive more in the Test Discipline than in Dev or PM.

It was a productive meeting – the ideas generated will be great.  One thing, though, bugged me about meeting.   We spent a large amount of time trying to converge on a definition of Test that we felt was valuable for the exercise.

Earlier the same day, I met with a tester who currently works in Office Test.  This tester had not been out of school for long and was frustrated with their job.  They came to me to sort this out: was it their job, their manager, the team they were on, Microsoft?  

“I really love interacting with my Developer.   I’m really helping to make them productive and they appreciate me for it”. 

 This person was upset because they knew they were providing real value, but their manager was “old school” and focused on the tester’s low bug count.

On the web, there are examples as well.  Here are a couple of recent ones:

  • Alan Page puzzles over the Test Job (and whether or not he’s doing it)
  • Scott Barber tries to scope the Value Add of Test (and does so quite beautifully, imho)

This is a pattern that I’ve noticed a lot of lately (2 or 3 times a week).   Folks are trying to define the value/purpose/meaning of the Test position.  

And struggling.

 

It occurred to me that the problem might be Test itself.   Not the discipline, mind you, the word.

Ever since I learned Design Patterns, I have been fascinated by the idea that a single simple word can immediately invoke an entire concept.  If you were looking at a piece of my code and stated, “Brent, don’t you think this code would work better as a Singleton?”  I would know exactly what you meant. 

Definitions are key to understanding.  Singleton is nice.   It has multiple definitions, but in the CompSci context, only one.  You know it or you don’t.

Test doesn’t quite do the same.  If I say I programmed a solution, it means I wrote a bunch of code that I believe works to solve the problem I intended.    If I hand that code over to someone to Test, what are they going to do with/to it?  Is it clear?

Alan shares his perspective on the problem:  

Roles that testers play on teams vary. They vary a lot. You can’t compare them. That’s ok, and (IMO) part of the growth of the role. I find it disturbing and detrimental when testers not only assume that their version of testing is “the way”, or that some roles (or people in those roles) do not qualify as testing.

I tend to agree.   I do not consider myself part of the “Context-Driven” School of testing.   I think Testing, by its nature, is context-driven and therefore including that term is redundant.   (Though, it does occur to me that this viewpoint *might* actually make me a member of the school.    I’ve struggled to find the other Schools of Test to compare.  (Edited:  it’s here.))

I find myself puzzling:  How many other industries that use the word Test have this very same ‘Context-driven’ property? 

For example, if my doctor were to schedule a CA 125 test for me, would the lab technician recognize that I was a male and question the validity of the test?   If the results came back low (or high), would the lab technician offer me nutritional tips to improve my experience?     Would the technician defer my sample to someone else while they worked hard to improve the efficiency of the paperwork handoff between the doctor and the technician?

To all three I answer:  Maybe, but I doubt it.

IMHO, over the years, Test has gotten overloaded so much that it has lost much of its meaning.   I am reminded of another article.  In this one, Michael Bolton makes a distinction between Checking and Testing.  I love this.   While I don’t necessarily agree with what he finally landed on regarding the definition of Testing.  I can roll with it.  Michael is helping to disambiguate the terms.  I’ve personally found these terms helpful when communicating with others.

So how did we get here?  My hypothesis is we got here through business demands.

Scott says this:

When it comes to software, on average, testing is on [the business’] radar, and testing is not providing the value they are looking for (or, at least is not being communicated in a way they understand).”

Many business leaders have differing expectations of what they want from test.  When Software testing began, it was essentially for scientific endeavors.    It should be of no surprise to find that it grew from debugging as a mechanism to “prove” correctness of a program (ie prove it works).  This then grew to finding bugs.   However, I believe as commercialized software began to dominate, making it “user friendly” became a differentiator for most businesses.  Suddenly “correctness” meant quality and we hired hoards of testers to pound and find bugs.  The cost of recurring Test passes forced us to learn to automate in order to move forward.   Since Testing has gone through such dramatic shifts and very quickly, it really shouldn’t be surprising that many folks still cling to older definitions and older value propositions

We are at another transitional stage.  Quality is now not necessarily about prevention as it once was, but about quick reaction.   We are going faster and faster, and storing more and more and more data.  I believe the Testers of the future are Data Scientists

One thing I am digging already is the cool new title.  I hope we don’t overload this one too.