Quality is meeting the true end goals of the customer. Quality indicates how well, from a Customer’s perspective, a product meets its functional and non-functional requirements – without problems. Quality is an indirect measure of how much business value a product delivers but a direct measure of the relationship between users and a product.

Holistic Software Development does not include a separate set of “testing”, “verification and validation” or “quality” activities because quality is everyone’s responsibility. Building quality into a product through close customer collaboration, early identification, investigation and fixing of problems removes much of the need for extensive post-build testing and formal quality assurance.
Quality is subjective and difficult to define and so overly formalizing quality activities leads to a separation between “development” and “testing” specialisms that dysfunctionally reduces a focus on quality throughout the entire lifecycle. In HSD quality related activities such as testing are expected to be done as part of every activity.

Quality Activities

Although quality is “designed in” as part of modern professional development, problems will still occur due to the inherent complexity of software development as a team activity. For this reason, testing activities are an essential part of product development. At each level of the Definitions of Done different types of testing or other quality activities might be useful. We recommend focusing quality activities on quality risks, using a mix of techniques and quality activities whenever possible. For example, if an integration test across an entire Product Family can be quickly and cheaply run as part of an automated build at the lowest level of done then it should be.

A focus on quality is necessary across all parts of software development to prevent causing avoidable problems. Some problems will always occur despite the best intentions and so a balance between prevention and detection must be struck. In both cases we recommend using an understanding of risk in different areas to help focus quality efforts.

When decomposing requirements, we recommend doing so in collaboration with users and customers. This helps to avoid the risk that teams build the wrong solution, refining collective understanding with each build. Using visual techniques such as User Experience Modelling helps to bridge the gap between “techies” and “business users”. Remember requirements elaboration will happen continuously during the lifecycle of development, not just at the beginning.

Architecture exists to mitigate risks and focusses on the implementation of cross-cutting and non-functional requirements. Good architecture is the foundation of quality, whereas bad, or the wrong choice of, architecture undermines quality efforts. Rapid high level Solution Architecture sketching with Business Customers, and System Architecture sketching with a suitably technically minded Product Customer are excellent ways of building quality into architecture and design activities. Even without technically minded Product Customers, understanding, and spiking risks, is the key to successful architecture. The Architectural Profile is an excellent way of finding where extra quality activities of effort may be needed as quality risks tend to be in the high complexity areas.

Many development practices have evolved to design in quality and prevent or detect Problems early: Pair Development, Mob Development, Test Driven Development and Unit Testing are ways of building quality into technical development.

Development work leads to integration and at each level of integration changes are combined into builds, higher level components, systems, products and product families. Each integration activity is a point at which we can check that output matches the requirements, acceptance criteria, example or plan (these activities are traditionally called “verification”) and check that the customer is getting what they want and will accept the output (traditionally called “validation”). Verification is checking that that the product is built correctly, validation checks that the right product is being built.

We recommend that quality checks happen at each level of feedback cycle with outputs being checked with customers at every available opportunity to find problems as early as possible and to actively include customers in development cycles. We recommend that a focus on quality is built into every activity and explicitly addressed by testing, and other quality assurance activities, to find problems that have crept into products. Testing activities can help teams feel confident in the promotion of a product release through the stack of Definitions of Done.

However, it’s a mistake to only do higher level testing on major releases. Every change should be pushed as far up the stack of “done” as is practical and useful. Whenever a quality risk can be addressed by testing, the test should be applied regardless of levels or process.

Test activities typically involve a mix of manual testing approaches, tool supported exploratory testing and automated testing using development environmentscontinuous integration environments and test environments. The top level of the Definitions of Done is focused on “Acceptance” and so is normally driven by Acceptance Testing with Product Customers and Business Customers.

Definition of Done

A “Definition of Done” describes what “done” means in an unambiguous way so that interacting individuals, teams or organizations can work together effectively.

Most organizations are made up of teams of teams and at each team boundary work requests and results are handed between teams at a certain level of completeness. Although close collaborative relationships, organically formed are often the most effective, this transactional behaviour is often necessary when teams have been organized to decompose large populations. Where transactional boundaries do exist, clarity over definitions of done, both in terms of incoming requests and quality of output is necessary.

One of the most common causes of conflict when teams need to work together is an inconsistent Definition of Done between the teams. Aligning on known definitions prevents conflict.

We recommend developing acceptance criteria associated with work requests, requirements and bugs/changes so that requests are fully rounded and results won’t be produced without the requisite level of quality. These levels of completeness and quality are “definitions of done”. In Holistic Software Development we explicitly define a stack of levels of done that correlate to the governance structures, decomposition of requirements, promotion through integration streams and loosely to levels of testing. Where possible we recommend testing to as high a level of done as possible to identify release candidates (or problems) as early as possible.

 Read more: Definition of Done


A common term used in development businesses that’s focused on preventing problems is “assurance”. We often see businesses apply “assurance” to plansdesigns and other activities – but crucially not against working software. This kind of practice, reminiscent of the process police checking that the artifacts are all there, doesn’t tend to positively affect quality.
Spending a small amount of time checking plans, roadmaps and architectural approaches is a good idea but spending considerable time or effort is actually counter-productive. We learn far more from working software that we can test, try out and otherwise play with than we ever learn from studying the plans. We’ve seen many projects where the process assurance people checked all of the processes were being followed and yet the resulting product was bad. We’ve also seen many examples of the opposite: teams that didn’t follow “the process” but succeeded in making high quality products their customers loved.
The lesson we learn from this is that working software really is the only meaningful measure of progress, and the only meaningful thing we can test. Software development, other than specific edge cases such as model driven development, simply isn’t transformational. We don’t just take the requirements and translate them into design, then into code. We learn and evolve as we’re going along.
Avoid spending time and money on “assurance” of plans, designs and other intermediary artifacts. This activity slows you down in terms of building the product, the thing you actually can assure. We recommend putting the emphasis on the output of activity, not the plans and designs. Regular customer/user demos, automated and exploratory testing are all valuable activities.

Specialists vs. Cross-Functional Developers

As software development practices have evolved people have specialized into specific roles such as:

  • Developer
  • Designer
  • Architect
  • Tester
  • Requirements Engineer
  • Etc.
This is further compounded by processes introducing roles, some process frameworks have 10s of roles! The problem is that as people specialize and go deeper into an area of specialization they increasingly abstract away from actually building software. Sometimes “professionalization” is the enemy of progress.
One of the drivers for these specializations occurring is the scaling of hierarchical organizations. The supporting functions around the core of problem solving have evolved new practices and tools becoming “special”. In turn they become separated from “development” due to organizational size, further encouraging their specialization away from collaborative co-creation. As organizations grow these separate areas grow in size and separation. Often this leads to status, power, “empire building” reward and recognition in the hierarchy, encouraged by Human Resources practices that emphasize responsibility in terms of numbers of people and size of budget under control rather than skill and application of that skill. This has led to a significant vested interest by old power hierarchies in maintaining this status quo, reinforced by traditional business consultancy.
This is one of the reasons why a small team can run circles around a large organization. Small teams have agility built in, large organizations often have specialization, separation and excessive management built in. We recognize that this point may be difficult reading for people who’ve specialized and abstracted away from development. But consider the following questions, even if you recognize yourself in one of them consider the others:
How many architects do you encounter that can’t build the software but spend time drawing diagrams?
How many testers do you encounter that don’t realize that platform-as-a-service needs testing but spend time test planning?
How many Project Managers do you encounter that don’t understand modern software development but spend time planning it?
How many developers do you encounter that think testing is someone else’s job?
A well rounded developer knows how to test, how to elicit requirements and how to design software. A professional developer is skilled in all of these areas, and more. These types of people are called “cross-functional developers”.
Of course, there is also room for expertise. Practicing experts, who are masters of their specialism and yet still up to date with development and technology can bring expertise to teams and organizations.  The best types of experts are those with both breadth and depth, sometimes called “T-shaped people”. They understand the various disciplines involved in software development reasonably well but have specialist expertise in one or two areas. Few people are experts in all areas! Often niche specialisms such as high end security or information assurance professionals simply don’t exist in enough numbers to have one embedded in every team and so we see specialists float between teams. We recommend using the Team Forming practices in the People view to help identify the need for specialists, and help them become part of a team on a part time basis.
For testing specifically, we often see a separation between developers and testers, or worse separate test teams. Separating the concerns of development and test splits responsibility for quality. Splitting verification into a separate role de-emphasizes prevention of problems and often leads to personal conflict. We’ve seen many testers taking a wider view of quality than simply testing which then puts them into a process management role, often resented by other team members. Alternatively, we’ve seen developers feeling in direct confrontation with testers who are constantly attacking their work.
Specialist professional testers are skilled in test techniques and can act as leaders and/or mentors in test activities but we recommend that teams collectively work together to achieve the required levels of Done in whatever manner best suits the team rather than a rigid adherence to roles. At higher levels of integration, especially as part of Product Families, there will be less “coding” and more “testing” required to draw out integration issues. Integration Scenarios and their acceptance criteria are the primary source of requirements for testing at this level where more specialist professional testing skills are required over cross-functional or developer-tester skills.
Architecture professionalism often seems to grow out of the top end of development mastery but frequently leads into drawing more and more abstract diagrams about solutions that aren’t useful to the development team. Again the balance between coding and spiking vs. diagram drawing is often more towards the coding end for system architecture and conversely towards the diagram drawing end for Product Family development.
We recommend that teams are solely responsible for producing products according to their required Definition of Done – that means they need to do quality activities as part of their workflow. We recommend against the use of separate test, architecture or other specialist teams.
Internally, teams may be comprised of cross-functional developers and a number of single specialism people. We recommend using the Team Forming practice to create useful teams from these mixes. We encourage everyone to be involved in producing, designing, testing and building the software, taking the guidance of experts where necessary but learning as much about the holistic business of software development in the process.

These specialisms tend to emerge in large organizations but startups rarely recruit requirements engineers, instead they need smart creative cross-functional people.

When hiring new people, we recommend:
  • Hiring experts in their field, with recent practical experience of general development.
  • Hiring cross-functional developers
  • Avoid hiring junior specialists

Test Planning

Because quality activities are expected to be part of every activity we do not recommend creating test plans separate from development plans. Such a separation in planning causes a fragmentation in the focus on quality. Planning in HSD is holistic and covers definition, production and quality aspects of software engineering. The separation of planning from development and testing in traditional software methods has lead to silo mentalities, transactional behavior and an abdication of responsibility for quality from anyone who isn’t a specialist tester. We recommend instead that teams build and test their products cohesively as part of their collective plan either in an agile/iterative way or continuously.

Builds will typically be the focus of low level developer testing, unit testing and component testing, whereas Releases will be the focus of System Testing and above. However the higher level tests that can be done against any form of integration the better, so if system and integration tests can be done against a build they should be. The earlier a team finds out if a build is a potential release candidate, a candidate for promotion through levels of done, the better.

A Build is change or collection of changes, integrated into a product, that has had some (typically minimal) level of inspection and quality assurance.

A Build may be performed at an individual personal level or at a team level where changes from more than one team member are integrated into the build. Inspection and quality assurance activities are normally a little more involved at team level as the Definition of Done is a little higher but builds do tend to have a relatively low Definition of Done – not every build will be fully tested. Builds happen frequently, often many times a day.

A Release has a dual nature:

A Release is typically formed once in a Release planning timebox after a succession of internal builds but can be created from every successful team build when Continuous Deployment practices are used. Releases will reach a high Definition of Done, often as far as “End 2 End Tested” but may not necessarily be adopted by the business just because they are ready as Adoption and Business Change cycles may be decoupled from Release cadences.

Test planning is a critical and indivisible part of Release planning. Quality activities and testing are vital to reaching the De-Risked Milestones.

Planning is the process of defining an intention for, and making arrangements for, achieving something of Business Value

In Holistic Software Development a plan may be a formal or informal description of how work will be done and controlled, defining scope, schedule and resource usage targets. We focus plans on achieving business value to ensure that plans are aligned to the business strategy and “doing the right thing”. Planning does not constitute progress, only working products should be considered progress.
Plans should be understood to be “best guesses” rather than accurate predictions of the future.
“No plan survives contact with the enemy.”
The purpose of a plan is to answer:
  • What is the work?
  • Where are we now?
  • When will the work be done?
  • What resources will be used?

Quality Risks

There are many types of testing ranging from functional and non-functional testing, to security testing, accessibility and usability testing, installation testing etc. We recommend a risk driven approach, choosing the appropriate types of testing that will address likely quality risks based on the business needs as described by the Requirements stack and Product Customers.

A Quality Risk is an upcoming or ongoing issue that has a significant probability of negatively affecting the quality of the Product or Product Family Quality Risks can be assessed across many quality dimensions. As a starting point we recommend the FURPS+ model used with the Architectural Profile and an examination of Fringe Cases.

Fringe Cases

Fringe cases are unusual paths through requirements that are outside of the normal intended flow through a system. Fringe Cases are a fertile ground for finding bugs and other problems.

Integration ScenariosScoping Use Cases or even simply collections of related User Stories form pathways through a system. Often these requirements form an intentional logical flow through a system. Sometimes this flow is called a “basic flow” or a “happy path” and is the primary input to development. Typically this workflow is the most tested workflow because it’s tested during development and is the most “obvious” path through the system. Although testing this plain flow is necessary, as it typically must work, bug density and quality risk exposure tend to be relatively low. Testing of these flows is often automated and, even if manually tested, are normally required to pass testing at the lowest levels of done.

Often these workflows will have a number of alternative paths, expressed as User Stories tracing back to the same high level requirement or simply defined in an Integration Scenario. These work flows are intentional alternates to the simple, or normal, operation of the system. Since these alternates have been considered prior to development they again are likely to be tested as part of normal development and again can be often automated and part of low levels of done. In Use Case based processes bug density was typically higher in alternative flows, but in User Story based processes there isn’t an explicit definition of a “basic flow” vs. “alternative flows” amongst stories so bug density (and quality risks) tend to be more consistent.

Fringe Cases, especially those across Integration Scenarios, are the paths through a system that haven’t been written down and are often sources of higher bug density and quality risks. Professional test techniques, such as exploratory testing, are often designed to draw out Fringe Cases and professional testers (as opposed to developers doing testing) can be very skilled at finding Fringe Cases. Sometimes random techniques can be used to try to find Fringe Cases such as a “Chaos Monkey” or “Random User Simulation”.

Dealing with Problems and Bugs

Testing and other quality activities often uncover potential issues with products that aren’t necessarily bugs or changes. We recommend capturing “problems” which can be thought of as instances of quality risks. Problems might be bugs, design problems or user interaction problems, misunderstandings etc. Moving to a focus on enhancing the relationship between a Customer and the Product rather than spending time categorizing types of problems helps drive quality.

A Problem is an issue negatively affecting the relationship between a User/Stakeholder and the Product. Problems are often represented as “Bugs” or “Changes”

Bugs (sometimes called “Defects”) and Changes are both ways of representing “problems” where the difference is often described dysfunctionally as:

“Bugs the supplier pays for, Changes the customer pays for” or “Bugs are when the product doesn’t meet the requirements, changes are when the customer wants a change to the requirement.”

The difficulty with the former approach is that it leads to a commercial pressure against bugs, causing suppliers to argue that everything is a change. Indeed, many large software vendors have explicit internal policies to not accept bugs but argue that they are changes wherever possible. This often leads to time-consuming conference calls arguing the details of individual problems – adversarially negotiating the difference between bugs and changes damages the collaborative relationship between suppliers and customers causing transactional behavior.

The latter approach incorrectly implies that a software system can be exactly and perfectly specified, using an up-front requirements approach. We know from the vast weight of industry evidence that this idea is either simply wrong for the majority of software projects or excessively cost-prohibitive and time consuming in predictable cases. There is too much implicit information inherent in requirements, software and user-product interactions to explicitly define everything at this level.

Instead we promote thinking of both bugs and changes in terms of:
“…an issue negatively affecting the relationship between a User/Stakeholder and the Product”

We recommend that Product Customers prioritize bugs, changes and new requirements alongside existing requirements as part of iterative cycles on Backlogs during each Releases. If any bug, change or new requirement is too large to be incorporated without significant change in scope, time or cost then the Product Customer may escalate to the Business Customer.  Where supply is managed by a software development contract we recommend that tolerances are set around changes to scope, time and cost within which the suppliers and customers can make immediate decisions with only very major changes requiring a contractual change.

If a bug, change or new requirements is outside of tolerances and sufficiently large then it can be considered as a new Customer Request for consideration in the portfolio.

We recommend that conflict is avoided by organizations, teams and stakeholders by collectively making group decisions regarding whether a problem is a bug, change or new requirement (if it matters) as well as consideration of whether that problem is within tolerances or not. The earlier problems (of any kind) are identified in development the easier and cheaper they are to fix. For that reason we recommend that quality assurance is built into every activity using feedback cycles rather than left to late lifecycle testing.

In practical terms

Most teams will not be familiar with the term “problem” and so will tend to use bugs (or “defects”), changes (or “change requests”) alongside their requirement types (features, stories etc.) in work item tracking tools. We recommend using these familiar terms without worrying about the differences between them too much. They all represent work that is prioritized on a Backlog and selected for development so the exact difference between a bug and a change or new requirement vs. old requirement is not really relevant.

Measurement driven behavior will often drive items of one type to be turned into another (e.g. bugs into changes to avoid dipping quality metrics). This tends to happen when a type (such as bug) has numerous different contexts such as an internal bug (found by team testing) or an external/escaped bug found in operational usage by customers. Teams generally do not want to mix these different contexts when examining work item metrics as the truth can be easily confused (for example, high numbers of internal bugs is a good thing as it means quality problems are found before the product is released whereas high numbers of escaped bugs are generally considered a bad thing). We recommend using flags in work item tracking tools to differentiate contexts (such as internal vs. escaped bugs) and filtering them to examine metrics.

Is it a bug or an unfinished requirement?

During development, especially when feedback cycles are small, testing activity may identify problems with the current development of a requirement since it hasn’t been finished yet – or more accurately it hasn’t yet achieved its required level of Done. Often a requirement might be thought of as “done” and yet found to be lacking by testing or user acceptance activities. In these cases, teams need to understand whether to raise a bug against the original requirement or re-open the requirement.

Re-opening will typically raise workflow productivity metrics (such as Lead and Cycle Time which can be seen as bad) however it may be the more honest approach for requirements in active development. Raising bugs against a requirement will typically lower quality metrics (such as quality confidence which can also be seen as bad) however this may be the more honest approach for requirements once they have been accepted as “done”. As a result, we recommend that requirements that have not yet hit the highest level of acceptance at the team’s scope have “Changes” against them (since they are not “done” yet) and only once requirements have been accepted as “done” are bugs raised against them.

This means that for development teams delivering a product, once development against a requirement has been accepted by the product customer (reached the “Product User Acceptance” Definition of Done) it should not be re-opened – instead bugs should be raised against the product. Below this level of done the requirement has not been closed or marked as “done” instead more work is required to reach the “acceptance” level of done which are best represented as “Changes”.

For integration teams at a Product Family level whose Product Family has not yet, as a whole, reached the “Product Family User Acceptance” Definition of Done, problems (including unexpected work) are best represented as either Changes or new Integration Scenarios if significant enough. Problems identified with constituent components or systems from contributing teams may be raised against those teams as problems (or bugs/changes). Once “Product Family User Acceptance” has been achieved then “Bugs” can be raised against the Product Family.

Treating problems in this way requires not considering “done” as a simple binary state, but instead adopting a stack of “Definitions of Done”. Most work item tracking tools can do this as either sub-states on “Closed” or by having multiple “Closed” states.

Where software development is managed by a contract this distinction helps drive a need for closer customer collaboration.


We can measure Quality in a number of ways. In the Metrics and Reporting view we discuss Bug Frequency and Quality Confidence. However, the most effective way to measure the quality of a system is through direct user feedback. In terms of both internal systems, services and open market apps we recommend instrumentation of basic usage so that developers can track how much their code is being used.

Quality Confidence is a lead indicator for the quality of a software release based on the stability of test pass rates and requirements coverage. Quality Confidence can be implemented at any level of the requirements stack mapped to a definition of done.