Quality is meeting the true end goals of the customer. Quality indicates how well, from a Customer's perspective, a product meets its functional and non-functional requirements - without problems. Quality is an indirect measure of how much business value a product delivers but a direct measure of the relationship between users and a product.
Although quality is "designed in" as part of modern professional development, problems will still occur due to the inherent complexity of software development as a team activity. For this reason, testing activities are an essential part of product development. At each level of the Definitions of Done different types of testing or other quality activities might be useful. We recommend focusing quality activities on quality risks, using a mix of techniques and quality activities whenever possible. For example, if an integration test across an entire Product Family can be quickly and cheaply run as part of an automated build at the lowest level of done then it should be.
A focus on quality is necessary across all parts of software development to prevent causing avoidable problems. Some problems will always occur despite the best intentions and so a balance between prevention and detection must be struck. In both cases we recommend using an understanding of risk in different areas to help focus quality efforts.
When decomposing requirements, we recommend doing so in collaboration with users and customers. This helps to avoid the risk that teams build the wrong solution, refining collective understanding with each build. Using visual techniques such as User Experience Modelling helps to bridge the gap between “techies” and “business users”. Remember requirements elaboration will happen continuously during the lifecycle of development, not just at the beginning.
Architecture exists to mitigate risks and focusses on the implementation of cross-cutting and non-functional requirements. Good architecture is the foundation of quality, whereas bad, or the wrong choice of, architecture undermines quality efforts. Rapid high level Solution Architecture sketching with Business Customers, and System Architecture sketching with a suitably technically minded Product Customer are excellent ways of building quality into architecture and design activities. Even without technically minded Product Customers, understanding, and spiking risks, is the key to successful architecture. The Architectural Profile is an excellent way of finding where extra quality activities of effort may be needed as quality risks tend to be in the high complexity areas.
Many development practices have evolved to design in quality and prevent or detect Problems early: Pair Development, Mob Development, Test Driven Development and Unit Testing are ways of building quality into technical development.
Development work leads to integration and at each level of integration changes are combined into builds, higher level components, systems, products and product families. Each integration activity is a point at which we can check that output matches the requirements, acceptance criteria, example or plan (these activities are traditionally called “verification”) and check that the customer is getting what they want and will accept the output (traditionally called “validation”). Verification is checking that that the product is built correctly, validation checks that the right product is being built.
We recommend that quality checks happen at each level of feedback cycle with outputs being checked with customers at every available opportunity to find problems as early as possible and to actively include customers in development cycles. We recommend that a focus on quality is built into every activity and explicitly addressed by testing, and other quality assurance activities, to find problems that have crept into products. Testing activities can help teams feel confident in the promotion of a product release through the stack of Definitions of Done.
However, it’s a mistake to only do higher level testing on major releases. Every change should be pushed as far up the stack of “done” as is practical and useful. Whenever a quality risk can be addressed by testing, the test should be applied regardless of levels or process.
Test activities typically involve a mix of manual testing approaches, tool supported exploratory testing and automated testing using development environments, continuous integration environments and test environments. The top level of the Definitions of Done is focused on "Acceptance" and so is normally driven by Acceptance Testing with Product Customers and Business Customers.
Definition of Done
A "Definition of Done" describes what "done" means in an unambiguous way so that interacting individuals, teams or organizations can work together effectively.
Most organizations are made up of teams of teams and at each team boundary work requests and results are handed between teams at a certain level of completeness. Although close collaborative relationships, organically formed are often the most effective, this transactional behaviour is often necessary when teams have been organized to decompose large populations. Where transactional boundaries do exist, clarity over definitions of done, both in terms of incoming requests and quality of output is necessary.
One of the most common causes of conflict when teams need to work together is an inconsistent Definition of Done between the teams. Aligning on known definitions prevents conflict.
We recommend developing acceptance criteria associated with work requests, requirements and bugs/changes so that requests are fully rounded and results won't be produced without the requisite level of quality. These levels of completeness and quality are "definitions of done". In Holistic Software Development we explicitly define a stack of levels of done that correlate to the governance structures, decomposition of requirements, promotion through integration streams and loosely to levels of testing. Where possible we recommend testing to as high a level of done as possible to identify release candidates (or problems) as early as possible.
Specialists vs. Cross-Functional Developers
As software development practices have evolved people have specialized into specific roles such as:
- Requirements Engineer
These specialisms tend to emerge in large organizations but startups rarely recruit requirements engineers, instead they need smart creative cross-functional people.
- Hiring experts in their field, with recent practical experience of general development.
- Hiring cross-functional developers
- Avoid hiring junior specialists
Because quality activities are expected to be part of every activity we do not recommend creating test plans separate from development plans. Such a separation in planning causes a fragmentation in the focus on quality. Planning in HSD is holistic and covers definition, production and quality aspects of software engineering. The separation of planning from development and testing in traditional software methods has lead to silo mentalities, transactional behavior and an abdication of responsibility for quality from anyone who isn't a specialist tester. We recommend instead that teams build and test their products cohesively as part of their collective plan either in an agile/iterative way or continuously.
Builds will typically be the focus of low level developer testing, unit testing and component testing, whereas Releases will be the focus of System Testing and above. However the higher level tests that can be done against any form of integration the better, so if system and integration tests can be done against a build they should be. The earlier a team finds out if a build is a potential release candidate, a candidate for promotion through levels of done, the better.
A Build is change or collection of changes, integrated into a product, that has had some (typically minimal) level of inspection and quality assurance.
A Build may be performed at an individual personal level or at a team level where changes from more than one team member are integrated into the build. Inspection and quality assurance activities are normally a little more involved at team level as the Definition of Done is a little higher but builds do tend to have a relatively low Definition of Done – not every build will be fully tested. Builds happen frequently, often many times a day.
A Release has a dual nature:
- a Build, with a high level of Done, packaged for operational deployment that is a viable cohesive product delivering Business Value.
- a Planning concept (Release Plan), typically containing a number of deliveries produced through, iterations/sprints or continuous flow, that combines a set of "done" high level-requirements, architecture and tests - often represented as a Milestone.
A Release is typically formed once in a Release planning timebox after a succession of internal builds but can be created from every successful team build when Continuous Deployment practices are used. Releases will reach a high Definition of Done, often as far as "End 2 End Tested" but may not necessarily be adopted by the business just because they are ready as Adoption and Business Change cycles may be decoupled from Release cadences.
Test planning is a critical and indivisible part of Release planning. Quality activities and testing are vital to reaching the De-Risked Milestones.
Planning is the process of defining an intention for, and making arrangements for, achieving something of Business Value
- What is the work?
- Where are we now?
- When will the work be done?
- What resources will be used?
There are many types of testing ranging from functional and non-functional testing, to security testing, accessibility and usability testing, installation testing etc. We recommend a risk driven approach, choosing the appropriate types of testing that will address likely quality risks based on the business needs as described by the Requirements stack and Product Customers.
A Quality Risk is an upcoming or ongoing issue that has a significant probability of negatively affecting the quality of the Product or Product Family
Quality Risks can be assessed across many quality dimensions. As a starting point we recommend the FURPS+ model used with the Architectural Profile and an examination of Fringe Cases.
Integration Scenarios, Scoping Use Cases or even simply collections of related User Stories form pathways through a system. Often these requirements form an intentional logical flow through a system. Sometimes this flow is called a "basic flow" or a "happy path" and is the primary input to development. Typically this workflow is the most tested workflow because it's tested during development and is the most "obvious" path through the system. Although testing this plain flow is necessary, as it typically must work, bug density and quality risk exposure tend to be relatively low. Testing of these flows is often automated and, even if manually tested, are normally required to pass testing at the lowest levels of done.
Often these workflows will have a number of alternative paths, expressed as User Stories tracing back to the same high level requirement or simply defined in an Integration Scenario. These work flows are intentional alternates to the simple, or normal, operation of the system. Since these alternates have been considered prior to development they again are likely to be tested as part of normal development and again can be often automated and part of low levels of done. In Use Case based processes bug density was typically higher in alternative flows, but in User Story based processes there isn't an explicit definition of a "basic flow" vs. "alternative flows" amongst stories so bug density (and quality risks) tend to be more consistent.
Fringe Cases, especially those across Integration Scenarios, are the paths through a system that haven't been written down and are often sources of higher bug density and quality risks. Professional test techniques, such as exploratory testing, are often designed to draw out Fringe Cases and professional testers (as opposed to developers doing testing) can be very skilled at finding Fringe Cases. Sometimes random techniques can be used to try to find Fringe Cases such as a "Chaos Monkey" or "Random User Simulation".
Dealing with Problems and Bugs
Testing and other quality activities often uncover potential issues with products that aren't necessarily bugs or changes. We recommend capturing "problems" which can be thought of as instances of quality risks. Problems might be bugs, design problems or user interaction problems, misunderstandings etc. Moving to a focus on enhancing the relationship between a Customer and the Product rather than spending time categorizing types of problems helps drive quality.
A Problem is an issue negatively affecting the relationship between a User/Stakeholder and the Product. Problems are often represented as "Bugs" or "Changes"
Bugs (sometimes called "Defects") and Changes are both ways of representing "problems" where the difference is often described dysfunctionally as:
"Bugs the supplier pays for, Changes the customer pays for"
"Bugs are when the product doesn't meet the requirements, changes are when the customer wants a change to the requirement."
The difficulty with the former approach is that it leads to a commercial pressure against bugs, causing suppliers to argue that everything is a change. Indeed, many large software vendors have explicit internal policies to not accept bugs but argue that they are changes wherever possible. This often leads to time-consuming conference calls arguing the details of individual problems - adversarially negotiating the difference between bugs and changes damages the collaborative relationship between suppliers and customers causing transactional behavior.
The latter approach incorrectly implies that a software system can be exactly and perfectly specified, using an up-front requirements approach. We know from the vast weight of industry evidence that this idea is either simply wrong for the majority of software projects or excessively cost-prohibitive and time consuming in predictable cases. There is too much implicit information inherent in requirements, software and user-product interactions to explicitly define everything at this level.
We recommend that Product Customers prioritize bugs, changes and new requirements alongside existing requirements as part of iterative cycles on Backlogs during each Releases. If any bug, change or new requirement is too large to be incorporated without significant change in scope, time or cost then the Product Customer may escalate to the Business Customer. Where supply is managed by a software development contract we recommend that tolerances are set around changes to scope, time and cost within which the suppliers and customers can make immediate decisions with only very major changes requiring a contractual change.
If a bug, change or new requirements is outside of tolerances and sufficiently large then it can be considered as a new Customer Request for consideration in the portfolio.
We recommend that conflict is avoided by organizations, teams and stakeholders by collectively making group decisions regarding whether a problem is a bug, change or new requirement (if it matters) as well as consideration of whether that problem is within tolerances or not. The earlier problems (of any kind) are identified in development the easier and cheaper they are to fix. For that reason we recommend that quality assurance is built into every activity using feedback cycles rather than left to late lifecycle testing.
In practical terms
Most teams will not be familiar with the term "problem" and so will tend to use bugs (or "defects"), changes (or "change requests") alongside their requirement types (features, stories etc.) in work item tracking tools. We recommend using these familiar terms without worrying about the differences between them too much. They all represent work that is prioritized on a Backlog and selected for development so the exact difference between a bug and a change or new requirement vs. old requirement is not really relevant.
Measurement driven behavior will often drive items of one type to be turned into another (e.g. bugs into changes to avoid dipping quality metrics). This tends to happen when a type (such as bug) has numerous different contexts such as an internal bug (found by team testing) or an external/escaped bug found in operational usage by customers. Teams generally do not want to mix these different contexts when examining work item metrics as the truth can be easily confused (for example, high numbers of internal bugs is a good thing as it means quality problems are found before the product is released whereas high numbers of escaped bugs are generally considered a bad thing). We recommend using flags in work item tracking tools to differentiate contexts (such as internal vs. escaped bugs) and filtering them to examine metrics.
Is it a bug or an unfinished requirement?
During development, especially when feedback cycles are small, testing activity may identify problems with the current development of a requirement since it hasn't been finished yet - or more accurately it hasn't yet achieved its required level of Done. Often a requirement might be thought of as "done" and yet found to be lacking by testing or user acceptance activities. In these cases, teams need to understand whether to raise a bug against the original requirement or re-open the requirement.
Re-opening will typically raise workflow productivity metrics (such as Lead and Cycle Time which can be seen as bad) however it may be the more honest approach for requirements in active development. Raising bugs against a requirement will typically lower quality metrics (such as quality confidence which can also be seen as bad) however this may be the more honest approach for requirements once they have been accepted as "done". As a result, we recommend that requirements that have not yet hit the highest level of acceptance at the team's scope have "Changes" against them (since they are not "done" yet) and only once requirements have been accepted as "done" are bugs raised against them.
This means that for development teams delivering a product, once development against a requirement has been accepted by the product customer (reached the "Product User Acceptance" Definition of Done) it should not be re-opened - instead bugs should be raised against the product. Below this level of done the requirement has not been closed or marked as "done" instead more work is required to reach the "acceptance" level of done which are best represented as "Changes".
For integration teams at a Product Family level whose Product Family has not yet, as a whole, reached the "Product Family User Acceptance" Definition of Done, problems (including unexpected work) are best represented as either Changes or new Integration Scenarios if significant enough. Problems identified with constituent components or systems from contributing teams may be raised against those teams as problems (or bugs/changes). Once "Product Family User Acceptance" has been achieved then "Bugs" can be raised against the Product Family.
Treating problems in this way requires not considering “done” as a simple binary state, but instead adopting a stack of “Definitions of Done”. Most work item tracking tools can do this as either sub-states on “Closed” or by having multiple “Closed” states.
Where software development is managed by a contract this distinction helps drive a need for closer customer collaboration.
We can measure Quality in a number of ways. In the Metrics and Reporting view we discuss Bug Frequency and Quality Confidence. However, the most effective way to measure the quality of a system is through direct user feedback. In terms of both internal systems, services and open market apps we recommend instrumentation of basic usage so that developers can track how much their code is being used.
Quality Confidence is a lead indicator for the quality of a software release based on the stability of test pass rates and requirements coverage. Quality Confidence can be implemented at any level of the requirements stack mapped to a definition of done.