Included below is a summary of key findings from the research and science behind effective software delivery and DevOps. Links to the original research and book are available in the references.
The four characteristics that were found to be the most effective measures of software delivery performance are
Software delivery performance is categorized into the following levels:
Aspect of Software Delivery Performance | High Performer | Medium Performer | Low Performer |
---|---|---|---|
Deployment frequency For the primary application or service you work on, how often does your organization deploy code to production or release it to end users? | On-demand (multiple deploys per day) | Between once per day and once per week | Between once per month and once every six months |
Lead time for changes For the primary application or service you work on, what is your lead time for changes (i.e., how long does it take to go from code committed to code successfully running in production)? | Less than one hour | Between one week and one month | Between one week and one month |
Time to restore service For the primary application or service you work on, how long does it generally take to restore service when a service incident or a defect that impacts users occurs (e.g., unplanned outage or service impairment)? | Less than one hour | Less than one day | Between one day and one week |
Change failure rate For the primary application or service you work on, what percentage of changes to production or released to users result in degraded service (e.g., lead to service impairment or service outage) and subsequently require remediation (e.g., require a hotfix, rollback, fix forward, patch)? | 0-15% | 0-15% | 31-45% |
Lean methodology extends beyond manufacturing, DevOps etc… it also complements product development.
Pathological Power-oriented |
Bureaucratic Rule-oriented |
Generative Performance-oriented |
---|---|---|
Low co-operation | Modest co-operation | High co-operation |
Messengers shot | Messengers neglected | Messengers trained |
Responsibilities shirked | Narrow responsibilities | Risks are shared |
Bridging discouraged | Bridging tolerated | Bridging encouraged |
Failure leads to scapegoating | Failure leads to justice | Failure leads to inquiry |
Novelty crushed | Novelty leads to problems | Novelty implemented |
Westrum originally was interested in improving safety outcomes in medical environments by adopting generative cultures and avoiding pathological ones. From my experience in working in healthcare I noticed how impactful it was as well and there is research that supports this. Little did he realize it would also become a huge driver in DevOps and continuous delivery. Of the human factors needed to adopt continuous delivery, its one of the most significant.
Here was his original research paper on organizational culture https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1765804/pdf/v013p0ii22.pdf
High performers have better employee loyalty, as measured by employee Net Promoter Score (eNPS). Employees in high-performing organizations were 2.2 times more likely to recommend their organization as a great place to work. eNPS was significantly correlated with:
The extent to which the organization collects customer feedback and uses it to inform the design of products and features The ability of teams to visualize and understand the flow of products or features through development all the way to the customer The extent to which employees identify with their organizations values and goals, and the effort they are willing to put in to make the organization successful
Employees in high-performing teams are 2.2 times more likely to recommend their organization as a great place to work. Employees in high-performing teams are 1.8 times more likely to recommend their team as a great place to work. Job satisfaction predicts organizational performance.
We observed significant differences in leadership characteristics among high-, medium-, and low-performing teams.
High-performing teams reported having leaders with the strongest behaviors across all dimensions: vision, inspirational communication, intellectual stimulation, supportive leadership, and personal recognition. Low-performing teams reported the lowest levels of all five leadership characteristics. These differences were all at statistically significant levels.
The fear and anxiety that engineers and technical staff feel when they push code into production can tell us a lot about a team’s software delivery performance. We call this deployment pain, and it is important to measure because it highlights the friction and disconnect that exist between the activities used to develop and test software and the work done to maintain and keep software operational. This is where development meets IT operations, and it is where there is the greatest potential for differences: in environment, in process and methodology, in mindset, and even in the words teams use to describe the work they do.
Our research shows that improving key technical capabilities reduces deployment pain, teams that:
decrease their deployment pain.
In the 2019 edition of the “State of DevOps” report a new “Elite” category was added to distinguish the extreme high performers that emerged while grouping the data. The benefits applied to “high performers” from up above still apply here to the “Elite” performers.
Aspect of Software Delivery Performance | Elite Performer | High Performer | Medium Performer | Low Performer |
---|---|---|---|---|
Deployment frequency For the primary application or service you work on, how often does your organization deploy code to production or release it to end users? | On-demand (multiple deploys per day) | Between once per day and once per week | Between once per week and once per month | Between once per month and once every six months |
Lead time for changes For the primary application or service you work on, what is your lead time for changes (i.e., how long does it take to go from code committed to code successfully running in production)? | Less than one day | Between one day and one week | Between one week and one month | Between one month and six months |
Time to restore service For the primary application or service you work on, how long does it generally take to restore service when a service incident or a defect that impacts users occurs (e.g., unplanned outage or service impairment)? | Less than one hour | Less than one day | Less than one day | Between one week and one month |
Change failure rate For the primary application or service you work on, what percentage of changes to production or released to users result in degraded service (e.g., lead to service impairment or service outage) and subsequently require remediation (e.g., require a hotfix, rollback, fix forward, patch)? | 0-15% | 0-15% | 0-15% | 46-60% |