Scaling Agile using the Scaled Agile Framework (SAFe)

“If you’ve lost the ability to do small things then you’ve not scaled, you’ve just gotten big and slow!!!”

Abstract

The Scaled Agile Framework (SAFe) has become a hot discussion topic in the agile community as a solution to scaling agile in the enterprise. Amongst the options available to scale agile, SAFe appears to be leading the pack as the popular choice for consideration and implementation. The framework has got its critics and some prominent respected people in the agile community have spoken out against it. With everything you read you have to consider motivations beyond the narrative to evaluate how impartial a critique actually might be. I decided I needed an education in SAFe to develop an informed opinion on the advantages and disadvantages of the framework and when it might make sense to use it. Last week I flew to Melbourne to undergo the SAFe Program Consultant (SPC) certification training. This blog gives an account of what I learned (the good, the bad and the ugly), the opinions I have formed and what recommendations I would make.

WARNING – This blog is a long read but hopefully valuable for those considering SAFe.

The Pitch

SAFe is pitched as a template framework for implementing agile practices in the enterprise at scale. It recognizes what works well today at the team level on agile deliveries and attempts to protect and scale that for the enterprise. It specifically pitches to the powerhouse in large organisations (e.g. executives, managers, architecture etc.) to bring them onboard in order to scale successfully.

The framework is founded on Lean and Agile principles and it looks to integrate and leverage much that has already proven to work when undertaking knowledge based work (e.g. Scrum, XP, Kanban etc.).

Based on Lean principles SAFe is looking to achieve a goal consisting of:

  • Speed: sustainably shortest lead time
  • Value: most customer delight, lowest cost, high morale, safety
  • Quality: best quality and value to people and society

The business benefits it targets include:

  • Significant increase in employee engagement
  • Increase in productivity
  • Faster time to market
  • Higher quality

Key Concepts

See the SAFe website for more details. This is just a brief introduction to some key concepts in SAFe to assist reading this blog.

SAFe defines these three levels within the enterprise:

  • Portfolio
    • Program
      • Team

Portfolio Level

  • Focused on the Portfolio Vision
  • Investment Themes are created, each theme has funding assigned
  • Themes are broken down to business and architectural epics, forming the portfolio backlog, each epic draws down a portion of the investment theme funding and is subject to a very lightweight business case
  • Kanban is used at the Portfolio level

Program level

  • Focused on a specific business value stream
  • Portfolio epics are broken down into features to help form the program backlog
  • It is a scaled up version of the team level scrum model so it is based around a timebox and not flow (kanban)
  • The timebox at the program level is called a PSI (Potentially Shippable Increment) and by default is 5 sprints (iterations) in length (10 weeks).
  • Each PSI is delivered through the concept of an Agile Release Train (ART), the analogy is trains running continuously with (relatively) short intervals between each one (like a subway/underground rail system) so you just turn up and board without the need for much upfront planning.
  • Release Train Engineer (RTE) is the equivalent to a ScrumMaster at the program level (a ScrumMaster of ScrumMasters).
  • The Product Manager is the equivalent to the Product Owner at the program level and forms the Product Management team with the team level Product Owners.

Team level

  • Program level features are broken down into Stories to help form the team backlog
  • The standard implementation is to operate Scrum with some XP practices at the team level (alternatives such as kanban can be used)
  • The standard Scrum roles apply at the Team level
  • The sprint (iteration) size is 2 weeks

HIP Sprint

  • The last sprint (iteration) of a PSI is known as a HIP sprint – it does not deliver new features but instead is focused on:
    • Hardening
    • Innovation
    • Planning

Weighted Shortest Job First (WSJF) Prioritisation

  • SAFe uses an algorithm to determine the prioritisation of Epics and Features at the Portfolio and Program levels. The formula tries to consider business value, date criticality, risk reduction, opportunity enablement and the size of the work in deriving a number that is then used to rank backlog items.

Aha!! Moments

Projects Disappear

  • Projects as a method to deliver scope disappear in the SAFe framework. A delivery capacity is setup through the regular departing of a release train (ART) every 10 weeks (default). There is a Program and Portfolio Management team at the Portfolio level where a PMO, program and/or project managers may still exist. The RTE may also be a project manager if they have an agile mindset and experience.

It’s Scrum at the Team Level

  • SAFe at the team level uses Scrum with some XP practices (it can also use Kanban or adaptions like ScrumBan). SAFe only changes the following:
    • Cadence alone is not enough, SAFe requires rituals to be synchronised across teams
    • Team backlogs are pushed down from the Program level by the Product Management and Architecture teams.
    • Teams have to be take part in additional rituals at the Program level for planning, managing dependencies, inspection and adaption.

Scrum of Scrums (SoS) is used to help coordination across teams

  • SAFe uses the SoS approach within PSI boundaries to coordinate across teams (e.g. managing dependencies, removing impediments etc.)
  • In addition to SoS, SAFe introduces the PSI planning meeting at PSI boundaries.

PSI <> the only time we can ship

  • A PSI is a vehicle to plan (coordinate and synchronise) the work for a 10-week horizon (default); it is not a constraint to when software can actually be shipped. SAFe incorporates the idea of Deliver On Demand as long as release and governance criteria have been met.

Opinion: I think SAFe has evolved to this state and the PSI probably was the release point but would have been subject to much criticism constraining releases to a set point every 10 weeks. The Deliver On Demand I expect will need much localized thought when applying SAFe as concepts like the HIP sprint (used to address maturity weakness) may well provide some challenges.

The Good

Promotes lean and agile mindsets in a traditionally aligned corporate world

  • SAFe is based on lean and agile principles. Many large corporations that will be looking to adopt SAFe will have strong footholds in traditional delivery methods and ways of working. SAFe will help with the cultural shift required to more contemporary delivery methods and mindsets.
  • Organisational structures will typically be aligned functionally to different specialisms. Whilst SAFe does not attempt to force a change to those reporting structures it does require the creation of virtual teams (valuing empowerment, co-location and cross-functional capability).
  • SAFe promotes a shift from “command and control” style of management to leadership based on agile and lean principles such as:
    • Develop people, not things
    • Decentralise control
    • Embracing the Agile Manifesto
  • The Program and Portfolio Management team is part of the framework at the Portfolio level. This inclusion provides an opportunity to shift the traditional thinking of many corporate PMOs.

Promotes a shift to shorter term planning horizons

  • SAFe takes away the notion of projects to deliver outcomes; the delivery vehicle becomes the train (ART) to deliver the features assigned to each PSI. A PSI is 10 weeks in length (default), which is a much shorter planning horizon than most projects are setup to deliver. Knowledge based work is full of uncertainty so shorter planning horizons with regular feedback loops and adaptive planning based on empirical evidence produces more reliable plans (does SAFe go far enough though? See the Ugly section).
  • With the shift from projects, Investment Themes become the container to which funding is assigned, Investment Themes though get broken down to Business and Architectural Epics, an epic draws down funding from the Investment Theme. This promotes:
    • An incremental funding model
    • Lightweight business cases to justify the release of funds for an epic (many project based business cases are too heavy and often would not look out of place in the “Science Fiction” of the library).
    • A shift away from the annual funding cycle (e.g. “the bank is open once a year”) and the non-ethical behaviour that this can create

SAFe is an advocate for long lived delivery teams

  • Traditionally teams are created to deliver projects, they often disband at the end of projects and new teams are created to support new projects. Whenever teams form they go through different stages of development (see Bruce Tuckman stages of team formation). It takes a long time for a team to reach the Performing team stage, often teams never get there, they get disbanded before that stage has been reached. SAFe promotes not only long-lived delivery teams but it also promotes at scale long-lived teams of teams aligned to a particular value stream.

The Bad

Normalised Story Points and Velocity

  • The program level uses the Scrum model scaled up. It therefore operates a time box of 10 weeks (default) and makes use of story points and velocity for prioritisation and planning. As features at the program level can be assigned to any team a consistency in estimation units is required. SAFe looks to normalise the size of a story point across teams by anchoring 1 Story Point = 1 Ideal Day.
  • The framework describes a four-step process for normalising story point estimating. At best it is crude and simplistic, at worst it creates confusion by including 2 steps that are instead about deriving a team’s initial velocity. Whilst linked for planning purposes, Normalised Story Points and Calculating Velocity should be tackled separately along with guidance on different strategies for each. If you listen to the recorded presentation on the SAFe Foundations delivered by Dean (Q&A section at the end) it sounds like even the creator is confused between normalising story points and normalising velocity. Dean’s deck also has this bullet point:
deck normalized bp

I wonder how many initial PSIs go awry given this confusing guidance? It can never be perfect but it definitely could be a whole lot better.

  • With regard to strategies:
    • For Normalised Story Points my preference is to avoid ideal days, break the association between story points and time from Day 1 (what different people can achieve in a day varies). We should estimate relative size but we should derive the duration (through velocity). The anchoring could instead be achieved by seeding some common people in the very initial estimation sessions across teams (this only needs to be done once).
    • For Velocity, less crude strategies could be employed like:
      • Lets build some empirical data first
      • Lets run the PSI planning day providing a list of prioritised features and see what the teams feel they can get done
      • Lets simulate the planning for a few selected estimated features and see what velocity we can derive (and throw away when we have that empirical data)
  • Story point estimates are created for both Epics and Features, use of the WSJF formula for prioritisation necessitate this. Epics are things that will typically get delivered across multiple PSIs (by default a PSI is 10 weeks), have you ever estimated something that large using planning poker and the fibonacci sequence? If you need to estimate something that large, without breaking it down, the level of uncertainty in that estimate will be extremely high. If it’s just used for prioritisation of something large, coarser-grained T-shirt sizing might be more appropriate – reduce the time you spend estimating when all you seek is something in the right ballpark. Alternatively SAFe could consider not estimating at all, breaking the work up and measuring throughput and cycle time and prioritise based on value is a valid alternative for consideration.

Separate queuing of functional and non-functional work

  • SAFe separates Business Epics (functional type work) and Architecture Epics (typically non-functional type work) and each get queued separately by the Product Management (content authority) and Architecture groups (design authority) respectively. An agreement is made between the Business Owners (a group of 3-5 people from areas like Product, Vertical Leads, Development/Delivery Managers, Architecture and Operations) on the capacity allocation for each work type for a PSI. The WSJF formula is used to prioritise backlog items within each work type.
  • At the team level on agile deliveries I have seen two schools of thought:
  • Technical Stories as an anti-pattern – the thought here is that technical stories should be able to be written as a User Story and described in terms of the business value they provide so that the PO can prioritise.

OR

  • Technical Stories can exist but they must be sponsored by at least one User Story, they cannot exist without being associated to a User Story. When the PO prioritises that User Story it is clear which technical stories will also need to be done. For example, in a data warehouse a User Story (with acceptance criteria) might need specific data available in the BI layer, this Story might require a number of technical stories (each with their own test criteria) that promotes the data through the various warehouse layers to eventually realize the User Story.

In both these scenarios the PO, who in Scrum is responsible for realising ROI, prioritises the work, its up to the team to influence the PO on why some technical work must be done.

  • In the SAFe framework the Product Management Team does not prioritise all of the work and the responsibility for ROI is spread across the Business Owners.
  • The architecture team can inject work into a PSI without linkage to any particular business epic. They should still use WSJF prioritisation (which considers business value) for architectural epics and features but this prioritisation is only within the architecture class of work. Effectively some architecture work that has a lower WSJF rating can get queued ahead of some business functional work that is rated much higher. The architecture team can queue the work without reference to the product management team.

Backlog Injection

  • Features and Stories can be injected into the Program and Team backlogs respectively without total transparency at the higher level. The fact that teams can inject work into these backlogs is a good thing, it demonstrates empowerment and localised decision making. However SAFe, despite having decentralised control as a principle, does exhibit a high level of controlled alignment in the execution of release trains (to be covered in the Ugly section) so this backlog injection is a paradox. The way it is controlled in SAFe is again through capacity allocation, which means, like architecture items, teams might not be working on the next most valuable thing.

Scaling Agile in the Enterprise requires cultural transformation

  • SAFe does pitch to some of the power brokers in large corporate organisations (e.g. executives, architecture, PMO). Many functionally aligned middle managers will not feel the same level of inclusion and feel the framework is a threat to their empires. Whilst power is often an illusion this fear will be real. The framework is missing the change management required for implementation in large corporations backed up with a toolkit of strategies and techniques to address the common challenges that will occur.

The Ugly

PSI planning days

  • At PSI boundaries there is a 2 day mass meeting (PSI planning) involving everyone involved on the release train (ART) including the management team. This meeting will be a gathering of 50 to 125 people to agree a plan for the next PSI (10 weeks by default).
  • We already know that to get anything done effectively you need 7 +/-2 people (no more than you are able to feed with 2 pizzas). Within these 2 days the teams do breakout for team planning but based on the suggested agenda 5 hours out of the 2 days will be spent in groups of this preferred size (less than 50%). Even in these breakouts people are running between teams to discuss and resolve dependencies – really how effective can this be and how much truly gets missed?
  • This problem is a direct result of the SAFe implementation that is attempting to enforce a high level of controlled alignment by pushing features into teams rather than letting them self-organise around particular goals that need to be achieved. Features are pushed on to teams and they are giving 5 hours team planning time to work out how they will collectively achieve it and then commit to the next set of PSI objectives for a 10 week (default) period. At the team level with Scrum, teams have a 4-hour timebox for sprint (iteration) planning to commit to a sprint goal for a 2-week sprint. Compare that to the 5 hours a team gets to commit to a 10-week horizon for a PSI. What’s worse, they are not committing to high-level goals that give them flexibility to shape the PSI if needed; they instead have to commit to a detailed list of PSI objectives! No wonder SAFe regards an achievement of 80% of PSI objectives a success and a process that is under control – what is the external reception however based on the expectations that are being set?
  • There needs to be greater emphasis in SAFe on eradicating the dependencies over managing the dependencies to water down the need for such mass planning days. Strategies might include how you break up the work (e.g. different teams working on different Minimum Viable Products (MVPs)), how you organise the teams (e.g. feature teams over component teams), architectural patterns that promote loose coupling, integration and deployment strategies (e.g. feature toggling).
  • Minimising dependencies and removing heavily controlled alignment in favour of alignment achieved through vision, goals, values + principles with regular feedback loops will allow teams to better self-organise and reduce the need for PSI planning days of up to 125 people. Teams will instead coordinate across much smaller groups when required based on the reduced set of dependencies that exist.

Poor stakeholder expectation setting

  • The 2 day mass PSI planning meeting starts with the content and design authorities bringing a list of their top 10 features that they will assign to the teams. It’s not the ordered program backlog of features, it’s the top 10 – expectations have already begun to be seeded.
  • The two deliverables from PSI planning are:
    • PSI Objectives – shows committed and stretch objectives for the PSI
    • PSI Program Plan – a visualisation of PSI milestones, the features that will be complete at each sprint (iteration) boundary within the PSI plus a representation of dependencies using cards and lots of red string
  • We know that when we undertake knowledge-based work there is much uncertainty (we cannot reliably predict the future), adaptive planning practices based on tight feedback loops are the way we best manage that uncertainty. Whilst the PSI objectives reflect some of that uncertainty through the inclusion of stretch objectives it does not go far enough. The PSI objectives are too granular, they should be more course-grained goals that allow teams flexibility to shape their work (when required) to fit the timebox and meet that goal. Expectations have been set at too fine a level through the PSI objectives that are not sufficiently reflecting the uncertainty ahead.
  • The PSI Program Plan has set expectations with the audience about what features will land where. There is no reflection on this plan for the uncertainty that exists and the level of confidence in delivery of each feature. When release planning at the team level this is often shown through a release burnup/burndown chart and the variability is shown through optimistic, most likely and pessimistic views on the rate of travel (velocity). This communicates the uncertainty and sets expectations correctly, it can show at a certain point of time you will have these features delivered, you will probably also have these features delivered and at a long shot you might also get these. The PSI Program Plan does nothing to communicate this uncertainty and so on it’s own it is badly setting stakeholder expectations.

NB. There is a Release burndown in the metrics section of SAFe; it is not a deliverable from PSI planning. SAFe provides no detail on how it handles different velocities of different teams to create the chart and the example only shows a single rate of travel not a range.

SAFe is not doing a good job at ensuring we “Build the Right Thing”

  • From the SAFe Program Level features are pushed down to Team backlogs. The Product Management team and Architecture are deciding which features to build and it is done without the involvement of the team until that team has had the feature pushed into their backlog. Teams are effectively treated as “order takers”, the collective collaboration between those that know what’s valuable (understand the market), know what’s usable (sit with the users) and know what’s feasible (can build it) is not occurring. The SAFe model assumes that product management and architecture are the authorities and know what to build next, in reality it is normally a collaborative game between people with the above skillsets that together create the best ideas on what to do next. The closest SAFe gets to this is at the PSI planning event where the team collaborate for 5 hours to cover scope for a 10 week period (using defaults) – this is insufficient time and in most cases will not have all the skills represented to be effective.
  • SAFe has no strength in ensuring that validated learning is part of the feedback loop. Even the best ideas are experiments and need validation, features are built to achieve business goals, a feature needs to include how you will measure its success. After deployment metrics should be captured and with the new knowledge gained (validated learning) adapt what you build next and so evolve to “build right thing”.

Riding the paradox

Reduce the batch size YET the Release Train (ART) is a batch

  • One of the SAFe principles is to “Reduce Batch Sizes”. SAFe does successfully reduce batch sizes by eradicating projects in favour of the release train to deliver a PSI every 10 weeks (default). SAFe does not go far enough; the release train itself is still a significant batch. Teams operating Scrum have a sprint (iteration) as their batch (typically 2 weeks). As teams become more advanced this batch is often seen as an overhead and teams look to gradually transition from timeboxes to flow using Kanban to eradicate the batch (NB. Scrumban is a good transition strategy for this shift). So SAFe enlarges the team level batch problem, scaling it up to 10 weeks (default) for a PSI. If SAFe was focused less on highly controlled alignment and focused more on dependency reduction strategies the need for this batch would disappear. The Program Level could then be managed through Kanban and flow rather than a timebox.

Decentralise control YET SAFe enforces controlled alignment

  • Another principle is to “Decentralise Control” yet SAFe’s model is supporting tightly controlled alignment. It is push model dictating direction from Portfolio Epics to Program Features to Team Stories. The whole Program Level exists to support this through the delivery of PSIs that deliver features pushed down from above.

Conclusions

Organisations will typically be looking for a scaled agile approach for one of two reasons:

  1. They are already operating agile delivery teams at the team level and have seen the benefits and successes it provides. They now want to scale up this capability in the organisation and achieve a wider success.
  2. A large organisation with traditional roots wants to make the transformation to agile but needs a solution that provides a scaled solution beyond the team.

In either case SAFe is probably the forerunner in terms of options that will be considered as it is pitched to address these concerns.

So should you use SAFe? Well, like with most things, it depends…

I do believe SAFe has its place but I don’t think its what you must use in order to scale.

“Scaling Agile <> SAFe”

If I had to write a headline pitch for SAFe it would read:

“SAFe, a transitional strategy for the large corporate on the path to agility”.

More specifically the criteria might be:

  1. Large Corporation organised around traditional delivery methods that wants to undergo a large-scale cultural transformation (even though SAFe does not provide the toolset to assist with the change management required).
  2. You need highly controlled alignment from the top down on what gets delivered.
  3. You will have 5+ agile delivery teams working on a single value stream and foresee high levels of dependencies that in your context will be difficult (or take a long time) to eradicate.

NB. If you have greater than 5+ teams overall but they work on different value streams that do not require high coordination then SAFe is probably not for you.

So if SAFe is not the option for you, what is?

You could still use SAFe and tailor heavily to suit your context – this blog might help you determine some areas in which to focus.

There are other frameworks, I know which one I am going to investigate next (another future blog)…

Alternatively the principles underlying SAFe are good but they do not originate from SAFe, there is no reason why you should not start with lean and agile principles and develop what will work in your own context.

Whatever your choice, here are an example of some considerations when you scale:

  • Do smaller things and let them flow
    • Break the work up into smaller units of work (Minimum Viable Products (MVP) is the popularised term)
  • Target flow over creating large batches
    • Visualise and manage the flow of MVPs using Kanban at the Portfolio level. I would question the need for the existence of the Program level in SAFe – it may depend on your context, just understand what problem you are trying to solve.
    • Limit Work In Progress (WIP) to improve throughput to become more predictable and deliver more.
  • Coordination and alignment
    • Coordination – reduce the dependencies
      • Look at how you structure the work
        • Favour feature teams over component teams
        • Ensure different teams work on different MVPs
        • Consider architectural patterns that will help minimize dependencies and promote loose coupling
        • Consider integration and deployment techniques (e.g. feature toggling)
    • Alignment
      • Look to achieve alignment through Vision and setting goals, supported by a common set of values + principles. Providing teams with more autonomy to self-organise around the goals to be achieved will reduce the need for big 2 day planning days involving up to 125 people!
  • Create long lived teams of teams
    • It takes a long time for teams to achieve a performing stage of development, once they reach this stage keep them together – don’t break them up. At scale this translates to the creation of long-lived teams of teams.
    • Remember: we are all part of one larger team, one team may be working on at one end of the boat and another team at the opposite end but we are all on the same boat – all teams should be concerned (and looking to help) when another team has a hole at their end of the boat.
  • Create working groups (virtual teams) that focus on identified improvement areas and are able to cross polinate learnings across multiple teams
  • Consider the funding model to support the delivery of outcomes, can you shift to a smaller incremental funding model to support shorter term planning horizons when delivering smaller chunks of work (MVPs).

“If you’ve lost the ability to do small things then you’ve not scaled, you’ve just gotten big and slow!!!”

8 thoughts on “Scaling Agile using the Scaled Agile Framework (SAFe)

  1. Great overview, although I have not taken the course yet, I will have to since Agile expertise is now about certifications. I thought this was very insightful.

  2. Steve,

    Balanced, insightful, and well-written. Kudos!

    Thank you for sharing this.

  3. Thoughtful article. We have started using SAFe to address the number of synchonization issues that occurred when delivering a massively complex integrated systems (hardware & software and configurations). For business that develop these types of systems where the systems’ PSI must be shown to be cohesive (Aerospace, Automotive, Medical Devices, etc), the synchronized release trains concept has really helped.

    You are spot on with having Feature-oriented teams vs Component teams

    With regard to the dependencies topic, while I agree that it is ideal to design these away through architecture or work allocation, there are many many scenarios where this is simply not possible. For example, in systems that are a mix of 3rd party legacy software, 3rd party new software that the teams have no control over, on Release Planning Day the teams must consider and account for these dependencies (mostly the risks due to them) in the Release Plan. Plan for Success.

    We just completed our Release Planning Day yesterday with the entire team (~45 engineers) together and it always amazes me how much proactive and spontaneous communication that occurs…and how the teams really do work stuff out between each other. The Scrum Team really does get shape how they are going to implement the Feature List and have the mindset that not only are we delivering our PSI but an PSI that is known to work with the overall systems.

    Like your article conveys, it is by no means perfect however it does provide a thought-through foundation based on lean principles for teams developing highly integrated software systems.

  4. Thanks Todd for the insights, it’s providing some benefits in your context then which is great. In my current context I’m adopting LeSS which feels much more natural. In LeSS Craig Larman is advocating feature teams and moving the dependencies to the code level, with strong technical practices, continuous integration and avoiding undone work that requires a hardening iteration. He argues that release trains add additional planning and management overhead which is why SAFe has a release management team and these big PSI planning days. As you say, SAFe is providing you with improvements but maybe it’s not the end game.

Comments are closed.