Home

Pflogging

the never-ending quest for pragmatic solutions, useful plans, flawless execution, and designs that endure

Navigation

  • Create
    • Create content
    • Modify attributes
  • Navigate
    • Home
  • Site features
  • Areas of interest
  • Blogs
  • Quotes
  • Technology
  • Demonstration
    • View scheduled work for current day
  • Support
    • Contact me
Home Areas of interest Paradigm shifts Lean practices

User login

  • Create new account
  • Request new password

A number of key features are only available to registered users. They include:

  • Access to the full content of top-rated material (only teasers are available to anonymous users after the material has been posted for 45 days)
  • The ability to search site content
  • The ability to access reviews of books relevant to site material
  • The ability to access key quotes relevant to site material
  • The ability to access content from partner sites
  • The ability to rate material
  • The ability to post comments
  • The ability to post new information and propose it for publication
  • The ability to request email notification when selected content is added or updated

Managing flow

  • View
  • links
Submitted by Bryan Pflug on Sat, 11/03/2007 - 09:38.
  • Lean practices

Cars in all directionsIt is important to balance the flow of resources, parts, and information through a work cell so that the proper items are provided to the right people at the needed times, and so that appropriate capacity can be made available to act on these items, just in time. The size and expected pace of that work can vary widely. Since the nature and quality of information that feeds production systems (in both planning and operational phases) changes over time, if you attempt to act on that information before it is ready to be acted on, a bottleneck can easily arise. In addition, work which has not yet been reconciled with other required information or dependencies may leak into production, resulting in further waste and inefficiency. 

Water flowing down drainThe primary purpose of a Kanban is to provide a visual system to highlight undesirable conditions in such a production system, so that the right actions are apparent and can be acted on quickly by empowered individuals with a minimum of overhead. The kanban's goal is to achieve smooth flow, as opposed to stop and go behavior. For an excellent case study on applying Kanban techniques to managing flow through a development team, see this presentation from Dave Anderson of Corbis.

I attended a seminar in which one of Dave's managers, Darren Davis, discussed how Corbis uses Kanbans in detail. Darren's background information was very helpful in understanding the mechanics behind their use. Service Level Agreements (SLAs) are very important to Corbis, since they establish customer commitments for their sustainment requests, and these commitments have significant finanical consequences if they fail to achieve them. The work groups which act upon these requests within Corbis manage a variety of assignments and priorities (change requests, problem reports, maintenance requests, and hot fixes), and processing them at prescribed rates and flow times are key to meeting their SLAs. Some of these work items are date dependent, and some are not, but their overall job mix must be managed effectively and efficiently to achieve the required throughput.

Designing a Kanban board

Corbis had previously been using a ticketing system to track information through their groups, but had problems in using their database to understand what was really going on. There were several reasons for this. For example, information in the database tended to lag what was really going on in the system. Queues of problem reports and change requests would stack up at one or more bottlenecks, but no one would be able to tell that until someone noticed that the overall rate of flow through the system itself was not being kept up. When this occurred (typically, when the system or it's people were under high demand, due to emergent problems, or unusual conditions, or when they fell behind schedule), there was a tendency to 'travel work' downstream, especially when it was not apparent whether there was a problem to be worked with a particular item or not. When it turned out these problems did exist, this caused further problems in the system, as variability and bottlenecks tended to accumulate. Yet there was no good indication of what was causing these bottlenecks themselves, or where they were occurring.

Corlis's approach to this situation was to begin to use a Kanban board. For an example of the different ways that Kanban boards can be used, see this article. When they initially launched this concept to support their sustainment process, they attempted to use an arrangement of their existing workflow states from their problem reporting system itself (some 20 in number),  but found this overly complex. The work did not move at the rates they wanted, and the tracking tools were never up-to-date. Everyone was frustrated, and the time they were spending in planning and reconciling the two systems was consuming far more resources than they could afford. They believe their migration to using kanban boards as their primary control mechanism, and the associated visual aspects of using post-it notes for their planning, made significant changes to their throughput, and alllowed them to rapidly reconfigure their approaches 'overnight' until they found the right mix for managing their work situation.

The mechanics of creating Kanban boards are easy. They use erasable whiteboards and thin pinstriping tape from Office Depot to lay out lines on boards of various sizes (several different teams use them in different configurations), and they have found this combination allows them to easily redesign the contents of these kanbans as needed, but not be constantly redrawing the board structure itself. Categorization and flow management

They use different colored sticky notes to distinguish their different types of work, and thus highlight whether individual jobs are customer-driven (and thus time-critical) or internally driven (typically, longer-term improvements desired but not required in a particular timeframe). They originally spent a lot of time trying to get their underlying problem system tracking states and reports to match kanban states, then decided that tracking things at the same level in their problem tracking system and on the kanban board created too much tracking overhead. Thus, they now track things at a high level on the board, and in a lower level in the job tracking system; for example, whether a job was released or withdrawn is meaningful to track in the database job tracking system, when transitioning work into a closed state, but that difference is only tracked in the job tracking system itself, not on the kanban board.

They have standard flows for most work, so once work shows up in the proposed queue, they have a good sense how long it will take to do a particular task for standardized work, and customers have come to rely on this. Only jobs which are ready for processing are allowed into their proposed queue, and only a designated number are allowed into the proposed queue at any one time. In practice, they have found that their customers, while initially bawking at this idea, found that predictability was far more important to them than instantaneous response (except for expedites). This forces work upstream to be prioritized before it is put into their queue, and allows only work that is suitable for processing to be absorbed into the job processing queue. They indicated that once customers came to understand the importance of well-defined work and it's impact on predictability of performance and flow through their system, the customers could themselves live with prioritizing their work up front; their prior resistance to prioritization themselves was that they didn't believe such priorities would do them any good. 

It is the responsibility of the owners of jobs on the kanban board to synchronize the information on the board with associated database tracking systems within a defined period (in their case, by the end of the day), and at that time, to ensure that their computer databases matches the states indicated on the kanban board. Peer pressure seems to ensure that this now happens, whereas previously, it was rare for the job tracking system to ever be up-to-date ('that which is used, is maintained'). When they first designed the tracking states on their kanban (represented as columns), they also had confused individual activities which one or more persons had to do with workflow states that they needed to visually track. They soon learned this also was inefficient - for example, since build, as an activity, didn't take long for them, they learned there was no need for representing work for build activities on their board; that activity was bounded by other states that were represented on the board, and could be reliably performed in a fixed time once begun.

Job processing

To represent the processing of each job on the kanban board, each post-it records one job (with a color corresponding to the job type), and the post-it has written on it an lD (from their computerized job tracking system), the date of the SLA clock start (to help in self-expediting things that are late), a title, and written icons to indicate various levels of risk which the item is assessed to have over time relative to the SLA for that task - exclamation points, asterisks, circles, etc. These icons are selected so that these icons can be elaborated over time (circles drawn around asterisks, for example) if risk escalates over time. Issues which arise with processing a particular job (such as when a roadblocks to progress is encountered) are recorded on pink sticky notes, which are placed on top of the job-tracking post-its that they correspond to; these are used to capture actions required on work that is stalled or requires attention.  Separate visual indicators also highlight problems found during processing a job by putting orange post-its on top of the corresponding job sticky note during which the problem was discovered, until resources can be freed up to unblock the issue. This indication helps communicates quality problems 'stacking up', as these orange tickets are a visual indication of the threat that future rework will be on draining resources; stacks of orange stickies across sets of items thus visually communicate the level of health of quality in the system overall, and where the key offending items. They then pull these orange post-its off the corresponding jobs (after adding a cross-referencing impacted function number), and inject that work into the corresponding column, as resources are available to work it, in order to track the processing of that change or problem on the board as a visual indicator of the rework cost of quality rippling through their system.

They have now settled in to tracking things through the following states:

  • proposed (added but not yet accepted for work),
  • active (committed to),
  • resolved (ready for release), and
  • closed.

Example kanbanEach of these states are represented on the Kanban across several spanning columns, with individual cells in the sub-columns representing work queues for sub-states. For example, work in the active state is further broken down into Analysis, Dev, & Test to indicate the status of that work. They don't track these lower level sub-states in their problem reporting tool, but use them on the kanban to keep pressure on their teams in progressing work through the system. They use a fixed queue size (4 for work in proposed, 2 for work in an active state per work cell) for each 'line' (in their case, a lead group), and represent these work cells as separate rows on the kanban. These queue size limits are defined based of what engineers can realistically do between daily standup sessions. They have a policy which limits  who can promote things across these columns, and thus, through their process sub-states, to manage the flow and variability injected into the system.

Queue management

Each queue has a corresponding line manager who is responsible for working issues within that queue, and can thus recognize common patterns and deal with them over time.  The handoffs through these queues can be thought of as batons being passed in a relay race. They have learned you cannot inject unbalanced work beyond these queue sizes into their system, because it really impacts predictability. They find that expedites are still sometimes necessary to meet their SLAs, but they control their overall impact by only allowing one expedite into their system at a time. Outages, which also occasionally arise, are managed via assessments of the impact to their business, and force a reshuffling of work.

Prior to stand-ups, people who are ready to make a hand-off move their corresponding job (post-it) below a line at bottom of the current column (state), in order to highlight this situation. It is presumed that this job thus satisfies exit criteria, but is held at the bottom until it's confirmed that it meets entrance criteria for follow-on queues. For example, development groups pull stickies from the available pool at the bottom of the analysis columns when they need work, and push work to the bottom of their column when they are ready for builds; the build team then does their overnight build, and if successful, moves that work into the test queue on the kanban, upon completion of the build.

Standups

Standup meetings take no longer than 15 minutes (ever); their only goal is to pair up people to take action on blocking problems, not to ask everyone to report on each individual item. No calling in via phone is allowed - "it just doesn't work - groups must get an on site rep" - and at a minimum, the queue managers (or their representatives) always need to attend. Ideally, they say, everyone is there to reinforce their interdependencies in achieving overall goals, which has also reinforced alignment of work schedules. After the 'all-hands' standup, there are then often quick, small, and informal face to face meetings (they call this the "after glow"), with small numbers of people interacting on key items that result from stand-up decisions.

Scalability

They have also built a 2 tier approach for these kanbans for scaling them to larger teams, with a front end that takes requirements (which have greater variability) and evolves these requirements into the specific changes that are then pulled into development. They use these tiers to pass their work through filters up-stream, and elaborate it into roughly equivalent sizes and thus strive to reduce variation in follow-on kanbans. The amount of variability required in sustainment is not as tight as in Manufacturing, but they acknowledge that this works much better in situations where something is working and has to be fixed or changed, as opposed to an all-new development, where nothing is working yet.

Results and future directions

They now are making functional releases every two weeks, with these each totalling approximately 200 changes, with just a handful of people (10% of their total staff).  They feel overall flow and predictibiltiy has significantly improved, and in practice expedites have only had to be used very infrequently (4 of 200 per two week release period, on average).

Other groups have also explored the automation of kanbans, and there are now tools available for this purpose... but caution is advised in attempting to automate too quickly. They have found that the physical act of manipulation of work in queues, and the community focus (and peer pressure) on collective achievement of SLAs, is a powerful concept. 

The philosphy, terminology, and concepts behind these kanbans are explained in more detail in the children pages of this writing. See the following articles extracted from partner sites for this information:


Average rating
 
 
 
 
 
(0 votes)
Total time spent: 01:37:27
  • The challenges of managing flow in engineering systems
  • How a Kanban can help
  • Kanban bootstrap
  • The essential difficulty of lean scheduling
  • Visual controls to manage buffers
  • Balancing resources and concurrent pipelines
  • Synchronizing production
  • Lean's ultimate impact on development and release cycles
  • Spreadsheet example for a small Kanban team
‹ Putting engineering on a dietupThe challenges of managing flow in engineering systems ›
  • Login or register to post comments

Copyright

Copyright © 2009 Pflogging
All Rights Reserved
RoopleTheme