About Me

My photo
Experienced Information Technology leader, author, system administrator, and systems architect.

Saturday, April 6, 2013

Book Review: An Introduction to Project Management

Great PMBOK Textbook

Schwalbe has created a textbook for Project Management that is complete and well-written. There is a wealth of information in the text, and it covers most of the typical artifacts that practitioners are likely to see in the field.

This is a great book for project managers to keep in their library as a reference, and it is a great textbook.

This book focuses mostly on the sort and treatment of material one would expect from a reference or textbook. If experienced project managers are looking for a good book about motivating project teams, I recommend Berkun's "Making Things Happen."

Friday, April 5, 2013

Book Review: Team Geek

The authors have drawn on their extensive experience in the open source community to describe the characteristics of a strong software development team. Their framework of Humility, Respect and Trust is a sound foundation for the long-term success of any type of team.

Their comments about removing barriers to entry were spot on. "A first-time user usually isn't thinking about whether your software is more or less powerful than a competitor's; she just wants to get something done. Quickly." Good software environments need to be all about setting up an appropriate level of abstraction, and helping most users get work done quickly. (Better software environments provide the power user a way to pull back the curtain and play directly with the gears and cogs.)

Team Geek focuses on the importance of the end user experience, which is a mind-set that successful technical people need to develop. Businesses don't pay us to play with really cool technology. They pay us to help other people get their work done.

Their descriptions of team dynamics and organizational culture were also very helpful, especially for people who may not have been part of a well-functioning team previously. The anecdotes were engaging and pointed, and were well-chosen to illustrate the particular point that the authors were making.

Thursday, April 4, 2013

Managing Meetings

There are two basic kinds of meetings that are worth having:
  • Informational/Status Meetings
  • Problem-Solving Meetings

These are two distinct types of meetings, and they need to be treated differently. But there are some things that are common to all meetings that are worth having:

  • An agenda exists, and is distributed beforehand (even if the agenda is a question or a problem statement)
  • Someone is in charge of the meeting and keeps it moving along.

For Status or Informational meetings, the purpose is to share information among the participants. If issues are mentioned, typically the people affected by those issues should schedule a breakout session or problem-solving meeting. Then the status on that issue will be updated in the next status meeting.

Problem-solving meetings are tougher. One key is to limit the participation to people who can actually help resolve the problem, or representatives of teams who would implement the solution. These meetings can easily run out of control, so it is important to try to focus the meeting while still allowing input from the participants. This can be a difficult balancing act.

Sometimes the role of the person running the meeting may be as a facilitator rather than a direct contributor. Write things on a whiteboard. Draft points for discussion, and keep the pace moving. People should be able to speak, but should not be allowed to take over. Sometimes a long-running monologue can be disrupted by asking a clarifying question that requires a yes or no answer, then asking another person in the room for their opinion on that answer.

Take Responsibility for Your Meetings

Make sure that your meetings are well-organized and focused:
  • Only call meetings that are necessary, with a clearly stated purpose.
  • Only invite people who need to be there.
  • Provide an agenda before the meeting, with ample time for participants to request clarifications or changes to the agenda.
  • Run the meeting professionally. This includes introducing participants (if needed), stating the purpose for the meeting, and laying down the ground rules before starting on the agenda. Participation should be encouraged, but the schedule should be kept. Breakout or follow-up meetings may be scheduled as needed. Minutes are distributed shortly after the meeting, including only critical issues and decisions addressed in the meeting.

Wednesday, April 3, 2013

From Techie to Boss: Transitioning to Leadership


My latest book, From Techie to Boss is scheduled for publication on April 24.

Let’s face it. Non-technical managers just don’t understand what we do for a living. The good ones try really hard and stand up for their team, but they just don’t feel it in their bones. If technology is not stamped into your DNA, you just don’t get it.

So that means that only technical people should manage technical people, right?

Here’s the problem: technical people frequently do not make good managers. It isn’t that they aren’t smart enough; usually the best technicians are the people who are asked to step into leadership roles. But the skills that make a good techie are not necessarily the skills that make a good leader.

When you become a leader, the focus shifts. It is no longer about what you can accomplish as an individual contributor. You will be judged by your team’s accomplishments.

Good technical people have developed good study habits, a sense of responsibility, and a solid work ethic. All of these are important, and can translate into skills that will help you be a good leader. But you will only be an effective leader when you inspire your team members to reach their potential.

Moving into a leadership role can be a bumpy ride. But it can also be hugely rewarding. Make sure to approach it from the right frame of mind. It isn’t about you anymore. It is about your team.

This book lays out some of the lessons I have learned during my own transition from a front-line techie to a manager.

I welcome your stories and your suggestions about how to make the transition to management an easier one!

Scott Cromar
St Augustine, FL
25 March 2013

PS:
A pre-release ebook is available from the Apress web site. Don't worry, updated versions can be downloaded from their web site during the production process.

Book Review: Manager's Toolkit

The Manager's Toolkit is a compact reference that should be required reading for new managers.

The authors have identified the most common knowledge gaps that new managers will have. When people are promoted up from the ranks, they usually do not have formal training on management techniques, and there is seldom anyone available to help them develop the necessary skills. This book can help fill the gap.

The chapter organization makes it easy to find suggestions for different types of problems, as new managers come across them.

Manager's Toolkit covers topics at a level of depth that is appropriate for most new managers, including managers who do not have a business school background. Since the coverage is of management, not every piece of information will be relevant to every manager. It is still interesting, and young managers should still read the entire book in order to see if there are ways to apply those approaches to other issues they may face.

Tuesday, April 2, 2013

Root Cause Analysis

Sometimes we end up "fixing" the same problem over and over. Root Cause Analysis helps us make sure that we have actually resolved the root cause of the problem.

5 Whys

For most problems, we can get to the root cause by drilling into proposed explanations by repeatedly asking "Why?" The 5 Whys method was developed by the Toyota Motor Corporation. It is based on the observation that five iterations of asking "Why?" is usually enough to get to the root cause of most real world problems.

For example:
Problem Statement: The system crashed. (Why?)
A memory chip failed. (Why?)
The machine room temperature exceeds recommendations. (Why?)
The HVAC unit is undersized given our heat load. (Why?)
Our projections for heat load were lower than what has been observed. (Why?)
We did the heat load projections ourselves rather than bringing in a qualified expert.

Some disadvantages of the 5 Whys method are:

  • The results are not repeatable. We may well end up with different results depending on who runs the exercise. For example, what if we had answered the second "why" with some other plausible explanation?
  • We are limited to the participants' knowledge of the system. In particular, we aren't going to find any answers that the participants don't already suspect.
  • We may not ask "why?" about the right symptoms of the problem.
  • We may stop short and not proceed to the actual root cause of the problem. For example, people may stop at the point about the HVAC unit being undersized, run the estimates themselves, and promptly purchase a larger (but still undersized) unit.

Current Reality Tree

The CRT's primary components are boxes describing symptoms and arrows representing relationships between them. Symptoms are divided into Undesirable Effects (UDE) and Neutral Effects (NE). This allows us to recognize the effects of things in our environment that are not viewed as undesirable, but which may contribute to a UDE.

Arrows may flow in both directions if necessary. In particular, this allows us to identify a negative feedback loop.

Two or more symptoms may have their arrows combined with an ellipse. This means that the combination of those symptoms is sufficient to provoke the following UDE, but that all of them are required.

To build a CRT, we ask a Key Question with our Problem Statement. The question will usually be of the form "Why is this happening?" Next, we need to create a list of several Undesirable Effects which are related to the Key Question. Each symptom (UDE or NE) gets a box. Wherever we can say something like "If A, then B," we would draw an arrow from A to B. Where we can say something like "If A is combined with B, then we get C," we would draw arrows from A and B to C, then group the arrows with an ellipse.

At the lowest level of the CRT, we should ask "Why?" and continue to build the tree down until we are at the Root Causes, also known as "Problems." If the lowest level boxes are still just symptoms of an underlying problem, build down as far as possible by asking "Why?" at each stage.

Some cases, like the one diagrammed here, end up with the root cause ending in a conflict between two Neutral Effects.

Evaporating Cloud and Future Reality Diagrams

The Evaporating Cloud refers to Goldratt's method for dealing with conflicts. In particular, Goldratt discusses the Core Conflict Cloud representing the Core Conflict in our CRT.

In an Evaporating Cloud Diagram, the end goal (aka the Systemic Objective) is placed in a box on the left. The two conflicting Prerequisite Conditions are placed in boxes at the right hand side of the drawing, with a lightning bolt arrow between them. The Necessary Conditions for the Systemic Objective are placed in boxes next to their respective conflicting prerequisite conditions.

The Evaporating Cloud Diagram illustrates the age-old conflict between upgrades and system stability. On the one hand, upgrades will increase the system reliability and performance. Neglecting upgrades for too long will eventually result in system problems. On the other hand, changes always carry some risk, so there is a strong desire to avoid the pain of changes, including upgrades.

In this case, we need to recognize the end goal of providing a reliable service. Upgrades need to be performed, but should be performed in a way that allows for adequate planning and testing in order to avoid introducing problems to a working system. This sort of solution "evaporates" the cloud.

We can use this solution to build a Future Reality Tree, which is like a Current Reality Tree, but with our solution injected into the diagram:

Monday, April 1, 2013

A Troubleshooting Methodology

Troubleshooting generally consists of the following steps. Different methodologies may call them by slightly different names, but the similarities are pretty obvious.
  • Investigation
    • Problem Statement: Create a clear, concise statement of the problem.
    • Problem Description: Identify the symptoms. What works? What doesn't?
    • Identify Differences and Changes: What has changed recently? What is unique about this system?
  • Analysis
    • Brainstorm: Gather Hypotheses: What might have caused the problem?
    • Identify Likely Causes: Which hypotheses are most likely?
    • Test Possible Causes: Schedule the testing for the most likely hypotheses. Perform any non-disruptive testing immediately.
  • Implementation
    • Implement the Fix: Complete the repair.
    • Verify the Fix: Is the problem really fixed?
    • Document the Resolution: What did we do? Get a sign-off from the system owner.

Problem Statement

The problem statement must be broad enough to describe the problem, but narrow enough to focus the investigation. It should not contain value judgements. It should be a factual answer to the question "What is wrong?"

Problem Description

Gather all symptoms, including error messages, core dumps, descriptions of any service outages, and contrasting descriptions of what still works. As near as possible, we need to identify the time of the incident.

Identify Differences and Changes

Identify differences between the faulted system and any similar working systems. Also identify any recent changes to the system.

Brainstorm

In this stage, we need to come up with as many possible explanations for the problem as possible. It is sometimes helpful (especially in a group setting) to use an Ishikawa diagram to organize our thoughts so that we don't leave any possibilities unconsidered.

Generate an Ishikawa diagram by drawing a “backbone” arrow pointing to the right at the problem statement. Then attach 4-6 “ribs,” each of which represents a major broad category of items which may contribute to the problem. Each of our components should fit on one or another of these ribs.

Identify Likely Causes

We need to consider how likely each potential cause is. We should only eliminate hypotheses when they are absolutely disproven.

For more complex problems, something like an Interrelationship Diagram may be useful in identifying which potential cause may be might be a root cause.

Interrelationship Diagrams use boxes containing phrases describing the potential causes. Arrows between the potential causes demonstrate influence relationships between these issues. Each relationship can only have an arrow pointing in one direction. (Where the relationship's influence runs in both directions, the troubleshooters must decide which one is predominant.) Items with more “out” arrows than “in” arrows are causes. Items with more “in” arrows are effects.

Test Possible Causes

We need to perform testing in the least disruptive fashion possible. Data should be backed up if possible before testing proceeds.

The best approach is to schedule testing of the most likely hypotheses immediately. Then start to perform any non-disruptive or minimally disruptive testing of hypotheses. If several of the most likely hypotheses can be tested non-disruptively, so much the better. Start with them.

In some cases, it may be possible to test the hypothesis directly in some sort of test environment. This may be as simple as running an alternative copy of a program without overwriting the original. Or it may be as complex as setting up a near copy of the faulted system in a test lab. If a realistic test can be carried out without too great a cost in terms of money or time, it can really help nail down whether we have identified the root cause of the problem.

Depending on the situation, it may even be appropriate to test out the hypotheses by directly applying the fix associated with that problem. If this approach is used, it is important to only perform one test at a time, and back out the results of each failed hypotheses before trying the next one. Otherwise, you will not have a good handle on the root cause of the problem, and you may never be confident that it will not re-emerge at the worst possible moment.

Implement the Fix

The fix needs to be implemented in the least-disruptive, lowest-cost manner possible. Ideally, the fix should be performed in a way that will completely verify that the fix itself has resolved the problem.

Verify the Fix

We need to check that the problem is resolved, and also that we have not introduced any new problems. Each service in your environment should have a test suite associated with it so that you can quickly eliminate the possibility that we have introduced a new problem.

Part of this verification should include a root-cause analysis to make sure that the real problem has been resolved. Band-Aid solutions are not really solutions.

Document the Fix

Over time, the collection of data on resolved problems can become a valuable resource. It can be referenced to deal with similar problems. It can be used to track recurring problems over time, which can help with a root cause analysis. Or it can be used to continue the troubleshooting process if it turns out that the problem was not really resolved after all.

Book Review: Enterprise Architecture as a Strategy

This book presents a raft of high-quality information about how Enterprise Architecture affects the success of several organizations. The authors summarize real-world research and draw actionable conclusions.

I first encountered this book as a graduate-level textbook in Enterprise Architecture, but I have re-read the book several times in order to find new insights about how other people had worked around problems similar to the ones I had.

This is not a beginner-level book, but experienced managers both inside and outside of IT will find this book an invaluable resource.

Sunday, March 31, 2013

Advantages and Disadvantages of Globally Dispersed Teams

More and more teams are becoming globally distributed, especially as businesses look to take advantage of lower staff salaries overseas. The decision to distribute teams globally is usually made several levels above most technology managers. The task of the technology manager is to make the environment work with the team that is available.

Based on your own experience, you can probably list several obvious advantages of having staff in the same geographic location.

  • Communications efficiency. 80% of the content of a conversation is nonverbal (body language, voice intonations, etc), and most of that is most effectively transmitted person-to-person. Telephones let you pick up things like the tone of someone's voice, and let you have immediate feedback to statements that are made. Video conferencing let you see some of the body language. None of these are a replacement for daily face-to-face contact, but they are tools you can use to make the situation better.
  • Ease of coordination. When you sit next to someone, there is a lot of informal communication that takes place. All of that makes it easier to coordinate activities. If you at least share a timezone, you can set up regular phone conversations to communicate. When you work across timezones, you have to think a lot more carefully about how to coordinate activities.
  • Local control. You can't "manage by walking around" when you manage people in remote locations. YOur subordinates can't get immediate feedback without calling you on the phone and maybe waking you up. Employees usually don't like to wake up the boss if it isn't an absolute emergency.
  • Cultural norms. When people have a similar background, cultural differences are less likely to garble communications. Implicit assumptions about requirements are also likely to be similar.
  • Cohesion. When people are nearby, it is easier to build a sense of teamwork.
  • Local responsiveness. It is easier to get immediate attention when the rest of the team is nearby.
  • Uneven workload expectations. If all the escalations go to the remaining small group of employees at the home office, those people end up having to resolve all of the hard problems with a reduced staff.
  • Loss of core competencies. Managers need to have an understanding of which functions are strategic advantages to the organization, and which tasks can be carried out by an external person without losing a core competency.

There are also advantages to dispersing the team globally:

  • Avoid groupthink. People from different locations and cultures bring different assumptions to the table. This can help avoid groupthink, as assumptions are challenged by having a more diverse group of team members.
  • Standardized, documented processes. In order to communicate effectively, senior team members will need to be more diligent about producing high-quality documentated processes and procedures.
  • More work hours in a day. “Follow the sun” scheduling has the potential to allow teammates to have better work life balance. Once processes are set up, and a culture of trust is in place, project teams can continue to work problems around the clock.
Book Review: Offshoring Information Technology