TREASURY INSPECTOR GENERAL FOR TAX ADMINISTRATION

 

 

Service Operations Command Center Management Can Do More to Benefit From Implementing the Information Technology Infrastructure Library

 

 

 

August 16, 2011

 

Reference Number:  2011-20-078

 

 

This report has cleared the Treasury Inspector General for Tax Administration disclosure review process and information determined to be restricted from public release has been redacted from this document.

 

Phone Number   |  202-622-6500

Email Address   |  TIGTACommunications@tigta.treas.gov

Web Site           |  http://www.tigta.gov

 

 

HIGHLIGHTS

 

SERVICE OPERATIONS COMMAND CENTER MANAGEMENT CAN DO MORE TO BENEFIT FROM IMPLEMENTING THE INFORMATION TECHNOLOGY INFRASTRUCTURE LIBRARY

 

Highlights

Final Report Issued on August 16, 2011

Highlights of Reference Number:  2011-20-078 to the Internal Revenue Service Chief Technology Officer.

IMPACT ON TAXPAYERS

The Enterprise Operations organization Service Operations Command Center Branch (SOCCB) exists in part to ensure that normal information technology service operations are maintained for servers and mainframes by using the three Information Technology Infrastructure Library® (ITIL) processes of Event Management, Incident Management, and Problem Management.  If the SOCCB does not effectively implement these ITIL best practices, service outages may not be addressed efficiently and the Internal Revenue Service (IRS) will not be effectively utilizing taxpayer resources.

WHY TIGTA DID THE AUDIT

The overall objective of this review was to determine whether the SOCCB has effectively implemented ITIL best practices to ensure service delivery management for Enterprise Operations organization products and services.

WHAT TIGTA FOUND

SOCCB management incorporated Event Management, Incident Management, and Problem Management best practices into SOCCB policies and procedures and daily operations.  In addition, personnel resolved the majority of incident tickets within the required time periods. 

TIGTA analyzed the 312 Fiscal Year 2010 incident tickets worked by the systems administrators and computer support specialists and determined that 145 (46 percent) of the tickets pertained to three systems.  Most of the incidents occurred because of problems with software and were resolved by performing a system reboot or stop/restart.  TIGTA determined that the SOCCB needs to examine incident reports to identify trends within the information technology infrastructure, making its Problem Management activities proactive. 

TIGTA also determined that SOCCB management did not conduct a baseline assessment of SOCCB staffing and workload and does not have a documented strategic plan to communicate its goals and priorities with milestone and target dates.  In addition, the current performance measures do not address whether work is performed efficiently and effectively. 

WHAT TIGTA RECOMMENDED

TIGTA recommended that the Associate Chief Information Officer, Enterprise Operations, ensure that SOCCB management revises SOCCB procedures to address ticket trending, perform a staffing and workload analysis of the SOCCB, update the Enterprise Operations organization’s strategic plan whenever a SOCCB ITIL best practice is required to support the goals or objectives of the organization, ensure development and execution of a training plan, and identify and implement additional ITIL performance measures.

In their response to the report, IRS officials agreed with all of the recommendations.  The IRS plans to revise procedures to account for trending activities of its incident tickets, perform a staffing and workload analysis, update the Enterprise Operations strategic plan for Problem Management, develop and execute a training plan, and identify and implement performance measures.

August 16, 2011

 

 

MEMORANDUM FOR CHIEF TECHNOLOGY OFFICER

 

FROM:                            Michael R. Phillips /s/ Michael R. Phillips

Deputy Inspector General for Audit

 

SUBJECT:                    Final Audit Report – Service Operations Command Center Management Can Do More to Benefit From Implementing the Information Technology Infrastructure Library (Audit # 201020006)

 

This report presents the results of our review to determine whether the Service Operations Command Center Branch has effectively implemented Information Technology Infrastructure Library® best practices to ensure service delivery management for Enterprise Operations products and services.  This review was included in our Fiscal Year 2010 Annual Audit Plan and addresses the major management challenge of Modernization of the Internal Revenue Service.

Management’s complete response to the draft report is included as Appendix V.

Copies of this report are also being sent to the IRS managers affected by the report recommendations.  Please contact me at (202) 622-6510 if you have questions or Alan R. Duncan, Assistant Inspector General for Audit (Security and Information Technology Services), at (202) 622-5894.

 

 

Table of Contents

 

Background

Results of Review

Actions Have Been Taken to Implement Information Technology Infrastructure Library Best Practices and Provide Timely Service

Service Operations Command Center Branch Management Can Do More to Become Proactive and Closer to the Target State

Recommendation 1:

Recommendation 2:

Additional Management Actions Are Needed to Ensure Long-Term Success of the Service Operations Command Center Branch

Recommendations 3 through 5:

Appendices

Appendix I – Detailed Objective, Scope, and Methodology

Appendix II – Major Contributors to This Report

Appendix III – Report Distribution List

Appendix IV – Glossary of Terms

Appendix V – Management’s Response to the Draft Report

 

 

Abbreviations

 

FY

Fiscal Year

IRS

Internal Revenue Service

ITAMS

Information Technology Assets Management System

ITIL

Information Technology Infrastructure Library®

MITS

Modernization and Information Technology Services

RCA

Root Cause Analysis

SOCCB

Service Operations Command Center Branch

 

 

Background

 

The Enterprise Operations organization supports the Modernization and Information Technology Services (MITS) organization by providing efficient, cost effective, secure, and highly reliable computing (mainframe and server) services for all Internal Revenue Service (IRS) business entities and taxpayers.  The Enterprise Operations organization has seven organizations that work to fulfill this mission.  One of the organizations, the Enterprise Computing Center,[1] is responsible for providing support for the systems used to receive and process tax returns and payments, all infrastructure servers enterprise-wide, and application servers located in the 10 campuses and non-Enterprise Computing Center sites.

The Service Operations Command Center Branch (SOCCB) – also referred to as the Command Center – falls under the purview of the Enterprise Computing Center and ensures that normal information technology service operations are maintained for mainframes and servers.  The SOCCB consists of two sections:

·         Service Operations Command Center Section – employs 64 systems administrators and computer support specialists to perform Event Management and Incident Management activities on the IRS’s mainframes and servers.

·         Service Operations Management Section – employs 11 information technology specialists to perform Problem Management activities for the entire MITS organization and to facilitate the Service Restoration Team’s part of Incident Management.

Event Management, Incident Management, and Problem Management processes originate from a set of concepts and techniques called the Information Technology Infrastructure Library® (ITIL).  The ITIL provides a set of best practices for managing Information Technology services and aims at delivering those services to satisfy the business requirements of the organization.  The ITIL also provides for a common set of terminology to be used between MITS organization operations, which helps increase customer service and reduce costs.  Because the ITIL provides general guidance for what to do rather than how to do it, it is often described as a framework or approach.  Figure 1 synopsizes a timeline of events that SOCCB management highlighted regarding its accomplishments in standing up and implementing ITIL best practices over the last 5 years.

Figure 1:  Timeline of ITIL Best Practices Implementation at the SOCCB

Date

Highlighted Events

October 1, 2006

The SOCCB stood up.  The Service Restoration Team process and limited Root Cause Analysis (RCA) support was in place at stand-up.

Fiscal Year (FY) 2007

SOCCB management negotiated a stand-up/reassignment agreement with the National Treasury Employees Union, solicited for volunteers to come to the SOCCB, and competitively announced/filled remaining vacancies.

FY 2008

The SOCCB represented being the first organization to their knowledge with employees outside of the Enterprise Computing CenterMartinsburg supporting workloads (without increasing authorized staffing).

May 2009

The SOCCB established the Knowledge Database.

September 2009

The SOCCB completed migration of monitoring/triage/resolution for Priority 1/Priority 2 service tickets for Enterprise Computing Center networked production servers.

September 2010

The SOCCB completed migration of monitoring/triage/resolution for Priority 1/Priority 2 service tickets for Enterprise Computing Center mainframe workloads.

FY 2011

The SOCCB implemented a process to ensure follow-up and Knowledge Database updates for all SOCCB tickets that have to be reassigned for lack of documentation.

Source:  SOCCB management .

This review was performed at the SOCCB locations in Martinsburg, West Virginia, and Memphis, Tennessee, during the period July 2010 through April 2011.  We conducted this performance audit in accordance with generally accepted government auditing standards.  Those standards require that we plan and perform the audit to obtain sufficient, appropriate evidence to provide a reasonable basis for our findings and conclusions based on our audit objective.  We believe that the evidence obtained provides a reasonable basis for our findings and conclusions based on our audit objective.  Detailed information on our audit objective, scope, and methodology is presented in Appendix I.  Major contributors to the report are listed in Appendix II.

 

 

Results of Review

 

Actions Have Been Taken to Implement Information Technology Infrastructure Library Best Practices and Provide Timely Service

The SOCCB updated its policies and procedures to incorporate ITIL best practices

In September 2010, the Chief Technology Officer outlined a goal to have the MITS organization implement ITIL best practices over the next several years.  The SOCCB has incorporated the ITIL best practice principles of Event Management, Incident Management, and Problem Management into its Concept of Operations and policies and procedures.  In addition, the SOCCB has made these best practices a part of the way it does business by utilizing a Knowledge Database.  This database provides personnel with the source for resolving incident tickets and is continually updated with new information.

Incident ticket resolutions occur within documented service level agreement time periods

The MITS organization service level agreement provides for a 4-hour response time on a Priority 1 ticket and an 8-hour response time on a Priority 2 ticket.  Our analysis of Information Technology Assets Management System (ITAMS) data showed that 305[2] of the 312 tickets worked by Command Center personnel were resolved within the documented time periods.  Figure 2 shows the average time to resolve these types of tickets.

Figure 2:  Average Resolution Times for Priority 1 and 2 Tickets

FY 2010

Priority 1

Priority 2

Number of Tickets Received

50

262

Average Time to Resolve

1 hour 39 minutes

55 minutes

Response Time Per Service Level Agreement

4 hours

8 hours

Source:  Our analysis of ITAMS data.

While the SOCCB has implemented Event Management, Incident Management, and Problem Management best practices into its environment, additional improvements are needed to show continued progress with and demonstrate efficiencies gained from implementing them.

Service Operations Command Center Branch Management Can Do More to Become Proactive and Closer to the Target State

We reviewed the incident tickets worked by SOCCB personnel during FY 2010, the current staff size within the SOCCB, and a recent independent contractor’s ITIL assessment of the SOCCB’s efforts to adopt ITIL best practices in evaluating whether the SOCCB is moving closer to the target state.  Our analyses found that SOCCB management can improve the following activities to help them become more proactive.

Command Center Section personnel should examine incident reports to identify trends within the information technology infrastructure

We analyzed the 312 FY 2010 incident tickets worked by the systems administrators and the computer support specialists looking for trends among the project type, cause of the incident, and how the incident was resolved.  Figure 3 shows that 145 (46 percent) of the tickets pertained to 3 systems.  Most of the incidents occurred because of problems with software (e.g., system, server, or application) and were resolved by performing a system reboot or stop/restart. 

Figure 3:  Top Three Causes and Most Common Resolution
of Incident Tickets by System Name

System Name

# of Incident Tickets

Top Three Ticket Causes

Most Common Resolution

Account Management Services

63

Application Software (28)

System Error[3] (8)

No Trouble Found (8)

Stop/Restart and Reboot

E-Services

53

Unknown (18)

Application Software (12)

Other[4] (10)

Reboot

Totally Automated Personnel System

29

System Error (22)

Application Software (3)

Server Software (3)

Stop/Restart and Reboot

Total

145

Source:  Our analysis of ITAMS data.

When asked to provide details about the incident ticket trends we observed, SOCCB management explained that the high volume of tickets for Account Management Services occurred partly because of a new release of the system for which Command Center personnel had to perform the work.  In addition, Account Management Services generally has the most users, thus increasing the likelihood for generating more incident tickets.  SOCCB management also described e-Services as a complex system requiring a MITS-wide approach to resolution, while simultaneously implementing workarounds (e.g., reboots) to minimize impact to the customer until the underlying root cause was addressed.

According to ITIL foundations, Problem Management is both a reactive and proactive process.  Reactive means solving problems uncovered by Incident Management or other sources, whereas proactive means looking for potential problems before they are reported as an incident.  The Operations Management Section of the SOCCB conducts RCAs to facilitate its Problem Management activities and mitigate the impact of system downtime/lost productivity by identifying the root cause, a workaround, and ultimately a permanent solution.  However, the current RCA process performed by the Operations Management Section is reactive and only performed if requested by an internal MITS organization stakeholder or if directed by an IRS executive. 

Reviewing incident ticket data to discover the types of problems that occur more frequently can help the SOCCB identify problems that may occur in other places within the IRS’s information technology infrastructure, as well as show that repeated failures have not been adequately resolved and are likely to continue to occur.  This move towards proactive Problem Management will help the SOCCB improve its overall effectiveness by posturing itself to identify problems before they occur and result in outages or lost productivity to its customers.

Figure 4 shows a recent ITIL assessment completed by an independent contractor that identified the current and target state for each SOCCB ITIL best practice.  The independent contractor’s assessment supports our observations that the current state of Problem Management activities within the SOCCB is reactive and notes the desired target state of trending incidents as part of proactive Problem Management.

Figure 4:  Current and Future State of ITIL Best Practices at the SOCCB

Best Practice

Current State

Target State

Event Management

The SOCCB reacts to and investigates service outages after they have been reported. 

Some events automatically create incident tickets.

The SOCCB resolves events before user notices/reports problem.

Correlate repeat events to provide for additional automatic escalation to an incident.

Incident Management

All incidents are logged and tracked.

Basic service level agreements are in place to monitor resolution.

Incidents proactively tracked and escalated to ensure service level agreements are met.

Problem Management

The bulk of problem resolution is recurring incident-based as opposed to proactive Problem Management.

Trend incidents to create problems on an ongoing basis.

Report incident trends and resolutions through proactive management.

Source:  Enterprise Operations organization, ITIL Maturity Assessment dated September 30, 2010.

The Government Accountability Office’s Internal Control Management and Evaluation Tool[5] explains that managers at all activity levels should review performance reports, measure results against targets, and analyze trends. 

SOCCB management stated they currently do not have sufficient staffing in the Operations Management Section to move more toward proactively identifying recurring incidents and creating RCA tickets that will identify why problems occur and provide a permanent solution.  To address the staffing issue, Enterprise Operations organization management negotiated the transfer of vacant positions from other MITS organization programs to help perform this kind of work.  

SOCCB management needs to baseline their staffing with workload

The SOCCB formally stood up in FY 2007, with approximately 25 personnel operating in a 24 hours a day, 7 days a week, 365 days a year environment.  At that time, SOCCB management submitted a request for organization change indicating that a total of 92 personnel (systems administrators, information technology specialists, and customer support specialists) were needed to support program activities.  As of July 2010, the SOCCB employed 75 personnel to handle Event Management, Incident Management, and Problem Management activities.

The 64 systems administrators and computer support specialists (which handle Event and Incident Management) support three different workloads split between mainframes and servers.  According to SOCCB management, there are more than 3,300 servers and 15 different mainframes that require monitoring.  During FY 2010, these personnel resolved 312 incident tickets (an average of 5 tickets per employee), updated the Knowledge Database, performed event monitoring activities, and reviewed and updated more than 700 Probe and Response guides.  According to SOCCB management, these personnel also assist with other work (e.g., communicating with stakeholders regarding changes to monitoring thresholds) that supports Event and Incident Management activities.  SOCCB management also indicated that the amount of time spent per activity varies based on the task/situation.

According to industry expert Gartner Group,[6] “developing a portfolio of standardized information technology services with repeatable process methodologies for service delivery and support will help information technology organizations improve information technology service quality, reduce costs, and increase business value and agility,” and “a cost-effective information technology organization can do more with less – or at least do more with the same.”

One way for an organization to measure its success and know if it is “doing more with less” is to perform a staffing and workload analysis.  When the SOCCB stood up in FY 2007, it did not initially baseline its staffing needs against its workload, and it has yet to perform a staffing and workload analysis.  Having this information readily available will allow SOCCB management to demonstrate to upper management the direct relationship between staffing and support levels and the impact any reductions or increases in staffing might have on service levels.

Management’s Responsibility for Internal Control (Office of Management and Budget Circular A-123) states that managers are responsible for increasing productivity and controlling costs of agency operations.  The Government Accountability Office’s Standards for Internal Control in the Federal Government[7] stipulates that program managers need both operational and financial data to determine whether they are meeting their goals for effective and efficient use of resources. 

Recommendations

The Associate Chief Information Officer, Enterprise Operations, should:

Recommendation 1:  Ensure SOCCB management revises SOCCB standard operating procedures to account for MITS-wide trending activities of its incident tickets to help identify repeat occurrences of incidents that can be used to proactively address problems, increase efficiency, and result in fewer repeat incidents in the future.

Management’s Response:  The IRS agreed with this recommendation.  The SOCCB has completed the hiring of the additional Problem Management staff, and all were onboard on June 19, 2011.  The SOCCB will revise the standard operating procedures to account for MITS-wide trending activities of its incident tickets.

Recommendation 2:  Perform a staffing and workload analysis of the SOCCB to demonstrate the relationship between staffing and service levels and help identify opportunities to improve information technology service quality, reduce costs, and increase business value.

Management’s Response:  The IRS agreed with this recommendation.  Currently, the SOCCB has the minimum staff per each work specialty and, due to budgetary, National Treasury Employees Union, and Labor Relations constraints, is unable to make any adjustments to the staffing model.  However, the SOCCB will perform a staffing and workload analysis to determine the relationship between staffing and service levels and help identify opportunities to improve information technology service quality, reduce costs, and increase business value in anticipation of the ability to acquire or adjust resources.

Additional Management Actions Are Needed to Ensure Long-Term Success of the Service Operations Command Center Branch

We reviewed the organizational goals, personnel training history, and currently established performance measures to evaluate whether the SOCCB can effectively and efficiently accomplish its mission.  Our review found that SOCCB management can make improvements in the following areas to help ensure future program success.

The SOCCB needs a strategic plan and vision for maturing the implementation of the ITIL best practices

Although SOCCB management has incorporated the ITIL best practice principles of Event Management, Incident Management, and Problem Management into SOCCB daily operations and updated its policies and procedures, they have not documented any milestones leading to maturing these best practices or long-range goals that will show the intended benefits.  SOCCB management has been more focused on getting policies and procedures updated and responding to new workload requests versus documenting a baseline organization, vision for the future, and a plan to show how they will attain their future vision.  SOCCB management has developed an action plan for when they transition new work, but the plan does not show specific time periods associated with completing the transition (of planned new workloads) or the desired outcomes of the transition efforts to include how those efforts will help the SOCCB most efficiently and effectively leverage and mature implementation of ITIL best practices. 

The first quarter FY 2011 version of the Enterprise Operations organization’s 5-year strategic plan,[8] which SOCCB management provides input to, contains two objectives and goals regarding the ITIL.  To support the objective of improving filing season execution, the strategic plan states a goal of continuously improving service to customers and identifies that the SOCCB will expand its Problem Management activities.  However, the strategic plan does not contain any milestones or target dates for the expansion, disclosing that these actions are pending the additional resources negotiated from the other MITS organizations.  In addition, the strategic plan states another goal of delivering improved business capabilities and governance, identifying that building the foundation of the ITIL will help strengthen program management capabilities and accountability.  The strategic plan does not contain a description of what this means, reflect any milestones or target dates, or provide a list of anticipated benefits.

According to industry best practices, to be successful in the ITIL, a deliberate, well-planned project management approach should be adopted.  Key activities in this approach include creating an overall vision and strategy, developing a project plan and managing it effectively, and assigning accountability for desired outcomes.  A strategic plan outlines an organization’s priorities and communicates to employees, internal stakeholders, and external stakeholders how it plans to accomplish those priorities.  Without a documented plan and strategy for maturing ITIL best practices relevant to the SOCCB, it will be difficult for management to demonstrate their efforts to make Command Center processes more efficient and influence decisions among their stakeholders. 

Personnel need customized training to effectively implement the ITIL

In our review of the training records, we determined 44 of 50 SOCCB personnel that received ITIL training completed an ITIL foundations course.  The foundations course is an entry-level course designed to provide candidates with a general awareness of the key elements, concepts, and terminology used in the ITIL.  We also determined that almost 90 percent (44 of 50) of the SOCCB personnel received training from 2 years to 4 years ago (during Calendar Years 2006 through 2008).  In addition, some personnel completed training in an ITIL best practice area (e.g., financial and security management, service level, and capacity management) that did not align with the specific work completed by the SOCCB.  According to SOCCB management, this training was completed as part of mandatory annual Federal Information Security Management Act[9] security training requirements.

Industry best practices require that ITIL training be customized to suit individual roles and responsibilities.  SOCCB management previously expressed concerns about their ability to fund ITIL training for their personnel, and this might explain why a formal training plan has not been developed and implemented.  Creating and executing a training plan will allow SOCCB management to ensure that their personnel consistently remain a valuable part of a highly skilled and high-performing workforce to support the SOCCB and IRS missions.

Additional measures are needed to capture the improved efficiency and effectiveness resulting from ITIL implementation

The SOCCB workload primarily consists of monitoring operations and working incident tickets.  An incident ticket is routed to systems administrators or computer support specialists by one of three methods:  1) a user calls in a ticket, 2) the ITAMS generates a ticket, or 3) the Command Center receives an alert announcing, for example, that a server is going down.  An alert is designed to prevent/minimize a work stoppage.  When the SOCCB receives a ticket, personnel will search for a resolution via the Knowledge Database.  Generally, systems administrators and computer support specialists have 30 minutes to close and/or reassign an incident ticket before it is elevated to a subject matter expert. 

Currently, SOCCB management has established the following goals to measure SOCCB performance each fiscal year:

·         Tickets Opened for Errors Recognized Through Event Management.  Goal:  20 percent increase by 4th quarter (compared to 1st quarter).

·         Probe and Responses Updated to Reflect the SOCCB as the Primary Assignment Group for Priority 1 and 2 Tickets.  Goal:  100 percent for Enterprise Computing Center networked servers.

·         Number of Priority Tickets Triaged/Closed.  Goal:  80 percent worked/closed without reassignment.

However, the first 2 of the 3 measures are broad goals that do not address whether the work is performed efficiently [i.e., work completed within reported time periods (e.g., number and percentage of tickets resolved in 30 minutes)] or effectively (i.e., quality resolutions or nonrecurrence of problems).  In addition, none of the current measures allow SOCCB management to track continuous improvements, and there are no measures to ensure that the SOCCB is meeting its RCA goals.[10]

Industry best practices emphasize that identifying the appropriate measures, creating a process for collecting and analyzing the data, and effectively using the data to guide and direct continued improvement are essential to establishing a successful measurement process.  Meaningful key performance indicators should align with organizational goals and provide insight into the following:

Also, metrics should be specific, measurable, attainable, realistic, and time driven.  Metrics help to ensure that the process in question is running effectively and efficiently.

Figure 5 shows the SOCCB experienced declines in each of its measures from FY 2009 to FY 2010. 

Figure 5:  Comparison of FYs 2009 and 2010 Measures

Goal/Measure

FY 2009

FY 2010

Tickets Opened for Errors Through Event Management

9,432

8,241

Probe and Response Guide Changed

140

71

Priority Tickets Closed

86%

69%

Source:  Enterprise Computing Center web site.

When asked about the decline in the percentage of priority tickets closed, SOCCB management attributed this to the types of tickets the SOCCB received during the later months of the fiscal year, introduction of a new workload, and more complete ticket reporting. 

A September 2010 independent contractor assessment of the ITIL within the Enterprise Operations organization also identified additional performance measures that will allow the SOCCB to ensure it makes progress in achieving its future state. 

According to SOCCB management, there is limited staffing available to perform ticket trending that would lead to identifying new performance measures to implement.  There is currently only one person available to do any ticket trending, and this individual has other duties to perform within the SOCCB.  Another reason why SOCCB management has not implemented additional measures is because of an initiative to improve the accuracy of ITAMS incident ticket data.  The Customer Relationship and Service Delivery staff began this process in summer 2009 by training its staff on ticket accuracy and in fall 2009 targeted Priorities 1 and 2 tickets to ensure the accuracy of the data input into key fields (e.g., ticket start time, ticket stop time, and cause code). 

Without effective measures to demonstrate continued improvements and quantify cost savings resulting from implementing new processes, like the ITIL, the SOCCB is at risk of being unable to fully support the MITS organization in its efforts to effectively reallocate operational savings to program enhancements. 

Recommendations 

The Associate Chief Information Officer, Enterprise Operations, should:

Recommendation 3:  Update the Enterprise Operations organization strategic plan whenever an SOCCB ITIL best practice is required to support the goals or objectives of the organization.  The update needs to address the goals, as well as milestone and target dates, completion dates, benefits, and any associated risks.

Management’s Response:  The IRS agreed with this recommendation.  The SOCCB will update the Enterprise Operations strategic plan for Problem Management whenever an SOCCB ITIL best practice is required to support the goals or objectives of the organization.  

Recommendation 4:  Ensure SOCCB management develops and executes a training plan to ensure personnel continue to receive customized training in ITIL best practices relevant to the Command Center.

Management’s Response:  The IRS agreed with this recommendation.  The SOCCB will develop and execute a training plan to ensure personnel continue to receive customized training in ITIL best practices relevant to the Command Center.  The training plan will be based on available online courses and other courses as budget constraints allow.

Recommendation 5:  Identify and implement performance measures that will demonstrate the efficiencies and effectiveness of implementing Event Management, Incident Management, and Problem Management.

Management’s Response:  The IRS agreed with this recommendation.  The SOCCB will identify and implement performance measures that will demonstrate the efficiencies and effectiveness of implementing Event Management, Incident Management, and Problem Management.  

 

Appendix I

 

Detailed Objective, Scope, and Methodology

 

Our overall objective was to determine whether the SOCCB had effectively implemented
ITIL
[11] best practices to ensure service delivery management for Enterprise Operations products and services.  In prior audits,[12] our overall assessment has been that ITAMS data are of undetermined reliability.  However, in our opinion, using these data did not weaken our analysis or lead to an incorrect or unintentional message.  Prior audit reports included language that clearly stated the data limitations.  To accomplish our objective, we:

I.                   Reviewed SOCCB program management controls over ITIL implementation.

A.    Interviewed management to obtain their understanding of ITIL best practices, how they communicated to employees and stakeholders regarding implementation of the ITIL, and the training provided to employees.

B.     Reviewed standard operating procedures, the Internal Revenue Manual, and the Concept of Operations to determine whether SOCCB policies and procedures were updated to reflect ITIL best practices.  In addition, we reviewed documentation which showed what benefits the SOCCB expected to accomplish by implementing the ITIL.

C.     Evaluated the process the SOCCB has in place to ensure continuous improvements and whether those improvements are delivered.

II.                Determined whether the SOCCB is performing its Event Management, Incident Management, and Problem Management functions in accordance with the ITIL.

A.    Reviewed Event Management statistics maintained on the Enterprise Computing Center web site and interviewed SOCCB management about Event Management activities.

B.     Determined the maturity/status of the SOCCB’s Incident Management activities.

1.      Analyzed all 312 FY 2010 Priority 1 and Priority 2 tickets obtained from SOCCB management and the ITAMS to identify average ticket resolution time.

2.      Interviewed Customer Relationship and Service Delivery function management about the reports it generates for the SOCCB.

3.      Using the ITAMS, selected a judgmental sample of 20 from 97 Priority 1 and Priority 2 tickets resolved during FY 2010 and traced them to the Knowledge Database to ensure it was updated.  We selected tickets from those systems that had a higher volume of reported incidents (e.g., Account Management Services, Eservices, and the Totally Automated Personnel System) and for which the causes of the incidents included System Error, Unknown, or Application Software.  We used judgmental sampling because we did not intend to project our results.

C.     Interviewed SOCCB management about their Problem Management activities.

D.    Reviewed policies and procedures that define how the SOCCB adds new services and applications to its workload.

III.             Determined whether the SOCCB is monitoring and measuring program performance in accordance with the ITIL.

A.    Interviewed management to determine how they monitored and measured program performance for FY 2010.  We determined how and when results are communicated to employees and stakeholders.

B.     Compared SOCCB measures to determine whether they align with Enterprise Operations organization goals.

C.     Identified the long-range goals that management envisions will show progress in SOCCB operations based on ITIL implementation.

D.    Identified the type of quality review process performed and how that information influences overall performance.

E.     Determined whether SOCCB management performs any trend analyses of performance metrics to ensure repeat incidents have been identified and resolved.

F.      Evaluated how SOCCB management quantifies the efficiencies they gain through proactive monitoring (i.e., cost savings).

Internal controls methodology

Internal controls relate to management’s plans, methods, and procedures used to meet their mission, goals, and objectives.  Internal controls include the processes and procedures for planning, organizing, directing, and controlling program operations.  They include the systems for measuring, reporting, and monitoring program performance.  We determined the following internal controls were relevant to our audit objective:  the MITS organization’s policies and procedures for implementing an effective SOCCB to address the critical issues of addressing service outages efficiently and utilizing taxpayer resources effectively.  We evaluated these controls by interviewing management and reviewing policies and procedures, such as the Internal Revenue Manual, Federal guidance such as the Clinger-Cohen Act of 1996,[13] and Office of Management and Budget Circulars and relevant supporting documentation.

 

Appendix II

 

Major Contributors to This Report

 

Alan Duncan, Assistant Inspector General for Audit (Security and Information Technology Services)

Danny Verneuille, Director

Diana Tengesdal, Audit Manager

Mark Carder, Senior Auditor

Myron Gulley, Senior Auditor

Allen Henry, Program Analyst

Sarah White, Program Analyst

 

Appendix III

 

Report Distribution List

 

Commissioner  C

Office of the Commissioner – Attn:  Chief of Staff  C

Deputy Commissioner for Operations Support  OS

Deputy Chief Information Officer for Operations  OS:CTO

Associate Chief Information Officer, Enterprise Operations  OS:CTO:EO

Associate Chief Information Officer, Strategy and Planning  OS:CTO:SP

Director, Enterprise Computing Center  OS:CTO:EO:EC

Chief Counsel  CC

National Taxpayer Advocate  TA

Director, Office of Legislative Affairs  CL:LA

Director, Office of Program Evaluation and Risk Analysis  RAS:O

Office of Internal Control  OS:CFO:CPIC:IC

Audit Liaison:  Director, Risk Management Division  OS:CTO:SP:RM

 

Appendix IV

 

Glossary of Terms

 

Term

Definition

Account Management Services

A project that will modernize the capability to collect, view, retrieve, and manage taxpayer information.

Best Practice

A technique or methodology that, through experience and research, has proven to reliably lead to a desired result.

Campus

The data processing arm of the IRS.  The campuses process paper and electronic submissions, correct errors, and forward data to the Computing Centers for analysis and posting to taxpayer accounts.

Computer Support Specialist

Position within the SOCCB whose duties include monitoring the mainframes.

Concept of Operations

A framework that includes a defined vision, strategic goals, operational themes, and program capabilities.  It identifies key organizational concepts required to achieve the organization’s vision.

E-Services

Provides a set of web-based business products as incentives to third parties to increase electronic filing, in addition to providing electronic customer account management capabilities to all businesses, individuals, and other customers.

Enterprise Computing Center

Supports tax processing and information management through a data processing and telecommunications infrastructure.

Event Management

The first line of defense for preventing an interruption to or reduction in the quality of service.  Event monitoring is used to sustain and improve quality service, identify significant events and initiate actions before an incident occurs, and automate the process.

Incident Management

The process for managing incidents with the goal of restoring service as quickly as possible and minimizing the adverse impact on the customer.

Information Technology Assets Management System

The workflow tool for all MITS organization service providers.  This module reports and tracks all MITS organization incidents and service requests.

Information Technology Infrastructure Library®

A set of concepts and techniques for managing information technology infrastructure, development, and operations.

Mainframe

A powerful, multiuser computer capable of supporting many hundreds of thousands of users simultaneously.

Priority 1 Ticket

An incident ticket exhibiting the following characteristics:  1) resulting in severe mission-critical work stoppage or any issue relating to safety or health (e.g., fire, electrical shock), 2) impacting on vital IRS customer commitments of national or area-wide scope, 3) affecting multiple internal or external customers and service to taxpayers, and 4) requiring immediate action.

Priority 2 Ticket

An incident ticket with the potential to result in a work stoppage (could have a direct impact on the service to taxpayers or if scope is multi-user and there is no
work-around) and/or to lead to severe mission-critical work stoppage if actions are not taken to resolve incident.

Probe and Response

Provides information for enhanced triage, first contact resolution, timely incident assignment, standard incident coding, and resolution solutions. 

Problem Management

Consists of performing an RCA to mitigate the impact of system downtime/lost productivity caused by errors in the information technology infrastructure and to prevent the recurrence of events.

Release

A specific edition of software.

Server

A computer that carries out specific functions (e.g., a file server stores files, a print server manages printers, and a network server stores and manages network traffic).

Service Level Agreement

A document that describes the minimum performance criteria a provider promises to meet while delivering a service, typically also setting out the remedial action and any penalties that will take effect if performance falls below the promised standard.

Service Restoration Team

A group of individuals who work to rapidly escalate, coordinate, and resolve Priority 1 or Priority 2 outages.  The team can leverage resources from outside the Enterprise Computing Center to assist with restoration activities.

Systems Administrator

Position within the SOCCB whose duties include monitoring Enterprise Computing Center networked production servers as well as participating in RCA and Service Restoration Team efforts.

Totally Automated Personnel System

Automated personnel system used by management for processing requests for personnel actions, as well as employee information report generation.

Triage

Refers to the analysis work that determines the priority of computer applications that need to be remediated.

 

Appendix V

 

Management’s Response to the Draft Report

 

DEPARTMENT OF THE TREASURY

INTERNAL REVENUE SERVICE

WASHINGTON, D.C. 20224

 

 

CHIEF TECHNOLOGY OFFICER

 

JULY 20, 2011

 

MEMORANDUM FOR DEPUTY INSPECTOR GENERAL FOR AUDIT

 

FROM:                            Terence V. Milholland /s/ Terence V. Milholland

   Chief Technology Officer

 

SUBJECT:                      Draft Audit Report - Service Operations Command Center Management Can Do More to Benefit From Implementing the Information Technology Infrastructure Library (Audit # 201020006) (e-trak # 2011-22851)

 

Thank you for the opportunity to review and respond to the subject audit report.

 

We appreciate your comments and observations on how the Enterprise Operations organization is implementing the Information Technology Infrastructure Library (ITIL) best practices in the Service Operations Command Center Branch (SOCCB). We have made considerable strides and are working toward fully utilizing our resources in each of the process areas, Event Management, Incident Management and Problem Management.

 

We agree with the five recommendations and as we continue our tasks toward ITIL Level 3, we will implement them on or before December 30, 2012.

 

We value your continued support and the guidance your team provides. If you have any questions, please contact me at (202) 622-6800 or Andrea Greene-Horace, Senior Manager of Program Oversight, at (202) 283-3427.

 

Attachment

 

RECOMMENDATION #1: The Associate Chief Information Officer, Enterprise Operations should ensure SOCCB management revises its standard operating procedures to account for MITS-wide trending activities of its incident tickets to help identify repeat occurrences of incidents that can be used to proactively address problems, increase efficiency, and result in fewer repeat incidents in the future.

 

CORRECTIVE ACTION #1: We agree with the recommendation. SOCCB completed the hiring of the additional problem management staff and all were onboard on June 19, 2011. We will revise the Standard Operating procedures to account for MITS-wide trending activities of its incident tickets to help identify repeat occurrences of incidents that can be used to proactively address problems, increase efficiency, and result in fewer repeat incidents in the future.

IMPLEMENTATION DATE: December 30, 2012

 

RESPONSIBLE OFFICIAL: Associate Chief Information Officer, Enterprise Operations

 

CORRECTIVE ACTION MONITORING PLAN: We enter accepted Corrective Actions into the Joint Audit Management Enterprise System (JAMES) and monitor them on a monthly basis until completion.

 

RECOMMENDATION #2: The Associate Chief Information Officer, Enterprise Operations should perform a staffing and workload analysis of the SOCCB to demonstrate the relationship between staffing and service levels and help identify opportunities to improve information technology service quality, reduce costs, and increase business value

 

CORRECTIVE ACTION #2: We agree with the recommendation. Currently, SOCCB has the minimum staff per each work specialty and due to budgetary, NTEU and LR constraints SOCCB is unable to make any adjustments to the staffing model. However, SOCCB will perform a staffing and workload analysis to determine the relationship between staffing and service levels and help identify opportunities to improve information technology service quality, reduce costs, and increase business value in anticipation of the ability to acquire or adjust resources.

 

IMPLEMENTATION DATE: December 30, 2012

 

RESPONSIBLE OFFICIAL: Associate Chief Information Officer, Enterprise Operations

 

CORRECTIVE ACTION MONITORING PLAN: We enter accepted Corrective Actions into the Joint Audit Management Enterprise System (JAMES) and monitor them on a monthly basis until completion.

 

RECOMMENDATION #3: The Associate Chief Information Officer, Enterprise Operations should update the Enterprise Operations strategic plan whenever a SOCCB ITIL® best practice is required to support the goals or objectives of the organization. The update needs to address the goals, as well as milestone and target dates, completion dates, benefits and any associated risks.

 

CORRECTIVE ACTION #3: We agree with the recommendation. SOCCB will update the Enterprise Operations strategic plan for Problem Management whenever a SOCCB ITIL® best practice is required to support the goals or objectives of the organization. The update will address the goals, as well as milestones and target dates, completion dates, benefits, and any associated risks.

 

IMPLEMENTATION DATE: December 30, 2012

 

RESPONSIBLE OFFICIAL: Associate Chief Information Officer, Enterprise Operations

 

CORRECTIVE ACTION MONITORING PLAN: We enter accepted Corrective Actions into the Joint Audit Management Enterprise System (JAMES) and monitor them on a monthly basis until completion.

 

RECOMMENDATION #4: The Associate Chief Information Officer, Enterprise Operations should ensure SOCCB management develops and executes a training plan to ensure personnel continue to receive customized training in ITIL® best practices relevant to the Command Center.

 

CORRECTIVE ACTION #4: We agree with the recommendation. SOCCB will develop and execute a training plan to ensure personnel continue to receive customized training in ITIL best practices relevant to the Command Center. The training plan will be based on available online courses and other courses as budget constraints allow.

 

IMPLEMENTATION DATE: December 30, 2012

 

RESPONSIBLE OFFICIAL: Associate Chief Information Officer, Enterprise Operations

 

CORRECTIVE ACTION MONITORING PLAN: We enter accepted Corrective Actions into the Joint Audit Management Enterprise System (JAMES) and monitor them on a monthly basis until completion.

 

RECOMMENDATION #5: The Associate Chief Information Officer should identify and implement performance measures that will demonstrate the efficiencies and effectiveness of implementing Event Management, Incident Management, and Problem Management.

 

CORRECTIVE ACTION #5: We agree with the recommendation. SOCCB will identify and implement performance measures that will demonstrate the efficiencies and effectiveness of implementing Event Management, Incident Management, and Problem Management.

 

IMPLEMENTATION DATE: December 30, 2012

 

RESPONSIBLE OFFICIAL: Associate Chief Information Officer, Enterprise Operations

 

CORRECTIVE ACTION MONITORING PLAN: We enter accepted Corrective Actions into the Joint Audit Management Enterprise System (JAMES) and monitor them on a monthly basis until completion.



[1] See Appendix IV for a glossary of terms.

[2] This excludes the Priority 1 and Priority 2 tickets considered Unscheduled Maintenance Requests and worked by an outside vendor.

[3] SOCCB management described a system error as one that occurs due to a problem with the service software suite. 

[4] SOCCB management indicated this cause code is used synonymously with “Unknown” (i.e., the employee could not identify what caused the incident ticket to occur). 

[5] GAO-01-1008G, dated August 2001.

[6] IT Infrastructure and Operations Leaders Key Initiative:  ITIL and Process Improvement, dated January 16, 2009.

[7] GAO/AIMD-00-21.3.1, dated November 1999.

[8] Enterprise Operations organization management reviews and updates its strategic plan quarterly.

[9] 44 U.S.C. Sections 3541 – 3549.

[10] The goals include mitigating the impact of system downtime/lost productivity and preventing the recurrence of incidents. 

[11] See Appendix IV for glossary of terms. 

[12] Management Advisory Report:  Review of Lost or Stolen Sensitive Items of Inventory at the Internal Revenue Service (Reference Number 2002-10-030, dated November 29, 2001), Progress Has Been Made in Using the Tivoli® Software Suite, Although Enhancements Are Needed to Better Distribute Software Updates and Reconcile Computer Inventories (Reference Number 2006-20-021, dated December 14, 2005), and Management Practices Over End-user Computer Server Storage Need Improvement to Ensure Effective and Efficient Storage Utilization (Reference Number 2007-20-103, dated July 3, 2007).

[13] Pub. L. No. 104-106, 110 Stat. 642 (codified in scattered sections of 5 U.S.C., 5 U.S.C. app., 10 U.S.C., 15 U.S.C., 16 U.S.C., 18 U.S.C., 22 U.S.C., 28 U.S.C., 29 U.S.C., 31 U.S.C., 38 U.S.C., 40 U.S.C., 41 U.S.C., 42 U.S.C., 44 U.S.C., 49 U.S.C., 50 U.S.C.).