Site Reliability Engineer - IT Operations

  • icrunchdata Network
  • Chickamauga Creek, Tennessee, USA
  • Jun 24, 2020
Information Technology (IT)

Job Description

FUNCTIONS AND RESPONSIBILITIES: - Provide authoritative advice on and perform operations work in support of IT infrastructure and applications at all organization’s facilities to support the organization’s administrative and business computing needs - Oversee the evaluation of events/alarms from operational tools to ensure effective response practices are employed - Develop and enhance event handling process; provide training/coaching as warranted in a peer environment; and perform post mortem reviews for lessons learned (Event categories include but are not limited to: software engineering, debugging, automation, routers, switches, servers, databases, voice, transport, environment, backups, cyber security, workstations and data transfer.) - Serve as the staff advisor and Subject Matter Expert for all event management operations tools - Leverage knowledge to train and mentor junior analysts to become more effective - Enhance trending, research, and proactive monitoring practices within the workgroup - Utilize administrator monitoring tools to maintain, update, and optimize system policies, thresholds, and triggers to provide robust problem detection - Develop operational reports that add value to the workgroup (top 10 worst, reoccurring alarms, etc.) - Represent the event management workgroup in operational and project planning meetings - Analyze emergent requirements in the best interest of the organization to deliver effective operational readiness Job Requirements: REQUIRED SKILLS AND EXPERIENCE: - Comprehensive knowledge of HP Operations Manager, HP Service Manager, Solar Winds, and similar event management tools - ITIL v3 experience - Thorough understanding of service operations processes and site reliability - Minimum of a bachelor’s degree in Computer Science, Engineering, Mathematics, Business Administration or related field of study; OR, in lieu of a degree, equivalent education, training & experience - Expert knowledge of and extensive experience with one or more IT infrastructure technologies such as networking, client server architectures, server and desktop operating systems, databases, server virtualization, storage area networks (SAN), messaging, etc. - Ability and willingness to work a rotational shift coverage schedule to staff a 24x7 Operation Center - Ability to obtain and maintain a Sensitive or equivalent security clearance - Ability and willingness to assume on-call rotational assignments which may include 24-hour, 7-day per week availability - Ability to travel as required to carry out project work or perform temporary duty at other locations - May be required to obtain and maintain a security clearance based on position / access requirements and essential job functions

Job ID

Johnson Service Group, Inc