Monitoring Tool Jobs in Bangalore - For a client of TeamLease Services Ltd
Job Description
The Enterprise Monitoring Tools SME is responsible for operation of monitoring tools and processes to enable proactive events and performance , availability and capacity monitoring of enterprise IT services such as servers, network devices, operating systems, database engines, storage capacity, applications, cloud, container platform, and virtualization across all IT service lines and provide 1st line of resolution.
Required Experience/Skills:
· Experience on usage, administration of monitoring tools such as Grafana, Prometheus, Graphite, Nagios Core, ELK stack, Splunk, Nagios XI, PRTG, and Zabbix.
· Experience in configuring, operating and/or troubleshooting various monitoring tools (specified above)
· Experience in query building on multiple platform and languages, such as Lucene, SQL, KQL, SPL, etc.
· Experience on usage of AppDynamics, and NewRelic APM tools.
· Good to have knowledge on administration of Discovery & CMDB tools such as ServiceNow, UCMDB, Device42, etc.
· Knowledge of and experience writing technical documents for an IT environment, to include but not limited to procedures for disaster recovery for IT infrastructure
· Strong ability to monitor & diagnose IT infrastructure, application services, server, network alerts, events or issues
· Experience of analyzing system and network performance using monitoring tools and historical / real time data.
· Knowledge of SNMP protocol and using SNMP traps/MIBs for monitoring
· Able to collate and interpret data from various monitoring sources and perform log analysis
· Ability to assess and prioritize faults and respond or escalate accordingly.
· Ability to assess faults, prioritizes, respond and escalate accordingly
· Quick learner on a wide range of issues, including identifying improvement opportunities
· Excellent problem solving skills, and results-oriented attitude. Excellent team skills, including strong work ethic.
· Ability to be proactive and prioritize tasks, including resolving urgent matters without impacting deliverables and productivity.
· Self-motivated, professional attitude and works well under pressure.
· Should be able to function independently, with limited supervision.
Responsibilities
· Proactive monitoring of IT infrastructure and application/services alerts via monitoring tool
· End-to-End ownership of tickets created via alerts, accurately recording progress and escalations on ticketing systems
· 1st level of investigation and diagnosis
· Fast and effective response to service failure Alerts and Notifications
· On time escalations and Follow up with the IT / Application Support team on pending high priority alerts according to incident management process
· Prepare and maintain Documentation, Reports ( availability , capacity trend analysis ), and provide follow up status on identified tasks
· Daily / Weekly Report preparation based on the specified already agreed format and sending the same to pre-assigned set of recipients
· Maintain, update and implement the standard escalation procedures complete with notification matrix and escalation standards