Site Reliability Engineer Jobs Opening in For a Client of TeamLease Digital at Bengaluru
Site Relaibility Engineer
organization.
Job Description
ob Description (Site Reliability Engineer – Azure, AKS, and DevOps)
Key Responsibilities:
· Design, build, and maintain highly available and scalable cloud infrastructure on Microsoft Azure, leveraging AKS, Service Bus, Event Grid, Cosmos DB, and PostgreSQL.
· Develop and maintain Infrastructure as Code (IaC) using Bicep for automated provisioning and consistent environment setup.
· Implement and manage CI/CD pipelines using GitHub Actions to streamline build, test, and deployment processes for microservices and infrastructure.
· Drive observability and incident management by integrating Grafana Cloud Stack (Loki, Tempo, IRM, and Prometheus) to collect logs, traces, and metrics for proactive system monitoring.
· Apply DevOps and SRE best practices such as automation-first principles, blameless postmortems, and continuous improvement.
· Ensure secure cloud operations through Azure networking and security features — VNets, NSGs, Load Balancers, Private Endpoints, Azure Firewall, and Key Vault.
· Integrate code quality and vulnerability scanning tools like SonarQube, Black Duck, and Dependabot within CI/CD pipelines to maintain high code and security standards.
· Lead incident response and root cause analysis (RCA) to minimize downtime and improve mean time to recovery (MTTR).
· Collaborate with development, architecture, and security teams to optimize system design for performance, reliability, and cost efficiency.
· Continuously evaluate and implement emerging cloud-native technologies and observability tools to enhance reliability and developer productivity.
________________________________________
Required Qualifications:
· Bachelor’s degree in Computer Science, Information Technology, or a related field.
· Minimum 6 years of IT experience, including at least 5 years in SRE, DevOps, or Cloud Engineering roles.
· Strong hands-on experience with Microsoft Azure services – App Services, AKS, Service Bus, Event Grid, Cosmos DB, PostgreSQL.
· Proven experience implementing CI/CD pipelines with GitHub Actions or Azure DevOps.
· Expertise in Docker and Kubernetes (AKS) for containerized application deployment and management.
· Proficiency in Infrastructure as Code (IaC) using Terraform, Bicep, or ARM templates.
· In-depth knowledge of Grafana Cloud Stack (Loki, Tempo, IRM) and Azure Monitor for observability.
· Experience with networking and security configurations within Azure (VNets, NSGs, Load Balancers, Firewalls).
· Familiarity with code quality and security tools such as SonarQube and Black Duck.
· Strong scripting skills in PowerShell, Bash, or Python for automation and system management.
· Excellent problem-solving, debugging, and performance optimization skills.
· Ability to work collaboratively across development, operations, and security teams in fast-paced environments.
________________________________________
Preferred Skills:
· Familiarity with cost optimization techniques and Azure governance frameworks.
· Experience in Agile/Scrum environments with a focus on automation, observability, and reliability engineering.
· Experience with Agile methodologies and DevOps practices.
· Familiarity with containerization technologies (e.g., Docker, Kubernetes).
· Knowledge of CI/CD pipelines using Azure DevOps or similar tools.
· Understanding of version control systems (e.g., Git)