SRE Manager

Entain

India

Posted January 09, 2026

Full-time Mid-Senior Level

Job Overview

We are seeking a talented and motivated SRE Manager to join our dynamic team. In this role, you will execute a range of site reliability activities, ensuring optimal service performance, reliability, and availability. You will collaborate with cross-functional engineering teams to develop scalable, fault-tolerant, and cost-effective cloud services.

If you are passionate about site reliability engineering and ready to make a significant impact, we would love to hear from you!

Key Responsibilities:

A Site Reliability Engineering (SRE) Team Manager plays a crucial role in ensuring system reliability, scalability, and operational efficiency. Their responsibilities typically include:

Leading & Mentoring – Guiding a team of SREs, fostering a culture of automation, resilience, and continuous improvement.
Defining SLOs & SLIs – Establishing service level objectives (SLOs) and indicators (SLIs) to measure and maintain system performance.
Incident Management – Overseeing incident response, conducting post-mortems, and implementing preventive measures.
Collaboration with Engineering Teams – Working closely with developers to build scalable and resilient systems.
Automation & Efficiency – Driving automation initiatives to reduce toil and enhance operational workflows.
Risk & Compliance Management – Ensuring adherence to security, compliance, and reliability standards.
Optimizing Observability – Implementing monitoring tools and strategies to proactively detect and resolve issues.

implement automation tools, frameworks, and CI/CD pipelines, promoting best practices and code reusability.
Enhance site reliability through process automation, reducing mean time to detection, resolution, and repair.
Identify and manage risks through regular assessments and proactive mitigation strategies.
Develop and troubleshoot large-scale distributed systems in both on-prem and cloud environments.
Deliver infrastructure as code to improve service availability, scalability, latency, and efficiency.
Monitor support processing for early detection of issues and share knowledge on emerging site reliability trends.
Analyze data to identify improvement areas and optimize system performance through scale testing.

SRE Manager

Job Overview

Ready to Apply?