Make Your Resume Now

Senior Site Reliability and Production Engineer

Posted October 12, 2025
Full-time
Associate

Job Overview

Your Career

As a Senior DevOps Engineer on our Production Engineering team, you will be at the forefront of ensuring the stability, scalability, and performance of our production systems. You’ll be responsible for the health of large-scale cloud environments, investigating incidents, driving root cause analysis, and implementing long-term solutions that improve system reliability. You’ll also own and continuously improve the production release process, ensuring deployments are safe, automated, and well-orchestrated. You’ll collaborate closely with engineering, platform, and SRE teams to ensure world-class operational excellence for our customer-facing services.

Your Impact

  • Own the end-to-end release process: plan, coordinate, and execute deployments across environments with a strong focus on safety, reliability, and automation

  • Ensure stability and performance of all production systems, maintaining high availability through proactive monitoring and incident management

  • Investigate and resolve complex production issues, driving post-incident reviews and implementing long-term fixes

  • Respond to critical incidents and customer escalations with a calm, structured approach and clear communication

  • Define and uphold best practices for change management, observability, and system reliability

  • Manage infrastructure-as-code using Terraform for scalable cloud deployments

  • Improve monitoring, alerting, and recovery mechanisms to detect and resolve issues faster

  • Automate repetitive operational tasks through scripting and tooling

  • Collaborate with development teams to ensure smooth delivery and stable operation of new features

  • Participate in an on-call rotation to support production systems

Ready to Apply?

Take the next step in your career journey

Stand out with a professional resume tailored for this role

Create Resume