Make Your Resume Now

Senior Site Reliability Engineer

Posted April 15, 2026
Full-time Mid-Senior Level

Job Overview

As a Senior Site Reliability Engineer (SRE) within the Web Center of Excellence (Web COE), you will be responsible for ensuring the reliability, security, scalability, and performance of enterprise web platforms. You will support and optimize web applications built on Sitecore, WordPress, and IIS-based solutions, while actively driving proactive monitoring, anomaly detection, and vulnerability remediation.

This role blends hands-on engineering, operational excellence, and forward-looking innovation, including participation in AI-driven observability and automation initiatives. You will work closely with developers, QA, solution architects, and business stakeholders to ensure highly available, secure, and resilient web services.

 

Key Responsibilities

Reliability & Operations

  • Own the availability, performance, and stability of web applications hosted on Azure, including PaaS and IaaS workloads.
  • Proactively monitor systems to detect anomalies, performance degradation, and reliability risks, and take preventive actions before customer impact occurs.
  • Lead incident response, root cause analysis (RCA), and post-incident reviews, ensuring long-term corrective actions are implemented.

Azure & Microsoft Ecosystem

  • Design, operate, and optimize solutions using Azure services such as App Services, Azure VMs, Azure Monitor, Log Analytics, Application Insights, Azure Front Door, and Azure Networking.
  • Automate operational tasks using PowerShell and Azure-native automation capabilities.
  • Ensure adherence to Microsoft security and compliance best practices.

Web Platform Support

  • Support hosting, deployment, and operational health of Sitecore, WordPress, and legacy IIS-based applications.
  • Collaborate with development teams to ensure applications are production-ready, scalable, and operationally sound.
  • Guide teams on web hosting architecture, DNS governance, SSL/TLS, and traffic management.

Security & Vulnerability Management

  • Proactively identify security vulnerabilities, misconfigurations, and exposure risks across infrastructure and applications.
  • Partner with security teams to implement remediation plans, patching strategies, and hardening standards.
  • Ensure secure-by-design principles are embedded into web hosting and operational processes.

Observability, Monitoring & AI Initiatives

  • Build and enhance monitoring, alerting, and observability across the web ecosystem.
  • Leverage data, logs, and metrics to identify trends and systemic risks.
  • Contribute to AI-driven initiatives such as intelligent alerting, anomaly detection, predictive reliability, and automated remediation.
  • Continuously improve operational maturity through tooling, dashboards, and insights.

Collaboration & Leadership

  • Work closely with Software Engineers, QA, and Architects to deliver reliable web services.
  • Provide technical mentorship to junior SREs and engineers, setting operational best practices.

Ready to Apply?

Take the next step in your career journey

Stand out with a professional resume tailored for this role

Build Your Resume – It’s Free!