OpenShift Architect / SME
Contract Mid-Senior LevelJob Overview
Anticipated Contract End Date/Length: November 30, 2026
Work set up: Hybrid
Our client in the Information Technology and Services industry is looking for an OpenShift Architect / SME to join the Virtualization Engineering Team within Distributed Compute in CTO Enterprise Services. The OpenShift Engineering SME is responsible for designing, building, operating, and continuously improving the enterprise OpenShift platform for critical business services. This role owns platform reliability, lifecycle management, security hardening, automation, and integration with core enterprise services, acting as the technical authority for OpenShift architecture and operations.
What you will do:
- Own OpenShift cluster lifecycle including Day 0 provisioning, Day 1 configuration, and Day 2 operations.
- Plan and execute platform upgrades, patching, and version compatibility management across OCP, RHCOS, and Operators.
- Design, build, and continuously improve enterprise OpenShift architecture.
- Implement and maintain platform security controls including RBAC, SCC, network policies, image governance, and compliance baselines.
- Build and maintain automation using Ansible, GitOps tools such as ArgoCD and OpenShift GitOps, and Infrastructure as Code tools such as Terraform or OpenTofu.
- Manage and optimise OpenShift networking including OVN-Kubernetes, Multus, ingress, egress, and load balancer integrations.
- Manage storage integration and lifecycle including StorageClass configuration, CSI drivers, provisioning, and performance tuning.
- Integrate OpenShift with enterprise services including DNS, NTP, LDAP, Active Directory, OIDC, PKI certificates, monitoring, logging, and ServiceNow.
- Define observability standards, SLOs, alerting thresholds, and incident response playbooks.
- Lead root cause analysis for platform incidents and drive preventative actions to improve reliability.
- Support capacity planning, scaling strategy, and platform performance optimisation.
- Partner with application, security, network, and infrastructure teams to enable secure and efficient onboarding.
- Manage vendor engagement including alignment with Red Hat TAM, support case management, and escalation handling.
- Create and maintain technical documentation, runbooks, standards, and knowledge transfer materials.
- Mentor engineering teams and establish best practices for platform operations and governance.
- Collaborate with cross-functional teams to deliver infrastructure solutions aligned to business needs.
- Participate in disaster recovery and business continuity planning and testing.
- Engage stakeholders to communicate issues, progress, risks, and resolutions for critical incidents.
- Monitor compliance and mitigate risks while ensuring adherence to regulatory standards and internal policies.
- Manage and improve platform offerings throughout their lifecycle using Agile methodologies.
- Share product updates, ensure supported version compliance, and communicate CVEs promptly.
- Provide technical direction across OpenShift, distributed systems, servers, and storage technologies.
- Represent the team during major incidents and provide technical expertise in crisis situations.
- Implement robust monitoring and alerting systems to proactively manage platform health.
- Ensure system availability and performance in line with defined SLAs and SLOs.
- Plan future capacity requirements and optimise system performance through tuning and scaling activities.
Make Your Resume Now