Site Reliability Engineer (SRE)
fulltime_permanent mid_levelJob Overview
At NOVACARD, we’re redefining how people use credit.
We are the first interest-free and no-annual-fee credit card in Mexico, designed to simplify personal finances and give users complete control - all from a mobile app. With NOVACARD, users can access up to $200,000 MXN in credit, only pay when they use it, and manage everything digitally in under 5 minutes. Our mission is to empower people to make smarter financial decisions by offering flexibility, transparency, and the freedom they need to reach their goals. Simple finances, big goals.
About the Role:
We’re looking for a Site Reliability Engineer (SRE) to ensure the stability, performance, and reliability of our critical production systems. You’ll work at the intersection of development and operations — building automation tools, improving observability, and preventing incidents before they occur.
Key Responsibilities:
Ensure the stability, performance, and fault tolerance of production systems.
Develop and maintain infrastructure automation and observability tools.
Monitor system health, respond to incidents, and perform root cause analysis (RCA).
Collaborate with development teams to improve scalability and reliability of services.
Define and manage SLIs, SLOs, and Error Budgets.
Lead incident response: organize recovery, document RCA, and run blameless post-mortems.
Configure and administer Grafana and Zabbix, design insightful dashboards, and fine-tune alerting.
Integrate and monitor external vendor systems, collaborating with vendor technical support when needed.
Make Your Resume Now