Make Your Resume Now

Datadog L3 Engineer

Posted December 13, 2025
Full-time Associate

Job Overview

This role is for one of the Weekday's clients

Min Experience: 5 years

Location: Singapur

JobType: full-time

As a Datadog L3 Engineer, you will play a critical role in designing, implementing, and operating advanced observability solutions for complex, large-scale technology environments. Based in Singapore, this full-time role is ideal for a highly skilled professional with deep hands-on experience in monitoring, logging, metrics, and real-user monitoring (RUM). You will act as a subject matter expert for Datadog, supporting mission-critical systems, driving operational excellence, and ensuring high availability, performance, and reliability across infrastructure and applications. This role requires strong collaboration with engineering, DevOps, and operations teams, along with a solid understanding of ITIL practices and modern cloud-native tooling.

Requirements

Key Responsibilities

  • Design, configure, and maintain end-to-end observability solutions using Datadog, including logs, metrics, traces, and RUM for distributed systems
  • Act as an L3 escalation point for complex monitoring, performance, and availability issues, performing deep root cause analysis and remediation
  • Implement and optimize log management pipelines, dashboards, alerts, and service-level indicators (SLIs/SLOs) to improve system visibility and reliability
  • Lead the setup and tuning of infrastructure and application monitoring across containerized and cloud environments
  • Build and manage monitoring infrastructure as code using Terraform, ensuring consistency, scalability, and repeatability
  • Support Docker-based platforms by monitoring container health, performance, and resource utilization
  • Integrate Datadog with CI/CD pipelines and cloud services to enable proactive detection of issues
  • Collaborate with DevOps, SRE, and application teams to define observability standards and best practices
  • Ensure adherence to ITIL processes for incident, problem, and change management
  • Create and maintain detailed documentation, runbooks, and operational guides
  • Continuously evaluate system performance trends and recommend improvements to enhance stability and user experience
  • Mentor junior engineers and provide technical guidance on observability and monitoring strategies

What Makes You a Great Fit

  • At least 5 years of hands-on experience in monitoring, observability, or site reliability engineering roles
  • Strong expertise with Datadog, including logs, metrics, dashboards, alerts, and Real User Monitoring (RUM)
  • Proven experience using Terraform to manage infrastructure and monitoring configurations
  • Solid hands-on knowledge of Docker and container-based environments
  • Strong understanding of ITIL processes and experience working in structured operational environments
  • Ability to troubleshoot complex, large-scale production issues with a methodical and analytical approach
  • Experience working with cloud platforms and modern DevOps toolchains
  • Excellent communication skills, with the ability to collaborate across technical and non-technical teams
  • A proactive mindset with a strong focus on automation, reliability, and continuous improvement
  • Comfortable working in fast-paced, high-availability environments with ownership and accountability

Ready to Apply?

Take the next step in your career journey

Stand out with a professional resume tailored for this role

Build Your Resume – It’s Free!