(Senior) Machine Learning Platform/Ops Engineer, Remote/Europe (f/m/x)
Full-time Mid-Senior LevelJob Overview
- Own the ML lifecycle: Design, implement, and maintain robust, containerized, and reproducible pipelines for model training, evaluation, and deployment—across both batch and real-time settings.
Operationalize models at scale: Build and manage ML services, APIs, and model serving infrastructure using tools like MLflow, Amazon SageMaker, and Feature Store.
Automate and monitor: Set up and maintain monitoring, observability, and alerting systems to ensure high availability and performance (including model/data drift, feature logging, and inference latency).
Accelerate experimentation: Develop and maintain internal libraries, templates, and platform tooling to improve reproducibility and simplify deployment workflows for all model teams.
Ensure reliability and quality: Implement CI/CD pipelines for model and data workflows using Docker, Terraform, and Jenkins and share best practices, mentor less experienced engineers, and foster strong collaboration across teams.
- Stay current: Continuously evaluate emerging MLOps technologies to improve efficiency, scalability, and reliability.
Make Your Resume Now