Make Your Resume Now

Machine Learning Engineer (Training Optimization)

Posted April 16, 2026
Full-time Mid-Senior Level

Job Overview

About the Role/Specialty

As a Machine Learning Engineer, you’ll lead efforts to scale and optimize the training system for our large-scale multimodal and foundation models. You’ll design distributed training systems using Megatron-LM, NVIDIA NeMo, FSDP, and Triton—pushing the limits of performance across compute, memory, and communication layers. You'll sit at the intersection of systems and AI research, directly shaping how we train the models that will power Canva’s next generation of products.

What you’ll do (responsibilities)

  • You’ll design, implement, and optimize large-scale machine learning systems for training
  • You’ll improve all aspects of performance, including GPU utilization, communication overhead, and memory efficiency.
  • You’ll partner with research and modeling teams to align systems with algorithmic needs.
  • You’ll evaluate and apply best practices for distributed training using industry-leading frameworks.
  • You’ll dive deep into low-level optimization, including custom CUDA or Triton kernels.
  • You’ll debug, profile, and fine-tune training workflows to unlock new levels of scalability.

Ready to Apply?

Take the next step in your career journey

Stand out with a professional resume tailored for this role

Build Your Resume – It’s Free!