Senior Software Engineer, Infrastructure
FullTime USD 170000 - 250000 1 YEARJob Overview
About Luminai
Nearly every organization in the world relies on complex manual work to carry out critical internal processes. These are processes that keep the world going — enrolling patients in a hospital, underwriting loans inside a bank, or processing new transactions for an airline. Yet most companies don’t have enough resources to properly automate these tasks and are stuck in manual, decades old way of doing things.
At Luminai, we develop technology to automate long-form organization wide workflows of any complexity easily and safely using AI. Luminai serves some of the world’s most critical organizations in sectors like Healthcare, Finance, and Telecommunication to delegate mission-critical workflows that previously required hands-on human involvement, over to autonomous AI systems. Our approach combines frontier AI development, with a purpose built workflow execution engine to achieve this goal.
We've raised significant amounts of capital (including some un-announced) from many of the best Silicon Valley VCs: General Catalyst, YCombinator, and investors including Kevin Weil (Chief Product Officer at OpenAI), Arash Ferdowsi (co-founder of Dropbox), Katie Stanton (former VP Global Media, Twitter) and CEOs of companies including Flexport, Notion, Front, Ramp and Twitch.
About the Role
We’re looking for a Senior Platform Engineer to join our Infrastructure team and help build and scale a self-hosted, cloud-native platform for both production and air-gapped/on-prem environments.
You will work closely with our existing senior engineers (who have built the current platform from the ground up) to evolve our AWS/Azure-based Kubernetes infrastructure, optimize CI/CD workflows, maintain GitOps pipelines, and ensure local/dev/prod environment parity.
This is a high-ownership role. You’ll be expected to contribute architecture-level decisions, write production-grade code, and own infrastructure that supports mission-critical services.
What You’ll Be Working On
Extend and maintain a multi-cluster AWS/Azure EKS setup via Terraform modules (EKS, VPC, IRSA, ECR, IAM, S3, KMS, etc.)
Develop and maintain Kubernetes Helm charts and distroless Docker images for self-hosted deployments
Own and evolve the platform stack:
cert-manager, external-dns, ingress-nginx, istio ambient, minio, karpenter, otel/signoz, velero, pomerium, temporal, redis, etc.
Support and optimize CI/CD systems with GitHub Actions, custom self-hosted runners, and Skaffold-based PR environments
Maintain a robust GitOps deployment model using ArgoCD, SOPS, and external-secrets
Contribute to our local development experience using k3d clusters, helping teams onboard and maintain environment parity
Improve platform observability and reliability through Signoz, Pyroscope, custom dashboards, and alerts
Enable and support air-gapped / on-prem installations by ensuring self-hostability and minimal third-party dependencies
You Should Have
5+ years of experience in DevOps, SRE, or Platform Engineering roles
Proven expertise with Kubernetes and production-grade Helm-based deployments
Strong experience building infrastructure with Terraform, including complex module systems and environment separation
Experience deploying and managing GitOps pipelines using ArgoCD or Flux
Proficiency in designing secure, scalable CI/CD pipelines with tools like GitHub Actions, Skaffold, and Docker build optimizations
Deep understanding of networking, ingress controllers, TLS, and service-to-service communication (esp. with Istio or ambient mesh)
Experience with cloud-native observability: tracing, metrics, logging (OpenTelemetry, Signoz, Pyroscope, Prometheus, etc.)
Practical knowledge of security best practices: IRSA, KMS, SOPS, secrets management
Comfortable with air-gapped / self-hosted system constraints
Ability to write clean, modular Go or Python when needed for controllers/operators/scripts
Bonus Experience
Building or maintaining custom Kubernetes operators with CRDs (esp. for ephemeral workloads)
Prior experience migrating from SaaS to self-hosted alternatives
Exposure to on-premise infrastructure and air-gapped delivery pipelines
Familiarity with key developer tooling like Keycloak, Temporal, MinIO, and Pomerium
Experience using or maintaining self-hosted GitHub Actions runners with docker-in-docker caching setups
Make Your Resume Now