Make Your Resume Now

AI Platform Engineer

Posted October 31, 2025
Full Time

Job Overview

About the Role 
We're building an ambitious internal AI Platform to power Bright's next generation of AI-driven products and services. This Kubernetes-hosted platform provides teams across the organisation with the tools to build, deploy, and observe AI-powered applications without managing complex infrastructure themselves. 

As an AI Platform Engineer, you'll join a small, high-impact team building critical platform infrastructure for LLM operations (LLMOps). Working under the supervision of two senior/principal platform engineers and reporting to the Head of AI, you'll be instrumental in delivering self-service AI capabilities that enable developers across Bright to build sophisticated AI applications with confidence. 
This is an opportunity to work on cutting-edge AI infrastructure, learn from experienced platform engineers, and make a significant impact on how Bright leverages AI technology at scale. 



Key Responsibilities

Our roadmap spans multiple interconnected platform epics. You'll contribute to key initiatives including: 
Core Platform Services 
  • Observability & Experimentation: Enhancing Langfuse for LLM tracing, evaluation, and experimentation capabilities 
  • Developer Self-Service: Building and improving Backstage as an internal developer portal for platform discoverability 
  • LLM Operations: Deploying and maintaining LiteLLM proxy, Langflow runtime, and other core LLM services 
  • Monitoring & Logging: Implementing platform-wide monitoring (Prometheus/Grafana) and logging infrastructure (Loki) 
Security & Compliance 
  • LLM Ops Security: Implementing guardrails (LlamaGuard, Azure Guardrails) and security controls 
  • GDPR & PII Management: Building automated PII detection, minimization strategies, and compliance tooling 
  • Incident Response: Establishing security incident response procedures for LLM operations 
Infrastructure & Reliability 
  • Kubernetes Operations: Managing AKS clusters, implementing reliable deployment tooling via ArgoCD 
  • Infrastructure as Code: Productionizing infrastructure with Terraform, eliminating manual configuration 
  • Autoscaling & Performance: Implementing workload management and autoscaling for AI services 
  • Storage Solutions: Migrating from self-hosted MinIO to managed Azure Blob Storage 
Applications Support 
You'll also support the deployment and operation of AI applications built on the platform, including: 
  • RAG (Retrieval-Augmented Generation) applications like Ask IPASS and Ask UK Pay Centre 
  • Document processing applications (BrightCapture) 
  • Employee onboarding automation (Oscar) 
  • Internal AI assistant (Bright GPT) 

Skills, Knowledge and Expertise

What We're Looking For 
Essential Skills & Experience 
  • Platform Engineering Fundamentals: 2-4 years experience with cloud infrastructure, preferably Azure 
  • Kubernetes: Practical experience deploying and managing applications in Kubernetes (AKS experience is a plus) 
  • Infrastructure as Code: Hands-on experience with Terraform or similar IaC tools 
  • CI/CD: Experience with GitOps workflows and tools like ArgoCD, GitHub Actions, or similar 
  • System Programming: Proficiency in Python or Go for automation and tooling; shell scripting essential 
  • Linux & Containers: Solid understanding of containerization with Docker and container orchestration 
Desirable Experience 
  • Exposure to LLM technologies or AI/ML infrastructure 
  • Experience with observability tools (Prometheus, Grafana, Loki) 
  • Knowledge of Helm and Helmfile for Kubernetes deployments 
  • Knowledge of Kustomize 
  • Understanding of security best practices and compliance requirements (GDPR) 
  • Backend-as-a-Service platforms (Supabase or similar) 
  • Developer portal platforms (Backstage or similar) 
  • Application programming experience with .NET and/or TypeScript 
What Makes You a Great Fit 
  • Learning Mindset: You're excited to learn about LLM operations and emerging AI infrastructure patterns 
  • Systems Thinking: You understand how distributed systems work and can reason about failure modes 
  • Pragmatic Approach: You balance perfect solutions with shipping value quickly 
  • Collaboration: You work well with both technical and product stakeholders 
  • Documentation: You believe good documentation is as important as good code 
  • Ownership: You take responsibility for your work from development through to production 
Team Structure & Reporting 
  • Reports to: Head of AI 
  • Works closely with: Two senior/principal platform engineers 
  • Collaborates with: Application development teams, product managers, and security/compliance stakeholders 
  • Team size: Small, full-stack AI team covering development, DevOps, operations, and support 
What Success Looks Like 
In your first 3 months: 
  • You've contributed to multiple platform epics from our roadmap 
  • You understand the architecture of our AI platform and can navigate the codebase 
  • You've successfully deployed services to our Kubernetes clusters 
  • You're participating in on-call rotation and can troubleshoot platform issues 
In your first 6 months: 
  • You're independently owning epics and driving them to completion 
  • You're contributing to architectural decisions and technical direction 
  • You've improved platform reliability, observability, or developer experience 
  • You're mentoring junior engineers or helping onboard new team members 
Technical Stack 
Infrastructure: Azure (AKS, Blob Storage, Cognitive Services), Kubernetes, Terraform
Platform Services: LiteLLM, Langflow, Langfuse, Supabase, Open Web UI, Backstage
Observability: Prometheus, Grafana, Loki, Langfuse tracing
CI/CD: ArgoCD, GitHub Actions, Helmfile
Languages: Python, Go, Shell scripting
Security: Azure Guardrails, LlamaGuard, PII detection tooling 
Why Join Bright's AI Platform Team? 
  • Impact: Your work directly enables AI innovation across the entire organization 
  • Growth: Learn from experienced platform engineers in a supportive environment 
  • Cutting Edge: Work with the latest AI infrastructure and tooling 
  • Autonomy: Small team means you'll have significant ownership and influence 
  • Mission: Help accountants and finance professionals work more efficiently with AI 
 

Ready to Apply?

Take the next step in your career journey

Stand out with a professional resume tailored for this role

Build Your Resume – It’s Free!