Lead Architect — Autonomous IT Agent Platform - Sanctum
FullTimeJob Overview
Lead Architect — Autonomous IT Agent Platform
About Us
We’re not another IT vendor with shiny dashboards and offshore tickets. We’re building the world’s first Agentic MSP — where AI systems don’t “assist” humans; they run the stack.
Our agents see, reason, and act across entire IT environments — capturing context, fixing problems, and learning as they go. Humans handle edge cases. Machines handle the rest.
We’re part of Infinity Constellation, a portfolio of AI-native service companies rethinking what “operations” even means in the post-human era.
If you want to manage a team of ticket chasers, scroll on. If you want to teach machines to run IT, read on.
The Role
We’re looking for a Lead Architect who lives for first principles, fast iteration, and occasional chaos. This is not a “sit in meetings” job. It’s hands on keyboard, whiteboard covered in arrows, deploy at 2AM and brag about it later.
You’ll design and build the agentic core — the autonomous systems that will make traditional MSPs obsolete.
You’ll:
• Architect, build, and deploy production-grade AI agents that reason, act, and self-correct.
• Connect those agents to the real world: patching, provisioning, monitoring, identity, access — the works.
• Turn operational pain into autonomous behavior loops that get smarter every day.
• Lead a small, lethal team of engineers who move fast and leave clean commits.
Why This Matters
Because the current IT industry is broken. Vendors profit when things break. We’re flipping that — agents that prevent problems before they exist.
You won’t be “optimizing workflows.” You’ll be erasing them.
Every line of code you write will replace hours of human toil. Every experiment will teach machines to manage complexity better than humans ever could.
Your Mission
1. Architect the Agentic Core
• Design multi-agent architectures with reasoning, memory, and human feedback loops.
• Build orchestration frameworks with LangGraph, CrewAI, Autogen, or whatever you invent next.
• Make agents observable, debuggable, and auditable — like any teammate worth their seat.
2. Automate IT Ops (For Real)
• Deploy agents that handle helpdesk triage, patching, access control, monitoring, and remediation.
• Integrate deeply with JumpCloud, HaloITSM, Google Workspace, AWS/GCP, and other APIs of chaos.
• Bake in human-in-the-loop learning so the system evolves with every interaction.
3. Build, Scale, and Break (Gracefully)
• Own the infrastructure: AWS, EKS, Helm, Terraform, FastAPI, Pydantic — whatever gets it done.
• Instrument everything: telemetry, observability, self-healing, cost tracking.
• Fail fast, fix faster. No red tape. No PowerPoints.
4. Lead by Doing
• Grow and mentor a small team (5–10 engineers).
• Ship first, systematize later.
• Build a culture that worships velocity and despises bureaucracy.
Who You Are
• Builder > talker. You ship more than you debate.
• Systems thinker. You make complex systems look inevitable in hindsight.
• AI-fluent. You get agent orchestration, not just API calls.
• Comfortable with chaos. You thrive when the edges aren’t defined.
• Technically fearless. Python, FastAPI, distributed systems — all second nature.
• Operationally aware. You get why IT matters and how to automate it without breaking the world.
Qualifications
Required:
• 5+ years building distributed systems or full-stack platforms.
• 2+ years shipping LLM or agentic workloads in production (not just notebooks).
• Deep experience in Python, FastAPI/Django/Pydantic.
• Familiar with frameworks like LangGraph, CrewAI, Autogen, LlamaIndex, or equivalents.
• Proven track record deploying real workloads on AWS (EKS/Helm/Terraform).
• Strong grounding in MLOps, telemetry, observability (MLflow, W&B, Ray, Prometheus, Grafana).
• Leadership experience in small, high-output engineering teams.
Nice to Have:
• IT automation, endpoint management, or RMM background.
• Experience fine-tuning or evaluating LLMs (OpenAI, Anthropic, HuggingFace).
• RAG, retrieval, or agent evaluation pipelines.
• Startup/founder experience — you know how to go from chaos to customer.
Location
Remote. Global.
You’ll operate primarily on New York (EST) hours.
We care about impact, not time zones.
Why You’ll Love It
• You’ll build the tech brain behind a new category — autonomous IT.
• You’ll have real ownership — architecture, code, direction, everything.
• No legacy. No “best practices.” Just the next practices.
• You’ll help define how AI and humans work together for the next decade.
If you’ve ever wanted to build something truly new — something that outlasts hype cycles and PowerPoint decks — this is your moment.
Come teach machines to run IT.
Make Your Resume Now