We are seeking a skilled and forward-looking ML Engineer with experience in Large Language Models (LLMs), generative AI, and agentic architectures to join our growing R&D and Applied AI team. This role is critical in helping Oversight deliver the next generation of agentic AI systems for enterprise spend management and risk controls.
ย
The ideal candidate has a strong foundation in machine learning, modern deep learning frameworks, and data pipelines, coupled with hands-on experience experimenting with LLMs, small language models (SLMs), multi-agent frameworks, and retrieval-augmented generation (RAG).
You will work closely with AI/ML researchers, data engineers, and product teams to design, implement, and optimize models that power autonomous exception resolution, anomaly detection, and explainable insights. This is a hands-on engineering role where you will not only build and scale ML systems but also actively contribute to cutting-edge applied research in agentic AI.
Core ML/LLM Engineering
- Contribute to the design, training, fine-tuning, and deployment of ML/LLM models for production.
- Implement RAG pipelines using vector databases.
- Work with frameworks like LangChain, LangGraph, MCP to prototype and optimize multi-agent workflows.
- Develop prompt engineering, optimization, and safety techniques for agentic LLM interactions.
- Integrate memory, evidence packs, and explainability modules into agentic pipelines.
- Work hands-on with multiple LLM ecosystems:
- OpenAI GPT models (GPT-4, GPT-4o, fine-tuned GPTs).
- Anthropic Claude (Claude 2/3 for reasoning and safety-aligned workflows).
- Google Gemini (multimodal reasoning, advanced RAG integration).
- Meta LLaMA (fine-tuned/custom models for domain-specific tasks).
Data & Infrastructure
- Collaborate with Data Engineering to build and maintain real-time and batch data pipelines that serve ML/LLM workloads.
- Conduct feature engineering, preprocessing, and embeddings generation for structured and unstructured data.
- Implement model monitoring, drift detection, and retraining pipelines.
- Leverage cloud ML platforms (AWS Sagemaker, Databricks ML) for experimentation and scaling.
Research & Applied Innovation
- Explore and evaluate emerging LLM/SLM architectures and agent orchestration patterns.
- Experiment with generative AI and multimodal models to extend capabilities beyond text (images, structured financial data).
- Collaborate with R&D to prototype autonomous resolution agents, anomaly detection models, and reasoning engines.
- Translate research prototypes into production-ready components.
Collaboration & Delivery
- Work cross-functionally with R&D, Data Science, Product, and Engineering to deliver business-aligned AI features.
- Participate in design reviews, architecture discussions, and model evaluations.
- Document processes, experiments, and results effectively for knowledge sharing.
- Mentor junior engineers and contribute to ML engineering best practices.
Required
- Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or related field.
- 3+ years of experience building and deploying ML systems.
- Proficiency in Python and libraries such as PyTorch, TensorFlow, Scikit-Learn, Hugging Face Transformers.
- Hands-on experience with LLMs/SLMs (fine-tuning, prompt design, inference optimization).
- Demonstrated experience with at least two of the following ecosystems:
- OpenAI GPT models (chat, assistants, fine-tuning).
- Anthropic Claude (safety-first AI for reasoning and summarization).
- Google Gemini (multimodal reasoning, enterprise-scale APIs).
- Meta LLaMA (open-source, fine-tuned models).
- Familiarity with vector databases, embeddings, and RAG pipelines.
- Ability to work with structured and unstructured data at scale.
- Knowledge of SQL and distributed data frameworks (Spark, Ray).
- Strong understanding of ML lifecycle: data prep, training, evaluation, deployment, monitoring.
Preferred Qualifications
- Experience with agentic frameworks (LangChain, LangGraph, MCP, AutoGen).
- Knowledge of AI safety, guardrails, and explainability techniques.
- Hands-on experience deploying ML/LLM solutions in cloud environments (AWS, GCP, Azure).
- Experience with CI/CD for ML (MLOps), monitoring, and observability.
- Familiarity with anomaly detection, fraud/risk modeling, or behavioral analytics.
- Contributions to open-source AI/ML projects or publications in applied ML research.
US Army:
17D - Cyber Capability Developer
17C - Cyber Operations Specialist (Advanced Track)
35Q - Cryptologic Network Warfare Specialist
35N / 35P / 35S (Intel Analysts w/ coding exposure)
US AirForce:
17X - Cyberspace Warfare Operations
1B4X1 - Cyber Warfare Operations
9S100 - Scientific Applications Specialist
3D0X4 / 1D7X1 (Software / Data Ops variants)
US Navy:
CTN - Cryptologic Technician (Networks)
CTI / CTR (with analytics focus)
Information Warfare Officers (1810)
US Marine Corps:
1721 - Cyberspace Warfare Operator
26XX Intel (with data/automation focus)
US Space Force:
Cyber Operations (DCO/OCO) Guardians