While technology is the heart of our business, a global and diverse culture is the heart of our success. We love our people and we take pride in catering them to a culture built on transparency, diversity, integrity, learning and growth.
If working in an environment that encourages you to innovate and excel, not just in professional but personal life, interests you- you would enjoy your career with Quantiphi!
About Quantiphi:
Quantiphi is an award-winning, AI-First global digital engineering company that helps the world's leading Fortune 1000 organizations transform bold ideas into measurable business impact. We go beyond building innovative AI technologies-we solve the problems that matter most to our clients.
Since our founding in 2013, Quantiphi has built a proven track record of turning complex challenges into meaningful outcomes across industries.
Headquartered in Boston, with more than 4,000 professionals worldwide, we partner with global enterprises to deliver large-scale digital, cloud, and AI-driven transformation.#SolvingWhatMatters.
We are an Elite and Premier partner to Google Cloud, AWS, NVIDIA, Snowflake, and other leading technology platforms, and our work has been recognized across the industry, including:
- 3 AWS AI/ML Partner of the Year awards
- 3 NVIDIA Partner of the Year awards
- 3 Snowflake Partner of the Year awards
- Rated Leaders by Gartner, Forrester, IDC, ISG, Everest Group and other leading analyst firms
Quantiphi delivers First-in-class AI solutions across Life Sciences, Healthcare, Banking, Financial Services, CPG, Manufacturing, Energy, High-Tech, Telecommunications, etc., powered by cutting-edge Generative AI and Agentic AI accelerators.
We are also proud to be certified as aGreat Place to Work-reflecting our commitment to our people and our culture.
For more details, visit:WebsiteorLinkedIn Page
Role:DevOps/Observability Engineer
Experience Level:8+ years
Employment type:Full Time
Location:Remote - USA
What you will do:
We are seeking a highly experienced Senior DevOps/Observability Engineer with over 8 years of experience to lead the design and implementation of our next-generation, unified observability platform. This pivotal role will focus on architecting a sophisticated observability pipeline from the ground up, leveraging a modern, open-source-centric stack on Amazon Web Services (AWS). The ideal candidate will have deep expertise in designing and deploying observability solutions, with a strong emphasis on OpenTelemetry (OTel) and Kubernetes observability. You will be responsible for deploying, configuring, and integrating a suite of tools including Prometheus, Grafana, and Splunk to provide comprehensive insights into our complex, distributed systems. This is a hands-on role for a technical leader who is passionate about building scalable, reliable, and efficient monitoring and logging systems
Basic Qualifications (BQs):
- Unified Pipeline Architecture: Proven ability to design and implement end-to-end observability pipelines using OpenTelemetry, Prometheus, and Grafana on centralized infrastructure.
- Cross-Account AWS Observability: Deep expertise in centralizing AWS telemetry, including multi-account CloudTrail organization trails, cross-account CloudWatch metrics/logs, and VPC Flow Logs.
- Log Aggregation & Routing: Strong experience designing log aggregation strategies, implementing noise reduction/filtering at the collector level, and configuring Splunk HTTP Event Collector (HEC) integrations.
- Advanced Alerting & Dashboarding: Hands-on experience building comprehensive alerting frameworks using Alertmanager and CloudWatch Alarms, coupled with advanced dashboard engineering in Grafana (using PromQL).
- Infrastructure as Code (IaC): Advanced proficiency in writing Terraform modules specifically for deploying and managing observability stacks and EC2 infrastructure.
Other Qualifications (OQs):
- Enterprise Scale Log Management: Demonstrated experience managing, routing, and optimizing log pipelines at massive scale (TB/day).
- Kubernetes/Container Observability: Experience deploying Prometheus and OTel within Kubernetes (EKS) or containerized (ECS) environments.
- Cost Optimization: Proven track record of reducing observability spend through strategic metric dropping, log filtering, and efficient storage tiering.
What is in it for you:
- Join one of the world's fastest-growing AI-first digital engineering companies and make a real impact at scale.
- Lead and collaborate with a high-energy team of talented, driven individuals solving complex, meaningful challenges.
- Work with Fortune 500 companies and disruptive innovators in a research-driven environment with 60+ patents.
- Stay ahead of the curve by gaining hands-on experience with cutting-edge AI, ML, data, and cloud technologies while continuously upskilling.
If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!