Senior Systems Engineer - Email Reliability (Hybrid SRE)
New York, NY (Hybrid, 3 days in office)ย
Top-tier compensation package
Join an elite technology group where systems engineering is a competitive advantage. They treat infrastructure as code, and prioritize reliability above all else, and seeking a Site Reliability Engineer (SRE) or Senior Systems Engineer to architect the next generation of their critical messaging grid. Youโll move beyond managing servers to building the CI/CD pipelines, Observability stacks, and Event-Driven APIs that power the firm's global communication.
The Roleย
This is not a standard administration role; it is a Production Engineering position. You will own the reliability and architecture of a complex hybrid stack (Linux/Postfix + Microsoft Exchange). You will partner directly with software developers to decouple legacy monolithic systems and replace them with modern, API-first microservices.
Key Responsibilities:
Engineer high-availability, stateless Postfix clusters on Linux. Implement modern Infrastructure as Code (IaC) practices using Terraform or Ansible to manage configuration drift and deployment.
Move beyond โup/downโ monitoring. Build sophisticated dashboards (Prometheus, Grafana, ELK) to visualize queue latency, delivery rates, and SMTP performance in real-time.
Lead the architectural shift from legacy EWS Polling to Event-Driven Graph API Webhooks. Architect the "glue" code in Python or Go that allows internal apps to send mail without touching the backend directly.
Programmatically manage Sender Reputation. Automate IP warming strategies and enforce DMARC/DKIM/SPF standards across thousands of domains to ensure 99.99% deliverability.
Manage the edge defense layer (Proofpoint/Mimecast) using policy-as-code where possible, focusing on threat detection and automated remediation pipelines.
Who you are:
7+ years of Engineering experience in high-volume messaging or complex infrastructure environments.
Deep comfort with the Linux kernel, TCP stack tuning, and Postfix/MTA configuration management.
Proven experience writing production code (Python, Go, or advanced PowerShell Core) to wrap APIs and automate workflows. You don't just run scripts; you build tools.
Experience managing infrastructure via Terraform, Ansible, or similar config management tools.
You understand SMTP, TLS, and DNS at the packet level and can debug complex handshake failures in a Wireshark trace.
Employment Type: FULL_TIME