Principal Dev Ops Engineer

Tempe, AZ • Remote

Apply

Full-time

Posted 28 days ago

Job description

Iridium is an award-winning and innovative satellite communications company with bragging rights to the only network that offers voice and data connectivity anywhere in the world. For over 20 years, Iridium’s unique network and services have supported critical communications needs for individuals, businesses, and the evolving Internet of Things.

At Iridium, we understand the importance of staying connected and the limitations of traditional communications networks. People across the globe, including first responders, humanitarians, global militaries, scientific researchers, and lone workers, as well as ships, aircraft and remote operations all rely on Iridium to stay connected. We take our responsibility for providing these essential communications very seriously and pride ourselves on offering a reliable lifeline when needed. Likewise, Iridium is committed to providing an exciting and innovative workplace, where employees are challenged to think outside the box and collaborate on new, bold ideas and solutions. Our talented teams are passionate about their work and the impact our company makes around the world. Iridium fosters an empowering and inclusive culture that allows employees to genuinely be their best selves. We are looking for others who want to join this truly unique company that celebrates our employees and provides the opportunity to truly make a difference in the world.

What We’re Looking For:

We are seeking a highly skilled Principal DevOps Engineer to lead the strategy, design, and evolution of DevOps practices supporting our cloud-native Open RAN and 4G/5G Core network. In this role, you will set the technical direction for CI/CD, infrastructure-as-code, automation, and observability frameworks that enable reliable, scalable operations across Core, RAN, Transport, and Cloud domains. You will define and implement greenfield CI/CD pipelines, establish standardized automation and monitoring approaches, and create advanced telemetry, alerting, and automated remediation capabilities. Through close partnership with NOC Operations, Engineering, Cloud, Development, and Test teams, you will help drive operational excellence, reduce Mean Time to Repair (MTTR), and minimize alert fatigue. As a technical leader within the Gateway organization, you will provide governance, best practices, and handson expertise to teams across global time zones. The ideal candidate brings deep experience with cloudnative architectures, Kubernetes, CI/CD, telemetry pipelines, and infrastructureascode, along with familiarity in telecom network environments and Agile practices.

What You’ll Do:

Cloud & CI/CD Enablement

Lead the design and implementation of CI/CD pipelines supporting cloud-native and G-RAN deployments
Manage Kubernetes environments (EKS and on-prem) by:
- Monitoring CNF health
- Automating scaling policies
- Optimizing resource allocation
Implement Infrastructure-as-Code solutions using Terraform and Ansible to deploy and maintain monitoring and observability stacks
Integrate observability platforms and tools into operational workflows to strengthen visibility and diagnostic capabilities

Observability & Monitoring Architecture

Design and enhance observability frameworks using:
- Grafana dashboards and alert correlation
- Health checks/Back Ups etc.
- Core CDR dashboards (IMS & Packet Core)
- Viavi probe integrations
- SolarWinds telemetry feeds
Build unified dashboards that provide nationallevel visibility and realtime health insights
Optimize alarm thresholds and event correlation to reduce false positives and alert storms
Implement structured logging, metrics, and distributed tracing for cloudnative network functions

Automation & Self-Healing Engineering

Develop automation using Python, Bash, or Go to:
- Auto-triage common alarms
- Perform health validations
- Trigger corrective actions and workflows
Build eventdriven automation using Kafka feeds from Mavenir and Gatehouse OSS systems
Implement automated remediation for common failure scenarios (e.g., pod restarts, resource exhaustion, signaling retries) to reduce manual NOC intervention
Reduce manual NOC intervention through closed-loop automation
Implement Infrastructure as Code (Terraform/Ansible) for monitoring stack deployments
Integrate observability tools into DevSecOps workflows

Incident & Reliability Engineering

Support Major Incident Management by providing telemetry insights, automated diagnostics, and postincident analyses
Perform post-incident analysis using logs, traces, and performance metrics
Drive improvements that reduce MTTD and MTTR
Partner with Core, RAN, Transport, and Cloud engineering teams to prevent recurring issues through rootcause analysis

Leadership & Continuous Improvement

Mentor junior DevOps and NOC engineers in automation, observability, and DevOps best practices
Develop reusable automation frameworks and operational standards
Document playbooks, reference architectures, and bestpractice patterns to mature operations from reactive to predictive

What You’ll Need to Succeed:

Bachelor’s degree in Engineering, Computer Science, Telecommunications, or related field
10+ years of experience in DevOps, Site Reliability Engineering, or network automation roles supporting cloudnative environments
Strong proficiency with CI/CD pipeline management, Infrastructure-as-Code frameworks, and containerized deployments
Hands-on experience with Kubernetes (EKS and on-prem K8s) and Docker-based cloud-native network functions (CNFs)
Proficiency with AWS cloud services
Advanced Python scripting skills, with additional experience in Bash or Go
Experience building Grafana dashboards, alerting logic, and observability workflows
Familiarity with Kafka-based event streaming architectures
Strong Linux system administration skills
Strong understanding of telecom architecture, including 4G EPC, 5G Core, IMS, Open RAN
Experience integrating and operationalizing probe-based observability solutions (e.g., Viavi)
Deep understanding of monitoring concepts, including metrics, logs, traces, and APM
Excellent communication skills, with the ability to convey products, deliverables, analyses, and/or issues clearly and confidently, and recognize and adapt to different communication techniques
Be able to analyze a situation or problem, generate effective solutions, and see those solutions through to completion
Must possess the creativity and resourcefulness needed to make reliable decisions and determine methods on new assignments
Can thrive in a dynamic environment by handling multiple tasks and managing shifting priorities
Be proactive in sharing knowledge you’ve learned with others

Things That Would be Great if You Brought to the Table:

Experience supporting Mavenir 4G/5G Core in production
Knowledge of SIP, Diameter, GTP, HTTP/2, PFCP protocols
Experience with Prometheus, ELK stack, or OpenTelemetry
CI/CD experience (GitLab, Jenkins, ArgoCD)
Kubernetes certification (CKA/CKAD)
AWS certifications
Experience building closed-loop automation for telecom NOCs

We’ll also need you to:

Participate in on-call rotations for automation platform support
Support major incidents requiring automation troubleshooting
Travel up to 10% if needed

Work Environment:

This position primarily works in an office setting and is largely sedentary with the majority of the position working with a computer. The role typically requires the use of basic office equipment such as a phone, video, computer, keyboard, mouse, and printer.

We believe in-person connection drives innovation, strengthens mentorship, and builds culture, while flexibility enables employees to do their best work. Under Iridium’s Hybrid Work Policy, employees are expected to work at least three days per week (approximately 60%) in an Iridium office to support collaboration, relationship-building, and professional growth.

Additional Information

This job description outlines the general nature and level of work for this role and is not a comprehensive list of duties, responsibilities, or qualifications. Employees may be assigned additional responsibilities as needed.

Iridium is an Equal Opportunity Employer, including individuals with disabilities and protected veterans.

Apply

Most Popular Jobs Similar to Principal Systems Engineer

lead systems engineer

principal systems architect

senior systems engineer

systems engineering manager

engineer systems architect

principal engineer

senior principal engineer

systems engineer

system engineer

lead systems architect

Other Helpful Pages Related To Principal Dev Ops Engineer

Satellite Communications Engineer Salaries

Satellite Communications Engineer Career Research

Frequently asked questions

Q: What skills or qualities help someone succeed as a Principal Systems Engineer?

A: To succeed as a Principal Systems Engineer, key technical skills include expertise in systems engineering methodologies, architecture, and design, as well as proficiency in tools such as system modeling languages (SysML) and simulation software. Additionally, strong communication, leadership, and problem-solving skills are essential, as Principal Systems Engineers often lead cross-functional teams and must effectively collaborate with stakeholders to drive project success. By combining technical expertise with strong interpersonal and leadership abilities, Principal Systems Engineers can drive innovation, improve project outcomes, and advance their careers through increased responsibility and influence.

Q: What is the career path for a Principal Systems Engineer?

A: A Principal Systems Engineer typically follows a career progression from entry-level roles such as Systems Engineer or Integration Engineer, to mid-level positions like Senior Systems Engineer or Technical Lead, and eventually to senior roles like Principal Systems Engineer or Engineering Manager. Key opportunities for skill development and professional growth in this role include mastering system design, architecture, and integration, as well as developing leadership and communication skills to effectively manage cross-functional teams. Long-term career prospects for a Principal Systems Engineer may include transitioning into executive roles such as VP of Engineering or CTO, or pursuing specialized roles like Technical Fellow or Chief Architect.

Principal Systems Engineer Jobs

Iridium Satellite, LLC job posting for a Principal Dev Ops Engineer in Tempe, AZ with a salary of $117,400 to $171,200 Annually with a map of Tempe location.

Trending keywords

Popular titles

Top companies