Job Summary:
SpaceX is actively developing technologies to enable human life on Mars, and they are seeking a Sr. Site Reliability Engineer for the Starlink program. The role involves solving challenges to improve satellite internet services, focusing on system upgrades, infrastructure management, and collaboration with engineering teams.
Responsibilities:
• Upgrade existing distributed systems to become sharded and geo-redundant in multiple data centers
• Advance existing deployment, monitoring, and alerting infrastructure to support a multi-region environment
• Manage petabyte scale bare metal compute clusters
• Closely collaborate with engineers across all programs to create highly operable, scalable, and maintainable products
• Engage throughout the whole software development lifecycle of services -- from inception to design, deployment, operation, and iterative refinement
• Focus on performance bottlenecks and performance improvement techniques
Qualifications:
Required:
• Bachelor's degree in computer science, engineering, math, or scientific discipline and 5 years of software development experience; OR 7+ years of professional experience building software with site reliability or DevOps in lieu of a degree
• Experience with Linux operating systems
• Willing to work extended hours and weekends when needed
• Active Top Secret or TS/SCI clearance
• To conform to U.S. Government export regulations, applicant must be a (i) U.S. citizen or national, (ii) U.S. lawful, permanent resident (aka green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain the required authorizations from the U.S. Department of State.
Preferred:
• 5+ years of rigorous experience with site reliability or DevOps
• Experience with Kubernetes and Istio for on-premise deployment
• Experience with in-stream, data processing and analytics using open source platforms such as Apache Kafka, Spark, HBase, HDFS, Flink
• Experience troubleshooting hardware and network-layer issues
• Programming experience in Python, C#, Java, Scala, Go or similar languages
• Good understanding of version control, testing, continuous integration, build, deployment and monitoring
Company:
SpaceX designs, manufactures, and launches rockets and spacecraft to facilitate space exploration. Founded in 2002, the company is headquartered in Hawthorne, USA, with a team of 1001-5000 employees. The company is currently Late Stage.