Skip to Main Content
Reliability and Observability Lead Engineer
Cognosos Atlanta, GA

Reliability and Observability Lead Engineer

Cognosos
Atlanta, GA
  • Vision , Medical , Dental , Paid Time Off , Life Insurance , Retirement
  • Full-Time
Job Description
Company Info
Job Description
Description:

Reliability and Observability Lead Engineer

Who we are:

Cognosos leads the market in delivering real-time asset location and tracking intelligence solutions. Our lightweight, flexible and scalable platform deploys quickly both indoors and outdoors, delivering an unparalleled combination of price and performance across a variety of industries including automotive, logistics, and healthcare. Join our dynamic team, as we accelerate our momentum in our current markets and prepare to launch into our next ones. We are named as one of Atlanta’s Top Places to Work by the Atlanta Business Journal. Learn more about Cognosos's mission to equip enterprises with instant asset intelligence that unlocks operational potential and optimizes performance at www.cognosos.com.

Cognosos is seeking a Reliability and Observability Lead Engineer responsible for the overall monitoring and observability of our products. The right candidate is obsessed with product reliability and quality improvements and has experience identifying critical product metrics, defining processes, building information-dense dashboards, and translating them into actionable alerts.

If you’re looking for a highly challenging position with the opportunity to advance your technology career in the areas of cloud IOT, machine learning, cloud computing and security, then look closely at this position.

Responsibilities
  • Work with the executive team in to define SLOs for the Cognosos platform and service
  • Define relevant SLIs that support those SLOs.
  • Build processes and tools to monitor uptime, and measure the SLA compliance of the overall service, as well as individual components (Hardware, Network, Software)
  • Provide dashboards and periodic reports for System performance and availability
  • Work with the hardware and platform engineering teams, and the field services team to define and implement notifications, alarms, and escalation procedures
Qualifications:
  • 5+ years site reliability engineering experience preferably in a hardware/software company
  • Background in statistical quality control techniques
  • Experience defining, implementing, and advocating for platform observability objectives
  • Experience with AWS, Prometheus, Grafana, Python, and MySQL required.
  • Familiarity with ElasticSearch and Kibana
Preferred Skills
  • Experience with New Relic or similar APM platform
  • Experience with OpsGenie or similar
  • Experience with hardware and/or mobile app monitoring
  • Knowledge of Software Development Life Cycle and Agile methodologies
  • Strong leadership skills and experience leading and mentoring a team

Benefits and Perks:

We are pleased to offer:

  • Competitive salaries
  • Unlimited vacation so you can rest and recharge
  • Full benefits program (Health, Dental, Vision, 401(k), life and disability insurance)
  • Paid parking at our Atlanta office
  • Opportunity for equity participation
  • Volunteer opportunities
  • Weekly catered lunches

Whether it’s virtual happy hours, company-wide contests or quarterly cultural outings, we are always looking for ways to keep our employees happy and engaged with their teammates.

Requirements:


How can the hiring manager reach you?

By clicking the button above, I agree to the ZipRecruiter Terms of Use and acknowledge I have read the Privacy Policy, and agree to receive email job alerts.

Cognosos job posting for a Reliability and Observability Lead Engineer in Atlanta, GA with a salary of $55 to $73 Hourly with a map of Atlanta location.