Site Reliability Engineer / Platform Operations Engineer

Site Reliability Engineer / Platform Operations Engineer

Targeted Talent

Winnipeg, MB

Full-time

Posted 26 days ago


Job description

We are looking for an experienced Site Reliability Engineer or Platform Operations Engineer for our client. This is a permanent position that is remote to start with later relocation to Calgary or Winnipeg. Our client is a global enterprise company with a product that you've likely used.

You Will:
  • Own development projects, providing technical guidance and delivering against the Platform & Service Operations Engineering roadmap.
  • Designing and Implementing Wargames to test our operational response and identify areas of weakness in our platforms.
  • Technical and Management Escalation point for Service Operations Centre (SOC) engineers and during major incidents.
  • Troubleshooting, reproducing and mitigating issues in our production environments
  • Mentoring other team members.
  • Operate global AWS Platforms at scale
You Have:
  • Evidence of Strong Troubleshooting, problem-solving and investigative skills
  • Experience of AWS or Other cloud providers
  • Experience developing in Java
  • Major incident management on experience operating production platforms at scale
  • Experience working with distributed web applications
  • Experience Automating operational tasks / Processes using other languages
  • Understanding of relational and/or NoSQL data structures
  • Experience mentoring/influencing peers
  • Identifying improvements, highlighting risks vs benefits, and translating them into technical requirements
Bonus:
  • Worked with Ansible, Terraform, Python
  • Experience working with Serverless / Containers
  • Experience of ELK &/Or Graphite/Prometheus / Grafana
  • Used Tracing Tools in production before
  • Experience in Chaos Engineering / Failure Injection Testing
  • Experience of working in an Agile Environment
  • Experience working in a similar site reliability role
This role offers great perks and a competitive salary, please apply to the job posting if it matches your career path!

Targeted Talent logo

About Targeted Talent

Sourced by ZipRecruiter

Your single source for HR professional services, we offer job seekers specialized employment services, spanning contract, permanent positions, and project solutions for highly specialized and managerial level talent needs. Our team of specialized recruiters and consultants abilities extend far beyond resume or career counseling. With hundreds of collaborators strategically located throughout the country, our organization possess the local market knowledge and industry relationships that make successful geography-specific reach possible.

Industry

Recruiting and staffing services

Company size

11 - 50 Employees

Headquarters location

Vancouver, BC, CA



Frequently asked questions

Q: What skills or qualities help someone succeed as a Site Reliability Engineer?

A: To succeed as a Site Reliability Engineer (SRE), one should possess strong technical skills in areas such as programming languages (e.g., Python, Go), cloud computing platforms (e.g., AWS, GCP), and operating systems (e.g., Linux). Additionally, soft skills like effective communication, problem-solving, and collaboration are crucial, as SREs often work closely with cross-functional teams to identify and resolve complex technical issues. By combining these technical and soft skills, SREs can ensure high system reliability, efficiency, and scalability, ultimately driving business growth and career advancement opportunities.

Q: What is the career path for a Site Reliability Engineer?

A: A Site Reliability Engineer's typical career progression involves starting as a junior SRE, focusing on incident response, monitoring, and troubleshooting, before advancing to a mid-level role as a SRE Lead or Team Lead, where they oversee team operations and implement reliability best practices. At the senior level, SREs often become Technical Leads or Engineering Managers, driving strategic decisions and technical direction for the organization. Throughout their career, SREs can develop skills in areas like cloud computing, containerization, and automation, as well as soft skills like communication, collaboration, and problem-solving, ultimately leading to opportunities in leadership, architecture, or specialized roles like DevOps or Cloud Engineering.



Targeted Talent job posting for a Site Reliability Engineer / Platform Operations Engineer in Winnipeg, MB with a map of Winnipeg location.