Recruiting Guy

60 Recruiting Guy Senior Reliability Engineer Jobs Hiring Near You

As a Senior Reliability Engineer, you'll work within a focused team, developing groundbreaking ... NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work ...

Sr. Reliability Engineer

Albany, NY · On-site

$100K - $110K/yr

Optimize Performance as a Sr. Reliability Engineer -- Strengthen Asset Reliability Across Advanced Manufacturing Operations Are you an experienced engineer focused on improving asset performance ...

Reliability Engineer

Kermit, TX · On-site

$97K - $122K/yr

We have a great opportunity for a Sr. Reliability Engineer in Kermit, Tx Who We Are We are a leading solutions provider to the energy industry. Our portfolio of offerings includes oilfield logistics ...

The Senior Reliability Engineer is responsible for stability, performance, and scalability of the PCS Data Warehouse platform. This role combines production support, engineering, and automation to ...

The Senior Reliability Engineer is responsible for stability, performance, and scalability of the PCS Data Warehouse platform. This role combines production support, engineering, and automation to ...

The Senior Reliability Engineer is responsible for stability, performance, and scalability of the PCS Data Warehouse platform. This role combines production support, engineering, and automation to ...

The Senior Reliability Engineer is responsible for stability, performance, and scalability of the PCS Data Warehouse platform. This role combines production support, engineering, and automation to ...

Job Purpose and Impact The Senior Reliability Engineer will be a key leader on the maintenance and reliability management team for the discipline of reliability engineering. In this role, you will ...

Reliability Engineer

Kermit, TX · On-site

$97K - $122K/yr

We have a great opportunity for a Sr. Reliability Engineer in Kermit, Tx Who We Are We are a leading solutions provider to the energy industry. Our portfolio of offerings includes oilfield logistics ...

The Senior Reliability Engineer position is responsible for improving equipment reliability, asset performance, and long-term equipment health across the facility. This role integrates engineering ...

Sr. Reliability Engineer | Primient About Primient Primient is a century old company with an ... in which it recruits and hires employees. We collect the following categories of personal ...

GRVTY (Northstrat) is seeking a Senior Reliability Engineer to own the reliability of our fielded and in-production radio-frequency (RF) electronic warfare (EW) systems. These systems are already ...

Showing results 21-40

Recruiting Guy Jobs Information

Senior Reliability Engineer

Senior Reliability Engineer

RIT Solutions

Atlanta, GA

Contractor

Posted 25 days ago


Job description

Senior Reliability Engineer
hybrid - malvern, pa
Job Description
As a Senior Reliability Engineer, you will play a critical role in solving impactful operational problems. You are curious and take a proactive approach to identifying problems and making improvements. You balance innovative thinking with pragmatism and understand the long-term impacts of technical decisions. You communicate complex ideas clearly and collaborate effectively to deliver scalable solutions.
Core Responsibilities
Team is focused on automating incident response and infrastructure management. While Java and Python receive a stronger emphasis, candidates with solid programming fundamentals in any language and the ability to adapt will be considered. Experience with AWS and event-driven architectures is also valuable.
From a technical standpoint, familiarity with observability concepts (e.g., distributed tracing) and tools like Prometheus or Grafana is beneficial, though not mandatory. More important is an understanding of the underlying principles, such as instrumentation and monitoring strategies.
* Improve resiliency engineering practices across platforms and applications, including resilient application design patterns, system observability and deployment strategies
* Incident detection, troubleshooting, and resolution.
* Develop automation for incident response and infrastructure management
* Develop and support OpenTelemetry integrations for multiple application platforms (browser, ECS, lambda, etc) and languages (JavaScript, Java)
* Contribute to architectural decisions and support implementation of solutions.
Skills and Qualifications
* Deep knowledge of Java or Javascript. Practical experience developing and operating software in distributed systems environments.
* Problem-solving and analytical thinking: ability to diagnose complex issues and propose efficient solutions. Strong debugging and optimization skills for performance and scalability.
* Cloud platforms: Hands-on experience with AWS services and cloud infrastructure
* System architecture and design: ability to design scalable, secure, and maintainable systems.
* Working knowledge of Python (or similar scripting language).
* Strong knowledge of resiliency engineering techniques for both platforms and applications.
* Experience troubleshooting complex production issues and implementing effective mitigations.
* Familiarity with OpenTelemetry specification and core APIs.
From a screening perspective, we recommend focusing on:
  • How candidates approach software releases and validate functionality
  • Their understanding of system dependencies and fault tolerance
  • Experience with diagnosing and resolving production issues
  • Their ability to reflect on past incidents and identify improvements
  • Evidence of systems thinking and architectural awareness