2

Remote Observability Jobs in Tennessee (NOW HIRING)

Improve observability by suggesting/implement better logging practices and metric coverage. 4. ... Provide software-side support during integration testing, mainly remote and on-site by occasion. 5. ...

Senior Software Engineer

Nashville, TN · Remote

$118K - $156K/yr

Remote TRAVEL: Rarely for conferences, strategic planning sessions, and key stakeholder meetings ... Proficiency with DevOps tooling (Git, CI/CD, Config, Observability), containerization, and ...

Senior Software Engineer

Nashville, TN · On-site +1

$118K - $156K/yr

Remote TRAVEL: Rarely for conferences, strategic planning sessions, and key stakeholder meetings ... Proficiency with DevOps tooling (Git, CI/CD, Config, Observability), containerization, and ...

Remote-USA We are a healthcare revenue cycle company delivering high-impact financial outcomes for ... Establish and enforce engineering standards for code quality, API design, security, observability ...

Google Edge Tech Lead

Nashville, TN · On-site +1

$86K - $203K/yr

Drive infrastructure observability and logging using Cloud Monitoring, Logging, and Operations ... Deep, hands-on expertise with Terraform (advanced level), including reusable modules, remote state ...

Senior Software Engineer

Memphis, TN · On-site +1

$109K - $144K/yr

We are located in Memphis, TN, however we will consider candidates for remote work that are located ... Integrate with external third-party systems and improve reliability and observability. * Mentor ...

Senior Site Reliability Engineer

Nashville, TN · Remote

$55 - $73.25/hr

... Observability & Reliability * Build and maintain monitoring, alerting, and dashboards using ... Fully remote--work from anywhere in the world * A team where it\'s safe to be honest, learn from ...

next page

Showing results 1-20

Remote Observability information

What are some common challenges faced by professionals in a Remote Observability role, and how can they be addressed?

Professionals in Remote Observability often face challenges such as monitoring complex, distributed systems, ensuring reliable data collection, and quickly identifying the root causes of issues without physical access to infrastructure. To address these challenges, it's essential to implement robust monitoring tools, establish clear alerting thresholds, and maintain strong communication with development and operations teams. Regular knowledge-sharing sessions and continuous learning about new observability platforms can also help remote teams stay effective and proactive.

What is the difference between Remote Observability vs Remote Monitoring?

AspectRemote ObservabilityRemote Monitoring
FocusComprehensive system insights, including logs, metrics, and tracesTracking specific system metrics and alerts
ToolsOpenTelemetry, Grafana, JaegerNagios, Zabbix, Datadog
Work EnvironmentDevOps, SRE teams managing complex distributed systemsIT operations teams overseeing system health
CredentialsKnowledge of cloud platforms, scripting, and monitoring toolsBasic networking, system administration skills

Remote Observability provides a holistic view of system health through logs, metrics, and traces, enabling proactive troubleshooting. Remote Monitoring focuses on tracking specific metrics and alerts to detect issues. While both roles involve system oversight, observability offers deeper insights for complex environments, whereas monitoring emphasizes real-time alerts for system stability.

What are the key skills and qualifications needed to thrive as a Remote Observability Engineer, and why are they important?

To thrive as a Remote Observability Engineer, you need expertise in monitoring, logging, and tracing, typically supported by experience in systems administration or DevOps and a relevant technical degree. Familiarity with observability tools like Prometheus, Grafana, Datadog, ELK Stack, and cloud monitoring platforms, as well as certifications such as AWS Certified Cloud Practitioner or Google Professional Cloud DevOps Engineer, is highly valued. Strong analytical thinking, problem-solving, and effective communication are vital soft skills for diagnosing issues and collaborating with distributed teams. These skills and qualifications ensure reliable system performance, rapid incident response, and seamless user experiences in complex, cloud-based environments.

What is remote observability?

Remote observability refers to the ability to monitor, measure, and understand the state and performance of systems, applications, or infrastructure from a distance, typically using specialized tools and platforms. It is crucial for organizations that operate distributed or cloud-based environments, as it allows teams to detect issues, analyze metrics, and ensure reliability without needing physical access to the hardware. Remote observability often involves collecting logs, metrics, traces, and other telemetry data to provide a comprehensive view of system health and performance.
What are the most commonly searched types of Observability jobs in Tennessee? The most popular types of Observability jobs in Tennessee are:
What are popular job titles related to Remote Observability jobs in Tennessee? For Remote Observability jobs in Tennessee, the most frequently searched job titles are:
What job categories do people searching Remote Observability jobs in Tennessee look for? The top searched job categories for Remote Observability jobs in Tennessee are:
What cities in Tennessee are hiring for Remote Observability jobs? Cities in Tennessee with the most Remote Observability job openings:
Software Reliability Engineer - Warehouse Management Systems

Software Reliability Engineer - Warehouse Management Systems

Lineage Logistics Holding, LLC

On-site, Remote

Full-time

Medical, Dental, Life, Retirement, PTO

Posted 12 days ago


Job description

The Software Reliability Engineer (SRE) will play a critical role in ensuring that our Warehouse Management Software (WMS) runs seamlessly across both automated and manual facilities. This role focuses on investigating, diagnosing, and resolving operational software issues that impact warehouse performance-freeing developers to focus on new features and ensuring WMS never disrupts day-to-day operations.
Please note: We are unable to sponsor work authorization now or in the future for this role.
Roles and Responsibilities
1. Operational Issue Investigation and Quick Resolution
  • Monitor and respond to operational issues affecting WMS functions (e.g., receiving, shipping, inventory).

  • Analyze system logs, error reports, and transaction flows to identify anomalies or failures.

  • Work closely with Level 1 support and warehouse operation teams to understand incident symptoms and timelines.

  • Execute quick resolutions by using extended user rights, database interventions, or WMS configuration changes.

2. Code-Level Debugging
  • Debug application code, workflows, customizations, and interfaces to identify bugs or performance bottlenecks.

  • Collaborate with WMS QA team to reproduce issues in test environments and trace through application workflows to isolate root causes.

  • Collaborate with Product/Development teams to propose, implement, and test code fixes.

3. Real-Time System Monitoring
  • Use tools like Datadog or internal diagnostics to monitor WMS behavior.

  • Proactively set up or refine alerts for failure patterns (e.g., inventory mismatches, interface timeouts, RF disconnects).

  • Improve observability by suggesting/implement better logging practices and metric coverage.

4. Interface Troubleshooting
  • Investigate communication failures between WMS and other Products (e.g., LinOS, Link, EDI, Easymetrics).

  • Troubleshoot integration issues between the WMS and external systems (e.g., DevOps, DCOps).

  • Provide software-side support during integration testing, mainly remote and on-site by occasion.

5. Incident Management & Escalation
  • Participate in on-call rotations or site support shifts for time-sensitive incidents.

  • Coordinate with operations, IT, and engineering during critical events to ensure fast resolution.

  • Document incidents thoroughly, including root causes, fixes, and follow-up actions.

6. Post-Incident Review & Continuous Improvement
  • Contribute to postmortem analysis for high-impact incidents.

  • Recommend and implement configuration changes or process improvements to prevent repeated issues.

  • Update or create playbooks and troubleshooting guides for known WMS issues.

7. Internal Tooling and Automation
  • Develop scripts or queries (e.g., SQL) to streamline log analysis, system diagnostics, or data validation.

  • Propose internal utilities to detect edge-case failures or performance degradations early.

  • Support development of internal test tooling and simulations for recurring business scenarios.

8. Cross-Functional Collaboration
  • Work with Product/Development teams to escalate and fix production bugs.

  • Collaborate with QA teams to validate fixes or reproduce intermittent issues.

  • Partner with implementation teams to train staff on WMS behavior and provide escalation support.

#LI-Remote
Why Lineage?
This is an excellent position to begin your career path within Lineage! Success in this role enables greater responsibilities and promotions! A career at Lineage starts with learning about our business and how each team member plays a part each and every day to satisfy our customers' requirements. Beyond that, you'll help us grow and learn on our journey to be the very best employer in our industry. We'll ask you for your opinion and ensure we do our part to keep you developing and engaged as we grow our business. Working at Lineage is energizing and enjoyable. We value respect and care about our team members.
Lineage is an Equal Employment Opportunity Employer and is committed to compliance with all federal, state, and local laws that prohibit workplace discrimination and unlawful harassment and retaliation. Lineage will not discriminate against any applicant on the basis of race, color, age, national origin, religion, physical or mental disability or any other protected status under federal, state and local law.
Benefits
Lineage provides safe, stable, reliable work environments, medical, dental, and basic life and disability insurance benefits, 401k retirement plan, paid time off, annual bonus eligibility, and a minimum of 7 holidays throughout the calendar year.

Lineage Logistics logo

About Lineage Logistics

Sourced by ZipRecruiter

At Lineage, we have a shared purpose: We are transforming the food supply chain to eliminate waste and help feed the world. Lineage Logistics is the industry's leading innovator in temperature-controlled supply chain and logistics. Lineage's expertise in end-to-end logistical solutions, its unrivaled real estate network, and its use of technology combine to promote food safety, increase distribution efficiency, advance sustainability, lessen environmental impact, and minimize supply chain waste. As a result, Lineage helps customers ranging from Fortune 500 companies to small family-owned businesses increase the efficiency and protect the integrity of their temperature-controlled supply chain. In pursuit of this shared purpose, we are working to build a world class Solutions Design team.

Industry

Trucking

Company size

10,000+ Employees

Headquarters location

Novi, MI, US

Year founded

2012