Senior Technology Engineer (Operations / AI & Observability)
Location: 3 Days Onsite In Deerfield Beach, FL
Duration: Long-term Contract
Overview:
We are seeking a Senior Technology Engineer with a strong developer mindset and experience across AI, observability, and infrastructure operations. This role is responsible for supporting and enhancing monitoring and event management platforms while helping drive the integration of AI-driven automation across operational systems.
This position blends hands-on systems administration, observability platform ownership, and emerging AI capabilities to improve signal quality, reduce operational friction, and accelerate incident response. The ideal candidate is comfortable working across traditional infrastructure and modern AI-enabled tooling.
Key Responsibilities
AI & Automation (Core Focus)
- Design, develop, and support AI-driven solutions to enhancemonitoring, alerting, and incident response workflows
- Leverage tools such as Copilot, ChatGPT, and Anthropic (Claude) tobuild and optimize automation and operational efficiencies
- Partner with internal AI governance teams (AICOE) to align withstandards and best practices
- Contribute to long-term initiatives enabling cross-platform AIintegration and automation
Observability & Monitoring Platforms
- Support and maintain platforms such as SolarWinds, Datadog,Dynatrace, and Opsgenie (or similar)
- Ensure platform health, upgrades, patching, and onboarding of newsystems/devices
- Improve alert quality, signal-to-noise ratio, and event routingaccuracy
- Collaborate with platform owners to enhance integrations andmonitoring coverage
Systems Administration (Required)
- Provide hands-on support for Linux and Windows servers
- Perform agent deployments, upgrades, and troubleshooting
- Support monitoring/logging enablement across hybrid environments(on-prem + cloud)
Cloud & Azure Monitoring
- Work within Microsoft Azure environments, supporting monitoring andlogging capabilities
- Utilize tools such as Azure Monitor and Log Analytics
- Ensure visibility and observability for cloud-hosted workloads
SIEM & Security (Nice to Have)
- Exposure to SIEM platforms such as Azure Sentinel and Cribl
- Support log ingestion, troubleshooting, and collaboration withsecurity teams
- Help align monitoring data with security detection requirements
Continuous Improvement
- Contribute to documentation, runbooks, and operational bestpractices
- Identify incremental improvements to reduce repeat incidents andmanual effort
- Participate in knowledge transfer and platform lifecycle activities
Qualifications
- 4-5+ years of experience in infrastructure operations,observability, or platform engineering
- Strong developer mindset with experience in scripting andautomation (PowerShell, Python, Bash, etc.)
- Hands-on experience supporting Linux and Windows environments
- Experience with monitoring/observability tools (SolarWinds,Datadog, Dynatrace, or similar)
- Familiarity with AI tools and platforms (e.g., Copilot, ChatGPT,Anthropic/Claude) and their application in operational workflows
- Experience with Microsoft Azure and cloud-based monitoring concepts
- Foundational networking knowledge (SNMP, telemetry, networkmonitoring concepts)
Preferred / Nice to Have:
- Exposure to SIEM/security platforms (Azure Sentinel, Cribl, orsimilar)
- Experience working in Dev workflows (GitHub, CI/CD, code reviews)
- Familiarity with Agile/SAFe methodologies
- Experience with hybrid cloud environments and Azure AD
Skill Set
• Design, develop, and support AI-driven solutions to enhance monitoring, alerting, and incident response workflows • Leverage tools such as Copilot, ChatGPT, and Anthropic (Claude) to build and optimize automation and operational efficiencies • Partner with internal AI governance teams (AICOE) to align with standards and best practices • Contribute to long-term initiatives enabling cross-platform AI integration and automation Observability & Monitoring Platforms • Support and maintain platforms such as SolarWinds, Datadog, Dynatrace, and Opsgenie (or similar) • Ensure platform health, upgrades, patching, and onboarding of new systems/devices • Improve alert quality, signal-to-noise ratio, and event routing accuracy • Collaborate with platform owners to enhance integrations and monitoring coverage Systems Administration (Required) • Provide hands-on support for Linux and Windows servers • Perform agent deployments, upgrades, and troubleshooting • Support monitoring/logging enablement across hybrid environments (on-prem + cloud) Cloud & Azure Monitoring • Work within Microsoft Azure environments, supporting monitoring and logging capabilities • Utilize tools such as Azure Monitor and Log Analytics • Ensure visibility and observability for cloud-hosted workloads