All roles

Senior Site Reliability Engineer * Largely Remote | W2 ONLY *

Remote · USA Full-time New today

OverviewOur client, a Global Fortune 50 organization and one of the world's largest distributors of healthcare systems, medical supplies & pharmaceutical products, seeks an accomplished Senior Site Reliability EngineerCandidate must be authorized to work in USA without requiring sponsorshipDetailsLocation: San Francisco, CA or Irving, TX 75039 (Largely Remote)Duration: 6 weeks contract with possibility of extension or conversion to FTE roleWork Hours: Mon – Fri, 8:00am – 5:00pm Pacific Time.NotesWhile this position is primarily remote, occasional onsite presence may be required in the future in either San Francisco, CA or Irving, TX, depending on the candidate's location.Work hours: Mon – Fri, 8:00am – 5:00pm Pacific Time.DescriptionLynx is transforming from a traditional production support model to an automation-first, AI-assisted reliability platform following its migration to Azure cloud.This senior/staff-level Site Reliability Engineer role focuses on operating, stabilizing, and improving highly available systems while driving reliability automation using agentic AI across services.You will design, operate, and support scalable, observable production systems in Azure, while participating in and leading on-call rotations and high-severity incident response. Responsibilities include root cause analysis, blameless post-incident reviews, and implementing corrective actions.You will own and enhance observability using Dynatrace (dashboards, alerts, SLIs/SLOs), troubleshoot production issues across Java-based services, Kubernetes, and cloud infrastructure, and collaborate with cross-functional teams to reduce risk and operational toil.A key focus of this role is designing and building AI-driven automation for incident ingestion, triage, investigation, and remediation using multi-agent patterns, with appropriate guardrails and human-in-the-loop controls.You will also develop automation for incident communication, reporting, and continuous improvement, while remaining accountable for system reliability and AI-driven operations in production.The role combines software engineering and systems expertise to automate workflows, improve performance, and enhance system resilience using tools such as Azure, Kubernetes, Docker, GitHub Actions, Dynatrace, Python, Bash, and Ansible.Additional responsibilities include developing CI/CD pipelines, managing infrastructure, improving monitoring and observability, and supporting Java applications in production environments.As a senior leader, you will define reliability standards, influence system architecture, lead incident response efforts, and serve as an escalation point for critical issues.You will drive proactive reliability improvements through monitoring and reporting, communicate system health and risks to leadership, and mentor team members while supporting hiring and onboarding efforts.Ensure compliance with HIPAA and organizational security and regulatory requirements.QualificationsParticipation in a scheduled on-call rotation is required.7 years of Site Reliability Engineering or Production Engineering experience.Strong experience with Azure cloud infrastructure, Kubernetes, Docker, Java production systems, CI/CD (GitHub Actions), and observability platforms (Dynatrace preferred).Demonstrated experience automating infrastructure and operational workflows.Deep understanding of SRE principles (SLIs, SLOs, error budgets).Experience with Ansible.Solid understanding of Linux and Windows system administration.Experience working with onsite and offshore teams.Strong communication skills (written and verbal).Strong organizational skills and attention to detail.Experience in healthcare software or compliance solutions is a plus.Strong analytical and problem-solving skills.Preferred / Differentiating QualificationsExperience designing automation that replaces or materially reduces on-call toil.Experience building or orchestrating AI agents applied to operational workflows.Familiarity with multi-agent architectures or distributed automation systems.Strong judgment around risk management, safety boundaries, and human-in-the-loop design.Experience working in healthcare or regulated environments.Amerit Consulting provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.Applicants, with criminal histories, are considered in a manner that is consistent with local, state and federal laws.J-18808-Ljbffr Apply tot his job Apply To this Job

Related roles

Senior Site Reliability Engineer, Messaging Services

Remote · USA Full-time

Site Reliability Engineer

Remote · USA Full-time

Senior Site Reliability Engineer (B2B Contract)

Remote · USA Full-time

Site Reliability Engineer (EMEA, Canada , Bellevue, Los Angeles)

Remote · USA Full-time

Junior Linux BSP Software Engineer

Remote · USA Full-time

Linux Systems Administrator

Remote · USA Full-time

DevOps Linux Senior Engineer

Remote · USA Full-time

Linux Engineer

Remote · USA Full-time

Network & Infrastructure Security Analyst

Remote · USA Full-time

Sr. Threat Intelligence Analyst; Remote, East

Remote · USA Full-time

Remote Live Chat Support Specialist – Home‑Based Customer Service Representative – Flexible Hours, $25‑$35/hr, Full Training & Career Growth Opportunities

Remote · USA Full-time

Experienced Bilingual (Spanish) Automotive Customer Service Representative – Remote Opportunity

Remote · USA Full-time

Experienced Data Entry Specialist – Content Management and Quality Assurance (Remote, Part-Time)

Remote · USA Full-time

Credentialed Coder, Health Information Services

Remote · USA Full-time

Experienced Part-Time Remote Data Entry Specialist – Empowering Accurate Information Management at arenaflex

Remote · USA Full-time

Dental Billing Specialist

Remote · USA Full-time

Territory Manager (Remote Location: Houston, Texas)

Remote · USA Full-time

Case Manager, Registered Nurse

Remote · USA Full-time

AI Tutor - Swahili

Remote · USA Full-time

Sales Development Representative (EMEA)

Remote · USA Full-time