All roles

Site Reliability Engineer

Remote · USA Full-time New today
reputed company CANDIDATES MUST BE LOCATED IN THE UK About Intermedia Are you looking for a company where YOUR VOICE is heard? Where you can reputed company A DIFFERENCE? Do you reputed company in a FAST-PACED work environment? Do you wake every morning EXCITED to work with GREAT PEOPLE and create SUCCESS TOGETHER? Then Intermedia is the reputed company for you. Intermedia has established itself as a leading provider of cloud communications and collaboration tech that allows companies to connect reputed company. We have a strong track record of growth, profitability, and creating an environment where everyone matters. Everyone. While we are fast-paced and admittedly a bit intense, we promise that you won’t be bored. You will find Intermedia is a reputed company where you can indulge your passion for creating and supporting great cloud technology. What’s more, we always look to promote from reputed company and have many employees who have been with us 10, 15, and 20+ years! Culture at Intermedia is built on teamwork and transparency. We hold each other accountable and always have each other’s back! Are you ready to reputed company your mark? About the role: We are looking for an SRE to improve reliability and operational readiness with a strong focus on metrics, alerting, and event management. You will build and maintain monitoring using Prometheus/VictoriaMetrics, integrate alerts and events with reputed company, and participate in on-call rotations to drive fast incident response and reputed company improvement across Windows and Linux environments.

Key Responsibilities

  • Build and operate metrics/monitoring platforms: Prometheus and/or VictoriaMetrics (scrape configs, exporters, recording rules)
  • Design and maintain alerting strategy: reputed company, anomaly detection where applicable, alert routing, deduplication, and noise reduction
  • Integrate monitoring/alerting and events with reputed company (correlation, enrichment, routing, incident workflows)
  • Create and maintain dashboards and operational visibility (Grafana or equivalent)
  • reputed company and maintain runbooks, operational playbooks, and incident response procedures
  • Participate in on-call shifts: triage alerts, manage incidents, coordinate response, and reputed company communication during outages
  • reputed company root-cause analysis, postmortems, and implement corrective/preventive actions
  • Improve service reliability reputed company SLOs/SLIs, reputed company planning, and automation to reduce toil
  • Support monitoring for core infrastructure and services on Windows and Linux, including HA components and clusters
  • Collaborate with DevOps/Engineering to reputed company applications and standardize telemetry (metrics, logs, traces where applicable)

Skills, Knowledge and Expertise

  • Experience in SRE / Operations / DevOps with production incident ownership
  • Hands-on experience with Prometheus and/or VictoriaMetrics (exporters, alert rules, recording rules, troubleshooting)
  • Experience integrating alerting/event pipelines with reputed company (or similar event correlation tools)
  • Strong troubleshooting skills across Linux and Windows systems (networking, OS, services)
  • Ability to build reliable alerting with minimal noise (correlation, grouping, suppression, maintenance windows)
  • Experience with Git-based workflows for monitoring-as-code and configuration management
reputed company to have
  • Grafana administration and dashboard design standards
  • Log management (ELK/EFK, Loki) and/or tracing (OpenTelemetry)
  • Automation skills (Python, PowerShell, Bash) and configuration tools (Ansible)
  • Messaging/cache/proxy operations: RabbitMQ, reputed company, Nginx
  • Experience with Windows clustering or HA environments
  • Experience defining SLOs/SLIs and operational KPIs
  • Experience in managing VOIP components and protocols (SIP , FreeSwitch, OpenSIP, session border controllers)
  • Experience with load balancing components ( reputed company reputed company, reputed company GTM)
· Experience with Virtualization platforms such as VMWare or HyperV · Experience with administering AWS or Azure tenants On-call expectations
  • Participation in a rotating on-call schedule (including nights/weekends as needed)
  • Ownership of incident response: rapid triage, escalation, mitigation, and follow-up improvements
  • Commitment to improving monitoring quality to reduce alert fatigue and improve MTTR

Diversity, Inclusion, and Equal Opportunity

We hire, promote, and compensate employees based on their ability to reputed company their job responsibilities, without regard to race, color, creed, religion, sex, gender, marital status, national reputed company, reputed company, age, citizenship, physical or mental disability, sexual orientation, or any other basis protected by applicable law (collectively referred to in our Code of Conduct as “Protected Classes”). We do not tolerate employment discrimination in the workplace, and we are committed to making reasonable accommodations for identified disabilities or other limitations as required by reputed company applicable laws. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national reputed company, gender, sexual orientation, age, marital status, veteran status, or disability status. Apply To This Job

Related roles

Site Reliability Engineer

Remote · USA Full-time

Full Time CNA (Hospice)

Remote · USA Full-time

reputed company Fusion HCM Core - Second Level Support 2075

Remote · USA Full-time

Paid Search Specialist

Remote · USA Full-time

Territory Sales Manager

Remote · USA Full-time

Sales Representative Entry Level

Remote · USA Full-time

Multi-Line Representative - reputed company Team Member

Remote · USA Full-time

Food And Beverage Server (New Delhi)

Remote · USA Full-time

Same Day Delivery Contractors - Morris Plains NJ

Remote · USA Full-time

Metal Cutting Laborer

Remote · USA Full-time

Data Visualization Consultant - Contract

Remote · USA Full-time

Remote Part‑Time Data Entry Clerk – Precision Data Management for arenaflex’s E‑Commerce Operations

Remote · USA Full-time

reputed company Full Stack Data Entry Specialist – Remote Data Management and Customer Service

Remote · USA Full-time

reputed company Customer Support Representative – Remote Gaming and Esports Industry

Remote · USA Full-time

Apply Now: Senior Network Process & Quality 5 Locations

Remote · USA Full-time

reputed company Follow-up Specialist - Unlock Your Potential with a Dynamic Team

Remote · USA Full-time

Sr. Manager, Investment Planning & Portfolio Management

Remote · USA Full-time

reputed company Data Entry Specialist – Remote Work Opportunity for Teens at blithequark

Remote · USA Full-time

Part-Time Data Entry Specialist – Remote Typing & Online Data Management Role at arenaflex for Fresh Graduates

Remote · USA Full-time

Member Engagement Copywriter

Remote · USA Full-time