All roles

Site Reliability Engineer 3

Remote · USA Full-time New today

About the position The Site Reliability Engineering team designs and builds the global infrastructure on which MongoDB deploys its services, focusing on the flagship MongoDB Atlas platform. As customers grow and globalize, services must satisfy demands for low-latency requests around the globe and comply with various data sovereignty requirements. The SRE Team’s mission is to build this increasingly complex infrastructure, while continually lowering the operational burden associated with it and increasing internal visibility into the health of the system. They are strong believers in infrastructure-as-code and self-healing systems. The SRE Team is fully integrated with all other engineering teams, working closely together with a soft and traversable boundary between their areas of responsibility. Candidates based in New York City are sought for a hybrid working model. MongoDB is built for change, empowering customers and people to innovate at the speed of the market. They have redefined the database for the AI era, enabling innovators to create, transform, and disrupt industries with software. MongoDB’s unified database platform—the most widely available, globally distributed database on the market—helps organizations modernize legacy workloads, embrace innovation, and unleash AI. Their cloud-native platform, MongoDB Atlas, is the only globally distributed, multi-cloud database and is available across AWS, Google Cloud, and Microsoft Azure. With offices worldwide and nearly 60,000 customers—including 75% of the Fortune 100 and AI-native startups—relying on MongoDB for their most important applications, they are powering the next era of software. MongoDB's Leadership Commitment guides decisions, interactions, and success. To drive personal growth and business impact, they are committed to developing a supportive and enriching culture for everyone, offering employee affinity groups, fertility assistance, and a generous parental leave policy to support employee wellbeing and professional and personal journeys. MongoDB is committed to providing necessary accommodations for individuals with disabilities within their application and interview process and provides equal employment opportunities to all employees and applicants.

Responsibilities

  • Design and build the infrastructure for a global cloud service that comprises hundreds of thousands of MongoDB clusters, processes a billion metrics per day, and replicates tens of billions of database writes to our backup service
  • Design, implement, and troubleshoot the automation and monitoring of services that seamlessly spans the globe - including several cloud providers
  • Become an expert in infrastructure performance, helping us optimize from the application level all the way through the firmware
  • Build for resilience. Our goal is that nobody’s pager goes off, ever. Are we there yet? No. Are we really close? Very. While we work on that - participate in a weekly on-call rotation
  • Improve our infrastructure capabilities, optimizing for cost, simplicity, and maintainability

Requirements

  • 3+ years of experience running a mission critical service at scale in a Linux environment
  • Firm grasp of at least one modern programming language, beyond basic scripting
  • Familiarity with web and network protocols and standards (HTTP, TLS, DNS, etc)
  • Bachelor’s degree in Computer Science or equivalent experience
  • Experience writing automation tools & eagerness to "automate all the things"

Nice-to-haves

  • Experience building large applications from scratch, complete with CI/CD infrastructure
  • Experience in networking, security, hardware or OS performance tuning
  • Experience with at least one of the major cloud providers (Amazon Web Services, Google Compute, Microsoft Azure)
  • Experience managing kubernetes clusters or some other container orchestration infrastructure
  • Experience with observability of large scale distributed systems

Benefits

  • fertility assistance
  • generous parental leave policy
  • employee affinity groups

Apply tot his job Apply To this Job

Related roles

Sr. Site Reliability Engineer

Remote · USA Full-time

Senior SRE - INTL MX

Remote · USA Full-time

Junior Site Reliability Engineer

Remote · USA Full-time

SRE (Kubernetes)

Remote · USA Full-time

Principal SRE

Remote · USA Full-time

Site Reliability Engineer (SRE) – II

Remote · USA Full-time

Site Reliability Engineer III

Remote · USA Full-time

Intermediate Site Reliability Engineer, Environment Automation

Remote · USA Full-time

Lead Site Reliability Engineer (GCP & Hybrid Cloud) Hybrid

Remote · USA Full-time

Senior Infrastructure Engineer/SRE

Remote · USA Full-time

Experienced Night Shift Customer Service Representative – Delivering Exceptional Support to arenaflex Clients

Remote · USA Full-time

Organizational Change Management Lead for Sourcing – Contractor /1099

Remote · USA Full-time

Identity Subject Matter Expert

Remote · USA Full-time

Experienced Customer Service Agent – Work From Home Opportunity at arenaflex

Remote · USA Full-time

Telehealth Nurse Practitioner or Physician Assistant (Remote) - New York License

Remote · USA Full-time

Senior Data Scientist II - Health Care Analytics - Chicago, DC or Remote

Remote · USA Full-time

Experienced Full Stack Quality Assurance Engineer – Virtual & Simulation Testing for Mechatronics & Sustainable Packaging

Remote · USA Full-time

Customer Intake Specialist - Insurance - Remote - San Francisco, CA

Remote · USA Full-time

Experienced Remote Live Chat Agent – Compassionate Support Specialist for arenaflex

Remote · USA Full-time

Regulatory Coordinator - Breast Oncology

Remote · USA Full-time