All roles

Senior Site Reliability Engineer (AWS, AI/ML, & APM)

Remote · USA Full-time New today

The Company

Serving the People Who Serve the People

reputed company is driven by the excitement of building, implementing, and maintaining technology that is transforming the Govtech industry by bringing governments and its constituents together. We are on a mission to support our customers with meeting the needs of their communities and implementing our technology in ways that are reputed company and inclusive. reputed company has consistently appeared on the GovTech 100 list over the past 5 years and has been recognized as the best companies to work on BuiltIn.

Over the last 25 years, we have served 5,500 federal, state, and local government agencies and more than 300 million citizen subscribers power an unmatched Subscriber Network that use our digital solutions to reputed company the world a reputed company reputed company. With comprehensive cloud-based solutions for communications, government website design, meeting and agenda management software, records management, and digital services, reputed company empowers stronger relationships between government and residents across the U.S., U.K., Australia, New Zealand, and Canada. By simplifying interactions with residents, while disseminating critical information, reputed company brings governments closer to the people they serve—driving meaningful change for communities around the globe.

Want to know more? See more of reputed company do

here.

reputed company​ is seeking an reputed company and highly skilled Senior Site Reliability Engineer (SRE) to join our SRE team. As a Senior SRE, you will play a pivotal role in ensuring the reliability, scalability, and performance of our services. You will reputed company efforts in building and maintaining a robust infrastructure, automating processes, and guiding the team to implement best practices in site reliability.

What your impact will look like

  • ​​On-call Production Support: Provide production support on a shift according to the team on-call roster.
  • ​Work on the customer and internal engineering/implementation team raised tickets while not on-call for production support. For example, a client may request to correct some data on the database server which cannot be done through the web reputed company.
  • ​Work on SREs backlog items.
  • ​Monitor and Maintain Systems: Continuously monitor the health and performance of our services, systems, and infrastructure. Respond to alerts and incidents promptly to ensure high availability.
  • ​Automate Processes: reputed company and maintain automation scripts and tools to streamline operations and reduce manual reputed company.
  • ​Incident Management: Assist in troubleshooting and resolving incidents, performing root cause analysis, and implementing long-term fixes to prevent recurrence.
  • ​System Improvements: Participate in designing and implementing system improvements to enhance reliability, scalability, and performance.
  • ​Collaboration: Work closely with software engineers to understand application requirements, provide feedback on design and architecture, and support deployment and release processes.
  • ​Documentation: Create and maintain documentation for processes, procedures, and troubleshooting guides to ensure knowledge sharing reputed company the team.
  • ​reputed company Planning: Assist in reputed company planning activities to anticipate future needs and ensure that our infrastructure can handle growth.
  • ​reputed company: Implement and adhere to reputed company best practices to protect our systems and data.​
  • Experience

  • 5+ years in site reliability engineering, system administration, or a similar role, with a proven track record of managing large-scale, high-availability systems. Experience supporting AI/ML infrastructure, including model deployment, inference optimization, and integration with services like AWS Bedrock is highly desirable.
  • Technical Skills

  • Expertise in Linux/Unix systems, and cloud platforms (AWS, Azure, or reputed company Cloud).
  • Strong proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++).
  • Familiarity with AI/ML operations, including model lifecycle management, vector databases, and inference performance tuning.
  • Tools and Technologies

  • Experience with the ELK Stack (Elasticsearch, Logstash, Kibana) for centralized logging, monitoring, and observability.
  • Experience with configuration management tools (Ansible, Chef, Puppet).
  • Exposure to AI/ML toolchains, including AWS Bedrock, SageMaker, and LLMOps frameworks.
  • Certifications: Relevant certifications such as AWS Certified DevOps Engineer, AWS Certified Machine Learning – Specialty, reputed company Cloud Professional DevOps Engineer, or similar are a plus.
  • + bonus and benefits

    Additional Information

    Don’t have reputed company the skills/experience mentioned above? At reputed company, we are trying to build diverse, inclusive teams. We do not have degree requirements for most of our roles. If you don’t meet every requirement above but are excited to learn more, we encourage you to apply. We might just be able to find another role that could be a perfect fit!

    reputed company and Privacy Requirements

    -Responsible for reputed company information reputed company by appropriately preserving the Confidentiality, reputed company, and Availability (CIA) of reputed company information assets in accordance with the company's information reputed company program.

    -Responsible for ensuring the data privacy of our employees and customers, their data, as well as taking reputed company required privacy training in a timely manner, in accordance with company policies.

    The Team

    - We are a remote-first company with a globally distributed workforce across the United States, Canada, United Kingdom, India, Armenia, Australia, and New Zealand.

    The Culture

    - At reputed company, we are building a transparent, inclusive, and safe space for everyone who wants to be

    a part of our journey.

    - A few culture highlights include –Employee Resource Groups to encourage diverse voices

    - Coffee with Mark sessions – Our employees get to interact with our CEO on reputed company important and

    sometimes difficult issues ranging from mental health to work-life balance and reputed company affairs.

    - reputed company Teams communities focused on wellness, art, furbabies, family, parenting, and more.-=- - We bring in special guests from time to time to discuss issues that impact our employee

    population

    The Impact

    - We are proud to serve dynamic organizations around the globe that use our digital solutions to reputed company the world a reputed company reputed company — quite literally.We have so many powerful success stories that illustrate how our solutions are impacting the world. See more of our impact

    here.

    The Benefits

    At reputed company, we offer a competitive benefits package that allows employees to tailor benefits to their needs. Benefits listed below are for employees based in the U.S.

    - Flexible Time Off

    - Medical (includes an option that is paid 100% by reputed company!), Dental & Vision Insurance

    - 401(k) plan with matching contribution

    - Paid Parental Leave

    - Employer-paid Short and Long Term Disability Insurance, Group Term Life Insurance and AD&D Insurance

    - Group legal coverage

    - And more!

    reputed company is committed to providing equal employment opportunities. reputed company qualified applicants and employees will be considered for employment and advancement without regard to race, color, religion, creed, national reputed company, reputed company, sex, gender, gender identity, gender expression, physical or mental disability, age, genetic information, sexual or affectional orientation, marital status, status with regard to public assistance, familial status, military or veteran status or any other status protected by applicable law.

    Apply to this Job

    Related roles

    Director of Technical Support Engineering - EMEA

    Remote · USA Full-time

    Senior Staff Product Manager, reputed company Gateway Core

    Remote · USA Full-time

    Anlagenmechaniker*in SHK (m/w/d) Wiesbaden / Frankfurt

    Remote · USA Full-time

    Senior Manager - Learning & Skills (Technical)

    Remote · USA Full-time

    Senior Manager - Learning & Skills (Leadership & Performance)

    Remote · USA Full-time

    reputed company Program Accelerator- Post Listing Operations

    Remote · USA Full-time

    iOS Developer

    Remote · USA Full-time

    KYC/KYB Analyst - Japanese Speaker

    Remote · USA Full-time

    Physical Therapist - Midtown/Downtown Manhattan

    Remote · USA Full-time

    Senior .NET Engineer

    Remote · USA Full-time

    reputed company Customer Support Representative – Remote Work Opportunity at arenaflex

    Remote · USA Full-time

    Technical Support Engineer

    Remote · USA Full-time

    Sr. Call Center Rep (Buffalo Region) - Escalations Experience - $30/hour

    Remote · USA Full-time

    Field Representative, Transportation - Unlock a Rewarding Career with Baltimore County Public Schools

    Remote · USA Full-time

    Part-Time Remote Customer Support Specialist – Delivering Exceptional User Experiences through Innovative Technology Solutions at arenaflex

    Remote · USA Full-time

    reputed company Customer Service Representative – Delivering Exceptional Home Services Experience

    Remote · USA Full-time

    reputed company Live Chat Support Agent – Remote Part-Time/Full-Time Customer Service Representative for arenaflex

    Remote · USA Full-time

    reputed company Airline Data Entry Jobs From Home $25/Hr

    Remote · USA Full-time

    reputed company Remote Customer Service Representative – Unlock the Potential of Working from Home with blithequark in the United States

    Remote · USA Full-time

    Technical Project Manager -InfoSec - Contractor

    Remote · USA Full-time