Enable job alerts via email!

Site Reliability Engineer

AnaVation LLC

Austin (TX)

Remote

USD 140,000 - 175,000

Full time

16 days ago

Job summary

A tech company specializing in public safety solutions is seeking a Site Reliability Engineer to ensure high reliability and performance of their production infrastructure. This fully remote role requires at least 6 years of DevOps experience, expertise in AWS and Kubernetes, and a strong engineering background. Competitive salary and generous benefits offered.

Benefits

3 weeks of paid vacation
Generous medical, dental, and vision plans
Casual work environment
Kitchen stocked with snacks and drinks

Qualifications

  • 6+ years of experience in DevOps or site reliability engineering.
  • Ability to work independently and make decisions with minimal supervision.

Responsibilities

  • Ensure high availability and performance of production infrastructure.
  • Collaborate with engineering teams on planning, design, and deployment.

Skills

DevOps experience
Distributed systems knowledge
Containerization (Docker, Kubernetes)
Infrastructure as code (Terraform, Ansible)
Monitoring and alerting (Grafana, Prometheus)
Cloud infrastructure proficiency
Programming languages (Python, Go)
Database management (PostgreSQL, Elasticsearch)

Education

Bachelor's degree in relevant field (Computer Science, Software Engineering, Mathematics)

Tools

AWS Cloud
Azure
GitHub Actions
Job description
Overview

At LeoTech, we are passionate about building software that solves real-world problems in the Public Safety sector. Our software has been used to help the fight against continuing criminal enterprises, drug trafficking organizations, identifying financial fraud, disrupting sex and human trafficking rings and focusing on mental health matters to name a few.

Role

We are looking for a seasoned Site Reliability Engineer (SRE) to join our distributed team. This is a fully remote, work-from-home opportunity.

As a key member of our DevOps team, you will be responsible for designing, implementing, and maintaining mission-critical monitoring, alerting, and incident response systems. Your work will ensure high availability, reliability, and performance of our infrastructure, supporting scalable services in production environments.

You will partner closely with engineering teams throughout the full development lifecycle, contributing to planning, design, deployment, and reliability goals.

Responsibilities
  • Design, implement, and maintain mission-critical monitoring, alerting, and incident response systems to ensure high availability and performance of production infrastructure.
  • Collaborate with engineering teams across the full development lifecycle for planning, design, deployment, and reliability goals.
Qualifications
  • Strong engineering background in Computer Science, Software Engineering or Mathematics.
  • At least 6+ years of DevOps or site reliability experience.
  • Deep understanding of distributed systems, containerization (Docker, Kubernetes), and modern infrastructure design patterns.
  • Experience authoring infrastructure as code using Terraform and/or Ansible.
  • Experience with monitoring, logging, and alerting using Grafana, Prometheus, ELK stack, or equivalents.
  • Deep understanding of public cloud infrastructure.
  • Proficiency with programming languages such as Python, Go, or similar.
  • Experience with PostgreSQL, Elasticsearch and KV stores.
  • Ability to work in a fast-paced environment with dynamic objectives and quick iterations.
  • Demonstrated ability to learn continuously, work independently, and make decisions with minimal supervision.
Technologies We Use
  • Primarily hosted on AWS Cloud with some infrastructure in Azure.
  • Extensive monitoring and alerting footprint in Grafana Cloud.
  • Backend services are dockerized, deployed in Kubernetes, and managed by ArgoCD.
  • Backend languages include Elixir, NodeJS, and Python.
  • TypeScript and React are central to frontend development.
  • Terraform, CloudFormation, and Ansible for infrastructure deployment and automation.
  • Industry-standard build tooling and CI/CD using GitHub Actions.
  • A mix of open-source and proprietary technologies tailored to the problems at hand.
What You Can Expect
  • Enjoy great team camaraderie whether at our Irvine office or working remotely.
  • Thrive on the fast pace and challenging problems to solve.
  • Modern technologies and tools.
  • Continuous learning environment.
  • Opportunity to communicate and work with people of all technical levels in a team environment.
  • Grow as you are given feedback and incorporate it into your work.
  • Be part of a self-managing team that enjoys support and direction when required.
  • 3 weeks of paid vacation – out the gate!!
  • Competitive Salary.
  • Generous medical, dental, and vision plans.
  • Sick, and paid holidays are offered.
  • Stand/ sit workstations.
  • Kitchen stocked with snacks and drinks on site.
  • Work with talented and collaborative co-workers.
  • Casual environment.

$140,000 - $175,000 a year

Please note the national salary range listed in the job posting reflects the new hire salary range across levels and U.S. locations that would be applicable to the position. The final salary will be commensurate with the candidate\'s accepted hiring level and work location. Also, this range represents base salary only and does not include equity, or benefits if applicable.

LeoTech is an equal opportunity employer and does not discriminate on the basis of any legally protected status.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.