Job Search and Career Advice Platform

Enable job alerts via email!

SRE Engineer

Applied Intuition Inc.

Cambridge

On-site

GBP 65,000 - 85,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology provider in Cambridge is seeking a Site Reliability Engineer to ensure the reliability and performance of their cloud platforms. The role involves automating operational tasks and collaborating closely with engineering teams. Ideal candidates will have 5+ years in an SRE or DevOps role, strong software development skills, and experience with Kubernetes and observability tools. This position offers competitive benefits and opportunities for professional growth.

Benefits

Pension scheme with company match up to 9%
Mental health and wellbeing support
Professional development support
28 days holiday + public holidays
Flexible working arrangements
Corporate gym membership rates
Free lunch, snacks, and drinks
Electric car charging stations

Qualifications

  • 5+ years in SRE, DevOps or related role in a large-scale environment.
  • Software development experience, ideally as a .NET developer.
  • Strong understanding of SDLC, microservice, and HA architecture.
  • Experience with Kubernetes in production environments.

Responsibilities

  • Ensure reliability and performance of Altium Cloud Platforms.
  • Automate operational tasks and improve observability.
  • Contribute to incident response and management.
  • Partner closely with engineering teams on reliability best practices.

Skills

5+ years in SRE, DevOps or related role
Software development experience (.NET)
Strong understanding of SDLC
Experience with Kubernetes clusters
Knowledge of CI-CD tooling
Basic knowledge of networking fundamentals
Experience with relational databases

Tools

NewRelic
Grafana
Terraform
Ansible
Jenkins
Gitlab
GitHub
Job description

Altium is transforming the way electronics are designed and’assurance built. From startups to world’s technology giants, our digital platforms give more power to PCB designers, supply chain, and manufacturing, letting them collaborate as never before.

  • Constant innovation has created a transformative technology, unique in its space
  • More than 30,000 companies and 100,000 electronics engineers worldwide use Altium
  • We are growing, debt‑free, and financially strong, with the resources to become #1 in the EDA industry

🏢 Form of work: We work in the office 5 days a week, but you must be located close to our office in Cambridge 🏢

🏠 Role Overview

Site Reliability Engineer ensuring the reliability, availability, and performance of large-scale software systems through a blend of software engineering and systems administration. Key responsibilities involve automating operational tasks, improving observability, and contributing to incident management while also collaborating with development and technology teams to build more reliable and scalable applications.

Join Altium as a Site Reliability Engineer to ensure the reliability and performance of the Altium Cloud Platforms.

Key Responsibilities
  • Understanding how anMaialtium Cloud Platform works
  • Pioneer improvements in observability, including logging, monitoring, and application performance management (APM), ensuring system reliability and proactive issue detection.
  • Develop and implement reliability frameworks and patterns that standardize and elevate the resilience of our SaaS products across multiple regions and environments.
  • Cultivate a shared responsibility model where the SRE team collaborates with and educates engineering teams on reliability best practices.
  • Contribute to incident response and management, ensuring rapid resolution, clear stakeholder communication-focused analysis for continuous improvement.
  • Participate in system design consulting, platform management, infrastructure upgrades and capacity planning.
  • Partner closely with engineering and development teams to enhance product stability, observability, and manageability through best practices in reliability engineering.
  • Partner closely with DevOps/Operations, drive automation initiatives, promote Infrastructure as Code (IaC), and streamline deployment processes to improve operational efficiency and scalability.
  • Champion Service‑Orर्जiture Organization (SOO) principles to ensure accountability and clarity in service ownership.
Who you are and what you’ll need for this position
  • 5+ years in SRE, DevOps or related role in a large‑scale environment
  • Software development experience (ideally working with and as a .NET developer)
  • Strong understanding of SDLC, microservice and HA architecture
  • Observability – NewRelic, ELK, Grafana, PagerDuty, OTEL or similar
  • Experience with Kubernetes clusters in production setting, AWS, I o C
  • Experience with operational tasks
  • Knowledge of CI‑CD tooling Jenkins, Gitlab, GitHub, ArgoCD or similar
  • Knowledge of IaaC Terraform, Ansible
  • Basic knowledge of networking fundamentals
  • Experience with relational databases (MySQL, Postgres) a plus
Benefits

🌅 Pension scheme with company match up to 9%

🧘 nilo.health, mental health and wellbeing support

📚 Professional development support and resources

🏖 28 days holiday + public holidays and special leave

🏡 Flexible working arrangements available based on role and location

\etros

💪 Corporateქონ membership rates with national gyms

🥪 Free lunch, snacks, and drinks in the office chants

🚗 Electric car charging stations, free office parking, bicycle, and scooter storage

Equal Employment Opportunity

We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, genderergus, gender identity or expression, or veteran status. We are proud to be an equal opportunity workplace.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.