Job Search and Career Advice Platform

Enable job alerts via email!

Senior DevOps Engineer

Digital Science

Remote

GBP 65,000 - 85,000

Full time

26 days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in the UK is seeking a Senior DevOps Engineer to enhance the Overleaf infrastructure. This role involves maintaining system reliability on Google Cloud, optimizing CI/CD pipelines, and ensuring cost efficiency. Ideal candidates will have significant experience with GCP, Kubernetes, and Terraform, alongside a solid understanding of Linux systems. Join us in advancing the research ecosystem and delivering innovative cloud solutions.

Qualifications

  • Significant working knowledge of GCP or AWS.
  • Strong hands-on expertise in Kubernetes and Docker.
  • Solid Linux/Unix systems knowledge and scripting skills in Bash/Python.
  • Experience with CI/CD tools and monitoring platforms.
  • Proven track record of handling production incidents.

Responsibilities

  • Own infrastructure on Google Cloud Platform.
  • Build and optimize CI/CD pipelines.
  • Implement and manage monitoring and alerting systems.
  • Ensure performance and reliability of databases.
  • Oversee cost management on GCP.

Skills

GCP or AWS knowledge
Kubernetes expertise
Terraform experience
Linux/Unix systems knowledge
CI/CD tools familiarity
Database expertise
Agile methodology experience
Security-focused mindset

Tools

Jenkins
Grafana
PostgreSQL
MongoDB
Redis
Job description
Senior DevOps Engineer

Department: Technology

Employment Type: Full Time

Location: UK (Home based)

Description

We are Digital Science and we are advancing the research ecosystem. We are a pioneering technology company, and our vision is of a future where a trusted and collaborative research ecosystem drives progress for all. Want to join us?

Your New Role: Senior DevOps Engineer, focusing on Overleaf Infrastructure

We are recruiting for a Senior DevOps Engineer within the wider Digital Science Product organization, where you will directly support one of our most critical and high‑profile products: Overleaf. You're responsible for maintaining the reliability, scalability, and performance of the systems that power Overleaf’s most critical platforms, operating primarily on Google Cloud (GCP). You will use your knowledge of distributed systems and architecture to ensure smooth, global operations and improve overall system health. You'll work closely with cross‑functional teams to identify and mitigate risks, supporting platforms that require world‑class reliability and automation.

What you’ll be doing

This role requires a blend of hands‑on infrastructure ownership, automation, and a strong focus on system reliability and cost efficiency:

  • GCP Infrastructure Ownership: Own our infrastructure on Google Cloud Platform and the Terraform codebase, managing critical components including VPCs, Compute Engine, Kubernetes clusters, Cloud SQL/Redis, load balancers, Cloud Armor, logging/monitoring pipelines, and IAM.
  • Automation & CI/CD: Build and optimize CI/CD pipelines using Jenkins or similar tools, and automate routine operations with shell scripts where appropriate.
  • Reliability & Monitoring: Implement and manage monitoring, alerting, and incident response systems using Google Cloud Monitoring and similar tools. Participate in a rotating on‑call schedule for critical infrastructure issues outside normal business hours.
  • Database Management: Ensure the performance, reliability, and uptime of PostgreSQL and Mongo databases with proactive monitoring and tuning.
  • Cost Management: Oversee resource usage on GCP to ensure efficient cost management.
  • Collaboration & Knowledge Sharing: Share knowledge with engineers, build consensus for change, and write excellent documentation.
What you’ll bring to the role

Essential Experience

  • Cloud & Containers: Significant working knowledge of GCP or AWS, strong hands‑on expertise in Kubernetes and Docker.
  • Infrastructure as Code (IaC): Strong hands‑on expertise in Terraform.
  • Operating Systems & Scripting: Solid Linux/Unix systems knowledge and scripting skills in Bash/Python.
  • DevOps Tooling: Experience with CI/CD tools such as Jenkins and monitoring platforms such as Grafana or Google Cloud Monitoring.
  • Database Expertise: Experience with Mongo, PostgreSQL, and Redis.
  • SRE Practice: Ability to implement best‑practice alerting, monitoring, and observability for high‑load applications.
  • Incident Management: Proven track record of handling production incidents and post‑incident analysis.
  • Agile: Significant experience working in an Agile methodology and implementing best practices in version control and code review.
  • Security‑first mindset: Cover confidentiality, integrity, and availability.
  • Continuous learning: Commitment to staying up‑to‑date with emerging technologies and implementing innovative cloud solutions.
  • Nice to have: Understand error budgets, SLI, and SLOs.
  • Understand how to manage cloud computing costs effectively.
  • Experience coding in a language such as JavaScript.

Don't worry if you don't meet every qualification—let us be the judge! Many qualified candidates from under‑represented groups hesitate to apply unless they meet every requirement. We encourage you to submit your application.

Living our Values

We invest in, nurture, and support innovative businesses and technologies that make all parts of the research process more open, efficient, and effective. The values we live by are: brave in the pursuit of better, collaborative and inclusive, always open‑minded, and from and for the community.

Equal Opportunity Statement

As an equal opportunity employer, we are committed to building and nurturing a workplace where every individual feels valued and belongs. All applicants will be considered for employment without attention to race, colour, religion, age, sex, sexual orientation, gender identity, national origin, veteran, or disability status.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.