Enable job alerts via email!

Senior Site Reliability Engineer

Methodfi

Remote

GBP 70,000 - 90,000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm is seeking a Senior Site Reliability Engineer to take ownership of reliability across their platform. This remote role focuses on enhancing observability, leading incident management, and collaborating with engineering teams. The ideal candidate will have significant experience in Site Reliability, DevOps, and AWS. Strong communication skills and the ability to debug across systems are crucial. Join a world-class team to shape the SRE culture in a fast-growing SaaS environment.

Benefits

Competitive salary

Full benefits package

Remote work opportunity

Qualifications

5+ years of experience in Site Reliability, DevOps, or Infrastructure Engineering roles.
Deep understanding of distributed systems and debugging.
Hands-on experience with AWS and container orchestration.
Ability to trace and debug Rust and TypeScript code.
Strong collaboration and communication skills.

Responsibilities

Own reliability end-to-end and enhance observability.
Lead incident management and strengthen infrastructure.
Collaborate across teams on reliability considerations.
Drive continuous improvement and champion SRE practices.

Skills

Site Reliability Engineering

DevOps

Infrastructure Engineering

AWS

Kubernetes

Rust

TypeScript

Cassandra

ClickHouse

Tools

Pulumi

GitLab CI

Docker

Grafana

Prometheus

Loki

Jaeger

About Cryptio

We’re Cryptio. We build infrastructure to bring financial integrity to the crypto economy. Our enterprise-grade back-office and data platform power mission-critical accounting, reporting, and operational workflows for institutions, corporates, and crypto-native organisations.

We’re trusted by leaders like Circle, Societe Generale, Uniswap, Gemini, and the Government of El Salvador. We’ve raised $26m from top investors including Point Nine, 1kx, Tim Draper, and Ledger Cathay.

The opportunity

We’re hiring a Senior Site Reliability Engineer (SRE) to take full ownership of Cryptio’s reliability, observability, and incident response. You’ll work across our stack—from AWS infrastructure to Rust microservices, TypeScript indexers, and data-heavy backends—to ensure our platform remains fast, stable, and resilient as we scale.

This is a role for a hands-on builder who can see across systems, trace complex issues, and design reliability into everything we ship. You’ll collaborate closely with engineering and product teams to define SLAs/SLOs, strengthen monitoring and alerting, improve incident management, and build the processes and tooling that make reliability a shared culture at Cryptio.

Key technologies

AWS (EKS, S3, GuardDuty, Route53, IAM, and more)
Rust, TypeScript (Nest.js, React, OpenAPI)
PostgreSQL, Cassandra, ClickHouse
Pulumi, GitLab CI, Docker, Kubernetes
Grafana, Prometheus, Loki, Jaeger

What you’ll do

Own reliability end-to-end: design, measure, and improve service availability, latency, and performance across Cryptio’s platform
Enhance observability: expand and refine metrics, logs, and traces to provide deep insight into our Rust and TypeScript services
Lead incident management: define playbooks, improve response workflows, and foster a blameless postmortem culture
Strengthen infrastructure: optimise AWS configurations, CI/CD pipelines, autoscaling, and networking for reliability and cost efficiency
Collaborate across teams: work with product and engineering leads to ensure reliability is considered at every design stage
Drive continuous improvement: identify systemic weaknesses, automate recovery where possible, and reduce MTTR across the stack
Champion SRE best practices: guide teams on capacity planning, runbooks, and resilience testing

What we’re looking for

5+ years of experience in Site Reliability, DevOps, or Infrastructure Engineering roles
Deep understanding of distributed systems and debugging at the network, application, and database layers
Hands-on experience with AWS, container orchestration (Kubernetes, ECS), and Infrastructure-as-Code tools (Pulumi or similar)
Comfortable tracing through Rust and TypeScript code to diagnose complex performance or reliability issues
Experience with (or willingness to learn) Cassandra and ClickHouse in production
Strong collaborator with excellent communication skills
Systematic, analytical, and passionate about building reliable systems at scale
Interest in (or curiosity about) crypto, finance, or large-scale data systems

Why you’ll love this role

True ownership of reliability and uptime across a critical, fast-growing SaaS platform
Opportunity to shape SRE culture and processes from the ground up
Work with a world-class engineering team at the intersection of crypto, accounting, and data infrastructure
Freedom to experiment and improve observability, alerting, and recovery pipelines end-to-end
100% remote (UK only), with opportunities to visit our Paris or London hubs
Competitive salary and full benefits package

Interview process

Talent Screen (15–30 min): Initial call to discuss your background, Cryptio, and the role
Technical Interview (60 min): Deep dive into reliability, AWS, and debugging scenarios
Team Interview (45 min): Meet an engineer and product manager to explore cross-team collaboration
CTO Interview (45 min): Discussion about technical strategy, ownership, and your vision for reliability at Cryptio

If this sounds like you, we would love to hear from you 🙌

At Cryptio, we move fast and take ownership of outcomes. We learn from failures, celebrate wins, and let humility, curiosity, and a passion for crypto guide how we work. If you value collaboration and want to build with purpose, you’ll feel right at home here.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top locations

Top companies

Top positions