Enable job alerts via email!

Member of Technical Staff & Safety Lead

Methodfi

Greater London

On-site

GBP 75,000 - GBP 95,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading AI research firm in Greater London is seeking a talented professional to own the red-teaming and adversarial evaluation pipeline for their models. This role involves ensuring model safety and compliance with deployment policies, crucial for successful releases. Ideal candidates will possess a graduate degree in Computer Science or a related field, deep understanding of LLM safety, and solid software engineering skills. The organization offers competitive compensation and extensive health benefits, fostering an impactful work environment.

Benefits

Top-tier compensation

Comprehensive health insurance

Paid parental leave

Daily lunches and dinners

Qualifications

Deep technical understanding of LLM safety and adversarial attacks.
Experience with automated evaluation pipelines and large-scale ML systems.
Ability to make high-stakes decisions regarding model safety.

Responsibilities

Own the red-teaming and adversarial evaluation pipeline for models.
Work with Alignment team to implement safety findings.
Validate releases meet risk thresholds before shipping.

Skills

LLM safety understanding

Software engineering

Automated evaluation pipelines

Reinforcement Learning

Education

Graduate degree in Computer Science or related discipline

Our Mission

Reflection’s mission is to build open superintelligence and make it accessible to all.

We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.

About the Role

Own the red-teaming and adversarial evaluation pipeline for Reflection’s models, continuously probing for failure modes across security, misuse, and alignment gaps.
Work hand-in-hand with the Alignment team to translate safety findings into concrete guardrails, ensuring models behave reliably under stress and adhere to deployment policies.
Validate that every release meets the lab’s risk thresholds before it ships, serving as a critical gatekeeper for our open weight releases.
Develop scalable, automated safety benchmarks that evolve alongside our model capabilities, moving beyond static datasets to dynamic adversarial testing.
Research and implement state-of-the-art jailbreaking techniques and defenses to stay ahead of potential vulnerabilities in the wild.

About You

Graduate degree (MS or PhD) in Computer Science, Machine Learning, or related discipline, or equivalent practical experience in AI Safety.
Deep technical understanding of LLM safety, including adversarial attacks, red-teaming methodologies, and interpretability.
Strong software engineering capabilities with experience building automated evaluation pipelines or large-scale ML systems.
Experience with Reinforcement Learning (RLHF/RLAIF) and how it impacts model safety and alignment is a strong plus.
Thrive in a fast-paced, high-agency startup environment with bias toward action.
Willing to make high-stakes decisions regarding model release and safety thresholds.
Passionate about advancing the frontier of intelligence.

What We Offer:

We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.
Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance.
Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning.
Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time.
Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Member of Technical Staff - Software Engineer

Methodfi

City of London

On-site

GBP 60,000 - 80,000

Full time

30+ days ago

Member of Technical Staff - Pre-Training

Methodfi

City of London

On-site

GBP 70,000 - 120,000

Full time

30+ days ago

Member of Technical Staff - Data Ingestion Engineer

Methodfi

Greater London

Hybrid

GBP 60,000 - 80,000

Full time

30+ days ago

Member of Technical Staff - GPU Infrastructure

Methodfi

City of London

On-site

GBP 80,000 - 120,000

Full time

30+ days ago

Member of Technical Staff - Data Quality Engineer (Post-training)

Methodfi

Greater London

On-site

GBP 60,000 - 80,000

Full time

30+ days ago

Member of Technical Staff - Data Quality Engineer (Pre-training)

Methodfi

Greater London

Hybrid

GBP 70,000 - 90,000

Full time

30+ days ago

Recruiting Coordinator

Methodfi

Greater London

On-site

GBP 60,000 - 80,000

Full time

30+ days ago

Machine Learning and AI Engineer - London London

Reflexivity

Greater London

On-site

GBP 90,000 - 160,000

Full time

30+ days ago

Senior Research Scientist - AI Safety

Methodfi

City of London

Hybrid

GBP 80,000 - 100,000

Full time

30+ days ago

Senior Research Scientist - AI Safety

Methodfi

City of London

Hybrid

GBP 70,000 - 90,000

Full time

30+ days ago

Top locations

Top companies

Top positions