Job Search and Career Advice Platform

Enable job alerts via email!

Machine Learning Researcher (Brahma)

DNEG Group

Remote

GBP 60,000 - 85,000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A pioneering technology firm in London is seeking a Machine Learning Researcher to advance generative video systems. This role focuses on improving expression control and lip synchronization through innovative models. Candidates should have a strong background in machine learning, particularly in video synthesis, and experience in Python and PyTorch. The position offers collaboration with a world-class team in a remote-first environment.

Qualifications

  • Strong background in machine learning and deep learning for generative models.
  • Experience with video synthesis tasks like lip sync and avatar animation.
  • Familiarity with frameworks for image and video generation.

Responsibilities

  • Research and develop advanced generative video models.
  • Fine-tune video diffusion models for realism and alignment.
  • Design training pipelines for talking-head synthesis.

Skills

Machine learning
Deep learning
Python
PyTorch
Video synthesis
Problem-solving

Education

PhD in Computer Vision, Machine Learning, or related field

Tools

MMPose
MediaPipe
DLIB
Job description
Overview

Brahma is a pioneering company developing AI-native products built to help enterprises and creators innovate at scale. Brahma enables teams to break creative bottlenecks, accelerate storytelling, and deliver standout content with speed and efficiency. Part of the DNEG Group, Brahma brings together Hollywood’s leading creative technologists, innovators in AI and Generative AI, and thought leaders in the ethical creation of AI content.

Role

We are seeking a Machine Learning Researcher to join our team and help advance the state of the art in human-centric generative video models. Your work will focus on improving expression control, lip synchronisation, and overall realism in models such as WAN and Hunyuan. You’ll collaborate with a world-class team of researchers and engineers to build systems that can generate lifelike talking-head videos from text, audio, or motion signals—pushing the boundaries of neural rendering and avatar animation.

Responsibilities
  • Research and develop cutting-edge generative video models, with a focus on controllable facial expression, head motion, and audio-driven lip synchronisation.
  • Fine-tune and extend video diffusion models such as WAN and Hunyuan for better visual realism and audio-visual alignment.
  • Design robust training pipelines and large-scale video/audio datasets tailored for talking-head synthesis.
  • Explore techniques for controllable expression editing, multi-view consistency, and high-fidelity lip sync from speech or text prompts.
  • Work closely with product and creative teams to ensure models meet quality and production constraints.
  • Stay current with the latest research in video generation, speech-driven animation, and 3D-aware neural rendering.
Qualifications
  • Must Haves
    • Strong background in machine learning and deep learning, especially in generative models for video, vision, or speech.
    • Hands-on experience with video synthesis tasks such as face reenactment, lip sync, audio-to-video generation, or avatar animation.
    • Proficient in Python and PyTorch; familiar with libraries like MMPose, MediaPipe, DLIB, or image/videogeneration frameworks.
    • Experience training large models and working with high-resolution audio/video datasets.
    • Deep understanding of architectures such as transformers, diffusion models, GANs and motion representation techniques.
    • Proven ability to work independently and drive research from idea to implementation.
    • Strong problem-solving skills, ability to work autonomously in a remote-first environment.
Nice to Have
  • PhD in Computer Vision, Machine Learning, or a related field, with publications in top-tier conferences (CVPR, ICCV, ICLR, NeurIPS, etc.).
  • Familiarity with or contributions to open-source projects in lip sync, video generation, or 3D face modelling.
  • Experience with real-time inference, model optimisation, or deployment for production applications.
  • Knowledge of adjacent areas like emotion modelling, multimodal learning, or audio-driven animation.
  • Experience working with or adapting models like WAN, Hunyuan or similar.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.