
Enable job alerts via email!
Generate a tailored resume in minutes
Land an interview and earn more. Learn more
A tech company focused on generative media is seeking a Machine Learning Engineer to design and scale infrastructure for generative video models. Responsibilities include building data workflows, fine-tuning generative models, and integrating computer vision techniques. Candidates should have strong ML engineering skills, particularly in generative media and computer vision. The role offers a competitive salary, equity, and flexible working conditions.
We’re looking for a Machine Learning Engineer to design and scale the infrastructure powering our generative video and visual models, particularly for our latest product verv.fm . This role combines deep ML engineering expertise with hands-on experience in modern generative systems like Stable Diffusion, Comfy, and Flux.
You’ll build data ingestion pipelines, develop visual preprocessing workflows, and fine-tune generative models with production-level reliability. You’ll also help drive prompt-aware systems and intelligent feedback loops that elevate both the quality and control of our media generation stack.
Design and implement data workflows for ingestion, cleaning, validation, filtering, and quality scoring
Fine-tune and deploy generative models like Stable Diffusion and Flux Loras and build large-scale workflows using tools such as Comfy
Experience training and working with video models like WAN, VACE, etc
Integrate computer vision techniques (segmentation, mask ops, object tracking) into generation pipelines
Build high-throughput pipelines for frame extraction, captioning, and visual data processing at scale
Develop and experiment with context- and prompt-aware model orchestration strategies
Contribute to observability and monitoring across the ML data lifecycle
Collaborate with infra teams to scale across GPU-backed infrastructure and serverless environments (e.g., Fal.ai )
Work across a fast-paced, evolving product environment with creative and technical inputs
(Bonus) Explore self-supervising or agentic workflows for automated pipeline feedback and improvement
Strong experience in ML engineering with a focus on computer vision, generative media, or multimodal systems
Hands-on experience fine-tuning or deploying generative models (e.g., SD, Flux, Comfy)
Proficiency in Python and asynchronous API development
Familiarity with image/video-specific challenges: frame alignment, codec handling, perceptual quality scoring
Experience with scalable data systems using tools like Airflow, Spark, or Ray
Solid understanding of GPU infrastructure and model deployment best practices
Knowledge of prompt engineering and context-driven model behavior
Comfortable working in ambiguity and bridging infrastructure and modeling challenges
Bonus: Experience with foundation model training pipelines or agentic systems
Availability for at least 4 hours of overlap with US Eastern Time
Competitive salary and equity
Private health coverage
Unlimited paid vacation
Fully-distributed, async-first culture
Hardware setup of your choice
Stipends for phone, internet, and meals
Moonvalley is an equal opportunity employer and we are committed to providing accommodations. If you require accommodation, we will work with you to meet your needs.
Please be assured we'll treat any information you share with us with the utmost care, only use your information for recruitment purposes and will never sell it to other companies for marketing purposes.