Job description

Senior Research Engineer (Data)
$175,000 - $250,000 + Equity + Benefits + PTO
Palo Alto, CA - On-site

Are you passionate about scaling data systems that fuel state-of-the-art AI? Want to play a mission-critical role in training cutting-edge generative models by designing the data infrastructure they rely on?

This is a rare opportunity to join a top-tier AI startup as they continue to push the boundaries of what's possible in multimodal generative AI - you'll be joining a high-performing, research-driven team with significant funding and strong momentum, in a high-impact position at the intersection of research and infrastructure.

I'm working with a well-funded AI startup in Palo Alto that's scaling its Research Engineering division. They're looking for a Senior Research Engineer focused on data systems-someone who understands how critical clean, diverse, and scalable data pipelines are to generative model performance. If you're excited about building high-quality datasets and architecting systems that impact billions of tokens, this is your chance to make a huge impact.

In this role, you'll partner closely with researchers to build end-to-end data acquisition and processing pipelines. You'll source novel data types, design filtering and deduplication systems, integrate active learning techniques, and help steer research directions based on model gaps. It's a role that combines engineering, research, and strategy-at serious scale.

This is a rare opportunity to have direct technical impact in a fast-paced, research-driven environment alongside some of the brightest minds in AI, whilst continuing to progress both your technical skills and career.

The Role

Architect and maintain scalable pipelines for sourcing, deduplicating, filtering, and preparing massive datasets for training.
Partner with research scientists to identify model gaps and improve dataset relevance and diversity.
Collaborate with annotation ops to enhance dataset quality through smart filtering strategies.
Integrate self-supervised active learning and other advanced data techniques to scale systems efficiently.
Contribute directly to the performance of cutting-edge video generation models and other generative systems.
On-site in Palo Alto, CA

Ideal Candidate

Experience building large-scale data pipelines in domains like computer vision, NLP, robotics, or autonomous systems.
Strong Python skills, with familiarity in deep learning frameworks such as PyTorch.
Experience working with large data processing frameworks (e.g., SQL, Spark).
Solid understanding of distributed systems and performance-aware data infrastructure.
Proven track record of delivering robust data solutions in fast-paced, research-heavy environments.
Bonus: experience in data-centric AI, self-supervised learning, or active learning methods.

Consultant

Luca Browning

Recruitment Consultant

Senior Research Engineer (Data)

Job description

Let's Talk

Quick Links

Contact Us

Accreditations & Certifications

Follow Us