Company Logo
Software Engineer

Netflix - 1d ago

Company Logo
Senior Software Engineer

Reddit - 4d ago

Research Engineer, RL Scaling Science

Requirements

  • Strong empirical research skills in Reinforcement Learning, large-scale ML training, or a closely adjacent area
  • Demonstrated ability to own large experiments end-to-end, from design through interpretation
  • Proficiency in Python and experience working with large-scale or distributed ML systems
  • Comfort operating at the research/systems boundary, including debugging where the two meet
  • Care about the societal impacts of AI and responsible scaling

Nice to Haves

  • Published or shipped work in long-horizon RL or RL fundamentals
  • Experience translating research findings into production training recipes
  • Demonstrated large scale industry impact via RL interventions
  • Experience working on frontier-scale training runs with long trajectories

What You'll Be Doing

  • Design, run, and interpret large-scale RL experiments, reasoning rigorously about what the data does and doesn't show
  • Investigate how RL improves as horizon, compute, and model size grow
  • Build and maintain benchmarks for long-horizon RL so progress is measurable and reproducible
  • Translate validated findings into production training recipes, exercising judgment about when a result is robust enough to ship
  • Debug complex issues at the seam where research meets infrastructure - failures that only appear at scale
  • Partner closely with adjacent RL teams across research and engineering and advance our overall RL stack

Perks and Benefits

  • Generous vacation and parental leave
  • Flexible working hours
  • Competitive compensation and benefits
  • Optional equity donation matching
  • Lovely office space in which to collaborate with colleagues
AI Summary ✨
Anthropic logo

Anthropic

London, UK

Experience: Senior
Posted: June 22, 2026
Last seen: 2 hours ago
Python
machinelearning

Why we track Anthropic

Anthropic is an AI safety company building Claude, one of the most capable large language models. They have engineering teams in London, Dublin, and Zurich working on core model development, infrastructure, and safety research. One of the highest-paying companies in AI.

Similar jobs

  • 3 days ago
    Remote
  • 3 days ago
    Remote
  • 3 days ago
  • See all jobs in UK