Company Logo

Software Engineer

Netflix - 1d ago

Company Logo

Senior Software Engineer

Reddit - 4d ago

Senior Machine Learning Engineer, Ads Training Platform

AI Summary ✨

Requirements

  • 5+ years in infrastructure/platform engineering or large-scale distributed systems.
  • 2+ years hands-on experience with Ray platform.
  • Strong understanding of distributed computing principles (task scheduling, fault tolerance, state management).
  • Experience with distributed storage systems and large-scale data processing.
  • Proven ability to debug and profile distributed jobs.

What You'll Be Doing

  • Design, build, and maintain large-scale distributed training infrastructure for Ads ML models.
  • Develop tools and frameworks on top of the Ray platform.
  • Build tools to debug, profile, and tune distributed training jobs for performance and reliability.
  • Integrate with object storage systems and improve data access patterns.
  • Collaborate with ML engineers to improve model training time, efficiency, and GPU training costs.
  • Drive improvements in scheduling, state management, and fault tolerance within the training platform to enhance overall performance.

Nice to Haves

  • Experience with deep learning frameworks (PyTorch, TensorFlow) is a big plus.
  • Bonus: model optimization for distributed training, Ads ML experience.

Perks and Benefits

  • Private Pension plan with Employer-matching
  • 100% employer-sponsored group medical plan
  • Income Replacement Programs
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Flexible Vacation & Reddit Global Days Off
Apply here
Reddit logo

Reddit

Remote - Netherlands (Remote)

Experience: Senior
Posted: August 19, 2025
machinelearning

Similar jobs

  • ebay logo

    Applied Researcher

    Amsterdam, Netherlands

    a day ago
    New
  • 7 days ago
  • ebay logo

    Applied Researcher

    Amsterdam, Netherlands

    13 days ago
  • See all jobs in Netherlands