Staff Software Engineer, Machine Learning Platform

Discord

Discord is used by over 200 million people every month for many different reasons, but there’s one thing that nearly everyone does on our platform: play video games. Over 90% of our users play games, spending a combined 1.5 billion hours playing thousands of unique titles on Discord each month. Discord plays a uniquely important role in the future of gaming. We are focused on making it easier and more fun for people to talk and hang out before, during, and after playing games.

This position is US based only.

The Machine Learning Platform (MLP) at Discord is responsible for the end-to-end model lifecycle across all ML applications. We sit at the intersection of machine learning engineers (MLEs), core infrastructure, and ML consumers to provide tools, capabilities, and services that make machine learning easy, safe, and widely accessible. In this role, you will work on everything from feature stores, real-time data processing, LLM tooling, and model serving at scale. You will lead projects and work directly with ML practitioners as well as other staff+ engineers to shape the landscape of Discords backend data systems. You will report to the Senior Engineering Manager of the ML Platform team.

What you’ll be doing

  • Design and build the platform ML engineers and data scientists use to understand and delight Discord’s users and keep them safe
  • Evaluate and integrate new machine learning frameworks and tools to ensure that Discord keeps up with the fast moving world of ML, including LLMs and generative AI
  • Collaborate with model builders to ensure we have a smooth path from idea to production
  • Set best practices in machine learning at Discord
  • Create foundational datasets and models

What you should have

  • 8+ years of experience working as a software engineer in data or backend with exposure to large datasets or distributed systems
  • 4+ years working on platforms or infrastructure
  • 2+ years working on machine learning platforms
  • Know-how with orchestration systems (such as Airflow, Dagster, or Argo).
  • You’ve put machine learning models into production

Bonus points

  • Experience with real-time data processing (Spark, Flink, Dataflow, Kafka, Pulsar, etc.)
  • Experience debugging and maintaining live production systems on Kubernetes
  • Experience building ML models using modern frameworks

Set up job alerts and get notified about the new jobs

Similar Remote Jobs