
Roblox
As a Senior / Principal Inference Engineer on ML Platform you will build the next generation of ML Ecosystem Tooling, specifically around model inference. ML Platform today supports billions of requests per day across our homepage, marketplace, economy, and more. We are looking for accomplished engineers to help build out the next generation of ML platform tooling for high-scale inference in a quickly innovating space.
You Will:
- Set technical strategy and oversee development of high scale, reliable infrastructure systems for large-scale inference, especially as we scale up both inference qps and model size.
- Dig into performance bottlenecks all along the inference stack, spanning from model optimizations to infrastructure optimizations.
- Stay abreast of industry trends in machine learning and infrastructure to ensure the adoption of leading-edge technologies and practices.
- Bootstrap and maintain infrastructure for ML Platform components—Serving Layer, Metadata Store, Model Registry, and Pipeline Orchestrator.
- Partner across organizations to build tooling, interfaces, and visualizations that make the ML@Roblox a delight to use.
You Have:
- 4+ years of professional experience and a tool chest of system design experience upon which to draw to build scalable, reliable platforms for all of Roblox.
- Experience building complex distributed systems that scale to real-time ML inference serving, ideally for real-time recommendation systems serving millions of QPS.
- Experience debugging complicated infrastructure-level performance issues to enable low latency, high throughput inference..
- Bachelor’s degree or higher in Computer Science, Computer Engineering, Data Science, or a similar technical field.
You Are:
- Passionate about supporting and working cross functionally with internal partners (Data Scientists and ML Engineers) to meet and understand their needs.
- A reliability nut: you love digging into tricky postmortems and identifying and fixing weaknesses in complicated systems.
- Ideally familiar with ML model inference frameworks like Triton Inference Server, TensorRT, KServe.
For roles that are based at our headquarters in San Mateo, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future. All full-time employees are also eligible for equity compensation and for benefits.
Annual Salary Range
$273,070—$3,338,270 USD