Machine Learning - Model Serving Job at Alexander Chapman, San Francisco, CA

QjA1OG9mWVp4RkFRMWdJYnpZaHl2RTh3M1E9PQ==
  • Alexander Chapman
  • San Francisco, CA

Job Description

We are working with a company building intuitive, voice-first AI systems that blend natural interaction with powerful model performance. Founded by leaders from Meta, Oculus, and Google, they’re creating a new class of consumer devices powered by speech, vision, and LLMs.

The Role

You’ll help optimize and scale the inference stack, working across model serving, performance tuning, and deployment to support real-time, multimodal AI.

What You’ll Do

  • Improve serving systems for LLMs, speech, and vision models.
  • Optimize throughput, latency, and cost using advanced techniques like batching, caching, and kernel tuning.
  • Extend frameworks like VLLM or SGLang to push the limits of performance.
  • Collaborate with training teams to deploy faster, lighter models.
  • Experiment with compilers and hardware backends to boost efficiency.

What We’re Looking For

  • Strong experience with PyTorch or similar ML frameworks.
  • Deep knowledge of model serving and systems performance.
  • Skilled in low-level debugging, bottleneck analysis, and server optimization.
  • Familiar with VLLM, Ray, or deploying inference workloads at scale.
  • Comfortable owning complex infrastructure projects end to end.
  • Background in computer science or related field from a top-tier university (e.g. Stanford, MIT, Ivy League).
  • Experience at a top tech company (e.g. FAANG) or a successful, high-growth startup.

They’re looking for curious, impact-driven engineers ready to push what’s possible with real-time AI.

Job Tags

Similar Jobs

TradeJobsWorkForce

Web Designer Job at TradeJobsWorkForce

 ...creative team of designers and writers. You will: Create visual assets for a wide range of design projects, using creativity, ingenuity, passion and restraint to bring beautiful and impactful design to life across various mediums (print, packaging, digital) Collaborate...