ML Runtime Engineer (Mid-Level and Senior)
This role involves developing and integrating a high-performance Rust-based runtime for AI accelerators, focusing on inference server compatibility with open-source ML frameworks like PyTorch, vLLM, and SGLang. The engineer will work closely with hardware and ML teams using a co-design approach to optimize performance for large language models. Collaboration and ownership in a fast-paced, innovative environment are key aspects of the position.
London, United Kingdom