Staff Machine Learning Architect - Assembly Coding and Performance Engineer
Job Overview:High-performance ML workloads on Arm CPUs require the co-development of algorithms and highly optimized CPU kernels. In CT-ML (Central Technology, Machine Learning), rapid kernel prototyping is crucial for exploring algorithms and assessing trade-offs between model accuracy and performance. Successful prototypes drive future CPU architecture development and serve as deliverables...