Back

[Remote] Software Co-Design AI HPC Systems

Worldwide Salaried Open

Note: The job is a remote job and is open to candidates in USA. Microsoft is a leading technology company dedicated to empowering individuals and organizations. The Software Co-Design AI HPC Systems role focuses on architecting and optimizing next-generation AI systems, collaborating across hardware and software to enhance performance and efficiency.

Responsibilities

  • Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks
  • Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements
  • Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems
  • Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps
  • Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations
  • Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs
  • Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams
  • Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization

Skills

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Strong background in one or more of the following areas: AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads
  • Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development
  • Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders
  • Experience designing or operating large-scale AI clusters for training or inference
  • Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications
  • Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)
  • Background in performance modeling and capacity planning for future hardware generations
  • Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews
  • Publications, patents, or open-source contributions in systems, architecture, or ML systems are a plus

Benefits

  • Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Company Overview

  • Microsoft is a software corporation that develops, manufactures, licenses, supports, and sells a range of software products and services. It was founded in 1975, and is headquartered in Redmond, Washington, USA, with a workforce of 10001+ employees. Its website is https://www.microsoft.com.
  • Company H1B Sponsorship

  • Microsoft has a track record of offering H1B sponsorships, with 1317 in 2026, 9192 in 2025, 9343 in 2024, 7677 in 2023, 11403 in 2022, 7210 in 2021, 7852 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • Apply To This Job

    More jobs

    [Remote] Interim Human Resources Director (6-month assignment)

    Worldwide Salaried

    [Remote] Staff Product Manager

    Worldwide Salaried

    [Remote] Associate SEO Director

    Worldwide Salaried

    [Remote] Sr Software Development Engineer- Remote

    Worldwide Salaried

    [Remote] Business Development Associate

    Worldwide Salaried

    [Remote] DC Service Engineer

    Worldwide Salaried

    [Remote] Senior eDiscovery Attorney Project Manager - Remote

    Worldwide Salaried

    [Remote] Product Manager

    Worldwide Salaried

    [Remote] (Remote) Product Manager

    Worldwide Salaried

    [Remote] Clinical Engineering Customer Advocate - Advocate Health Remote FT Days

    Worldwide Salaried

    Experienced Teen Remote Customer Service Representative – Launch Your Career with arenaflex

    Worldwide Salaried

    Experienced Part-Time Remote Home-Based Data Entry Specialist – Efficient Record Management and Client Support

    Worldwide Salaried

    Head of Psychiatry – Collaborative Care Model (Telemedicine)

    Worldwide Salaried

    Senior Full-Stack Software Engineer – JavaScript, React, Python, Golang - SVP

    Worldwide Salaried

    Mobile Diesel Tech II - DOT

    Worldwide Salaried

    Legal Assistant (Estate Planning Service Specialist)

    Worldwide Salaried

    Executive Assistant, Remote Job

    Worldwide Salaried

    eBilling Analyst - Remote (Legal Services)

    Worldwide Salaried

    Experienced Customer Service Associate / Cashier – Providing Exceptional Guest Experiences in a Thriving Arenaflex Environment

    Worldwide Salaried

    Senior Product Analytics Manager - Remote Part Time Position | Quantitative Analysis & Data-Driven Decision Making

    Worldwide Salaried