Back to all jobs
Microsoft logo

Lead AI Software Architect

Microsoft

Redmond, Washington, United States
Full-time
Posted Aug 26, 2025
Up to 50% work from home

About the role

Do you want to be at the forefront of innovating the latest Inference systems to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross team collaboration, with business insight and strategy?

Responsibilities

  • Lead the SW architectural design, development, and deployment of the future AI inference infrastructure optimized for Microsoft’s AI cloud.
  • Collaborate closely with hardware architecture, compiler, systems, simulation/perf optimization to ensure seamless integration and optimized performance.
  • Define and execute strategies for inference, cost optimizations, workload balancing, and memory optimization.
  • Mentor and guide the software engineering team, setting clear technical directions and providing architectural oversight.
  • Evaluate, select, and integrate third-party libraries and open-source frameworks (e.g., TensorRT, TVM, PyTorch, ONNX) for optimized inference performance.
  • Act as a technical liaison between hardware engineers and software teams to communicate requirements, constraints, and opportunities for co-design.
  • Identify performance bottlenecks and opportunities to intersect future hardware and system roadmap planning, influencing strategic direction.
  • Ensure robust software quality and implement best practices for software engineering, testing, and continuous integration.

Requirements

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 7+ years of industry experience, with at least 5 years in AI inference software stack development and architecture.
  • 5+ years of experience in designing and optimizing software stacks for specialized AI hardware, including accelerators, GPUs, or custom ASICs.
  • 3+ years of experience building infrastructure and identify the opportunities for end2end Perf/TCO optimization for business critical AI workloads.
  • 3+ years of experience with AI inference frameworks and compiler toolchains such as TensorRT, ONNX Runtime, MLIR, or similar.

Benefits

  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

About the Company

Microsoft’s mission is to empower every person and every organization on the planet to achieve more.

Job Details

Salary Range

$163,000 - $296,400/yearly

Location

Redmond, Washington, United States

Employment Type

Full-time

Original Posting

View on company website
Create resume for this position