Back to all jobs

Lead AI Software Architect

Microsoft

Redmond, Washington, U.S.

Full-time

Posted Aug 26, 2025

Up to 50% work from home

Compensation

Loading salary analysis...

About the role

Do you want to be at the forefront of innovating the latest Inference systems to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross team collaboration, with business insight and strategy?

Responsibilities

Lead the SW architectural design, development, and deployment of the future AI inference infrastructure optimized for Microsoft’s AI cloud.
Collaborate closely with hardware architecture, compiler, systems, simulation/perf optimization to ensure seamless integration and optimized performance.
Define and execute strategies for inference, cost optimizations, workload balancing, and memory optimization.
Mentor and guide the software engineering team, setting clear technical directions and providing architectural oversight.
Evaluate, select, and integrate third-party libraries and open-source frameworks (e.g., TensorRT, TVM, PyTorch, ONNX) for optimized inference performance.
Act as a technical liaison between hardware engineers and software teams to communicate requirements, constraints, and opportunities for co-design.
Identify performance bottlenecks and opportunities to intersect future hardware and system roadmap planning, influencing strategic direction.
Ensure robust software quality and implement best practices for software engineering, testing, and continuous integration.

Requirements

Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
7+ years of industry experience, with at least 5 years in AI inference software stack development and architecture.
5+ years of experience in designing and optimizing software stacks for specialized AI hardware, including accelerators, GPUs, or custom ASICs.
3+ years of experience building infrastructure and identify the opportunities for end2end Perf/TCO optimization for business critical AI workloads.
3+ years of experience with AI inference frameworks and compiler toolchains such as TensorRT, ONNX Runtime, MLIR, or similar.

Benefits

Industry leading healthcare
Educational resources
Discounts on products and services
Savings and investments
Maternity and paternity leave
Generous time away
Giving programs
Opportunities to network and connect

About the Company

Microsoft’s mission is to empower every person and every organization on the planet to achieve more.

Job Details

Salary Range

$163,000 - $296,400/yearly

Location

Redmond, Washington, U.S.

Employment Type

Full-time

Original Posting

View on company website

Create resume for this position