Lead AI Software Architect
Microsoft
Redmond, Washington, United States
Full-time
Posted Aug 26, 2025
Up to 50% work from home
About the role
Do you want to be at the forefront of innovating the latest Inference systems to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross team collaboration, with business insight and strategy?
Responsibilities
- Lead the SW architectural design, development, and deployment of the future AI inference infrastructure optimized for Microsoft’s AI cloud.
- Collaborate closely with hardware architecture, compiler, systems, simulation/perf optimization to ensure seamless integration and optimized performance.
- Define and execute strategies for inference, cost optimizations, workload balancing, and memory optimization.
- Mentor and guide the software engineering team, setting clear technical directions and providing architectural oversight.
- Evaluate, select, and integrate third-party libraries and open-source frameworks (e.g., TensorRT, TVM, PyTorch, ONNX) for optimized inference performance.
- Act as a technical liaison between hardware engineers and software teams to communicate requirements, constraints, and opportunities for co-design.
- Identify performance bottlenecks and opportunities to intersect future hardware and system roadmap planning, influencing strategic direction.
- Ensure robust software quality and implement best practices for software engineering, testing, and continuous integration.
Requirements
- Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
- 7+ years of industry experience, with at least 5 years in AI inference software stack development and architecture.
- 5+ years of experience in designing and optimizing software stacks for specialized AI hardware, including accelerators, GPUs, or custom ASICs.
- 3+ years of experience building infrastructure and identify the opportunities for end2end Perf/TCO optimization for business critical AI workloads.
- 3+ years of experience with AI inference frameworks and compiler toolchains such as TensorRT, ONNX Runtime, MLIR, or similar.
Benefits
- Industry leading healthcare
- Educational resources
- Discounts on products and services
- Savings and investments
- Maternity and paternity leave
- Generous time away
- Giving programs
- Opportunities to network and connect
About the Company
Microsoft’s mission is to empower every person and every organization on the planet to achieve more.
Job Details
Salary Range
$163,000 - $296,400/yearly
Location
Redmond, Washington, United States
Employment Type
Full-time
Original Posting
View on company website