GPU Software Architecture Engineer
Apple
Cupertino, California, U.S.
Full-time, Regular
Posted Oct 28, 2025
Onsite
Compensation
Loading salary analysis...
About the role
Design and implement tensor/data/expert parallelism strategies for large language model inference across distributed server cluster environments
Responsibilities
- Design and implement tensor/data/expert parallelism strategies for large language model inference across distributed server cluster environments
- Drive hardware and software roadmap decisions for ML acceleration
- Expert in designing architectures that achieves peak compute utilizations and optimal memory throughput
- Develop and optimize distributed inference systems with focus on latency, throughput, and resource efficiency across multiple nodes
- Collaborate with hardware teams on next-generation accelerator requirements and software teams on framework integration
- Lead performance analysis and optimization of ML workloads, identifying bottlenecks in compute, memory, and network subsystems
- Drive adoption of advanced parallelization techniques including pipeline parallelism, expert parallelism, and various other emerging approaches
Requirements
- Strong knowledge of GPU programming (CUDA, ROCm) and high-performance computing
- Must have excellent system programming skills in C/C++
- Deep understanding of distributed systems and parallel computing architectures
- Experience with inter-node communication technologies (InfiniBand, RDMA, NCCL) in the context of ML training/inference
- Understand how tensor frameworks (PyTorch, JAX, TensorFlow) are used in distributed training/inference
Benefits
- Comprehensive medical and dental coverage
- Retirement benefits
- Range of discounted products and free services
- Reimbursement for certain educational expenses
- Discretionary bonuses or commission payments
- Relocation assistance
About the Company
Apple is a technology company that combines hardware, software, and internet services to make products like the iPhone, Mac, iPad, Apple Watch, and Apple TV.
Job Details
Salary Range
Salary not disclosed
Location
Cupertino, California, U.S.
Employment Type
Full-time, Regular
Original Posting
View on company website