Site Reliability Engineer II
Microsoft
Reston, Virginia, U.S.
Full-Time
Posted Oct 15, 2025
3 days / week in-office
Compensation
Loading salary analysis...
About the role
Do you have a passion for high scale services and working with some of Microsoft’s most critical customers? We’re looking for a Site Reliability Engineer II with the right mix of software development, on-line services experience and passion for quality to envision, design, and deliver Office 365 government cloud service offerings.
Responsibilities
- Demonstrates working knowledge of distributed systems design, interactions between cloud technology layers and components, common dependencies at scale, and the code that defines infrastructures.
- Develops an understanding of the code, features, and operations of specific products at scale as required to contribute to incremental improvements in product availability, reliability, efficiency, observability, and/or performance; participates in on-boarding, code/design reviews, and regular meetings with the engineering teams that develop and/or manage those products.
- Researches and maintains an awareness in industry trends, advances in distributed systems and cloud technologies, new tools, and/or processes for maintaining and improving product availability, reliability, efficiency, observability, and/or performance.
- Contributes to the implementation of new solutions within their team by identifying ways they can be applied to solve persistent problems.
- Leverages experience with large scale distributed systems and specific products, as well as objective insights drawn from analyses of production telemetry data to suggest changes or add-ons to product features or code to improve the availability, reliability, efficiency, observability, and performance of product components or features supported by their team.
- Develops and tests basic changes to optimize code and improve the observability, reliability and operability of a defined range of platform, system, or product components or features with direction from other engineers.
- Engages with product engineering teams by participating code/design reviews, regular meetings, on-call rotations and incident responses throughout product development and operations cycles.
- Leverages knowledge of underlying systems/platforms and insights drawn from engagements with product engineering teams and telemetry analyses to propose potential improvements in code base and designs across components and features of one or more products.
- Designs, develops, and maintains telemetry pipelines and monitoring tools that detail operations metrics (e.g., availability, reliability, performance, efficiency) of product components and features operating at scale.
- Independently performs analyses using existing tools and/or models to identify insights and shares them with product engineering teams to directly contribute to improvements in product development and/or operations; monitors the impact of changes on operations metrics (e.g., Time-to-X).
Requirements
- Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration
- Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration
- OR equivalent experience
Benefits
- 401k matching
- Health insurance
- Discounts on products and services
- Savings and investments
- Maternity and paternity leave
- Generous time away
- Giving programs
- Opportunities to network and connect
About the Company
At Microsoft, we can offer you a collaborative team, exciting challenges, and a fun place to work. The work environment empowers you to have a positive impact on millions of end users.
Job Details
Salary Range
Salary not disclosed
Location
Reston, Virginia, U.S.
Employment Type
Full-Time
Original Posting
View on company website