Senior Software Engineering Manager - Azure Fleet Resource Lifecycle Mgmt Team
Microsoft
About the role
Azure is a rapidly expanding and evolving cloud platform. The Azure Fleet Resource Lifecycle Management Team (formerly known as Capacity Infrastructure Service Team) is tasked with constructing the foundation and core of this platform using cutting-edge hardware and providing the infrastructure necessary for hosting worldwide customer applications.
Responsibilities
- Coaches the team’s partnership with stakeholders to clarify user requirements within the team’s owned components/features; raises cross‑team gaps through appropriate channels.
- Reviews debugging tools, tests, logs, telemetry, and other methods, and coaches others to proactively verify assumptions while developing code before issues occur for products in production.
- Coaches others on and conducts incident retrospectives to identify root causes of problems, implements repair actions, and identifies mechanisms to prevent incident recurrence.
- Tracks and attempts to minimize cost of debugging for a set of scenarios.
- Coaches team on applying least-access principles, using logging, telemetry, and other appropriate mechanisms to investigate issues while retaining privacy and security, and driving those practices.
- Identifies and drives best practices and shares information with other engineers for building code that is based on well-established methods and secure design principles while also applying best practices for new code development and formal validation of security invariants.
- Monitors the product development and scaling to customer requirements, and applies best practices for meeting scaling needs and performance expectations and security promises.
- Advocates for use of GenAI tooling (e.g., GitHub CoPilot) to enhance engineering productivity.
- Leads the team to identify dependencies and produce design docs for their owned area; coordinates with adjacent teams as needed with support from senior leaders for broader alignment.
- Sets engineering standards and reviews work so engineers optimize, debug, refactor, and reuse code to improve performance, maintainability, and ROI.
- Acts as (or assigns) DRI for the team’s services/components and runs an effective on‑call, ensuring incident response, triage, and post‑incident learning for the team’s scope; escalates systemic issues appropriately.
- Owns the team’s project and release plans, driving backlog health, delivery cadence, and quality; coordinates with PM and partner teams to land dependencies and dates.
- Ensures the team’s services meet defined scale and performance expectations for their scope, applies best practices, and surfaces risks early when goals/SLOs are at risk.
Requirements
- Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
- Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- 4+ years of software development experience.
- 1 year(s) complex problem solving and design skills.
Benefits
- Health insurance
- 401k matching
- Flight privileges
- Discounts on products and services
- Savings and investments
- Maternity and paternity leave
- Generous time away
- Giving programs
- Opportunities to network and connect
About the Company
Azure is a rapidly expanding and evolving cloud platform. The Azure Fleet Resource Lifecycle Management Team (formerly known as Capacity Infrastructure Service Team) is tasked with constructing the foundation and core of this platform using cutting-edge hardware and providing the infrastructure necessary for hosting worldwide customer applications.
Job Details
Salary Range
$122,200 - $135,800/yearly
Location
Multiple Locations, Australia, Australia
Employment Type
Full-Time
Original Posting
View on company website