Microsoft Corporation Principal Site Reliability Engineering Manager in Sunnyvale, California

The Office 365 SRE team is responsible for maintaining and engineering some of largest collaboration systems in the world. With hundreds of millions of users and tens of thousands of servers our systems must be reliable, secure, and performant at all times. We are seeking a highly talented Site Reliability Engineering Manager who has experience operating Internet services at massive scale, a background in software engineering and familiarity with data science principles.

Our service engineers are tenacious problem solvers who take the initiative to investigate and own problems through to completion. The team requires a manager that can provide leadership, strategic insight, and technical guidance and prioritization. This expansive role offers diverse opportunities such as:

  • Improving the performance, availability and security of the live site by identifying, creating and implementing data signals for alerting on user issues

  • Performing in-depth data analysis to understand service behavior and trends in the service for anomaly detection

  • Creating solutions to improve the operation of the service, such as tools to visualize data, manage servers and automate processes

The successful candidate will work collaboratively with many teams and forge productive partnerships across the organization. We need an individual with exceptional technical and communications skills.

Required Skills & Experience

  • 7+ years programming in a high level object oriented language (C# preferred).

  • 7+ years scripting (PowerShell preferred).

  • 5+ years running online services at scale.

  • Data science / analysis fundamentals.

  • Extensive knowledge of Windows Server and Internet Information Server.

  • Experience automating repetitive tasks with scripting or applications.

Preferred skills & Experience

  • BA/BS degree in Computer Science or Information Systems

  • Networking fundamentals, design and troubleshooting (routing, switching, data flow, etc.).

  • Good understanding of network protocols such as, HTTP, SMTP, POP, SOAP and JSON.

  • Experience administering and deploying applications to Windows Azure.

  • Understanding of the software development life cycle and experience with the testing processes, build process, configuration management, release management, and server deployments.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to Services (engineering)