Identity and Access Management Site Reliability Engineer (SRE)

Morgan Stanley
permanent
Alpharetta, Georgia, United States of America
1 views
Posted February 14, 2026

Job Description

In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities.

This is a Lead Software Production Management & Reliability Engineering position at Director level which is part of the job family responsible for overseeing the production environment, ensuring the operational reliability of deployed software, and implementing strategies to optimize performance and minimize downtime.

Since 1935, Morgan Stanley is known as a global leader in financial services, always evolving and innovating to better serve our clients and our communities in more than 40 countries around the world.

Morgan Stanley Wealth Management (MSWM) Technology is the global technology department responsible for the design, development, delivery and support of the technical solutions behind the products and services used by the Morgan Stanley Wealth Management (MSWM) business. The department is comprised of 10 organizations: Sales, Banking & Corporate-Client Technology, Investment Products & Markets Technology, Client Reporting, Core Processing, Private and International Wealth Management Technology, Technology Integration Office, Enterprise Infrastructure & Production Management, Capital Markets

Application & Data Services, Deployment Planning & Release Management, and the Chief Operating Office.

Position Description

This position is for an Identity and Access Management SRE Director with experience for Morgan Stanley Wealth Management - IAM SRE Senior Associate working on Core Platform Services Identity and Access Management Team at Morgan Stanley’s Alpharetta office.

The Identity and Access Management team is part of the Wealth Management Organization at Morgan Stanley. SRE Senior Associate/Director position is a highly visible/critical role, which will be a team member of technical SME’S managing the stability and optimization of the Wealth Management entitlement platform. Scope includes but not limited to, the day-to-day support of the organization’s technology related outages, collaboration on technology projects focused on stability, optimization, business impact analysis, and associated risk-related methodologies. This role will be responsible for overall stability of the Wealth Management entitlement platform in distributed and mainframe, participation on key initiatives and transformation.

We are looking for colleagues with strong sense of ownership and ability to drive solutions. The role is primarily responsible for continual SRE maturity implementing observability standards and ensure that the WM entitlement platform is stable.

The candidate is expected to participate in squads, handle incidents/outages, release management and continued enhancement for monitoring and alerting for the entitlement platform for data security administration and entitlement management.

The ideal candidate will be a self-motivated team player committed to delivering on time and should be able to work without or with minimal supervision.

Responsibilities

Engage with the Wealth Management Access Management SRE and Development team through the life cycle to support Application build for reliability for entitlement platform and working with other stakeholders in WM Infrastructure Risk and Technical Risk Central Risk Service.

Understanding access management and data security administration controls and ensuring they are followed

Develop software to automate manual operational work and continual transformation

Develop an observability dashboard on Prometheus and Grafana

Run, maintain and improve the service against established Service Level Objectives by applying software engineering principles

Responsible for the availability, performance, change management, monitoring, and capacity management of their services

Troubleshoot priority incidents, conduct blameless post-mortems and ensure permanent closure of the incidents

Analyze patterns of production incidents, develop permanent remediation plans, and implement automation to prevent future incidents from occurring through software engineering

Facilitates maximum speed of delivery by objectively binding to error budgets of the service.

Manage the efforts to split between manual operational work and engineering work

Interface with Product Engineering from a development and testing perspective on current and proposed production solutions

The workload for the position would comprise 50% supporting software development and 50% Technology Operations (the proportionate allocation might vary)

Team player for back-up coverage with rotating weekend support coverage with other IAM Managers

Identify opportunities for adopting new technologies to solve and transform existing needs along with designing for future challenges.

Requirements & Skills

Required Skills

Bachelor’s degree and/or extensive relevant experience

Experience with Site Reliability Engineering (SRE) discipline

Minimum of 5-9 years’ experience in technology or technology troubleshooting environment

Good understanding of tools: Splunk, Ansible, AppDynamics, Nagios, Grafana, Prometheus

Understanding on JVM

Advance MS SQL hands on Experience

Unix or Linux experience

Working experience on Automation

Knowledge of docker, containers and Kubernetes

Exposure on either public or private cloud concepts (AWS, Azure, Google cloud, Open stack)

Prior experience with Identity and Access Management experience

Microsoft Office suite of tools (Excel/Word/Power Point)

Solid customer service and interpersonal skills

Strong organizational and problem-solving skills

Excellent verbal and written communications skills

Desired skills

Experience in Wealth Management or a similar financial environment

Prior Mainframe RACF/DB2 experience and/or RACF Certified with JCL/Cobol basic understanding

How to Apply

Apply directly through the application link above, or