Lead SRE - AWS, Terraform
As a Lead Site Reliability Engineer within the CIB Markets Sales, Research and Data organization at JPMorgan Chase, you will play a pivotal leadership role in your team. You will leverage your strong knowledge across multiple technical domains to advise and support software engineering teams globally. Your hands‑on role will involve migrating and managing applications in the public cloud, promoting SRE principles and practices. You will work on initiatives such as unified telemetry, application and infrastructure modernization, SLO and SLI onboarding, advanced deployment strategies, and performance and scalability improvements, all aimed at reducing operational risks and enhancing our products.
Job Responsibilities
- Design, code, test, and deliver software to automate manual operational work, including self‑healing and resiliency patterns for public cloud services and engineering teams.
- Define and implement a telemetry strategy, including rollout of APM and cloud telemetry.
- Act as a culture carrier and adoption site reliability champion for your team by demonstrating site reliability principles and practices daily and mentoring technologists within the organization.
- Troubleshoot priority and escalation incidents, facilitate blameless post‑mortems, and ensure permanent closure of incidents and subsequent problem tasks.
- Engage and evangelize with development teams throughout their SDLC to develop software for reliability and scale, ensuring minimal refactoring or changes.
- Identify application patterns and analytics to support better service level objectives.
- Design automated software and product upgrades, change management, and release management solutions.
- Provide comprehensive and ongoing guidance, tools, and solutions to support the firm’s growth.
- Work towards becoming an expert on the applications and platforms in your remit by understanding interdependencies and limitations and driving evolution and debugging of critical components.
Required Qualifications, Capabilities and Skills
- Bachelor’s degree or equivalent experience in a software engineering discipline.
- Demonstrated experience working with a major public cloud provider (Amazon Web Services) and infrastructure as code (Terraform).
- Advanced understanding of site reliability culture and principles and a track record of implementing site reliability within an application or platform, including key SRE concepts such as SLOs and error budgets.
- Advanced knowledge and experience in observability capabilities across applications (metrics, tracing, SLOs), alerting, telemetry collection and ability to design critical and golden signal monitoring and dashboards.
- Strong communication skills and a desire to mentor and educate others on site reliability engineering principles and practices.
Preferred Qualifications, Capabilities and Skills
- Experience defining non‑functional standards and blueprints related to supportability – logging, alerting, resiliency patterns, etc.
- Working knowledge of infrastructure components (e.g., routers, load balancers, cloud products, container systems, compute, storage, and networks).
- Ability to partner with and influence architecture teams in defining non‑functional application supportability standards.
- Proven leadership skills with a drive for continuous improvement.
- AWS Cloud Certification, Linux Foundation CKA/CKAD, Terraform Associate and other relevant certifications are a plus.
About Us
J.P. Morgan is a global leader in financial services, providing strategic advice and products to the world’s most prominent corporations, governments, wealthy individuals and institutional investors. We strive to build trusted, long‑term partnerships to help our clients achieve their business objectives. We are an equal‑opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute and we also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs.
About The Team
J.P. Morgan’s Commercial & Investment Bank is a global leader across banking, markets, securities services and payments. Corporations, governments and institutions throughout the world entrust us with their business in more than 100 countries.
Seniority level
Not Applicable
Employment type
Full‑time
Job function
Engineering and Information Technology