Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Head of Site Reliability Engineering (JR1026353)

3+ months agoNewark, NJ

Company Description
Broadridge Financial Solutions, Inc. (BR), a $4 billion global Fintech leader and part of the S&P 500® Index, is a leading provider of investor communications and technology-driven solutions to banks, broker-dealers, asset and wealth managers and corporate issuers. At Broadridge, we do well by doing good. Our unique culture is guided by the Service-Profit Chain—the idea that success is mutual, directly connecting employee engagement, client satisfaction, and the creation of stockholder value. We enable better financial lives by powering investing, governance, and communications for our clients, their customers, and the financial services industry.

Job Description
Are you seeking a position within a growing company? Broadridge is hiring! Our mission is to attract, develop and retain outstanding talent. Being a place where exceptionally driven and hardworking people want to work is how we deliver award-winning services to our customers and ultimately build customer value.

The Head of Site Reliability Engineering will be responsible for leading, growing and motivating a set of robust and hardworking engineers. The ideal candidate will in addition to possessing excellent leadership skills will also be a highly skilled system engineer with knowledge of code and automation. They will also have a strong aim to improve existing systems and processes to advance the way our hardware, software, and network solutions are designed and deployed.

- Drive GTO SRE strategy across the Portfolio of products and associated SRE teams within the GTO business segment

To ensure Portfolio SRE teams
- Have effective capabilities for ensuring production uptime and stability as well as the observability, reliability, availability, performance, capacity planning and operational support for the products across GTO.
- Have effective processes for continuous improvements to improve Service Level Objectives (SLO) and mean time to identification (MTTI), mean time to resolution (MTTR), and mean time to failure (MTTF).
- Can effectively engage in incident management and have well defined procedures for the identification of relationships between processes and events, and their root cause.
- Focus on automation; in the context of self-healing, auto-remediation, removing manual toil, orchestration tooling and infrastructure-as-code patterns.
- Ensure that the systems can withstand 'chaos engineering' practices and can fail gracefully when services are degraded
- Ensure the means exist to quickly recover a degraded service (instrumentation, runbooks, tooling etc).
- Ensure adequate instrumentation and alerting exists to spot leading indicators of an impending incident in the system; as well as in systems on which the platform depends.
1. Provide leadership in iteratively defining & refining engineering processes and adoption strategies as the SRE practice grows. Motivate, lead and develop a team of diverse Portfolio SRE managers.
2. Drive SRE education across the Portfolios to improve quality and reliability
3. Directly collaborate with Portfolio SRE Managers and their Portfolio technology and product stakeholders to understand their strategies and needs and incorporate them into GTO SRE strategy.
4. Provide strategy, process and tooling so that Portfolio SRE Managers can effectively and consistently communicate service reliability, performance and superior customer experience metrics to their respective Portfolio technology and business partners.
5. Provide best practice SRE consideration to Enterprise Architecture and Portfolio Application Development Management Team so that stability and reliability are incorporated into new solutions

- More than 15 years of relevant working experience with a strong technical background
- Strong exposure in driving transformation programmes and process for constant improvement
- Excellent knowledge in distributed architecture, Cloud, microservices, SOA, IaaS and PaaS as related to design patterns
- Ability to identify potential design issues and present valid solutions/options during the design phases
- Experience in Agile and Test-Driven Development (TDD) methodologies
- Experience in leading SRE teams and/or DevOps functions or similar
- Understanding what it takes to support applications and its related infrastructure in a production environment (Service Level Agreements)
- Experience growing and building multiple highly effective teams.
- Experience collaborating across organizational boundaries, forming alliances with other members of the GTO management leadership team and building bridges that support functional as well as company goals.
- Ability to identify trends and promote solutions that solve challenges efficiently across the organization
- Highly-collaborate team player who can build strong relationships at all levels of the technology and business organizations

- Working through the definition, design, release and run cycle of software products to markets
- Experience with DevOps, ITIL, Cloud Services, IT Infrastructure and Operations, including environment stand-up, server builds, firewalls, security and regulatory compliance.
- Experience of any object-oriented language
- Proficiency working in Unix/Linux environments.
- Experience with IBM MQ, Kafka, Postgres
- Familiarity with Amazon cloud solutions and architectures such as EC2, S3, Cloud Formation, Dynamo DB, Route 53, IAM, ELB, CloudWatch, Lambda, Kinesis etc.
- Experience of Logging, Monitoring and Alerting framework for hybrid cloud or third-party services using AppDynamics, Splunk, Data Dog and CA APM.
- Experience with Atlassian toolset JIRA/Confluence and agile development practices
- Experience with tools such as Jenkins and Ansible, GIT, Maven, Nexus, Chef, Docker, Terraform, Kubernetes, Pivotal Cloud Foundry, Concourse

Why join Broadridge?
- We offer a valuable benefits package effective date of hire including comprehensive healthcare (medical/prescription drug, dental, and vision), wellness benefits, fertility treatment, inclusive parental leave, life insurance, tuition reimbursement, disability benefits and more.
- We support a number of associate-led networks where associates with similar backgrounds and interests can find peer support, shape company policy and culture, receive mentorship from senior members, and develop their careers.
- We provide a competitive 401k plan with employer match and additional basic company contribution.
- We encourage our associates to rest and relax by taking advantage of our generous paid time off program including vacation, personal holidays, sick time, and paid company holidays.
- We strive to improve both our environmental impact as well as our industries’ by partnering with our clients to decrease their carbon footprint through digitization, data, and innovation.

Job ID:

Company Videos

Hear directly from employees about what it is like to work at Broadridge.