Manager, Reliability Engineer

Plano 1 (31061), United States of America, Plano, Texas

At Capital One, we're building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding.

Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good.

Manager, Reliability Engineer

Capital One is a diversified bank that offers a broad array of financial products and services to consumers, small business and commercial clients. We nurture a work environment where people with a variety of thoughts, ideas and backgrounds, guided by our shared Values, come together to make Capital One a great company and a great place to work.

As a Reliability Engineer, you will be responsible for leading production support activities for all major systems and related subsystems. Your role is critical to ensuring the integrity and operation of critical business systems. As a manager of the team, your responsibilities involve a deep technical understanding of architecture and environments supported.


  • Responsible for the day to day operations of Capital One Home Loans application and infrastructure. Ensure system availability, capacity and performance. Resolve incidents, events, problems and issues.
  • Implement changes to applications and infrastructure. Configure and update appropriate monitors and alerts. Ensure systems meet Capital One standards for security and resiliency.
  • Code Chef recipes/cookbooks in an Amazon Web Services (AWS) Public Cloud environment
  • Code frameworks/APIs on AWS using Java/python/Ruby/PHP SDKs
  • Programing data ingestion/processing in any of the scripting languages
  • Deliver AWS based infrastructure solutions using AWS Cloud Formation (JSON) and Chef (Ruby) for configuration management
  • Migrate on premise applications to AWS
  • Create models/diagrams/views to facilitate infrastructure as a service (IaaS), Software as a service(SaaS) and Platform as a Service(PaaS) solutions including JSON file creation
  • Develop procedures to automate various systems and tasks (e.g. automating code builds and deployments) including monitors and alerts as well as automated error detection using Splunk and Zabbix
  • Execute system administration of hosting platforms capable of running on a variety of frameworks (java, node.js, ruby, php, python)
  • Assist in code promotion process to production environments
  • Work with production SQL and No SQL databases to optimize performance and resiliency, including disaster recovery.

Basic Qualifications:

Bachelor's degree or military experience

At least 3 years of experience providing enterprise Linux based system administration

At least 1 year of experience with GIT or at least 2 years of experience with SVN or at least 1 year of experience working with Jenkins Hudson

At least 1 year of experience with Ant or at least 1 year of experience with Anthil Pro

At least 1 year of experience with Maven or at least 1 year of experience with Sonar or at least 1 year of experience with U Deploy

At least 2 year of experience with Shell or at least 1 year of experience with Ruby or at least one year of experience working with Python

At least 2 year of experience working with AWS cloud automation

At least 2 years of experience with configuring monitoring alerts using tools like Splunk and Zabbix as well as construction of system performance dashboards and reports

At least 2 years of experience working with ITIL foundations for Incident, Change and Problem management

At least 2 year of experience Chef or at least 2 year of experience with Puppet or at least 1 year of experience with Ansible

Preferred Qualifications:

1+ year of experience in an enterprise cloud environment

1+ year of experience working with Incident Management

1+ year of experience working with Release Management

2+ year of experience working with Unix shell scripting

At this time, Capital One will not sponsor a new applicant for employment authorization for this position.

No agencies please. Capital One is an Equal Opportunity Employer committed to diversity in the workplace. All qualified applicants will receive consideration for employment without regard to gender, race, color, age, national origin, religion, disability, genetic information, marital status, sexual orientation, gender identity/assignment, citizenship, pregnancy or maternity, protected veteran status, or any other status protected by applicable national, federal, state or local law Capital One promotes a drug-free workplace. Capital One will consider for employment qualified applicants with a criminal history in a manner consistent with the requirements of applicable laws regarding criminal background inquiries, including, to the extent applicable, Article 23-A of the New York Correction Law; San Francisco, California Police Code Article 49, Sections 4901-4920; New York City's Fair Chance Act; Philadelphia's Fair Criminal Records Screening Act; and other applicable federal, state, and local laws and regulations regarding criminal background inquiries.

If you require an accommodation to apply for a job or to perform a job, please contact Capital One Recruiting at 1-800-304-9102 or

Meet Some of Capital One's Employees

Ryan P.

Head Of Design

Ryan and his team of designers and developers work at The Shop, a combined technology workshop and retail hub, to create meaningful financial products and services.

Brennan C.

Director, Software Engineer

Brennan simplifies Capital One’s home loan payments with smart software solutions, making the customer experience as pleasant as possible and financial transactions easy to complete.

Back to top