Machine Learning Data Engineer

11 West 19th Street (22008), United States of America, New York, New York

At Capital One, we're building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding.

Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good.

Machine Learning Data Engineer

Are you a high performing data engineer or data scientist looking to take part in some of the most cutting edge research and production projects? Do you enjoy reading and investigating advancements in various applied machine learning architectures and solution white papers? Would you like to take part or drive the creation of publishable advancements in machine learning across various disciplines? You could be a great match for a Machine Learning Data Engineer role at Capital One's Center for Machine Learning (C4ML).

C4ML is a highly technical team focused on consulting, research, and building machine learning products for the enterprise. We have the highest executive support for acting as a catalyst of machine learning across Capital One. In this role as a Machine Learning Data Engineer in C4ML, you will work on one of our cutting-edge products that lies at the intersection of machine learning, distributed computing, and DevOps. In this role you will leverage technologies like AWS, Kubernetes, Docker, TensorFlow, Spark, and Kafka to build a containerized platform for deploying distributed frameworks. Our goal is to be able to handle the machine learning and big data infrastructure needs of the entire company. You will also have opportunities to work in the consulting and research branches of the team.

What you will bring to the role:

  • Excellent communication skills evidenced by multiple white papers (internal proprietary or externally published).
  • Demonstrated ability to build full stack systems architected for speed and distributed computing.
  • Demonstrated ability to quickly learn new tools and paradigms to deploy cutting edge solutions.
  • Experience mentoring junior engineers.
  • Adept at simultaneously working on multiple projects, meeting deadlines, and managing expectations.


What you will do in the role:
  • Act as an advisor to various lines of business to help create or improve projects.
  • Develop both deployment architecture and scripts for automated system deployment in AWS.
  • Code new machine learning paradigms, sometimes from first principles, for integration into production systems.
  • Learn and work with subject matter experts to create large scale deployments using newly researched methodologies.
  • Construct data staging layers and fast real-time systems to feed machine learning algorithms.
  • Create white papers, attend conferences, and contribute to open source software.


Basic Qualifications:
  • Bachelor's Degree or Military Experience
  • At least 2 years of experience designing and building full stack solutions utilizing distributed computing.
  • At least 2 years of experience working with Python, Scala, or Java.
  • At least 2 years of experience with distributed file systems or multi-node database paradigms.


Preferred Qualifications:
  • Master's Degree or PhD
  • At least 2 years of experience deploying production applications to a cloud services provider, such as AWS.
  • At least 2 years of experience with machine learning or deep learning frameworks, such as TensorFlow, PyTorch or H2O.
  • At least 2 years of experience with distributed data movement frameworks, such as Spark, Kafka, or Dask.
  • At least 3 years of experience with a container orchestration platform, such as Kubernetes.
  • At least 5 years of experience with CI/CD technologies, such as Ansible, Cloud Formation, or Jenkins.
  • At least 5 years of experience leading teams in code development


At this time, Capital One will not sponsor a new applicant for employment authorization for this position.


Back to top