Machine Learning Data Engineer--Principal Associate

7900 Westpark Drive (12131), United States of America, Tysons, Virginia

At Capital One, we're building a leading information-based technology company. Still founder-led by Chairman and Chief Executive Officer Richard Fairbank, Capital One is on a mission to help our customers succeed by bringing ingenuity, simplicity, and humanity to banking. We measure our efforts by the success our customers enjoy and the advocacy they exhibit. We are succeeding because they are succeeding.

Guided by our shared values, we thrive in an environment where collaboration and openness are valued. We believe that innovation is powered by perspective and that teamwork and respect for each other lead to superior results. We elevate each other and obsess about doing the right thing. Our associates serve with humility and a deep respect for their responsibility in helping our customers achieve their goals and realize their dreams. Together, we are on a quest to change banking for good.

Machine Learning Data Engineer--Principal Associate

Investing in the right information security capabilities is essential to what we do for Capital One in protecting our customers and our employees. As part of that mission, the CyberML team collects and analyzes vast quantities of data to help detect malware, prevent fraud, and protect customers.

As a Principal Data Engineer on the CyberML team, you will drive the design and development of cutting-edge solutions for meeting Capital One's exponentially increasing data and machine learning needs. You will be tackling problems that will impact Capital One's entire enterprise, as well as provide technical leadership to the other engineers on your team.

Who You Are

  • You are interested in working on challenging problems involving scalability and performance.
  • You can effectively collaborate with other teams to work on high-profile initiatives.
  • You enjoy learning new technologies and picking up new skills.


What The Role Is
  • Collaborating as part of a cross-functional Agile team to create and enhance software that enables state of the art, next generation Big Data & Fast Data applications.
  • Building efficient and scalable storage for structured and unstructured data.
  • Developing and deploying distributed computing Big Data applications using Open Source frameworks like Apache Spark, Apex, Flink, Nifi, Storm and Kafka on AWS Cloud
  • Building and running large-scale NoSQL databases like Elasticsearch and Cassandra.
  • Utilizing programming languages like Java, Scala, Python.
  • Designing and building applications for the cloud (AWS, Azure, GCP, DO)
  • Leveraging DevOps techniques and practices like Continuous Integration, Continuous Deployment, Test Automation, Build Automation and Test Driven Development to enable the rapid delivery of working code utilizing tools like Jenkins, Maven, Nexus, Chef, Terraform, Ruby, Git and Docker.
  • Performing unit tests and conducting reviews with other team members to make sure your code is rigorously designed, elegantly coded, and effectively tuned for performance


Basic Qualifications
  • Bachelor's Degree or Military Experience
  • At least 3 years of programming experience in Java, Scala, Python, C++, or Golang
  • At least 3 years of experience working on data streaming with Apache Spark, Flink, Storm, or Kafka or data warehousing with Snowflake Analytics, Presto, AWS Athena, or AWS Redshift applications
  • At least 2 years of experience working with Linux-based OSes
  • At least 2 years of experience with Shell, Python, or Perl
  • At least 1 year of experience working within cloud environments
  • At least 1 year of experience with streaming analytics, complex event processing, and probabilistic data structures.
  • At least 1 year of experience with columnar data stores and MPP


Preferred Qualifications
  • M.S. or Ph.D. in Computer Science or related technical discipline.
  • Experience working with machine learning libraries such as sklearn, Tensorflow, Pytorch, and/or H20.
  • Experience working with Elasticsearch and Lucene-based search.
  • Experience working with Snowflake data warehouse.
  • Experience working with container runtimes (Docker, rkt, cri-o, etc.)
  • Experience working with container frameworks (Kubernetes, Mesosphere, etc.)
  • Experience with building Machine Learning (ML) applications or implementing ML algorithms.
  • AWS Solution Architect Associate or Professional


At this time, Capital One will not sponsor a new applicant for employment authorization for this position.


Back to top