Data Engineer, Retail Services

Description

The Retail Systems Business intelligence team is looking for a talented, smart, and experienced Data Engineer to help us build an exceptional platform to support business reporting, data analysis, and machine learning applications.

Our data is consumed by key stakeholder teams across Amazon Retail business teams and Executive Management.

As an Amazon.com Data Engineer you will be working in one of the world's largest and most complex data warehouse environments. You should be skilled in the architecture of DW solutions for the Enterprise using multiple platforms (RDBMS, Columnar, NoSQL, Cloud). You should have extensive experience in the design, creation, management, and business use of extremely large datasets. You should have excellent business and communication skills to be able to work with business owners to develop and define key business questions, and to build data sets that answer those questions. Above all, you should be passionate about working with data and inventing new and elegant solutions to support our internal customers' needs.

As a Data Engineer on our team, you will develop new data pipelines that leverage cloud architecture and perform transformations on existing data to support new use cases. You will also be assisting with building out Redshift as our primary data warehouse solution to create the curated data model for the enterprise to leverage. You will be directly supporting scientists and BI engineers as they produce state of the art machine learning models and also helping us deploy Tableau as our presentation layer. This is an exciting opportunity to learn emerging AWS-built analytics services as an actual part of the Amazon organization.

Responsibilities

  • Interfacing with business customers, gathering requirements and developing new datasets in data warehouse
  • Building and migrating the complex ETL pipelines from various upstream databases to Redshift and Elastic Map Reduce to make the system grow elastically
  • Optimizing the performance of business-critical queries and dealing with ETL job-related issues
  • Tuning application and query performance using Unix profiling tools and SQL
  • Identifying the data quality issues in Redshift to address them immediately to provide great user experience
  • Extracting and combining data from various heterogeneous data sources
  • Designing, implementing and supporting a Tableau platform that can provide ad-hoc access to large datasets
  • Modeling data and metadata to support ad-hoc and pre-built reporting
  • Experience working with large data sets in order to extract business insights or build predictive models (data mining, machine learning, regression analysis)
  • Broad knowledge of applied mathematics (probability and statistics, linear algebra, mathematical optimization)
  • Expertise in statistical tools (SAS, R) for large-scale data processing and analysis

Basic Qualifications

The ideal candidate would have business acumen, strong technical skills, superior written and verbal communication skills, and the ability to influence and participate in cross-functional teams. S/he would be a self-starter, comfortable with ambiguity, able to think big (while paying careful attention to detail), and enjoy working in a fast-paced dynamic environment.

Bachelor's degree in Computer Science or related field

  • 5+ years relevant experience
  • SQL proficiency
  • Experience with MySQL or MS SQL or Oracle replication, clustering, and scaling
  • Experience designing and developing large-scale data structures for business intelligence analytics using ETL/ELT processes, data modeling, SQL, and Oracle Implement data structures using best practices in data modeling
  • Experience dealing with large databases
  • Experience with managing production database servers
  • Administration MS/Linux servers, and storage
  • Experience in supporting internal and external customers

BS or MS in Computer Science, Mathematics, Statistics, Economics, or related field

  • business insights or build predictive models (data mining, machine learning, regression analysis)
  • Broad knowledge of applied mathematics (probability and statistics, linear algebra, mathematical optimization)
  • Expertise in statistical tools (SAS, R) for large-scale data processing and analysis

Preferred Qualifications

Experience with Big Data Technologies (Hadoop, Hive, Hbase, Pig, Spark, etc.)

  • Coding proficiency in at least one modern programming language (Python, Ruby, Java, etc)
  • Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
  • Experience building data products incrementally and integrating and managing datasets from multiple sources
  • Query performance tuning skills using Unix profiling tools and SQL
  • Experience leading large-scale data warehousing and analytics projects, including using AWS technologies – Redshift, S3, EC2, Data-pipeline and other big data technologies
  • Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
  • Linux/UNIX command line (shell scripting)
  • Experience with AWS services
  • A desire to work in a collaborative, intellectually curious environment
  • Tableau Report development experience

Meet Some of Amazon's Employees

Mae M.

Senior UX Designer

Mae integrates human-centered design into tools that enable business partners to operate efficiently and intuitively. She analyzes customer needs and pain points to improve designs.

Heather Z.

Director of Alexa Engagement

Heather focuses on building great customer experiences for Alexa users. She heads a team of technical and creative professionals who bring the product to life.


Back to top