- Seattle, WA
AWS Support is one of the largest and fastest growing business units within AWS. We are a highly technical, innovative organization revolutionizing the customer engagement processes and offers topnotch technical support for the portfolio of products and features of AWS. We are determined to redefine the word "Support" and lead the industry with best in class technology.
We are looking for an exceptional Data Engineer who is passionate about data and the insights that large amounts of data sets can provide. The ideal candidate will possess both a data engineering background and a strong business acumen that enables him/her to think strategically. He/she will experience a wide range of problem solving situations requiring extensive use of data collection and analysis. The successful candidate will work in lock-step with BI Engineers, Data scientists, ML scientists, Business analysts, Product Managers and other stakeholders across organization. He/she will:
• Develop and improve the current data architecture, data quality, monitoring and data availability.
• Collaborate with Data Scientists to implement advanced analytics algorithms that exploit our rich data sets for statistical analysis, prediction, clustering and machine learning
• Partner with BAs across teams to build and verify hypothesis to improve the AWS Support business.
• Help continually improve ongoing reporting and analysis processes, simplifying self-service support for customers
• Keep up to date with advances in big data technologies and run pilots to design the data architecture to scale with the increased data sets of customer experience on AWS.
• Bachelor's degree in Computer Science or related technical field, or equivalent work experience.
• 5+ years of work experience with ETL, Data Modeling, and Data Architecture.
• 4+years of work experience in writing and optimizing SQL.
• Experience with AWS services including S3, Redshift, EMR, Kinesis and RDS.
• 1+ years of work experience with Big Data Technologies (Hadoop, Hive, Hbase, Pig, Spark, etc.)
• Knowledge of distributed systems as it pertains to data storage and computing
• Experience with ETL optimization, designing, coding, and tuning big data processes using Apache Spark or similar technologies.
• Experience with building data pipelines and applications to stream and process datasets at low latencies.
• Experience handling data - tracking data lineage, ensuring data quality, and improving discoverability of data.
• Knowledge of distributed systems and data architecture (lambda)- design and implement batch and stream data processing pipelines, knows how to optimize the distribution, partitioning, and MPP of high-level data structures.
• Experience with native AWS technologies for data and analytics such as Redshift Spectrum, Athena, S3, Lambda, Glue, EMR, Kinesis, SNS, CloudWatch, etc.
Back to top