Sr. Data Engineer

2 months agoSunnyvale, CA


The Alexa Echo Device Team is looking for a talented, highly motivated Senior Data Engineer to join our Business Intelligence team. Alexa is the groundbreaking cloud-based intelligent agent that powers Echo and other devices designed around your voice. We provide actionable business insights that inform future products and services that will power the next generation of Echo and Alexa devices.

As a Senior Data Engineer, you will work in one of the world's largest and most complex data warehouse environments. You will work closely with Product Management, Software Development, Data Science, and other Data Engineering teams to develop scalable and innovative analytical solutions, process and store terabytes of low latency structured and unstructured data, and enable the Echo Device team to build successful, data driven strategies.

You will be responsible for designing and implementing an analytical environment using third-party and in-house tools and using Python, Scala, or Java to automate the ETL, analytics, and data quality platform from the ground up. You will design and implement complex data models, model metadata, build reports and dashboards, and own data presentation and dashboarding tools for the end users of our data products and systems. You will work with leading edge technologies like Redshift, EMR, Hadoop/Hive/Pig, and more. You will write scalable, highly tuned SQL queries running over billions of rows of data and will develop learning and training programs to drive adoption of data driven decision making across the Echo and Alexa organization.

You should have deep expertise in the design, creation, management, and business use of large datasets, across a variety of data platforms. You should have excellent business and interpersonal skills to be able to work with business owners to understand data requirements, and to implement efficient and scalable ETL solutions. You should be an authority at crafting, implementing, and operating stable, scalable, low cost solutions to replicate data from production systems into the BI data store.

Key Responsibilities
• Design, implement, and improve the analytics platform
• Implement and simplify self-service data query and analysis capabilities of the BI platform
• Develop and improve the current BI architecture, emphasizing data security, data quality and timeliness, scalability, and extensibility
• Deploy and use various big data technologies and run pilots to design low latency data architectures at scale
• Collaborate with business analysts, data scientists, product managers, software development engineers, and other BI teams to develop, implement, and validate KPIs, statistical analyses, data profiling, prediction, forecasting, clustering, and machine learning algorithms
• Partner with other BI and analytics teams to build and verify hypotheses to improve the AWS customer financial experience


• Bachelor's degree in Computer Science or related field or 5+ years relevant experience
• Expert level skills writing and optimizing complex SQL
• Knowledge of data warehousing concepts.
• Experience in data mining, profiling, and analysis
• Experience with complex data modelling, ETL design, and using large databases in a business environment
• Proficiency with Linux command line and systems administration
• Experience with languages like Python, Ruby, Java, or similar language
• Experience with Big Data technologies such as Hive/Spark.
• Proven ability to develop unconventional solutions; Sees opportunities to innovate and leads the way
• Excellent verbal and written communication; Proven interpersonal skills and ability to convey key insights from complex analyses in summarized business terms; Ability to effectively communicate with technical teams
• Ability to work with shifting deadlines in a fast paced environment


• Authoritative in ETL optimization, designing, coding, and tuning big data processes using Apache Spark or similar technologies.
• Experience with building data pipelines and applications to stream and process datasets at low latencies.
• Demonstrate efficiency in handling data - tracking data lineage, ensuring data quality, and improving discoverability of data.
• Sound knowledge of distributed systems and data architecture (lambda)- design and implement batch and stream data processing pipelines, knows how to optimize the distribution, partitioning, and MPP of high-level data structures.
• Knowledge of Engineering and Operational Excellence using standard methodologies.

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit