- Seattle, WA
Who we are
At AWS, we identify and common customer pain points across services and tackle them. We raise the bar on internal practices and alleviate operational bottlenecks that hinder forward movement on making proactive improvements for customers.
"Nothing is beyond our reach"
We operate by the slogan that "nothing is beyond our reach". We think big. We sometimes find ourselves asking "how in the world is there no tool or solution for this?", or "how have we managed to operate like this for so long?". Our response in these moments, is "well, let's fix it". If the idea of pioneering solutions to unsolved problems from the ground up in uncharted territory sounds exciting to you, come join us!
AWS produces an outrageous amount of data, and our org is seeking a data engineer to usher in a new generation of data management where data can be a first class citizen in our operations. You will be the source of truth for data management best practices. We, as developers, TAMs, and scientists, need your unique expertise on all things from how to efficiently design schemas to how to optimally utilize AWS services to architect production data lakes, databases, and compute engines.
Here are some qualities that will make you successful in this role:
You have good instincts. You have good judgement on when to utilize/build upon existing solutions, starting from scratch only when appropriate. You always seek to disprove your assumptions, yet have backbone and would be comfortable giving disruptive feedback to a room of principal engineers.
You have a 'can do' attitude and a helpful spirit. Encountering bad practices and constraints doesn't irritate you, but rather excites you. You enjoy getting occasionally interrupted by people dropping by your desk seeking your unique expertise.
You can be a generalist and are comfortable navigating in unfamiliar domains, but you have a deep expertise in the data domain. You are a veteran working with AWS tooling. You have learned from experience works and doesn't work at scale. You understand tradeoffs and limitations between tools. You are curious and always looking for and exploring new tooling.
Roles and responsibilities:
• Work with systems engineers and data scientists to design and build a analytics platform and data lake to support compute heavy data science, dashboarding, and web-facing production tooling. (You get to tell us what this should look like!)
• Write ETL to consolidate and relate petabytes of data owned by disparate teams
• Implement best practice data quality assurance mechanisms
Within the Compute Services organization you will:
• Participate in design reviews for new tooling and services
• Capture and share best practices with service teams
• Teach service team engineers how to test for data quality issues
• Bachelors degree in software engineering or a relevant quantitative discipline
• 2+ years development experience in python or similar scripting language for automation
• 3+ years of work experience with ETL, Data Modeling, and Data Architecture with hundreds of terabytes of data
• 2+ years experience working with core AWS data and analytics services. Understand of the applicability, limitations, and trade offs between a wide set of AWS database and analytics technologies.
• 1+ years experience working with distributed computing and associated technologies such as Spark, EMR, etc.
• 2+ years experience with Redshift. Tangible experience working with Redshift Spectrum, AWS Glue, DynamoDB, and S3
• 5+ years of work experience with ETL, Data Modeling, and Data Architecture.
• Experience or familiarity with newer AWS data and analytics tools such as AWS Lake Formation, Sagemaker
• Expert-level skills in writing and optimizing SQL
Back to top