Data Engineering Intern
What you'll work on
- Spark: As data engineers, we want to move fast, and we want our code to move fast as well. We’ve recently transitioned from Hadoop to Spark, and we are continuing to increase the performance of our data pipelines while simultaneously increasing the complexity of the jobs that run on top of them. We use Spark in both streaming and batch mode. You'll be expected to improve the infrastructure while you implement on top of it.
- Machine learning: We use statistics to solve hard problems. Whether we’re running regression to better understand our business or clustering as part of a client-facing data pipeline, statistical modeling is key to our business.
- Data quality: A model is only good if it is correct and built on recent data. We put a strong emphasis on data quality. We write unit tests to test the functional correctness of each module and meta-tests to guard against common programming errors. Throughout our data pipelines we run automated sanity checks on live data, alerting if any data is stale or values fall outside of expected ranges.
- Strong experience in one of: Java, Hadoop or similar language
- Experience with online data stores (preferably MySQL); experience with offline data stores (preferably Hadoop stack) is plus
- Experience with at least one scripting language
- Passion for agile, test-driven development, continuous integration, and automated testing
- Solid understanding of distributed systems and functional programming paradigms
- Pursuing a BS, MS, or PhD in computer science, math, physics, or related field
Meet Some of Wealthfront's Employees
Whether it’s crafting user flows, designing a direct mail piece, or working on website visuals, Aly uses her passion for human and digital interaction to ensure a great user experience.
Back to top