Data Engineer, Datasets Team
- New York, NY
THE EARNEST RESEARCH COMPANY
Earnest Research is a VC-backed data innovation startup driven to change the way professionals understand consumer and business behavior. Working with world-class data partners, we transform raw data into a source for business and investment professionals to ask better questions so they can make better decisions. We believe, in the right hands, data has the power to change the way we work.
We look for the following characteristics in all of our employees:
- Creative problem solving
- A high level of enthusiasm and proactivity
- Attention to detail
- The ability to succinctly communicate ideas
- The ability to produce high-quality work under tight timelines
- A willingness to take ownership of work
- The ability to work as part of a team and effectively with others
The Earnest Research Company is headquartered in New York, NY and we fully support remote working.
DATA ENGINEER, DATASETS TEAM
Earnest Research is seeking a Data Engineer to join our Datasets Team. The Datasets Team is responsible for the ingestion, transformation, and productization of dozens (and growing!) disparate datasets. This position requires a highly motivated problem solver with strong communication skills, analytical ability and attention to detail.
RESPONSIBILITIES
- Extract and process raw data at scale (including writing scripts, calling APIs, writing SQL/Spark, etc.)
- Process unstructured data into a form suitable for analysis
- Work closely with product owners and data analysts to gather and understand requirements
- Interface with Data Platform engineers and give valuable feedback that guides tooling
- Participate in code reviews and design discussions, give and receive constructive feedback
- Create, extend and own data pipelines that power the company’s products
- Ensure high Data Quality and pipeline stability
QUALIFICATIONS
Required skills:
- Experience processing large amounts of structured and semi-structured data
- Programming experience in Python, SQL and Bash
- 2+ years writing and maintaining ETL at a terabyte level scale
- 1+ years experience working with Hadoop applications (Spark)
- Experience with version control systems (Git)
Preferred skills:
- Code-based data orchestrator such as Apache Airflow, Dagster, Luigi
- Knowledge of aws, emr, redshift/snowflake
- Spark-Scala / PySpark Experience
- Experience with Docker containerization
- Strong knowledge of and experience with statistics
- Enthusiasm for Open Source
- Data Warehouse modeling experience
- Experience with or willingness to learn functional programming (Haskell)
- Experience using Data Build Tool (DBT)
- Experience automating Data Quality checks either through DBT, Great Expectations or company tooling
BENEFITS & PERKS:
- 100% company paid medical plan options (additional medical, dental and vision plans available too!)
- 401K retirement plans
- Flexible and generous time off
- Generous Parental Leave Policies
- Pre-tax savings plans for public transportation and parking expenses
- Regular company happy hours, lunches & events
Earnest Research is an equal opportunity employer, and we encourage people with a diverse range of backgrounds to apply.
Back to top