Data Engineer
2 months ago• Pune, India
This job is no longer available.
DESCRIPTION
Key Responsibilities:
- Implement and automate deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
- Continuously monitor and troubleshoot data quality and data integrity issues.
- Implement data governance processes and methods for managing metadata, access, and retention for internal and external users.
- Develop reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
- Develop physical data models and implement data storage architectures as per design guidelines.
- Analyze complex data elements and systems, data flow, dependencies, and relationships to contribute to conceptual, physical, and logical data models.
- Participate in testing and troubleshooting of data pipelines.
- Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).
- Use agile development technologies, such as DevOps, Scrum, Kanban, and continuous improvement cycles, for data-driven applications.
Want more jobs like this?
Get Data and Analytics jobs in Pune, India delivered to your inbox every week.

RESPONSIBILITIES
Competencies:
- System Requirements Engineering: Translate stakeholder needs into verifiable requirements; establish acceptance criteria; track status throughout the system lifecycle; assess impact of changes.
- Collaborates: Build partnerships and work collaboratively with others to meet shared objectives.
- Communicates Effectively: Develop and deliver multi-mode communications that convey a clear understanding of the unique needs of different audiences.
- Customer Focus: Build strong customer relationships and deliver customer-centric solutions.
- Decision Quality: Make good and timely decisions that keep the organization moving forward.
- Data Extraction: Perform ETL activities from various sources and transform them for consumption by downstream applications and users.
- Programming: Create, write, and test computer code, test scripts, and build scripts using industry standards and tools.
- Quality Assurance Metrics: Apply measurement science to assess solution outcomes using ITOM, SDLC standards, tools, metrics, and KPIs.
- Solution Documentation: Document information and solutions based on knowledge gained during product development activities.
- Solution Validation Testing: Validate configuration item changes or solutions using SDLC standards and metrics.
- Data Quality: Identify, understand, and correct data flaws to support effective information governance.
- Problem Solving: Solve problems using systematic analysis processes and industry-standard methodologies.
- Values Differences: Recognize the value that different perspectives and cultures bring to an organization.
Education, Licenses, Certifications:
- College, university, or equivalent degree in a relevant technical discipline, or relevant equivalent experience required.
- This position may require licensing for compliance with export controls or sanctions regulations.
Nice to Have Experience:
- Understanding of the ML lifecycle.
- Exposure to Big Data open source technologies.
- Familiarity with clustered compute cloud-based implementations.
- Experience developing applications requiring large file movement for a cloud-based environment.
- Exposure to building analytical solutions and IoT technology.
Work Environment:
- Most work will be with stakeholders in the US, with an overlap of 2-3 hours during EST hours as needed.
- This role will be Hybrid.
QUALIFICATIONS
Experience:
- 3-5 years of experience in data engineering with a strong background in Azure Databricks and Scala/Python.
- Hands-on experience with Spark (Scala/PySpark) and SQL.
- Experience with Spark Streaming, Spark Internals, and Query Optimization.
- Proficiency in Azure Cloud Services.
- Experience in Agile Development and Unit Testing of ETL.
- Experience creating ETL pipelines with ML model integration.
- Knowledge of Big Data storage strategies (optimization and performance).
- Critical problem-solving skills.
- Basic understanding of Data Models (SQL/NoSQL) including Delta Lake or Lakehouse.
- Quick learner.
Job Systems/Information Technology
Organization Cummins Inc.
Role Category Remote
Job Type Exempt - Experienced
ReqID 2409183
Relocation Package Yes
Client-provided location(s): Pune, India
Job ID: Cummins-R-B201EE74430A4524AE721C22BF83EDB4
Employment Type: OTHER
Posted: 2025-05-21T12:09:06
Perks and Benefits
Health and Wellness
- FSA With Employer Contribution
- Health Reimbursement Account
- On-Site Gym
- HSA With Employer Contribution
- Health Insurance
- Dental Insurance
- Vision Insurance
- Life Insurance
- Short-Term Disability
- Long-Term Disability
Parental Benefits
- Non-Birth Parent or Paternity Leave
- Birth Parent or Maternity Leave
Work Flexibility
- Flexible Work Hours
- Remote Work Opportunities
Office Life and Perks
- Company Outings
- Casual Dress
Vacation and Time Off
- Leave of Absence
- Personal/Sick Days
- Paid Holidays
Financial and Retirement
- Relocation Assistance
- Performance Bonus
- Stock Purchase Program
- Pension
- 401(K) With Company Matching
Professional Development
- Mentor Program
- Shadowing Opportunities
- Access to Online Courses
- Lunch and Learns
- Tuition Reimbursement
Diversity and Inclusion