Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Data Engineer

Yesterday Pune, India

DESCRIPTION

GPP Database Link (https://cummins365.sharepoint.com/sites/CS38534/)

Job Summary:

Supports, develops and maintains a data and analytics platform. Effectively and efficiently process, store and make data available to analysts and other consumers. Works with the Business and IT teams to understand the requirements to best leverage the technologies to enable agile data delivery at scale.

Key Responsibilities:

Implements and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured). Implements methods to continuously monitor and troubleshoot data quality and data integrity issues. Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users. Develops reliable, efficient, scalable and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages. Develops physical data models and implements data storage architectures as per design guidelines. Analyzes complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models. Participates in testing and troubleshooting of data pipelines. Develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others). Uses agile development technologies, such as DevOps, Scrum, Kanban and continuous improvement cycle, for data driven application.

RESPONSIBILITIES

Competencies:

System Requirements Engineering - Uses appropriate methods and tools to translate stakeholder needs into verifiable requirements to which designs are developed; establishes acceptance criteria for the system of interest through analysis, allocation and negotiation; tracks the status of requirements throughout the system lifecycle; assesses the impact of changes to system requirements on project scope, schedule, and resources; creates and maintains information linkages to related artifacts.

Want more jobs like this?

Get jobs in Pune, India delivered to your inbox every week.

Job alert subscription


Collaborates - Building partnerships and working collaboratively with others to meet shared objectives.

Communicates effectively - Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences.

Customer focus - Building strong customer relationships and delivering customer-centric solutions.

Decision quality - Making good and timely decisions that keep the organization moving forward.

Data Extraction - Performs data extract-transform-load (ETL) activities from variety of sources and transforms them for consumption by various downstream applications and users using appropriate tools and technologies.

Programming - Creates, writes and tests computer code, test scripts, and build scripts using algorithmic analysis and design, industry standards and tools, version control, and build and test automation to meet business, technical, security, governance and compliance requirements.

Quality Assurance Metrics - Applies the science of measurement to assess whether a solution meets its intended outcomes using the IT Operating Model (ITOM), including the SDLC standards, tools, metrics and key performance indicators, to deliver a quality product.

Solution Documentation - Documents information and solution based on knowledge gained as part of product development activities; communicates to stakeholders with the goal of enabling improved productivity and effective knowledge transfer to others who were not originally part of the initial learning.

Solution Validation Testing - Validates a configuration item change or solution using the Function's defined best practices, including the Systems Development Life Cycle (SDLC) standards, tools and metrics, to ensure that it works as designed and meets customer requirements.

Data Quality - Identifies, understands and corrects flaws in data that supports effective information governance across operational business processes and decision making.

Problem Solving - Solves problems and may mentor others on effective problem solving by using a systematic analysis process by leveraging industry standard methodologies to create problem traceability and protect the customer; determines the assignable cause; implements robust, data-based solutions; identifies the systemic root causes and ensures actions to prevent problem reoccurrence are implemented.

Values differences - Recognizing the value that different perspectives and cultures bring to an organization.

Education, Licenses, Certifications:

College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required. This position may require licensing for compliance with export controls or sanctions regulations.

Experience:

Relevant experience preferred such as working in a temporary student employment, intern, co-op, or other extracurricular team activities.

Knowledge of the latest technologies in data engineering is highly preferred and includes:

  • Exposure to Big Data open source
  • SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent college coursework
  • SQL query language
  • Clustered compute cloud-based implementation experience
  • Familiarity developing applications requiring large file movement for a Cloud-based environment
  • Exposure to Agile software development
  • Exposure to building analytical solutions
  • Exposure to IoT technology

QUALIFICATIONS

it's a Hybrid role with 2 days Work from Office in Pune.

Must-Have:

  • 3 to 5 years of experience in data engineering with expertise in Azure Databricks and Scala/Python .
  • Proven track record in developing efficient pipelines.
  • Hands-on experience with Spark (Scala/PySpark) and SQL .
  • Strong understanding of Spark Streaming , Spark Internals , and Query Optimization .
  • Skilled in optimizing and troubleshooting batch/streaming data pipeline issues.
  • Proficient in Azure Cloud Services (Azure Databricks, ADLS, EventHub, EventGrid, etc.).
  • Experienced in unit testing of ETL/ELT pipelines.
  • Expertise with CI/CD tools for automating deployments.
  • Knowledgeable in big data storage strategies (optimization and performance).
  • Strong problem-solving skills.
  • Good understanding of data models (SQL/NoSQL), including Delta Lake or Lakehouse.
  • Exposure to Agile software development methodologies.
  • Quick learner with adaptability to new technologies.

Work Schedule:

Most of the work will be with stakeholders in the US, with an overlap of 2-3 hours during EST hours on a need basis.

Job Systems/Information Technology

Organization Cummins Inc.

Role Category Remote

Job Type Exempt - Experienced

ReqID 2418413

Relocation Package No

Client-provided location(s): Pune, India
Job ID: Cummins-R-59C0010CCF5B40E1B9A58CBE59113442
Employment Type: OTHER
Posted: 2025-08-15T19:28:51

Perks and Benefits

  • Health and Wellness

    • FSA With Employer Contribution
    • Health Reimbursement Account
    • On-Site Gym
    • HSA With Employer Contribution
    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
  • Parental Benefits

    • Non-Birth Parent or Paternity Leave
    • Birth Parent or Maternity Leave
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
  • Office Life and Perks

    • Company Outings
    • Casual Dress
  • Vacation and Time Off

    • Leave of Absence
    • Personal/Sick Days
    • Paid Holidays
  • Financial and Retirement

    • Relocation Assistance
    • Performance Bonus
    • Stock Purchase Program
    • Pension
    • 401(K) With Company Matching
  • Professional Development

    • Mentor Program
    • Shadowing Opportunities
    • Access to Online Courses
    • Lunch and Learns
    • Tuition Reimbursement
  • Diversity and Inclusion