Senior Software Engineer, Data Platform

Indigo is a company dedicated to harnessing nature to help farmers sustainably feed the planet. With a vision of creating a world where farming is an economically desirable and accessible profession, Indigo works alongside its growers to apply natural approaches, conserve resources for future generations, and grow healthy food for all. Utilizing beneficial plant microbes to improve crop health and productivity, Indigo’s portfolio is focused on cotton, wheat, barley, corn, soybeans, and rice. The company, founded by Flagship Pioneering, is headquartered in Boston, MA, with additional offices in Memphis, TN, Research Triangle Park, NC, Sydney, Australia, Buenos Aires, Argentina, and São Paulo, Brazil.

The Senior Software Engineer, Data Platform is a self-directed person passionate about database design performance at scale.  This person can identify and working with data across a variety of formats and making the data available to an equally varied set of customers, from data scientists to web-based applications through APIs.  They fundamentally understand big data ingestion techniques and the engineering necessary to push those frameworks to do more.  This person is comfortable working with, deploying, and managing big data solutions using cloud based technologies.



  • Collaborate with Data Management to provide tools and processes to validate and repair data
  • Delights users, across the organization by ensuring access to data that is necessary for those users to make informed decisions
  • Identify critical gaps and builds ETL processes
  • Domain expert in stored data and use cases
  • Implement / Design IoT Schema and Data Ingestion (w/ tests)
  • Finalize v1.x design and implementation for all R&D data



  • Understands implications of making various design decisions across a large dataset
  • Understands sparse data sets, storage mechanics, and the data needs for analytical applications
  • Understands trade-offs between multiple data formats (flat files, parquet, orc, etc.)
  • Understands data lifecycles (e.g. archiving, access, and cost of data management) and can manage / communicate these concern to the Architects and stakeholders.
  • Ability to work closely with the Data Science Group to understand their needs, enabling them to more effectively do their job by working with them to be able to access the data they require
  • Understands the need to pay significant attention to detail in software data management (e.g. data growth and monitoring, query performance awareness, data use cases) to be able to provide insight into problems before they are realized
  • Strong sense of quality and attention to detail
  • Willing to interface with many areas of organization to ensure our data is meaningful



  • Experience designing schemas (DDL)
  • Experience architecting database topologies across a wide array of formats
  • Comfortable working with complex schemas and managing views into data
  • Knowledge of why/when to use no-sql, columnar, data sharding solutions
  • Experience with RDBMS (e.g. Postgres, MySQL, Oracle)
  • Working knowledge of at least one NoSQL database
  • Knowledgeable of in-memory database solutions
  • Experience in Restful API design
  • Experience in Java / Python programming languages
  • Experience writing high-performance data queries and associated database designs
  • Experience with data quality and ability to test that processes are correct
  • Experience w/ benchmarking and performance tuning data solutions
  • Knowledge of AWS data solutions (Spark / EMR / RedShift / Aurora / Athena)
  • Bachelor’s degree required, Master’s degree preferred
  • 7+ years of experience



Back to top