Software Engineer - Data Pipeline (C#, .NET, Golang)
Agoda is the largest and fastest growing online hotel booking platform in Asia, as part of a Booking Holdings (BKNG) company, the world’s leading provider of brands that help people book great experiences through technology
Why Join Agoda?
Our Bangkok team is looking for top quality passionate engineers to build products across our next gen data platform products.
Our systems scale across a multitude of data centers, totaling a few million writes per second and managing petabytes of data. We deal with problems from real-time data-ingestion, replication, enrichment, storage and analytics. We are not just using Big Data technologies; we are pushing them to the edge.
Why Agoda Data Pipeline Team?
Join us in the team that handles the data pipeline infrastructure that is the core backbone for all Data Event Logging within Agoda and is crucial to real time monitoring of all Agoda Systems across geographically distributed Data Centers.
The always-on data pipeline feeds logs, events, metrics into Hadoop, ElasticSearch, Spark Clusters and other distributed systems that drives key business processes such as Data Driven Business Intelligence, NRT Monitoring, A/B Testing, Centralized Application Logging and Stream Processing to name a few.
Working on distributed systems spanning multiple data centers, thousands of servers and hundreds of billions of messages a day. Ensuring data quality, data integrity and data accuracy is a core part of our identity. You will be eager to solve problems that come from managing, and making sense of large amounts of data. Some of the things you'll get to work on include schema validation, real-time data-ingestion, cross data center replication, enrichment, storage and analytics of the data flow.
We are a small passionate team and we are looking for exceptional individuals to be a part of designing, building, deploying (and probably debugging) our Data Pipeline.
technologies we use
Java, Scala, GoLang, .NET, .NET Core, Python, Ruby, Kafka, ElasticSearch, Hadoop, Spark, Hive/Impala/SparkSql, Avro, Parquet, Schema Registry, Sensu, MSSQL, Graphite, Grafana
Day to Day:
- You will build, administer and scale data pipelines that process hundreds of billions of messages a day spanning over multiple data centers
- You will be comfortable navigating the following technology stack: Java/Scala, Golang, .NET, .NET Core, Kafka, scripting (Bash/Python), Hadoop, ElasticSearch
- You will develop and expand upon existing frameworks that is used by Teams throughout Agoda to produce messages to the data pipeline
- You will build and manage data ingestion into multiple systems (Hadoop, ElasticSearch, other Distributed Systems)
- You will build tools that monitor high data accuracy SLAs for the data pipeline
- You will fix production problems
- You will profile for performance, self-recovery and stability
- You will collaborate with other teams and departments
- You will automate system tasks via code as needed
- You will explore available new technologies that improve upon our quality of data, processes and data flow
- You will develop quality software through design review, code reviews and test driven development
- B.Sc. in Computer Science / Information Systems / Computer Engineering or related field
- You have two plus years of industry experience, preferred at a tech company
- A passion for Big (Petabytes worth) Data
- Good knowledge of data architecture principles
- You have operational experience debugging production issues
- An experienced coder, who can stand your ground with experience building systems with purpose that are flexible, well-tested, maintainable and scale
- You’re detail oriented considering every outcome of a particular decision
- You have no issues being on-call and working at odd hours as needed
- You can communicate in technical English with fluidity, both verbal and written
Nice to Haves:
- Good understanding of how Kafka works
- Kafka Administrator Experience
- Understands Concepts relating to Schema Registry and Schema Evolution
- Experience working with Serialization Formats either with ProtocolBuffers, Avro or Thrift
- Knows how to use ElasticSearch proficiently
- Experience with data ingestion from Kafka into Hadoop, ElasticSearch, other Distributed Systems
- Strong systems administration skills in Linux
- Worked on or contributed to Open Source Project
Back to top