Data Engineer - Video Ranking & Distribution
- Menlo Park, CA
Facebook's mission is to give people the power to build community and bring the world closer together. Through our family of apps and services, we're building a different kind of company that connects billions of people around the world, gives them ways to share what matters most to them, and helps bring people closer together. Whether we're creating new products or helping a small business expand its reach, people at Facebook are builders at heart. Our global teams are constantly iterating, solving problems, and working together to empower people around the world to build community and connect in meaningful ways. Together, we can help people build stronger communities - we're just getting started.
Our analytics team works very closely with Product Leaders of these apps to determine ways to acquire new users, retain existing users, and optimize user experience - all of this using massive amounts of data. As part of the analytics team, you will see a direct link between your work, company growth, and user happiness.
Our Data Engineers are clearly characterized by in-depth technical expertise and proven progression in leadership responsibility. If you have an interest in being responsible for the dynamics of a fast-paced environment, this is the right role for you. You will be working on many projects at a time, but also focused on the details while finding creative ways to pursue big picture challenges.
In this role, you will work closely with the Video Ranking and Distribution teams that are responsible for helping users discover interesting video content through different surfaces (New Feed, Watch, Pages etc) and create a planned viewing habit for Facebook users.
The Video Ranking & Distribution team is responsible for effective distribution of videos across our family of apps through recommendations and ranking. This is the largest video ranking group, responsible for serving videos to over one billion users daily. The Video Ranking team builds the best video recommendations technology and algorithms, which provides a delightful discovery and consumption experience to users.
You will be responsible for building a strong data foundation and architecture that will allow us to understand/measure: video distribution, ranking data for algorithms (including candidate generation), feature selection, relatedness, popularity bias and its impact on consumption and engagement, compute metrics for video ranking events, and reliability on different devices. In this role, you will have the opportunity to define technical specifications for logging, define and influence the right metrics, and build the core datasets that will be used by our Data Scientists, Machine Learning engineers and product managers.
The core datasets will allow us to:
Measure how effective video ranking and distribution is on our platforms.
Understand Session Metrics like time spent watching a video, relatedness of videos.
Understand relation between recommendation and consumption.
Provide understanding into the video interactions, devices, markets, media types.
Monitor generator algorithm health, relevance of recommendation and reliability.
Reducing popularity bias and optimizing evergreen content vs recent content.
- Craft and own the optimal data processing architecture and systems for new data and ETL pipelines
- Build core datasets as well as scalable and fault-tolerant pipelines
- Build data anomaly detection, data quality checks, and optimize pipelines for ideal compute and storage
- Work with video machine learning teams and prepare datasets to provide a 360 degree understanding of the video and help in feature engineering
- Define and own the data engineering roadmap for Video Ranking and Distribution Data Engineering
- Collaborate with Software Engineers and Data Scientists to design technical specification for logging and add logging to production code to generate metrics both online as well as offline
- Work with different cross-functional partners - Creator and Publisher experiences team, Video Understanding, Video Integrity, Computer Vision and Ranking Science
- Build visualizations to provide insights into the data & metrics
- Work with data infrastructure teams to suggest improvements and influence their roadmap
- Able to immerse yourself in all aspects of the product, understand the problems, and tie them back to data engineering solutions
- Recommend improvements and modifications to existing data and ETL pipelines
- Communicate and influence strategies and processes around data modeling and architecture to multi-functional groups and leadership
- Drive internal process improvements and automating manual processes for data quality and SLA management
- Provide ongoing proactive communication and collaboration throughout the organization
- 4+ years experience in the data warehouse space
- 4+ years experience working with either a MapReduce or an MPP system
- 7+ years experience in writing complex SQL and ETL processes
- 4+ years experience with object-oriented programming languages
- 7+ years experience with schema design and dimensional data modeling
- BS/BA in Technical Field, Computer Science or Mathematics
- Knowledge in Python or Java
- Experience analyzing data to identify deliverables, gaps, and inconsistencies
- Actively mentored team members in their careers
- Ability to effectively collaborate and communicate complex technical concepts to a broad variety of audiences
- Knowledge of machine learning recommendation and ranking
Back to top