Data Knowledge System Research Scientist - (Data Platform-Global Live) - Global Frontier Tech Recruitment Program - 2027 Start (PhD)
Responsibilities
The Data Platform Global Live team is dedicated to empowering the growth of TikTok LIVE business through big data. We support our businesses in achieving their missions by building high quality real-time and offline data warehouses, creating various forms of efficient and data-friendly data assets, and exploring and implementing business oriented data solutions. We provide stable and reliable data capabilities for daily operations, analyses, decision-making of TikTok LIVE features, in addition to robust data support to enhance live performance for streamers.
We are looking for talented individuals to join our team in 2027. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite at our Company.
Successful candidates must be able to commit to an onboarding date by end of year 2027. Please state your availability and graduation date clearly in your resume.
We are building a next-generation enterprise knowledge system for the LLM era.
Our goal is to enable large language models to understand, access, and operate on enterprise data, including data warehouses, documents, logs, and real-time streams.
This role focuses on designing and researching a unified knowledge layer that supports query, reasoning, and execution, integrating RAG, knowledge graphs, and agent-based systems.
You will work at the intersection of data infrastructure, AI systems, and knowledge modeling, and help define how AI interacts with enterprise data.
Topic Introduction:
This project focuses on building a unified knowledge system for the era of large language models. It aims to enable LLMs to efficiently access and understand both structured and unstructured enterprise data, including data warehouses, documents, logs, and real-time information. By integrating Retrieval-Augmented Generation (RAG), knowledge graphs, and agent capabilities, the project seeks to develop an intelligent system that supports querying, reasoning, and execution across key business scenarios such as analytics, decision-making, and automation.
Want more jobs like this?
Get Human Resources and Recruitment jobs in San Jose, CA delivered to your inbox every week.

Challenges:
Deploying LLMs in enterprise scenarios presents several challenges. Heterogeneous data sources are fragmented and lack a unified modeling and access framework. Knowledge updates are often delayed, making it difficult to meet real-time requirements. In addition, LLMs may produce hallucinations due to weak grounding, requiring reliable citation and verification mechanisms. Balancing performance, cost, and latency, while designing a scalable and extensible knowledge integration and orchestration framework, remains a core challenge.
Value:
This project provides foundational infrastructure for scaling LLM applications in enterprise environments, significantly improving output accuracy, interpretability, and business usability. By establishing a unified knowledge operation layer, it helps consolidate core data assets and build sustainable competitive advantages. It also accelerates the evolution of AI from conversational tools to data-driven decision agents, laying the groundwork for next-generation data + AI platforms.
What You Will Do:
- Research and design unified knowledge representations for enterprise data
- Explore and build RAG-based knowledge systems with high accuracy and low latency
- Develop ontology / semantic layers to bridge data and LLM understanding
- Design knowledge ingestion and update mechanisms (batch + real-time)
- Improve LLM grounding, traceability, and reliability
- Explore agent-based reasoning and execution frameworks
- Prototype and validate new ideas, and bring them into production systems
Qualifications
Minimum Qualifications:
- Individuals who are completing or have recently completed a PhD in Software Development, Computer Science, Computer Engineering, or a related technical discipline.
- Strong programming skills in Python / Java / Scala
- Solid understanding of data systems, data modeling, or distributed systems
- Experience in at least one of the following:
- Data engineering/backend systems
- Machine learning/LLM systems
- Strong problem-solving skills and curiosity about new technologies
Preferred Qualifications:
- Experience with LLM, RAG, or vector databases
- Knowledge of knowledge graphs or ontology modeling
- Experience with real-time data processing (Flink, Kafka, etc.)
- Understanding of AI agents or workflow orchestration
- Experience building data platforms or knowledge systems
Job Information
[For Pay Transparency] Compensation Description (annually)
The base salary range for this position in the selected city is $212800 - $450000 annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.
Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
For Los Angeles County (unincorporated) Candidates:
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment:
1. Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;
2. Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and
3. Exercising sound judgment.
Perks and Benefits
Health and Wellness
- Health Insurance
- Dental Insurance
- Vision Insurance
- HSA
- Life Insurance
- Fitness Subsidies
- Short-Term Disability
- Long-Term Disability
- On-Site Gym
- Mental Health Benefits
- Virtual Fitness Classes
Parental Benefits
- Fertility Benefits
- Adoption Assistance Program
- Family Support Resources
Work Flexibility
- Flexible Work Hours
- Hybrid Work Opportunities
Office Life and Perks
- Casual Dress
- Snacks
- Pet-friendly Office
- Happy Hours
- Some Meals Provided
- Company Outings
- On-Site Cafeteria
- Holiday Events
Vacation and Time Off
- Paid Vacation
- Paid Holidays
- Personal/Sick Days
- Leave of Absence
Financial and Retirement
- 401(K) With Company Matching
- Performance Bonus
- Company Equity
Professional Development
- Promote From Within
- Access to Online Courses
- Leadership Training Program
- Associate or Rotational Training Program
- Mentor Program
Diversity and Inclusion
- Diversity, Equity, and Inclusion Program
- Employee Resource Groups (ERG)
Company Videos
Hear directly from employees about what it is like to work at TikTok.