TikTok is one of the most demanding real-time video platforms with a massive global user base. Designing a scaled-down version of TikTok is a common system design interview challenge at companies like ByteDance, Meta, Google, and Amazon. In this post, we’ll walk through how to approach a TikTok system design interview, focusing on functional, non-functional requirements, scaling strategies, and metric-driven architecture.

Tiktok System Design interview

Functional Requirements

Core User Interactions

A simplified TikTok-like platform should allow users to:

✅ Upload short-form videos
✅ Watch a stream of short videos
✅ Like videos
✅ Share videos

These features should provide a seamless, real-time experience to users globally.

Reduced Scope for Interview

To manage time and complexity during a 45–60 minute system design interview, it’s reasonable to defer or skip certain functionalities.

❌ Sign-Up / Login / Auth (assume it’s done)
❌ Full-fledged UI/Frontend design
❌ AI-powered feed ranking
❌ Chat and comments system

This keeps the scope focused and allows you to go deeper into scalable backend services.

Non-Functional Requirements

Beyond user-facing functionality, TikTok must meet rigorous performance and reliability standards.

🔒 High Availability

The system should handle failures gracefully and ensure uptime across regions.

⚖️ Scalability

With 1B+ global users, the system should scale horizontally to manage massive read/write traffic.

⚡ Low Latency

Video load times should be near-instant (under 200ms).
Video upload should be fast and reliable, even on mobile networks.

📱 Multi-device Support

The system must support access from:

Smartphones (iOS, Android)
Tablets
Desktops
TVs (via casting)

System Metrics

These are essential for understanding system design scale:

Metric	Value
Total users	1 billion+
Countries served	150+
Videos viewed/day	1 billion
Videos uploaded/day	10 billion
Success metric	Time spent per user (~1 hour/day)

Detailed Assumptions

To ground our system design in real-world constraints, let’s make some storage and size assumptions.

📹 Video Details

Resolution: 1080 × 1920 px
Avg Duration: 10 seconds
Avg Size: 1 MB

📦 Annual Storage Estimate

Let’s calculate the total video storage required per year:

Video Size = 1MB
Videos/Day = 10B
Videos/Year = 10B × 365 = 3.65 Trillion

Storage Required = 1MB × 3.65T
                = 3.65 × 10^15 bytes
                = 3.65 PB/year

Even with optimizations, we estimate ~10 PB/year considering overhead, backup, and replication.

System Architecture

High-Level Overview

         [Mobile/Web Clients]
                |
           [API Gateway]
                |
   ┌────────────┬────────────┐
   |            |            |
Upload Svc   Feed Svc   Video Svc
   |            |            |
 Object Storage CDN    Metadata DB

Each core module communicates via REST/gRPC, and services are containerized using Kubernetes.

Video Upload Pipeline

Steps:

Client Uploads Video
Multipart form or resumable chunked upload
Upload Service Receives
Stores in an Object Store (S3/GCS)
Generates a video ID and thumbnail
Metadata Service Writes to DB
Stores metadata like uploader ID, video length, tags, timestamp
Transcoding Pipeline
Async job using services like FFmpeg to generate multiple resolutions
Caching Popular Content
Redis/Memcached layer for hot content

Video Playback & CDN

Client requests video by ID
Edge CDN (Cloudflare, Akamai) serves cached video
Fallback to Origin Store for cold content
Pre-fetching used to preload next videos

CDN Performance Goal:
⏱ <100ms video load from any region

Likes

Handled via a dedicated Like Service
Writes to NoSQL (Cassandra / DynamoDB) for high write throughput
Updates Redis Cache for fast like count fetch

Triggers event to Analytics Service
May include social sharing integration (optional scope)

Scaling Strategies

Horizontal Scaling

Upload, feed, metadata, and like services scale independently
Kubernetes + Auto Scaling Groups

Partitioning Strategy

Partition video metadata by userID / region
Shard video object storage by videoID prefix

Async Processing

Use message queues (Kafka/PubSub) for:
- Transcoding
- Analytics
- Notifications

Storage Design

Component	Storage Type	Technology
Video Files	Object Storage	Amazon S3, GCS, MinIO
Metadata	Relational DB	PostgreSQL, MySQL
Likes / Shares	NoSQL Key-Value	Redis, Cassandra
Feed Generation	Graph/Time-series	Neo4j, TimescaleDB
Caching Layer	In-Memory Cache	Redis, Memcached

Caching & Optimization

Redis Use Cases:

Caching feed results
Like counts
Hot video metadata

Content Delivery:

Use CloudFront/Cloudflare for geographic edge delivery
Enable pre-warming of cache during upload/transcode

Compression:

Store videos using H.264 or AV1 codecs
Apply adaptive bitrate streaming for slow networks

Monitoring and Metrics

Key Metrics to Track

Metric	Description
Upload Success Rate	% of successfully uploaded videos
Video Play Latency	Time to first frame
Time Spent per User	Engagement metric
CDN Cache Hit Ratio	Edge performance
Transcoding Queue Lag	Bottleneck indicator
Feed Query Time	Backend performance

Tools

Prometheus + Grafana for backend metrics
ELK stack / Datadog for logs
Jaeger for distributed tracing

Conclusion