What is the difference between QoE and QoS in video streaming?

QoS (Quality of Service) measures objective network and infrastructure performance — bandwidth, packet loss, latency — from a network engineering perspective. QoE (Quality of Experience) measures what users actually perceive: whether video starts quickly and plays without buffering. The same QoS can produce very different QoE depending on player implementation, ABR strategy, and encoding parameters, which is why video platforms need to track QoE directly.

What are the most important QoE metrics for video streaming?

Six core metrics: Video Startup Time (VST, target P50 under 1s for long-form VOD), Rebuffering Ratio (RBR, under 0.5% is excellent), Video Start Failure (VSF, under 1%), Exit Before Video Start (EBVS, under 3%), Video Playback Failure (VPF, under 0.5%), and Average Bitrate. Rebuffering directly impacts retention — Conviva research shows every 1% increase in RBR reduces watch time by 2–5%.

Should I buy Mux or Conviva, or build my own video QoE analytics?

Managed services like Mux (~$1.25 per 1K sessions, developer-friendly) or Conviva (enterprise-grade, most comprehensive, most expensive) integrate in hours with out-of-the-box dashboards and zero maintenance, but get costly at scale and limit customization. Self-built pipelines allow full customization and joining QoE with business metrics, at high development cost. The common evolution: start with Mux early, then build in-house at scale while keeping Mux as a benchmark.

阅读中文版 →

VOD Deep Dive Part 10: QoE Metrics — How to Measure What Users Actually Feel

QoE vs QoS, six core metrics (VST, RBR, VSF, EBVS, VPF, Avg Bitrate), data pipelines, multi-dimensional drill-down, troubleshooting cases, and when to buy vs build.

zhuermu · May 10, 2026 · 20 min

vodstreamingqoemonitoringmuxconviva

This is Part 10 of the VOD Streaming Deep Dive series.

QoE vs QoS: Two Often-Confused Terms

Abbreviation	Full name	Definition	Perspective
QoS	Quality of Service	Objective network/infrastructure performance (bandwidth, packet loss, latency)	Network engineering / Ops
QoE	Quality of Experience	User-perceived quality	Product / User research

QoS says “I’m delivering 10 Mbps.” QoE says “Can users actually watch without buffering?”

Same QoS can produce very different QoE depending on player implementation, ABR strategy, and encoding parameters.

We care about QoE. Running a video platform without QoE data is like operating a highway system without measuring traffic congestion.

Six Core QoE Metrics

1. Video Startup Time (VST)

Definition: Seconds from user pressing play to first frame rendered.

Targets:

Short-form mobile video: P50 < 300ms, P95 < 800ms
Long-form VOD: P50 < 1s, P95 < 2s

The most sensitive metric. Users won’t tolerate a 2-second black screen.

2. Rebuffering Ratio (RBR)

Definition: rebuffer_time / (rebuffer_time + play_time)

Example: User watched 60 seconds, buffered for 3 seconds total. RBR = 3/(60+3) = 4.8%.

Target: < 0.5% (excellent), < 1% (acceptable).

Impacts retention directly: Conviva research shows that every 1% increase in RBR reduces watch time by 2–5%.

3. Video Start Failure (VSF)

Definition: User triggered playback but the first frame never rendered (due to 404, CORS, DRM error, etc.).

Target: < 1%

Conviva further subdivides this:

VSF-T (Technical): Failed due to technical reasons — counts as QoE failure
VSF-B (Business): Failed for business reasons (no subscription, geo-blocked) — excluded from QoE

4. Exit Before Video Start (EBVS)

Definition: User triggered playback but left voluntarily before the first frame — not an error, just impatience.

Target: < 3%

Strongly correlated with VST. Slow startup → high EBVS → poor retention.

5. Video Playback Failure (VPF)

Definition: Playback crashed mid-stream (decode error, expired certificate, CDN stream cut).

Target: < 0.5%

6. Average Bitrate

Definition: Time-weighted average of bitrate tiers during actual playback.

Purpose: Measures whether users are actually seeing acceptable quality. If 70% of users average 480p, it could mean:

Network conditions are generally poor
ABR algorithm is too conservative
High tiers weren’t transcoded

Supporting Metrics

Metric	Description
Rebuffer Frequency	Rebuffer events per minute of playback (target: < 0.1/min)
Bitrate Switching	Number and magnitude of quality switches (stability preferred)
Video Complete Rate	Percentage of users who finish the video (business metric)
Time to Key Decode	Time to acquire DRM license
First Byte Time	Time until first byte arrives from CDN

Conviva’s SPI: A Composite Index

SPI (Streaming Performance Index): Conviva’s composite KPI representing the percentage of sessions with “good or very good” experience.

A session qualifies as “good” when it simultaneously meets:

No VSF-T or VPF-T errors
No or minimal rebuffering (CIRR below threshold)
Average bitrate meets the screen-size quality bar
Video Start Time within acceptable range
User didn’t wait excessively before exit

Single metrics can mislead (e.g., low RBR but extremely low bitrate). SPI provides a holistic view.

Multi-Dimensional Drill-Down

Never look at just “overall RBR.” Always slice by dimensions:

Dimension	Examples
Geography	Country / State / City / ISP
Device	OS version, model, chipset, screen size
Network	WiFi / 4G / 5G, throughput range
CDN	Provider, PoP, Shield
Content	Title, resolution tier, codec, duration
Time	Hour, day, week
User	New/returning, paid/free, region

Standard troubleshooting pattern:

Overall RBR spiked to 2% → cause unknown
  ↓ Drill by CDN → CDN-A RBR 5%, CDN-B RBR 0.3%
  ↓ Drill CDN-A by region → Mumbai RBR 12%
  ↓ Drill Mumbai by time → 19:00-22:00 peak spike
  → Conclusion: CDN-A Mumbai PoP degraded during evening peak
  → Action: Route India traffic to CDN-B

Data Pipeline: From Client to Dashboard

Typical Architecture

┌──────────────┐
│ Client SDK    │
│ (iOS/Android/ │── HTTPS batch POST every 5-10s
│  Web)         │   (events JSON)
└──────────────┘
       │
       ▼
┌──────────────┐
│ Ingestion     │   nginx / ALB / API Gateway / CloudFront
│ (Edge)        │   with rate limiting + auth
└──────────────┘
       │
       ▼
┌──────────────┐
│   Kafka       │   Persistent message queue
│  Topic: qoe   │   Partitioned by day
└──────────────┘
       │
       ├─────────────────────┐
       ▼                     ▼
┌──────────────┐      ┌──────────────┐
│ Flink / Spark │      │ ClickHouse / │   Real-time data warehouse
│ Streaming     │      │ BigQuery     │
│ (aggregation) │      │              │
└──────────────┘      └──────────────┘
       │                     │
       ▼                     ▼
┌──────────────┐      ┌──────────────┐
│  Alerting     │      │ BI Dashboard │   Grafana, Tableau, Looker
│ (PagerDuty)   │      │              │
└──────────────┘      └──────────────┘

Event Schema

Every event includes:

{
  "event": "video_rebuffer_start",
  "session_id": "uuid-...",
  "user_id": "u-12345",
  "video_id": "ep-789",
  "timestamp": 1715084800123,
  "player_version": "2.3.4",
  "device": {
    "os": "iOS",
    "os_version": "17.4",
    "model": "iPhone 15 Pro"
  },
  "network": {
    "type": "cellular",
    "carrier": "Verizon",
    "effective_type": "4g"
  },
  "cdn": "cloudfront",
  "bitrate": 2500000,
  "buffer_level_sec": 0.8,
  "position_sec": 45.2
}

Batch vs Real-Time

Don’t send one HTTP request per event (100K DAU x 100 events/user = 10M requests/day).

Batch strategy: Accumulate 10 seconds or 50 events, then send one POST.

Real-World Troubleshooting Cases

Case 1: Overall VST Spike

Monday 9 AM: VST P50 jumped from 400ms to 1.2s
│
├── Drill by OS → Android VST spiked to 2s, iOS normal
│
├── Drill by app version → v3.4.5 all 2s, v3.4.4 normal
│
├── Check changelog → v3.4.5 introduced a new player library
│
└── Action: Emergency hotfix / rollback to v3.4.4

Case 2: Single Title Rebuffering

New show Episode 3: RBR anomalously high at 5%
│
├── Drill by CDN → All CDNs high (not a CDN issue)
│
├── Check segments → One segment is 20 MB (others are 2 MB)
│
├── Check encoding log → 10-second action scene caused bitrate spike
│
└── Fix: Re-transcode with MaxBitrate cap on peak bitrate

Case 3: Regional Conversion Drop

India new-user first-hour completion rate dropped from 30% to 15%
│
├── Drill by VST → India VST P50 rose from 0.8s to 3s
│
├── Drill by CDN → CDN-A edge node latency elevated in India
│
├── Ping test → CDN-A Mumbai PoP latency 400ms for 4 hours
│
└── Action: Route India traffic to CDN-B, escalate to CDN-A support

Build vs Buy: Mux, Conviva, or Self-Built?

Managed Services

Service	Strengths
Mux	Developer-friendly, simple integration, ~$1.25/1K sessions
Conviva	Enterprise-grade, most comprehensive, most expensive
Datadog RUM	Integrated APM in one platform
NPAW (YOUBORA)	Strong in European markets

Pros: Hours to integrate, dashboards out of the box, zero maintenance. Cons: Expensive at scale, data lives with third party, limited customization.

Self-Built

Pros: Full customization, data can be joined with business metrics (orders, retention), cost advantage at scale. Cons: High development/maintenance cost, multi-platform SDK consistency is hard.

Common Evolution

Early stage: Buy Mux — get usable dashboards fast
At scale: Self-built pipeline + keep Mux as a benchmark for comparison

Client SDK Best Practices

Don’t Slow Down Playback

The QoE SDK itself must not degrade the experience:

Report on a separate low-priority thread
Network failures: silent retry, never block UI
SDK crash must not bring down the app

Offline Compensation

Users may finish watching offline. When back online:

Events written to local SQLite/file during offline
Batch-upload in FIFO order when connectivity returns

Clock Alignment

Device clocks can be inaccurate:

Use server timestamps (HTTP Date header) as reference
Events carry relative time (delta_ms from session start)

Sampling at Scale

At massive scale, 100% reporting is too expensive:

Critical error events: Always 100% reported
Normal events: Sample at 10–30%
Hash by user_id to ensure all-or-nothing per user (preserves session analysis)

Essential Dashboards

Dashboard 1: Global Overview

DAU, total play sessions
VST P50 / P95
RBR, VSF, VPF
SPI (composite score)
Top 10 countries drill-down

Dashboard 2: CDN Health

Per-CDN RBR, VST, error rate
CDN comparison panel (same time window)
CDN edge node map

Dashboard 3: Content Quality

New title first-24-hour quality metrics
Per-title completion rate and RBR
Anomalous title alerts

Dashboard 4: Device and Version

Error rate by app version
VST by OS version
RBR by device model

The QoE Optimization Loop

QoE data isn’t for passive observation — it drives engineering decisions:

        Data reveals problem
              │
              ▼
       Locate root cause
       (CDN? Encoding? ABR?)
              │
              ▼
       Try fix + A/B test
              │
              ▼
       Verify QoE improved
              │
              ▼
       Ship to 100% + keep monitoring

Weekly QoE review is standard practice for every mature video team.

Key Takeaways

QoE measures user-perceived experience, not network metrics.
Six core metrics: VST / RBR / VSF / EBVS / VPF / Average Bitrate.
Conviva’s SPI is a composite “good experience session ratio.”
Data must be sliced by multiple dimensions — a single global number can’t locate problems.
Standard pipeline: Client SDK -> Kafka -> Flink/ClickHouse -> BI.
Start with Mux/Conviva in early stages; build in-house when scale justifies it.
QoE data drives decisions — review weekly.

Previous: Part 9: Video Players

Next: Part 11: End-to-End Workflow

References

VMAF — perceptual video quality metric — Netflix / GitHub
hls.js — GitHub