Tech Deep Dive

Source-code-level analysis and system design explorations. Covers databases, search engines, streaming, video technology, and architecture internals.

29 articles

Tech Deep Dive

VOD Deep Dive Part 1: Video Fundamentals — What Is a Video, Really?

The first installment of our 12-part VOD streaming series. Learn what video actually is at the byte level — pixels, resolution, frame rates, bitrate, I/P/B frames, GOP, color spaces, and HDR.

20 min
Tech Deep Dive

VOD Deep Dive Part 10: QoE Metrics — How to Measure What Users Actually Feel

QoE vs QoS, six core metrics (VST, RBR, VSF, EBVS, VPF, Avg Bitrate), data pipelines, multi-dimensional drill-down, troubleshooting cases, and when to buy vs build.

20 min
Tech Deep Dive

VOD Deep Dive Part 11: End-to-End Workflow — From Upload to Playback

The complete 10-step VOD production pipeline: upload, content moderation, probe, transcode, package, publish, CDN pre-warm, orchestration with Step Functions and Temporal, disaster recovery.

20 min
Tech Deep Dive

VOD Deep Dive Part 12: Building VOD on AWS — Services, Architecture, and Costs

Complete AWS VOD reference: MediaConvert, MediaPackage, CloudFront, S3, Step Functions, SPEKE DRM integration, Terraform IaC, real cost breakdowns, common pitfalls, and a production roadmap.

22 min
Tech Deep Dive

VOD Deep Dive Part 2: Video Codecs — Why a 4K Movie Fits in 5 GB

How video compression works, why H.264 still dominates, when to choose H.265 or AV1, per-title encoding, VMAF quality metrics, and hands-on ffmpeg examples.

25 min
Tech Deep Dive

VOD Deep Dive Part 3: Audio Fundamentals — Making Sound Small

How digital audio works: sampling rates, bit depth, channels, AAC vs Opus vs Dolby Atmos, multi-language tracks, loudness normalization, and practical ffmpeg recipes.

12 min
Tech Deep Dive

VOD Deep Dive Part 4: Container Formats — .mp4 Is Not a Codec

Containers vs codecs, MP4 internals (Box structure), the faststart trap, fragmented MP4, CMAF for unified HLS+DASH, segment length trade-offs, and subtitle formats.

18 min
Tech Deep Dive

VOD Deep Dive Part 5: Streaming Protocols — How HLS and DASH Actually Work

Why progressive download fails, how HLS two-level manifests and DASH MPD work, CMAF dual-manifest best practices, LL-HLS for low latency, and when to consider WebRTC.

25 min
Tech Deep Dive

VOD Deep Dive Part 6: Adaptive Bitrate — How Players Auto-Switch Quality

How ABR works under the hood: throughput-based, buffer-based (BBA), BOLA, MPC, and Pensieve algorithms. Plus practical engineering advice for bitrate ladders and short-form video.

20 min
Tech Deep Dive

VOD Deep Dive Part 7: CDN Distribution — Why It's Fast Everywhere

CDN architecture (Edge/Shield/Origin), caching strategies, request collapsing, signed URLs, pre-warming, JIT vs pre-packaging, multi-CDN strategies, HTTP/3, and cost estimation.

20 min
Tech Deep Dive

VOD Deep Dive Part 8: DRM Content Protection — Why Netflix Can't Be Screen-Recorded

Widevine, FairPlay, PlayReady explained. CENC/CBCS unified encryption, license flow, L1/L2/L3 security levels, HDCP, SPEKE integration, and lightweight protection for short-form video.

22 min
Tech Deep Dive

VOD Deep Dive Part 9: Video Players — From Manifest to First Frame

What happens inside a video player: Web (MSE/EME), iOS (AVPlayer), Android (ExoPlayer/Media3), TTFF optimization, buffering strategies, lip sync, and when to build vs buy.

20 min
Tech Deep Dive

Subtitle Position Detection with OpenCV and Amazon Nova

A hybrid CV + LLM pipeline for automatic subtitle detection — 6 iterations to reach 83% accuracy on multilingual video.

20 min
Tech Deep Dive

How AI Coding Agents Actually Work: A Source Code Deep Dive

We traced the source code of Amazon Q CLI and Claude Code to understand how AI coding agents really work under the hood.

25 min
Tech Deep Dive

OpenClaw vs Claude Code: Architecture and Strategy Compared

Two AI agent products, two radically different philosophies. A deep comparison of architecture, adoption, and what's next.

15 min
Tech Deep Dive

OpenClaw vs Claude Code Source Code: Two AI Agent Architectures

We compared 453K lines of OpenClaw TypeScript with Claude Code's 28K lines of Markdown. The architectures couldn't be more different.

20 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 10): Full Architecture Blueprint and Cost Breakdown

The complete end-to-end architecture for a social app's data warehouse and recommendation system on AWS — every service mapped, with real monthly cost estimates and optimization strategies.

12 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 9): SageMaker and the ML Platform — From Training to Production

A complete tour of SageMaker AI: Studio notebooks, Feature Store, Training Jobs, real-time Endpoints, Model Monitor, and how it all fits into the recommendation system MLOps workflow.

14 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 8): Online Feature Stores — DynamoDB, ElastiCache, and OpenSearch k-NN

How recommendation systems serve features at inference time: DynamoDB for user features, ElastiCache for hot caching, OpenSearch k-NN for vector recall, and Neptune for graph retrieval.

13 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 7): Recommendation System Fundamentals — Funnel, Two-Tower, and PIT

Understand the recommendation system funnel (recall → pre-rank → rank → re-rank), two-tower retrieval architecture, and why Point-in-Time correctness matters for training samples.

15 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 6): End-to-End Data Pipeline — From Source to Feature Store

Connect all the dots: trace a click event from client SDK through API Gateway, MSK, Firehose, S3, warehouse layers (ODS→DWD→DWS→ADS), to DynamoDB for real-time serving.

10 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 5): EMR, Glue ETL, Flink, and Pipeline Orchestration

Compare EMR Serverless, Glue ETL, Managed Flink, and choose the right compute engine. Then orchestrate data pipelines with MWAA (Airflow) and Step Functions.

14 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 4): Glue Catalog, Athena, and Lake Formation

How AWS Glue Data Catalog acts as the central directory for your data lake, and how Athena queries Parquet and Iceberg tables on S3 with serverless SQL.

13 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 3): Data Ingestion — DMS, Zero-ETL, Firehose, and MSK

Four data sources, four ingestion pipelines — learn CDC with DMS, Aurora Zero-ETL, Kafka on MSK, and Firehose micro-batching to land data into your S3 data lake.

16 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 2): S3, Parquet, and Apache Iceberg Explained

Master the storage foundation of modern data lakes — S3 object storage, Parquet columnar format, and how Iceberg adds ACID transactions to files on S3.

14 min
Tech Deep Dive

Big Data on AWS Deep Dive (Part 1): Data Lakes, Warehouses, and the Lakehouse Revolution

Understand the core big data concepts — data lake vs. data warehouse vs. lakehouse, OLTP vs. OLAP, and why modern analytics architectures converge on S3.

12 min
Tech Deep Dive

How to Design a Full-Site Search Engine with Elasticsearch

Multi-source indexing, CDC sync, permission-aware search, hot keywords, and typeahead — a complete Elasticsearch architecture guide.

20 min
Tech Deep Dive

Building a Knowledge Base Search Engine with FSCrawler and Elasticsearch

Index PDFs, Word docs, and scanned files into Elasticsearch with FSCrawler. Covers OCR, custom mappings, and production setup.

16 min
Tech Deep Dive

Adding a Unique Index to a 15-Million-Row MySQL Table: A Production War Story

We added a unique index to a 15M-row live table and caused an outage. Here's what went wrong and the right way to do it.

15 min