Advanced Apache Flink Bootcamp
Deep dive into Flink internals, production deployment, and advanced patterns
Course Syllabus
Course Overview
This intensive 2-day bootcamp takes you deep into Apache Flink internals and production best practices. You'll learn how Flink really works by studying the source code, master both DataStream and Table APIs, and gain hands-on experience building custom operators and production-ready pipelines.
This is an advanced bootcamp. Most courses just repeat what’s already in the documentation. This bootcamp is different: you won’t just learn what a sliding window is — you’ll learn the core building blocks that let you design any windowing strategy from the ground up.
Level: Advanced
Format: Deep-dive training with hands-on workshops
Schedule: 9:00 AM - 4:00 PM PST each day
Next cohort: January 21st - 22nd
Learning Objectives
- ▸ Understand Flink internals by studying source code and execution flow
- ▸ Master DataStream API with state, timers, and custom low-level operators
- ▸ Know how SQL and Table API pipelines are planned and executed
- ▸ Design efficient end-to-end data flows
- ▸ Deploy, monitor, and tune Flink applications in production
Prerequisites
Technical Knowledge
- Good understanding of Apache Flink basics (DataStream or Table API)
- Proficiency with Java
- Understanding of distributed systems concepts
Ideal Experience
- 1+ years working with Flink or similar streaming systems
- Familiarity with Kubernetes is helpful but not required
Course Modules
Module 1: Anatomy of a Flink Pipeline
Understanding Flink internals through source code exploration
- ▸ Internals of a Flink application. From SQL query to bytes over the wire: studying the source code, inspecting code generation, looking into network exchange data, understanding stream tasks and processors
- ▸ End-to-end data flow from source to sink
Module 2: Mastering DataStream API
State, timers, watermarks, and custom joins
- ▸ Core primitive building blocks: state and timers. Combining them to build stateful operators like joins and aggregations
- ▸ Operator chaining, task assignment and state evolution
- ▸ Complex watermark generators: going beyond choosing timestamp columns
- ▸ Building custom low-level operators
Module 3: Mastering Table API & SQL
Dynamic tables, UDFs, and query optimization
- ▸ Dynamic tables, versioned tables, changelog semantics
- ▸ Understanding query plans: what EXPLAIN doesn't tell you
- ▸ Strategies for updating SQL jobs
- ▸ UDF best practices
- ▸ Process Table Functions (PTFs)
Module 4: Flink Connector Design
Modern source and sink patterns
- ▸ Design principles for modern source and sink connectors
- ▸ Understanding popular connectors: Kafka, Iceberg, Flink CDC, etc.
- ▸ Connectors and changelogs
- ▸ Best practices for connector implementation
Module 5: Hands-on Workshop - Building Custom Operators
Practical implementation with state and timers
- ▸ Build low-level DataStream API operators with state and timers
- ▸ Implement the same logic using custom Process Table Functions (PTFs)
Module 6: Architecting Efficient End-to-End Data Flows
Optimization patterns and best practices
- ▸ Changelog and upsert semantics, metadata propagation end-to-end
- ▸ Data reuse, scan sharing, and statement sets. Minimizing unnecessary work
- ▸ When to materialize intermediate data streams. One Flink pipeline vs many
- ▸ Async I/O and broadcasting
- ▸ Serialization best practices
Module 7: Flink in Production - Deployment
Kubernetes deployment and infrastructure
- ▸ Using Flink Kubernetes Operator
- ▸ Launching ad-hoc jobs with SQL Gateway
- ▸ Autoscaling
- ▸ Choosing parallelism, task slots, and resources
- ▸ Using REST API
- ▸ Building a control plane: putting Flink on auto-pilot
Module 8: Flink in Production - Observability and Reliability
Monitoring, alerting, and failure recovery
- ▸ Checkpointing in depth
- ▸ Key metrics to track and monitor, alerting strategies. Holistic telemetry collection
- ▸ Minimizing restarts and improving failure recovery
Module 9: Flink in Production - Performance and Tuning
Profiling, benchmarking, and optimization
- ▸ Profiling and benchmarking techniques. So many flamegraphs!
- ▸ Recommended settings to tune, common pitfalls to avoid
- ▸ RocksDB tuning and optimizations
Module 10: Hands-on Workshop - Production-Ready Pipeline
End-to-end implementation and deployment
- ▸ Take a Flink pipeline from development to production
- ▸ Apply optimization techniques learned in day 2
- ▸ Configure deployment with Kubernetes Operator
- ▸ Set up observability components
- ▸ Performance tuning and benchmarking
Please note: the course modules are subject to change
Certificate of Completion
All participants who complete the bootcamp and both hands-on workshops will receive a Certificate of Completion for the Advanced Apache Flink Bootcamp.
Instructor
Yaroslav Tkachenko, Lead Instructor 
- Yaroslav has been building software for more than fifteen years, focusing on data platform engineering and data streaming in the past eight years.
- Yaroslav was a tech lead at Activision and Shopify, driving major initiatives with technologies like Apache Kafka and Apache Flink.
- Later, Yaroslav spent several years as a founding engineer at Goldsky, building a self-managed data streaming platform based on Apache Flink.
- Yaroslav is an international speaker, author of the Data Streaming Journey newsletter, Confluent Catalyst and the founder of Irontools.
Testimonials
"I had the privilege of working with Yaroslav Tkachenko at Shopify, and it's rare to find someone with his depth of expertise across the entire streaming stack combined with an exceptional ability to make complex topics accessible. If you're serious about mastering streaming technologies, learning from someone with Yaroslav's real-world, battle-tested experience is an opportunity you don't want to miss."
Ryan van Huuksloot
Staff Engineer @ Shopify
"I’ve been running large-scale streaming systems for years, and Yaroslav consistently stands out as someone who really understands Flink. His experience with stateful stream processing, best practices, and production patterns goes beyond what you typically see in documentation. The Advanced Apache Flink Bootcamp looks genuinely useful for anyone looking to level up."
Sujay Jain
Senior Software Engineer @ Netflix
"Yaroslav’s knowledge of Flink, Kafka, and ClickHouse is outstanding. His consulting sessions are packed with practical advice, best practices, and forward-looking insights. Highly recommended for anyone building real-time data platforms."
Hojjat Jafarpour
CEO @ DeltaStream & Creator of ksqlDB
Course Schedule
January 2026 Cohort: January 21-22, 2026 (Wed-Thu)
Day 1: Wednesday, January 21st
Day 2: Thursday, January 22nd
What's Included
- ✓ 14 hours of live instruction (2 days, 7 hours each)
- ✓ 2 hands-on workshops integrated into the curriculum
- ✓ All course materials and code examples
- ✓ 90-day access to course recordings
- ✓ Certificate of completion
- ✓ Private Discord community access
- ✓ Q&A with the instructor
- ✓ Real-world project templates and exercises