CLINIC RECORDS / PRESCRIPTIONS

Clinic Records ⁴²

Digital Supplements for a Healthy Mind

YEAR_2026

              2026.03.30
            

The Midnight Job: Why Batch Processing Has an Expiry Date

Streaming Batch Processing Data Engineering

Note

DIAGNOSTIC BRIEF: Scheduled batch jobs are the drip-feed IV of the data world — …

              READ_MORE →
            

              2026.03.28
            

Terraform Hardening Protocol: 5 Safeguards That Could Have Saved Everything

Terraform Infrastructure as Code Cloud Security

Important

PRE-OPERATIVE CHECKLIST INITIATED: Before any infrastructure surgery, verify all …

              READ_MORE →
            

              2026.03.23
            

The State File Incident: One Missing File, One Destroyed Database

Terraform Infrastructure as Code Incident Report

Caution

CRITICAL SYSTEM FAILURE: Patient vitals flatlined. Root cause: state amnesia …

              READ_MORE →
            

              2026.03.17
            

PyFlink II: Windows, Watermarks, and Late Events

PyFlink Data Engineering Apache Flink

Goal: Understand and apply the core mechanisms of Flink stream processing—Windows and …

              READ_MORE →
            

              2026.03.16
            

PyFlink I: Architecture, Checkpoints, and Pass-Through Jobs

PyFlink Data Engineering Apache Flink

Goal: Understand Flink’s internal architecture (JobManager and TaskManager). Build …

              READ_MORE →
            

              2026.03.14
            

Streaming Foundations II: Python Consumers and PostgreSQL

Kafka Python Data Engineering

Goal: Write a Python Consumer to read and deserialize Kafka messages. Setup a target …

              READ_MORE →
            

              2026.03.13
            

Streaming Foundations I: Redpanda and Python Producers

Kafka Redpanda Python

Goal: Understand the fundamentals of message brokers, use Redpanda to simplify Kafka …

              READ_MORE →
            

              2026.03.07
            

Running Spark in the Cloud: GCS, Standalone Clusters, and Dataproc

Apache Spark GCP Dataproc

Goal: Move from local Spark development to cloud execution — connecting to Google Cloud …

              READ_MORE →
            

              2026.03.06
            

Spark Internals: Clusters, Shuffles, Joins, and RDDs

Apache Spark Distributed Systems RDD

Goal: Understand how Spark executes jobs across a cluster, how operations like GROUP BY …

              READ_MORE →
            

              2026.03.05
            

Spark SQL: Running SQL Queries on DataFrames

Apache Spark Spark SQL PySpark

Goal: Learn how to combine multiple datasets in Spark, register DataFrames as temporary …

              READ_MORE →
            

←010203 →05

Clinic Records 42