Airflow Xcom Exclusive Repack -

In Airflow, tasks run in isolated environments, often on different worker nodes. To pass metadata, state, or small data sets between these isolated tasks, Airflow uses .

@task def transform(data: dict): data['data'].append(4) return data

Excessive XCom writes create high I/O concurrency, leading to database locks and slower scheduler loops. Designing "Exclusive" XCom Workflows airflow xcom exclusive

Historically, Airflow allowed XCom values to be serialized using Python's pickle module, which could lead to security vulnerabilities and version incompatibilities. Modern Airflow , and pickling support is deprecated. Always ensure your XCom values are JSON‑serializable unless you have a very good reason to do otherwise.

Problem : Pushing a large JSON payload or a DataFrame exceeds the 48KB limit and may silently fail or corrupt your metadata database. Solution : Store large objects in a shared file system (e.g., S3, GCS) and pass only the file URI via XCom. Alternatively, use a custom XCom backend that offloads large payloads automatically. In Airflow, tasks run in isolated environments, often

Use .output explicitly or pass it inside a Jinja template string: ti.xcom_pull(task_ids='...') . High database CPU usage on Scheduler nodes.

By default, Airflow tasks push and pull XComs via the metadata database (usually PostgreSQL or MySQL). A simple pattern is: Problem : Pushing a large JSON payload or

Any value returned by an operator’s execute() method or a TaskFlow API Python function is automatically pushed to XCom.

Whether you are using PostgreSQL, MySQL, or SQLite, this architectural design introduces major bottlenecks if abused: