Skip to content

datafun-streaming

Shared Python utilities for streaming data analytics projects.

Provides a consistent foundation so projects can focus on streaming concepts rather than boilerplate.

What's Included

Subpackage Purpose
datafun_streaming.kafka Producer, consumer, admin, and connection utilities
datafun_streaming.storage DuckDB persistence with schema inference
datafun_streaming.data_validation Field-level validators, types, and reference data helpers
datafun_streaming.io CSV and JSON file I/O utilities
datafun_streaming.stats Incremental running statistics
datafun_streaming.visualization Plotly chart utilities for live data
datafun_streaming.core Shared type aliases used across subpackages

Installation

uv add datafun-streaming

Quick Start

from datafun_streaming.kafka.kafka_settings import KafkaSettings
from datafun_streaming.storage.duckdb_utils import init_db, upsert_row
from datafun_streaming.stats.stats_utils import RunningStats

settings = KafkaSettings.from_env()
conn = init_db(Path("data/output/sales.duckdb"))
stats = RunningStats()

Source