Project Instructions¶

WEDNESDAY: Complete Workflow Phase 1-3¶

Complete:

Phase 1. Start & Run - copy the project and confirm it runs
Phase 2. Change Authorship - update the project to your name and GitHub account
Phase 3. Read & Understand - review the project structure and code

Complete:

Streaming analytics using Kafka, validation, and derived fields.

This project focuses on validating streaming messages and computing analytics as messages are consumed.

The case project:

Review these files before making your changes:

File	Purpose
`src/streaming/kafka_producer_case.py`	Produces sales messages to Kafka
`src/streaming/kafka_consumer_case.py`	Consumes, validates, enriches, and summarizes messages
`src/streaming/data_validation/data_contract_case.py`	Defines the data contract
`src/streaming/data_validation/data_validation_case.py`	Provides validation helpers
`src/streaming/data_engineering/derived_fields.py`	Computes derived fields

The example data starts in:

data/sales.csv

Run commands are in README.md.

Copy the consumer case file:

src/streaming/kafka_consumer_case.py

Rename your copy:

src/streaming/kafka_consumer_yourname.py

Run your copied file and make one small change.

Good options include:

Keep your change small enough that you can explain it clearly.

You can leave the producer unchanged.

To customize the producer:

Apply the same streaming analytics pattern to your own scenario.

You may:

Document your work in docs/index.md.

Explain: