API Reference¶
Auto-generated code documentation.
datakit_lite ¶
datakit_lite: helpful utilities for Python and analytics education.
log_duration ¶
log_duration(label: str)
Context manager that prints how long a block takes.
Source code in src/datakit_lite/timer.py
34 35 36 37 38 39 40 41 | |
project_paths ¶
project_paths(root: str | Path = '.') -> ProjectPaths
Return a set of standard project directories, creating them if needed.
Directories
data/raw data/clean reports models
Source code in src/datakit_lite/paths.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
summarize_table ¶
summarize_table(df: DataFrame) -> pd.DataFrame
Return a simple summary of a pandas DataFrame.
Columns
name: column name dtype: pandas dtype non_null: count of non-null values total: total rows missing_pct: percent missing (0-100) unique: number of unique values
Source code in src/datakit_lite/summary.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | |
timeit ¶
timeit(fn)
Print how long a function takes.
Parameters:¶
fn : callable The function to be timed.
Source code in src/datakit_lite/timer.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | |
paths ¶
Project path management utilities.
This module provides: - ProjectPaths: dataclass for organizing standard project directories - project_paths: function to create and return project directory structure
ProjectPaths
dataclass
¶
Standard project directory paths.
Attributes:¶
root : Path The root directory of the project. data_raw : Path Directory for raw data files. data_clean : Path Directory for cleaned/processed data files. reports : Path Directory for reports and output files. models : Path Directory for model files and artifacts.
Source code in src/datakit_lite/paths.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | |
project_paths ¶
project_paths(root: str | Path = '.') -> ProjectPaths
Return a set of standard project directories, creating them if needed.
Directories
data/raw data/clean reports models
Source code in src/datakit_lite/paths.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
summary ¶
Summary utilities for pandas DataFrames.
This module provides functions to generate summary statistics and metadata for pandas DataFrames, including information about column types, missing values, and unique value counts.
summarize_table ¶
summarize_table(df: DataFrame) -> pd.DataFrame
Return a simple summary of a pandas DataFrame.
Columns
name: column name dtype: pandas dtype non_null: count of non-null values total: total rows missing_pct: percent missing (0-100) unique: number of unique values
Source code in src/datakit_lite/summary.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | |
timer ¶
Timer utilities for measuring function and code block execution time.
This module provides: - timeit: decorator for timing function execution - log_duration: context manager for timing code blocks
log_duration ¶
log_duration(label: str)
Context manager that prints how long a block takes.
Source code in src/datakit_lite/timer.py
34 35 36 37 38 39 40 41 | |
timeit ¶
timeit(fn)
Print how long a function takes.
Parameters:¶
fn : callable The function to be timed.
Source code in src/datakit_lite/timer.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | |