Working with SQLite¶
Using SQLite with Python, Power BI, and Spark SQL
SQLite is lightweight, cross-platform, and already included with Python. It is a file-based relational data store so we don't need to install and configure services like would be needed with MySQL, SQLServer, or other enterprise solution.
It makes an accessible proxy to learn about working with data warehouses.
Python¶
Python includes SQLite support by default through the built-in sqlite3 module.
Just import the package from the Python Standard Library in your Python code file or notebook:
import sqlite3
conn = sqlite3.connect("data/warehouse/smart_sales.db")
cursor = conn.cursor()
cursor.execute("SELECT COUNT(*) FROM fact_sale")
Power BI Desktop (Windows)¶
We can connect PowerBI to SQLite using the SQLite ODBC driver.
- Download it from https://www.ch-werner.de/sqliteodbc.
- Install the SQLite3 ODBC Driver (64-bit) using default settings.
- Create a Windows DSN (Data Source Name):
- Open Windows Start menu and type ODBC Data Sources (64-bit)
- Click the app when it appears to launch it
- Select the "System DSN" tab.
- Click Add… and choose SQLite3 ODBC Driver
- Click Finish
- Name it SmartSalesDSN
- Click Browse… and select your file, for example:
data/warehouse/smart_sales.db
In PowerBI, select this DNS from Get Data / ODBC.
Apache Spark¶
Apache Spark can connect to SQLite using the SQLite JDBC Driver.
- Download the SQLite JDBC
.jarfrom https://github.com/xerial/sqlite-jdbc/releases - Save the
.jarfile into your repo at:lib/sqlite-jdbc.jar - Configure PySpark:
spark = (
SparkSession.builder
.config("spark.driver.extraClassPath", "lib/sqlite-jdbc.jar")
.getOrCreate()
)
- Load a table:
df = spark.read.format("jdbc").options(
url="jdbc:sqlite:data/warehouse/smart_sales.db",
dbtable="fact_sale"
).load()
OPTIONAL: SQLite CLI (Command Line Interface)¶
This is not required.
SQLite provides a simple, fast SQL shell for working with .db files from the terminal.
To use it:
- Download the CLI from https://sqlite.org/download.html (look for
sqlite-toolsfor your OS) - Unzip the package
- Add the unzipped folder to your PATH.
Once installed, we can open a database from the terminal:
sqlite3 data/warehouse/smart_sales.db
Once inside the shell:
.tables
SELECT COUNT(*) FROM fact_sale;
Learn more at: https://sqlite.org/cli.html