Dbt Testing, Documentation, and Packages
Goal: ensure data quality with Tests, generate beautiful Documentation, and leverage the community with Packages.
1. Testing: Sleep Well at Night
dbt makes testing a core part of the development workflow, not an afterthought.
Generic Tests
You can add these directly to your schema.yml. dbt ships with 4 built-in tests:
unique: No duplicates.not_null: No missing values.accepted_values: Ensure a column is one of['A', 'B', 'C'].relationships: Referential integrity (Foreign Key check).
|
|
Singular Tests
For specific business logic, write a SQL query in tests/. If the query returns rows, the test fails.
|
|
2. Documentation: Your Project Website
dbt parses your project and generates a static website with:
- Lineage Graphs: See dependencies between models.
- Column Descriptions: Pulled from
schema.yml. - SQL Code: View compiled SQL.
How to Generate
|
|
Doc Blocks
For long descriptions, use doc blocks in Markdown/Jinja:
|
|
Refer to it in YAML: description: '{{ doc("table_description") }}'
3. Packages: Don’t Reinvent the Wheel
dbt has a massive library of open-source packages. The most essential is dbt-utils.
Installation (packages.yml)
|
|
Run dbt deps to install.
Usage Example: Surrogate Keys
Instead of manually concatenating strings to create a primary key, use a robust macro:
|
|
This handles nulls and data types correctly across different warehouses (BigQuery/Snowflake/DuckDB).
In the final post of this series, we will look at Deploying dbt Projects.