Build a Modern Data Stack with dbt Core, BigQuery and Power BI

Data Engineering

Enterprise Data & Analytics

Blog

Build a Modern Data Stack with dbt Core, BigQuery and Power BI: A Practical Guide for Data Teams

The modern data stack has fundamentally changed how organizations handle data transformation, storage and visualization. Instead of relying on rigid ETL workflows or opaque data processes, today’s data teams are embracing flexible, cloud-based tools that emphasize transparency, version control and scalability.

Among the growing ecosystem of data tools, dbt Core, Google BigQuery and Power BI have emerged as a powerful combination. Together, they allow engineers to model, transform and visualize data in a way that is reproducible, auditable and built for growth.

At Datum Labs, we often implement this stack for clients looking to modernize their data infrastructure. In this article, we’ll show how the three tools fit together, why this approach works so well and how to set it up from development to production.

Why dbt Core, BigQuery and Power BI Form a Strong Foundation

The concept of a modern data stack revolves around decoupling ingestion, transformation and visualization so each layer can evolve independently. Instead of relying on monolithic data platforms, organizations now combine best-in-class tools for each stage of the data pipeline.

In this architecture, dbt Core sits at the center. It allows data engineers to transform and test data directly inside the data warehouse using familiar SQL. Because dbt is version-controlled and testable, it introduces engineering discipline to analytics workflows.

BigQuery, meanwhile provides the horsepower. As a fully managed cloud data warehouse, it executes dbt transformations at scale, handling everything from small datasets to terabytes of analytical workloads without performance bottlenecks.

Finally, Power BI brings everything to life. It connects directly to BigQuery, allowing teams to build interactive dashboards on top of curated, dbt-modeled data. The result is an end-to-end workflow where the entire data journey from ingestion to visualization is traceable, efficient and consistent.

This combination gives data teams clarity and control. Transformations happen transparently, data models are versioned, and dashboards pull directly from a single source of truth.

Setting Up Your Modern Data Stack Environment

Before you can model or visualize anything, you need a stable local development environment that mirrors your cloud setup. This ensures your transformations can be tested locally before being deployed to production.

Start with a clean workspace using Python 3.11 (64-bit) and Visual Studio Code or any SQL-friendly IDE. Once ready, initialize a new project folder and create a virtual environment to isolate your dependencies.

py -3.11 -m venv .venv
./.venv/Scripts/activate
python -m pip install --upgrade pip setuptools wheel

With your environment activated, install dbt Core and the BigQuery adapter:

python -m pip install "dbt-bigquery<1.9>"

Matching dbt-core and adapter versions is essential for compatibility. This ensures your project remains stable as new releases roll out.

This foundational setup is the start of your modern data stack, lightweight, modular and ready to scale.

Connect dbt to BigQuery for Cloud-Native Transformations

The next step is to authenticate dbt with Google BigQuery so it can run transformations directly inside your warehouse. dbt connects through Google Cloud’s OAuth flow using the Cloud SDK.

Authenticate with:

gcloud init
gcloud auth application-default login

Once authenticated, define your connection profile in a YAML file located at ~/.dbt/profiles.yml. This file tells dbt which project, dataset and authentication method to use.

my_profile:
  target: dev
  outputs:
    dev:
      type: bigquery
      method: oauth
      project: your_project_id
      dataset: analytics_dev
      location: europe-west1
      threads: 4

Run dbt debug to test the connection. If you see “All checks passed!”, your environment is successfully linked to BigQuery.

From here, dbt can compile and execute your SQL models natively in the cloud, leveraging BigQuery’s massive parallel processing and cost-efficient compute model.

Structuring a dbt Project for Scalability

A dbt project isn’t just a folder of SQL scripts. It’s a structured, layered framework for building transformations that scale.

Projects typically follow a three-tier architecture:

  1. Staging Layer: Cleans and standardizes raw data from source tables.
  2. Intermediate Layer: Applies business logic, joins, and transformations.
  3. Marts Layer: Curates final tables optimized for reporting and analytics.

Each layer represents a stage of data maturity, making it easier to debug, test, and document.

A simple dbt_project.yml might define these layers like this:

name: analytics_project
version: 1.0.0
profile: my_profile
model-paths: ["models"]

models:
  analytics_project:
    staging:
      +materialized: view
    intermediate:
      +materialized: table
    marts:
      +materialized: table

This hierarchy encourages modularity and consistency. Teams can work independently across layers without creating data silos or redundant logic.

Transforming Data with dbt Models

At the heart of dbt are models, SQL files that define transformations in a declarative way. You describe what the data should look like and dbt takes care of compiling and executing it inside your warehouse.

A simple staging model might look like this:

select
  id as order_id,
  customer_id,
  cast(order_total as numeric) as order_value,
  created_at
from {{ source('raw', 'orders') }}
where order_status = 'completed'

Intermediate models then reference previous transformations using dbt’s ref() function, ensuring that dependencies are automatically tracked:

select
  customer_id,
  sum(order_value) as total_spent,
  count(order_id) as total_orders
from {{ ref('stg_orders') }}
group by customer_id

By chaining models together, dbt automatically builds a dependency graph, a visual lineage of how raw data transforms into analytics-ready outputs.

This process enforces good engineering practices: transformations are versioned, testable, and easy to reproduce.

Documenting and Visualizing Data Lineage

One of the standout features of dbt Core is its built-in documentation and lineage capabilities. By running a simple command, you can generate interactive documentation that reveals every relationship between your models.

dbt docs generate
dbt docs serve

This opens a web interface showing model metadata, column definitions and a dynamic lineage graph. Teams can visually trace how data moves through each transformation step from raw ingestion to final dashboards.

This level of transparency is invaluable in enterprise environments. It eliminates the guesswork of “where did this number come from?” and builds trust in the data pipeline.

Power BI turns data into Insight

Once your dbt transformations are live in BigQuery, the next step is visualization. Power BI connects seamlessly to BigQuery, pulling curated data directly from your marts layer.

Within Power BI Desktop, go to Get Data → Google BigQuery, authenticate with your Google account and select your dataset. You can then explore your tables and build dashboards powered entirely by dbt-modeled data.

The benefit of this setup is consistency. Business logic defined in dbt is executed in BigQuery and reflected in Power BI — eliminating discrepancies between SQL scripts, spreadsheets and BI reports.

This alignment allows analysts to focus on insights instead of debugging mismatched metrics.

Version Control and Collaboration with GitHub

A modern data stack isn’t complete without version control. Treating SQL transformations like application code ensures traceability and collaboration.

Initialize a Git repository inside your dbt project and commit your configuration, models and documentation. Once connected to GitHub, you can enable continuous integration workflows to automatically test transformations and generate updated documentation on every pull request.

This makes your data stack truly engineering-grade: reproducible, reviewable and auditable.

At Datum Labs, we encourage teams to apply the same DevOps mindset to data, because reproducibility and governance are just as critical in analytics as they are in software.

Scaling the Modern Data Stack for Production

Once your stack is operational, you can begin scaling by automating execution and monitoring performance. Tools like dbt Cloud, Apache Airflow or Prefect can orchestrate daily builds, run tests and alert teams of anomalies.

We also recommend separating environments development, staging and production each with its own BigQuery dataset. This isolation allows safe iteration without affecting live dashboards.

As your stack grows, invest in data quality tests, source freshness checks and CI/CD pipelines. These features transform a working prototype into a reliable, enterprise-ready data platform.

The Future of Analytics Engineering

The combination of dbt Core, BigQuery and Power BI is more than a convenient stack. It represents a cultural shift in how data is built, tested, and consumed.

With dbt, SQL becomes the language of data engineering. With BigQuery, scalability becomes effortless. And with Power BI, business users can explore insights from a single, trusted source of truth.

At Datum Labs, we see this stack as the foundation for a new era of analytics engineering, one where transparency, reproducibility and automation drive every decision.

Building a modern data stack isn’t about tools alone; it’s about adopting a mindset that treats data as a product, not a byproduct.

Featured Insights