🤩Canva designs user-facing analytics with Tinybird.Read their story.

Apr 24, 2025

dbt in real-time

Tinybird is kind of like dbt, but for real-time use cases. Here's how and why you might migrate real-time API use cases from dbt to Tinybird.

Scalable Analytics Architecture

Javier Santana

Co-founder

If you're in the data world, or you were 10 years ago, you know that dbt really was a "game changer" (I hate that phrase and 99% of the time it's not true, but with dbt it was). dbt gave data engineers and analysts a better way to organize and process data in the warehouse. It started as a consultancy, and now it's a billion-dollar startup because so many data engineers reach for dbt to build transformations in the warehouse.

Tinybird is a lot like dbt, but for a totally different use case. dbt is for building batch analytics in the data warehouse. Tinybird is for building real-time analytics for applications.

This blog post should be useful for people already familiar with dbt who are exploring real-time analytics and/or low-latency API use cases, but it will also be good if you're looking for a better way to keep your data projects well organized.

Why bother migrating from dbt to tb?

Tinybird isn't just "dbt but real-time." Tinybird has a different philosophy and is built around a different core engine optimized for speed and freshness. Tinybird is an engineered system, not just engineered parts assembled into a system.

Some specific reasons you might want to migrate…

Built for real-time processing

dbt was designed for batch processing mostly. You can indeed run real-time workloads in dbt if the database you use under the hood supports it, but in Tinybird everything is designed to work for real time. There is also batch processing in Tinybird if you need it, but, to be honest, it's not as complete as dbt (and it isn't meant to be).

APIs are first-class citizens

dbt models data for something else – a BI tool or another process. Building an API usually means adding another layer: a Python service (Flask/FastAPI), maybe another database cache, all querying the warehouse where dbt ran. More moving parts, more latency, more code to manage.

In Tinybird, pipes are APIs. Any SQL query (pipe node) can be published as a secure, parameterized, monitored REST endpoint with a single command (tb deploy). This radically simplifies building data-intensive applications or features.

Simplifies the stack

dbt is master of the "T" in ELT. You still need separate tools for ingestion (E), loading (L), orchestration (Airflow, Dagster, Prefect), API serving, and often specialized monitoring.

And, if your goal is fresh data powering fast APIs, the typical dbt stack (Kafka -> Flink/Spark -> Warehouse -> dbt -> API Framework -> Monitoring) is complex and expensive.

Tinybird offers a potentially much leaner alternative; it handles ingestion (Connectors, API), real-time transformation (SQL pipes, materialized views), API publishing, and observability (service data sources) in one workflow, managed via the tb CLI and git. For certain use cases, this dramatically simplifies the stack.

Raw speed

In dbt, performance depends entirely on your data warehouse (Snowflake, BigQuery, Redshift, etc.). These are powerful tools, but they're often optimized for broader analytical workloads, not necessarily p99 millisecond API responses.

Tinybird is built on ClickHouse. ClickHouse is fast for the types of analytical queries (filtering, aggregating, time-series) that power dashboards and APIs, especially when data is structured correctly (sorting keys!).

Mapping dbt concepts to Tinybird: A new way of thinking

Migrating from dbt to Tinybird requires a mental shift. Here's a rough translation guide:

dbt Concept	Tinybird Equivalent	Notes
dbt Project	Tinybird Data Project	Git-managed folder with configuration files.
`sources.yml`	`.datasource` file	Defines schema, ClickHouse engine, partition/sort keys. Crucial for performance. Can include ingestion config (Kafka, API schema).
Model (`.sql` file)	Pipe (`.pipe` file) node	A SQL transformation. Pipes chain nodes. Think `stg_.sql` -> `intermediate_.sql` -> `fct_*.sql` maps to nodes in one or more `.pipe` files.
`ref('model_name')`	`FROM pipe_name`	Referencing upstream dependencies.
`source('src', 'tbl')`	`FROM datasource_name`	Referencing a base table defined in `datasources/`.
Materialization (table, incremental)	Materialized view (`TYPE materialized` in pipe)	Key concept. Processes data incrementally on ingest. Targets an `AggregatingMergeTree` typically.
Materialization (view)	Standard pipe node	Just a query definition, run on demand.
Materialization (ephemeral)	Intermediate pipe node	A node used by others but not directly queryable/materialized.
Jinja (`{{ }}`, `{% %}`)	Tinybird Template Functions (`{{ }}`, `{% %}`)	Similar syntax, different functions. Primarily used for API endpoint parameters, less for dynamic SQL generation than in dbt.
dbt Tests	Tinybird Tests (`tb test`, `.yml`)	Primarily focus on testing API endpoint responses. Data quality is often built into pipes.
`dbt run`, `dbt build`	`tb deploy`, materialized views, copy pipes	`tb deploy` pushes definitions. MVs update automatically. Copy pipes (`TYPE COPY`) for scheduled batch runs/snapshots.
dbt DAG	Implicit via `FROM` clauses & MVs	Tinybird manages dependencies based on references.
Seeds	Fixtures (`fixtures/`), `tb datasource append`	Load static data locally with fixtures, or append via CLI/API.

The biggest shift from dbt to Tinybird? Thinking about materialized views for anything incremental or aggregated, and designing data source schemas (especially ENGINE_SORTING_KEY) for query performance from the start.

Step-by-step migration strategy

Assume you have the tb CLI installed and logged in (tb login), and you've initialized a project (tb create --folder my_tb_project && cd my_tb_project).

Make sure you have Tinybird local running for testing: tb local start

1. Migrate sources -> `.datasource`

For each dbt source table needed, create a file like datasources/my_source_table.datasource.

Some notes:

Schema: Translate data types carefully. Tinybird uses ClickHouse types (e.g., String not VARCHAR, DateTime64 not TIMESTAMP). See Tinybird Data Types.
Engine & keys: This is critical. MergeTree is common. ReplacingMergeTree if you need updates based on a key. AggregatingMergeTree for MV targets. Choose ENGINE_PARTITION_KEY (often time-based like toYYYYMM(timestamp_col)) and ENGINE_SORTING_KEY based on common query filters. Don't skip this. Poor sorting keys kill performance.
Ingestion config: If Tinybird will ingest data from a connected source (e.g., via Kafka), add the connector settings here. If it's populated by another pipe (or via Events API / Data Sources API, you only need the schema and engine.

An example:

dbt

Tinybird

2. Migrate models -> `.pipe`

Convert dbt .sql files into .pipe files (e.g., pipes/stg_pageviews.pipe).

Notes:

Basic Transformations: A dbt model often becomes a node in a .pipe. Use FROM previous_node or FROM datasource_name or FROM other_pipe.
SQL Dialect: Common changes, depending on your current database provider:
1. Date functions: toDate, toStartOfDay, addMinutes, etc.
2. JSON: JSONExtractString, JSONExtractInt, etc.
3. String functions might differ slightly.
4. Check the SQL Reference. You will spend time here.
Materialized views (the incremental magic): If your dbt model is incremental use a Tinybird materialized view.
1. Define a target .datasource (e.g., datasources/user_daily_summary.datasource) with an appropriate engine (AggregatingMergeTree for sums/counts, ReplacingMergeTree for latest state). Schema should include aggregate state columns (e.g., AggregateFunction(sum), AggregateFunction(uniq)).
2. Create a .pipe file (e.g., materializations/mv_user_daily_summary.pipe) containing the transformation SQL. Use aggregate state functions (sumState, uniqState, argMaxState).
3. Add TYPE materialized and DATASOURCE target_datasource_name to the final node of this pipe.
Copies: If you use a pre-aggregated table in dbt (materialized='table'), you should use copy pipes in Tinybird.
1. Define a target .datasource (e.g., datasources/user_daily_summary.datasource) with an appropriate engine (MergeTree, ReplacingMergeTree...)
2. Create a .pipe file (e.g., copies/daily_summary.pipe) containing the transformation SQL.
3. Add TYPE copy and DATASOURCE target_datasource_name to the final node of this pipe.
4. Optionally set the schedule and copy_mode (append or replace)

Example:

dbt

Tinybird

dbt (Incremental concept)

Tinybird (Materialized view approach)

Target datasource:

Materializing pipe:

Querying the MV:

3. Publish APIs -> `TYPE endpoint`

This is often the goal. Make the final node of your query pipe an endpoint:

Add TYPE endpoint.
Define URL parameters using {{ DateType(param_name, default_value) }}.

Deploy (tb --cloud deploy) and your API is live.

4. Migrate tests -> `tb test`

Translate dbt tests to Tinybird tests:

Endpoint tests (most common): If your Pipe ends in TYPE endpoint, use tb test create <pipe_name> to create a .yml test file in tests/. Run the endpoint with parameters (e.g., via curl or tb endpoint) and use tb test update <pipe_name> to capture the output as the expected result. See Test Files.
Data quality checks: Often embedded directly in the pipe logic. Use throwIf( count() > 0) in a node, or create specific nodes to filter/flag bad data. You can also create dedicated .pipe files that run checks and assert results in a test.

5. Orchestration -> MVs, copy pipes, deployment

Deployment: tb deploy pushes the definitions to Tinybird.
Real-time: Materialized views handle incremental updates automatically. No external scheduler needed for this continuous flow.
Scheduled batch: For jobs that should run periodically (like dbt runs or snapshots), use copy pipes. Add TYPE copy and COPY_SCHEDULE 'cron syntax' (e.g., '0 * * * *' for hourly) to a pipe node. See Copy Pipes.
External triggers: Need more complex logic? Trigger a Tinybird job (an on-demand copy pipe) via its API from Airflow, GitHub Actions, Trigger.dev, etc.

Potential pitfalls

SQL dialect hell: Budget time for translating functions, especially complex date logic, array/JSON manipulation, or window functions (ClickHouse support is good, but syntax differs). Test thoroughly.
Materialized view mindset: Thinking incrementally is key. Designing the MV target schema (AggregatingMergeTree, states) and the transformation logic takes practice. Debugging MVs can be trickier than batch jobs.
Sorting key design: Forgetting to define or choosing poor ENGINE_SORTING_KEY in your .datasource files will lead to slow queries, especially as data grows. This is more a database thing than a framework one, but it’s important to take it into account.
Complexity creep in pipes: While pipes allow chaining SQL nodes, overly complex, multi-hundred-line pipes become hard to debug and manage. Break things down logically.

Monitoring is a little bit different

Forget just checking if the dbt run succeeded. In Tinybird, you need to monitor the flow continuously:

datasources_ops_log: Monitor ingestion rates, errors for API/Kafka sources.
pipe_stats_rt: Check endpoint latency (p50, p95, p99), rows read, errors. Essential for API performance.
jobs_log: Monitor scheduled Copy Pipe runs.

Learn to query these service data sources (FROM tinybird.ds_name) and create endpoints (Prometheus format is especially useful here). They are your eyes and ears.

Final thoughts

Migrating from dbt to Tinybird isn't a simple lift-and-shift. It involves rethinking data flow for real-time and API-centric use cases, learning the SQL nuances, and embracing materialized views.

But if you have real-time needs, and you want to have everything in the same place, Tinybird is a good alternative/complement to dbt.

Do you like this post?