Python is a powerful coding language for data analysis, thanks to libraries like numpy
, pandas
, and now polars
.
There are also some great Python libraries for data visualization, for example, matplotlib
, plotly
, highcharts
, and seaborn
. As such, Python has been widely adopted in data science, machine learning, and data engineering as a go-to language for data analysis and data visualization.
While Python is good for data analysis and basic data visualizations thanks to its many libraries, you still might want to be able to build real-time dashboards in Python that refresh as new data is created while also handling a greater level of scale than a single machine can support.
In this tutorial, I'll show you how to build a real-time dashboard in Python using Tinybird, a serverless real-time data analytics platform, and Dash, an open source app-building framework for Python.
Here's what the final product looks like:
We'll use Tinybird to capture streaming data, transform it with SQL, and publish transformations as REST APIs. Utilizing the community-developed Verdin Tinybird SDK for Python, we'll consume data from Tinybird APIs in Python and visualize them in real time with Dash, adding interactive filtering to our real-time dashboard using Dash input components and Tinybird query parameters.
Step-by-step: Build a real-time Python dashboard
Follow this step-by-step tutorial to build a real-time dashboard in Python with Tinybird and Dash. If you get stuck along the way, all of the final code is in this GitHub repository, which you can use for reference. You can also chat with Tinybird engineers in our Slack community if you need help.
Prerequisities
First, let's get the prereqs out of the way. We're going to install dash
and its dependencies, verdin
, and the Tinybird CLI.
Go to tinybird.co/signup and get a free Tinybird account. You'll be taken to the Tinybird UI, where you can create your first Workspace. You can call the Workspace whatever you'd like, but I'll be calling it python_dashboard
to align with my project directory.
Once you've created your Workspace, copy your user admin token (the one with your email address) from the "Tokens" section of your Workspace.
Then, create a subdirectory in your project called data-project
and authenticate to your Tinybird Workspace:
.tinyb
file (which you already added to your .gitignore
).
Step 2: Ingest data into Tinybird
In a typical Dash app, you'd load the entire data set into memory as a pandas
DataFrame, perhaps from a CSV file. Then, you'd build metrics for your graphs by querying the DataFrame using the pandas
library to get the results you want to display. This can start to become computationally expensive, however, when the data set gets larger.
So, for our real-time dashboard, we're going to offload all of the real-time data processing to Tinybird, freeing up memory for our Dash app so it can be as fast as possible.
To start, we need to begin ingesting a data stream into Tinybird.
In this tutorial, I'm using the open source Mockingbird CLI to generate some mock data streams and send them to Tinybird using its HTTP streaming API.
If you want to follow along, you can install the Mockingbird CLI with:
Of course, there are tons of ways to get data into Tinybird. Check out the "Ingest" portion of the Tinybird documentation for more information on real-time data ingestion. Tinybird supports ingestion from streaming data sources like Kafka and Confluent, data warehouses like Snowflake and Google BigQuery, object storage like Amazon S3, and even good ol' fashioned CSV files.
In this example, I'll be using the Flight Bookings template from Mockingbird, which has the following simplified schema:
To start sending data to Tinybird, run the following in your terminal:
You should see a successful response sending 10 rows of mock data to Tinybird every second. You can continue to let this run in the background as you develop your dashboard.
If you navigate to the Tinybird UI, you will see the flight_bookings
Data Source has been automatically created with data streaming in.
You now have data in Tinybird! On to Step 3.
Step 3: Create dashboard metrics with SQL
In Tinybird, you use Pipes to transform your data in real time using SQL. With Pipes, you can split up complex SQL queries into chained, composable Nodes. Then, you can add dynamic query parameters using templating and publish your metrics as scalable, secured REST APIs.
In Tinybird, you create real-time data APIs with nothing but SQL.
We'll use these APIs to get the data we want to visualize.
Again, the benefit here is that we don't have to load our data set onto our machine as a DataFrame, and we don't even have to query it locally. Instead, we'll simply get the results from the Tinybird APIs as JSON, and load those (much smaller) results into pandas DataFrames, solely to visualize them with Dash.
In the GitHub repo for this project, you'll find six (6) .pipe
files that represent the SQL transformations we'll do to get the metrics we want to visualize. Two of these Pipes (airlines.pipe
and meals.pipe
) define helper APIs that we'll use to dynamically populate options for our dropdown filters, and the remaining four provide data for our visualizations.
Create your first API
Let's start by creating the Pipe that gets the full list of airlines in our data set. From the flight_bookings
Data Source page, click "Create Pipe".
Name the Pipe airlines
. Then, in the first Node, paste the following SQL:
This will return a list of unique airline names in the data set. Now, let's publish it as an API. Name the Node endpoint
, then click the Node Actions and select "Create API Endpoint":
You'll be showered in 🎉 confetti 🎉 as you revel in the automatic generation of your first real-time data API!
Create the remaining APIs
Now that you know how to create an SQL-based API in Tinybird, let's speed through things and generate the remaining five. From the GitHub repository, download the six .pipe
files and add them to a pipes
subfolder in the data-project
subdirectory of your project.
Your project directory should look like this:
From the data-project
directory, run the following command in your terminal:
This will instantly create five additional API Endpoints that we'll use for our real-time dashboard. As new data streams into Tinybird, these APIs will update in real time with the freshest analytics.
A note about query parameters
In your Pipe definitions, you may have noticed some curly braces, like in this first node from the flight_bookings_by_minute
Pipe:
Tinybird's templating language allows you to add dynamic query parameters with advanced logic to your APIs. In the node above, which is the first node in the flight_bookings_by_minute
Pipe, you'll find 3 query parameters:
lookback
: A UInt16 that filters out data that is more than<lookback>
hours oldairlines
: An array of airline Strings that allows filtering by one or more airlines when definedmeal_preference
: A String that filters by selectedmeal_choice
when defined
We'll use these query parameters in our Dash app to create dynamic filtering in the real-time dashboard.
With these API Endpoints created, you're ready to move on to Step 4.
Step 4: Create your Dash app
Now that you have your data APIs, you can start building your dashboard. Our dashboard is going to have a time series line chart, a donut chart, and two bar charts. To get started, let's go ahead and lay out the structure for your dash app.
app.py
file from the project GitHub repo. If at any point you're confused about where to add code in this file, refer to the repository.From your project root, create a file called app.py
and add the following code:
This imports all the libraries you need, creates your Tinybird client, and initializes your Dash app layout. Now, let's add some charts!
Set up Tinybird API definitions using Verdin
Since we're using the Verdin SDK, it's pretty easy to create Tinybird API definitions. Let's start by defining the airlines
and meals
APIs that we'll use to populate our dropdown options.
Add the following code below where you initialized your Tinybird Client (outside of the app layout):
Then, go ahead and add the remaining API definitions (the ones you'll use for your charts) below:
With your APIs defined in code, you're ready to start building the real-time dashboard components with Dash.
Create a title and div for inputs
Let's add a title to the Dash app and a div
to hold our input fields. Add the following to your app layout:
If you run python app.py
and open your web browser to 127.0.0.1:8050, you should see something like this:
Let's add some dropdowns. Here's what the dropdown definition looks like to allow people to select an airline:
You can see we are using the airline
column from our airline_options
DataFrame to populate the dropdown options, multi-select is enabled, and we've added a placeholder and some CSS styling. You'll want to add this in the children
list parameter for the encompassing html.Div
.
Let's go ahead and add another dropdown for meal selection
If you save and run your app, this is what you'll see:
Lets add one more input to dynamically change the lookback period, along with some labels:
And now, your Dash app looks like this:
Add your first chart
The first chart we'll add will be a line chart, showing the total number of bookings per minute grouped by airline over the lookback period. This uses the flight_bookings_by_minute
API we created in Tinybird.
Add the following code below the div containing your inputs:
This defines a Dash Graph component that we're calling bookings-chart
. Now, we need to add a callback that updates this chart.
Below your app layout definition, add the following decorator:
This will call the succeeding function and update the bookings-chart
anytime an input filter is updated. Below the callback decorator, add the function to update the line chart:
This function passes the selected airlines, meal preference, and lookback from the inputs to the Tinybird API, then creates a DataFrame from the data in the JSON response, and returns a Plotly Express line chart using that DataFrame.
Save your app.py
and run it. You'll see that you have data!
As you make selections from the dropdowns or update the lookback period, your Dash app is making requests to your Tinybird API to get the freshest possible data at very low latency.
Add a refresh interval
Of course, we want this to be a real-time dashboard, so it should refresh periodically as new data arrives. How often it refreshes is up to you, just keep in mind that every refresh will make additional Tinybird API requests, and you get 1,000 daily requests on the free plan, so I recommend something like a 60-second interval for starters.
To add a refresh interval, first add an Interval
component to your Dash app definition by adding the following code below your Graph
definition:
Note the interval is in milliseconds and setting n_intervals=0
means the refresh interval will run ad continuum.
Then, add the interval to your callback inputs:
Finally, add an additional parameter called n
to your callback function so it doesn't get angry (you're not actually using this parameter, it's just needed so the function matches the decorator).
Now, your dashboard will automatically refresh every minute.
Add the remaining components
Let's round out this dashboard with three additional charts. Add the following code below the Graph
and before the Interval
you defined previously in the app layout:
This will create a nice flex
div
that will layout the remaining three charts below your time series graph.
Then, we need to add callbacks to update those charts using the same filters and interval that we set up for the time series chart. For example, the age distribution chart:
Go ahead and copy the remaining two from lines 147-200 in app.py
in the repo. When that's done, save your app.py
, and run it.
Congratulations, you've just created a real-time dashboard in Python with Tinybird and Dash.
Next steps
Tinybird and Dash form a powerful duo, making it possible to create and deploy real-time Python data apps on the web with relatively little code.
Bear in mind that this is an elementary example, both in terms of data pipelines and data visualization. There are many ways to further optimize your Tinybird APIs by using proper indexing and creating Materialized Views to calculate important aggregates at ingestion time instead of querying time.
If you'd like to learn more about Dash, check out their documentation.
If you'd like to learn more about Tinybird, you can visit the website, check out the docs, or watch our YouTube screencasts. If you're ready to start building with Tinybird, go ahead and sign up for a free account if you haven't already. If you have questions or get stuck while building, then join our Slack community.
If you're interested in understanding more about Tinybird and the features and pricing of Pro or Enterprise accounts, please request a demo. We'd be happy to talk through your use case and show you how Tinybird can help you unify your data sources, build real-time data products with nothing but SQL, and empower your organization to build with data.
Frequently asked questions
Here are answers to some questions commonly asked about this tutorial and building real-time dashboards with Tinybird.
Why Tinybird instead of a library?
You might be wondering, "Why use Tinybird instead of a library like polars
to process data?". While polars
represents a big advancement for in-process data analysis using Python, Tinybird has some benefits that certain users might appreciate, namely:
- Simple API abstractions for data ingestion from a variety of sources, including streaming data platforms like Apache Kafka or Confluent, data warehouses like BigQuery and Snowflake, object storage systems like Amazon S3, or even CSV files. Regardless of the source, capturing real-time data in Tinybird is very simple.
- Full OLAP database. Tinybird supports complex queries at scale thanks thanks to its underpinning real-time database (ClickHouse) and serverless model. Because of this, Tinybird can potentially handle larger volumes of data much more quickly, as it won't be constrained by the RAM available on a single machine. Note that
polars
does offer mechanisms for processing data larger than your RAM, though this comes with some performance implications. - API flexibility. Tinybird allows you to publish real-time data pipelines as REST APIs. These can be visualized with Python libraries, but they can support many other use cases such as real-time website personalization, real-time Javascript-based web dashboards, operational intelligence dashboards, etc. The ability to access query results in real-time as a REST API gives Tinybird some flexibility over alternatives.
- It's free for simple projects. Tinybird isn't an open source library, but it is free to sign up with no time restriction and generous free tier limits.
Certainly, tools like pandas
, polars
, or even DuckDB can be used for processing and visualizing data with Python, but in this tutorial, we use Tinybird thanks to its simplicity, scalability, and flexibility.
Why Dash instead of matplotlib, seaborn, Bokeh, Highcharts, or Streamlit?
Certainly any of these Python visualization libraries would work well for Python data visualization, but we like Dash for a few reasons:
- Easy to set up a simple web dashboard. Dash is probably the simplest tool for setting up and styling a web dashboard with Python. As Dash is based on Flask, a popular Python web framework, it includes functionality that makes deploying our dashboard to the web that much easier.
- It leverages the
pandas
library. It's very easy to structure data for visualization with Dash, as you can usepandas
DataFrames to structure data for visualization. In our tutorial, we capture JSON data from Tinybird APIs and normalize it into apandas
DataFrame to be visualized with Dash. This all happens in just a few lines of code. - It's free! Dash Core is a free, open source library provided by Plotly. If you want to scale up without and access new features, you can upgrade to Dash Enterprise.
Of course, you can use just about any Python data visualization library with Tinybird! Choose what works best for you.
Does Tinybird cache query responses?
No, and for a good reason. While using a cache can certainly improve frontend performance, and is often useful in many cases, Tinybird is a real-time platform designed to create low-latency analytics APIs over the freshest possible data. Most Tinybird customers capture thousands to millions of new events per second, and they want to visualize the results of their Pipes with all of that new data as quickly as possible.
How fast do Tinybird APIs respond?
If you go to an API page in the Tinybird UI, you can see stats for average query latency. Of course, the latency of your queries depends on how much data those APIs process, but generally speaking most Tinybird APIs can be optimized to respond in 50ms or less.
How much does Tinybird cost?
Tinybird is free for small projects, with no credit card required and no time limit. Once you exceed the free limits of 10 GB of storage or 1,000 API requests per day, you can upgrade to Pro or Enterprise plans. To learn more about features and pricing for these plans, check out the Tinybird Pricing page.