Tableau releases new versions (major releases: e.g. 2021.3) of its products four times per year. For each version, they then create maintenance releases (minor releases: e.g. 2021.3.1) for 24 months – the maintenance window. Key new features are well documented, but with each upgrade also come slight adjustments and small improvements.
While working with a client we stumbled on one of those small changes:
Storage used per site previously only showed storage used for ‘Live’ content. It now encompasses all content, including revisions.
How did we discover this? Keep reading and we’ll explain:
- How to connect to the Tableau Server repository
- How to see exactly how much space content uses per site, including a Custom SQL query for re-use
- Where exactly is content stored on Tableau Server’s hard disk?
The challenge
We recently helped a client upgrade their Tableau Server from version 2018.3 to 2021.2. Following Tableau’s recommended preparation steps, we had recorded information from the different admin pages. When reviewing the Sites Overview (Server Admin page: Manage All Sites) after the upgrade, something strange showed up:
Before the upgrade the Business site used 2.0 GB of storage:
The same page shows 7.3 GB usage after the upgrade.
Without any changes to content, the space usage for the Business Site had gone up from 2.0 to 7.3 GB. How is that possible? The Site Admin pages can give us more detailed information about space usage. Let’s look at the view Stats for Space Usage (available to Site Admins):
So as a Server Administrator we see 7.3 GB in use, but a Site Admin sees 1.96 GB in use. So what is going on? How much space are we using for content?
Enter the repository
Tableau Server stores its own data in the Tableau Server Repository database. This PostgreSQL database can be accessed like any other database. The repository holds information about content, users, usage and much more.
So how to use the repository to answer our question? Full instructions can be found here. In summary:
- Enable repository access. The TSM user can enable database access by generating a password. Use the following command in the CLI: tsm data-access repository-access enable –repository-username readonly –repository-password <PASSWORD>
- Connect to the repository. The default port is 8060:
Use the following SQL query:
SELECT
sites.id AS id,
sites.name AS site,
CustomSQLQuery.location as location,
CustomSQLQuery.sum AS size,
FROM
(
SELECT
SUM(size),
r.site_id,
'flow versions' as location
FROM flow_versions r
WHERE r.content_key IS NOT NULL
GROUP BY r.site_id
UNION
SELECT
SUM(size),
r.site_id,
'workbook versions' as location
FROM workbook_versions r
WHERE r.content_key IS NOT NULL
GROUP BY r.site_id
UNION
SELECT
SUM(size),
r.site_id,
'datasource versions' as location
FROM datasource_versions r
WHERE r.content_key IS NOT NULL
GROUP BY r.site_id
UNION
SELECT
SUM(size),
r.site_id,
'flows' as location
FROM flows r
GROUP BY r.site_id
UNION
SELECT
SUM(size),
r.site_id,
'workbooks' as location
FROM workbooks r
GROUP BY r.site_id
UNION
SELECT
SUM(document_file_size + packaged_files_size),
r.site_id,
'flow drafts' as location
FROM flow_drafts r
GROUP BY r.site_id
UNION
SELECT
SUM(size),
d.site_id,
'datasources' as location
FROM datasources d
WHERE d.parent_workbook_id IS NULL
GROUP BY d.site_id
) CustomSQLQuery
INNER JOIN public.sites sites ON
CustomSQLQuery.site_id = sites.id
Running this query in a SQL client resulted in the following:
Looking at the Business Site:
- workbooks + datasource = 1.9 GB. This is what the Site Admin page measures.
- Site total = 7.3 GB. This is what the Server admin page measures.
We ran this query using a SQL client called dBeaver, but the same query can be used right inside Tableau.
Conclusion
- Stats for Space Usage shows only current content.
- In the Sites Overview, previous versions of Tableau Server counted only current content
- In the Sites Overview, newer versions of Tableau Server count current content as well as revisions
Finally, where does content get stored on the hard disk?
Tableau Server content takes up space in two folders in ProgramDataTableauTableau Serverdatatabsvc
- dataengine – actual content
- pgsql – repository
Files are stored in a secure format, so they are not accessible from outside of the Tableau Server application.
In August this year Tableau announced an early end-of-maintenance for versions 2019.3 to 2020.3. These versions will not receive minor releases after November 11th. This announcement triggered some of our clients to accelerate their upgrade schedule, upgrading all their Tableau products to a newer version. And that’s a good thing! This way they make sure to be up-to-date on all security updates and users get to use the latest features including Relationships, Ask Data for Viewers and multiple map layers, just to name a few.
The above is just a quick example of something that came up during a Tableau Server upgrade project. After years of experience using our different partner tools, we still learn little things almost every day. Do let us know if this was interesting and if you’d like to see more content like this. And of course, feel free to get in touch regarding Tableau Server installation, maintenance, upgrades or training.