Good morning, campers. Lets get right into the continuing story of the Tableau XML files. By now you should:
- Have a basic understanding around the use of these files
- Know how to request them
Now, we’ll dive into the information contained in each one. We’re only covering the “most popular”. There are many more - up to you to find them!
First, some expectation setting: In some cases, you’re going to see that a particular file might not contain everything you want it to.
Indeed, you might just be tempted to ask for some additional fields to make your life easier. Please keep in mind dear reader; these files were not designed for you nor are they intended to be consumed by you. So, you’ll eat what you’re served and like it.
data_connections.xml lists, you guessed it…the data connections inside all your workbooks. To be clear, these are NOT data sources themselves, but (pretty much) the connection strings inside your data sources.
As you can see, there’s some interesting stuff here. The file specifies our data source type (msolap = SQL Server Analysis Services), the server name (“sql”), as well as the internal and friendly name (“name” and “caption”) of the data source. If a username were specified for this data connection, we’d get that, too.
Here’s the same item directly in the application server:
Almost forgot to mention this…note the embedded <owner> element in the XML file definition of the connection. It is pointing to the Datasource that this connection is a part of.
That said, lets look at the Data Source:
All sorts of fun stuff here!
I’ve expanded the information for two data sources above. Pay close attention to the second expanded data source – it’s the parent of the connection we were just looking at – note the matching id (52).
- Id: internal id of the data source
- Name: Name of the data source
- Repository-url: The URL of the object if one were to retrieve it directly
- Owner_id: The site-specific user id of the person who published and owns this data source
- Tasks: Tasks / schedules associated with this data source
On to groups!
This file is pretty cut and dry. We’re going to get a group name, it’s domain and an ID. The domain will read “local” when we’re in standard mode, or will have an actual AD Domain if we’re running in AD mode. Nothing else to see here, move along.
Projects? Not super exciting, either.
Below you’ll see pretty much what you expect:
- The name and ID of the project
- When it was created and if/when it has been modified
- Owner information
Good stuff here.
You can get at pretty much everything you might need to know about your schedules, including the last time one ran (updated-at) and the next time it will run (run-next-at). The one thing you don’t get “out of the box” is whether or not the last execution succeeded or failed – that would be nice.
There is worthwhile information inside sites.xml, as well.
To me, the main standout fields are user_quota and storage quota. Note that the SkunkWorks site is limited to 10 users and 500 GB of storage.
What is the content_admin_mode element telling us about? It represents this part of the Tableau Server UI:
- 1 = Only system admin can add users
- 2 – Both system and site administrators can
Below, you can see the subscription id and subject, as well as information about the user who subscribed. The schedule associated with this subscription is also a part of the hierarchy.
Task information encompasses:
- The id of the task
- What it does (type)
- What object the task acts on (id 701, a workbook named “Google Analytics”
- When the action will occur (schedule)
Users.xml is another file which is content-rich.
- id: the site-specific user id of this person
- luid: ignore. A more unique way to express the user id not useful in the “corp” product
- name: the username in domain\user format
- login_at: last login
- email: duh.
- licensing_level: Interactor, Viewer, or Unlicensed
- administrator: true or false
- admin_type: <empty>, system, or content
- publisher: true or false
- raw_data_suppressor: used by other products, always returns false here.
If the specific user you’re dealing with owns any views, you’ll see them in the views collection of your user. In fact, if someone has added a tag to a view, that’ll show up, too (we won’t cover that yet, though).
This is one of my favorites. A query against /users/<username>.xml will return basic information about the user and a list of workbooks, views and data sources they own.
For example, grab the file for Barney Rubble (username=rubble):
…which results in:
Using this file is a great way to very quickly understand if this user is easily “delete-able” from Tableau Server, or if you need to unlicense them (since they own stuff) instead.
The output of views.xml is arguably the most complex text you’ll work with. There’s tons going on. First, request it:
Note at the top of the file you’re told how many views we’ll be dealing with:
Open one of these elements, and behold the goodness!
As you can see, if a view has tags associated with it, an extra <tags> collection element is added to the view node:
In the sample above you can see someone added a very boring tag (literally, it IS boring. the text is “tag”) in a specific site at a specific point in time. If this view had multiple tags, they’d show up as individual tag elements inside the tags collection.
Interested in the tags created by a user? Grab the /tags.xml file…but you’ll need to figure that one out on your own
views.xml ALSO will show you comments associated with a view in much the same way tags are documented.
And the cornucopia continues to overflow. Want to know if there are customized views floating around?
We gotcha covered…
Note that ALL customized views associated with this sheet are being enumerated, not just the ones I (I am user id 5) created. In the screenshot below I am logged in as “Russell” and I only see the single customized view I created…yet the list above shows two:
Sheesh, I’m glad this is mostly done. I’m getting tired of writing. Anyway, workbooks.xml is pretty much what you’d expect.
It contains elements we’ve already seen like the <view>, <owner>, <task>, and so on:
An interesting element above is <size>. It is used to record the size of the workbook including any associated extract.
Extra credit fun stuff
You can apply filters to many of these files right on the query string to amaze and impress your friends.
I’m not going to cover every single one of these since I’d like YOU to explore yourself. Maybe you’ll find something cool and post it back here.
Here’s a starter list:
Show me all users in a specific group:
(filter “fe_group” by a specific group, in this case “Big Group Three” in the local domain: local\Big Group Three)
Give me all workbooks modified after 1-Dec:
Which workbooks does user “Barney Rubble” (user 4408) own?:
Which data connections on my server utilizes Microsoft SQL Server?:
None of this is magic. Just click around inside the Tableau Server web application while applying filters you think are useful. Then, check out the resulting query string Tableau generates for you. Copy, paste, repeat.
In the final post (once my fingers stop cramping up), we’ll talk about different strategies you might use to parse all these files…
Football Tranfers across time by #Tableau!
…A continuation of Tableau Server XML Information Files
Before you can ask for one of these juicy XML files, you must authenticate against the server. There are three ways to do so:
- Quick and dirty: Login via the browser
- Scripted: Write a simple batch file or script to do the work
- Fancy: Write Ruby / Python / C# / whatever code to login
It’s important to note that the identity you login with (authentication) has a huge impact on what you can see and do (authorization).
For example, if I login as an Administrator I can hit all of the “admin-ish” pages and files to do things like:
- Get a list of sites on my server (/sites.xml)
- See the collection of data connections (/data_connections.xml) on the box
As a “regular” user, if I try to view some of this stuff, I’ll get the brush off:
If I ask for the XML file, it’ll simply be empty.
Here’s the really important bit: The identity you’ve logged in with will automatically modify a file’s content. If you login as Moe and ask for workbooks.xml, the file will list only the workbooks Moe has permissions to see. Login as Larry and ask for datasources.xml? It’ll contain the data sources that Larry has permissions to consume.
Exciting or surprising? No. Useful? Oh yes.
This behavior will allow you to determine per-user permissions on workbooks, data sources, and views.
How? Login as the user in question, ask for the file and you’re done.
Problem / Objection #1
Don’t have the password for each user account on your Server? (It actually would be a little scary if you did)
No worries. Drop a different “dummy user” with a password you DO know into each Tableau Group that your users are a part of. Login as the appropriate dummy user, get your file, and you should be looking at the same list that the “real” user would return.
Problem / Objection #2
You don’t use groups, you say? You’re a fan of the “willy-nilly, per-user permissioning, make-my-life-hard” technique? Well shame on you. Don’t do that.
Don’t worry, though. I still have a solution for you. It is more difficult to implement, but since you like doing things the hard way anyway, it’ll be more of the same. Details in a few paragraphs.
We won’t bother discussing how to login to Tableau Server. Let’s get right to “scripted logins”.
Using TabCmd, you can login as any user and then use the get command to grab the file of your choice. These steps can be added to a batch file or scripted, of course:
You can also get more fancy. This technique is more complex but is useful because it allows you to “impersonate” any user and get a file which is “modified” based on their identity. “Objection #2” people will want to read up.
- Configure Trusted Authentication on Tableau Server
- Configure Tableau Server to trust the IP address or hostname from where you’ll be running your scripts
- Configure Tableau Server to use Unrestricted Tickets. You must do this or you will be unable to request the XML files without a Permissions Denied error
After you’ve taken care of the configuration, you can use a tool like cURL to both request a trusted ticket from Tableau as any user, and then use the ticket to retrieve the file you want.
First, request a ticket, which authenticates the user “russch” on the Tableau Server “wintableau”:
Note the ticket which got returned (fplnW19….)
Then, use CURL to request the view that you want using the ticket.
- Parameter –L allows for redirects (necessary)
- Parameter –c allows cURL to save a cookie, which is necessary for authentication to work correctly
- Parameter –o stores the file to your file system
In case you can’t see the image above, the command-line is:
curl http://winTableau/trusted/<ticket>/views.xml -L -c c:\temp\cookies.txt -o c:\temp\views.xml
Here’s everything put together:
AND, just because I like you, how about a nice little batch file that does the same thing automatically (copy/paste into a notepad and save as “something.bat”:
FOR /f %%A IN (‘curl -d “username=russch” http://wintableau/trusted’) DO SET FOO=%%A
curl http://winTableau/trusted/%FOO%/views.xml -L -c c:\temp\cookies.txt -o c:\temp\views.xml
Let’s get fancy
You can also write code which handles authenticating against Tableau Sever, requesting the file you want, and writing it to your file system. You’ll need to understand how crypto works and how to use the crypto libraries supported by your development platform of choice, so this approach is not for the meek. Here are three GREAT examples of logging into Tableau Server and requesting a document:
Tamas Foldi (the father of Tableau hacking) does Ruby:
Github Gist: https://gist.github.com/tfoldi/5450418
Atilla Horvath does Java:
Ingvar Ljosland does C#:
And there you have it. You now know how to authenticate against Tableau Server using 3 different techniques and you also know how to ask for stuff.
In the next post, we’ll actually address what the “stuff” is.
This “replacement” #Tableau landing page and viz selection portal is pure awesome.
As of late, I’ve seen a lot of activity on the #Tableau Community Forums concerning the “informational” XML files that Tableau Server can generate for you, I thought it might be fun to write several long, rambling posts that speak to just about everything you can do with these suckers. Doing so helps me drown out the holiday noise that is Black Friday, Small Business Saturday, and Cyber Monday. Feh! Maybe I’ll just keep going until Jan 2.
As I’m sure you know, generating and using these files is not supported by Tableau. This is an internal mechanism which is undocumented and will stay that way as far as Tableau is concerned. This stuff could change at any time, like tomorrow. It could give you food poisoning and very bad breath. Your girlfriend or boyfriend might leave you. Seriously.
You’re an adult can choose to lean on this functionality if you wish, but there is risk involved. Eyes wide open, please.
What do these files do for me?
If you’re a Tableau Server administrator, the XML files offer an easy way to pull information about your users, workbooks, and other objects out of the server.
Generally, the same information is available via our (supported) database views. So, more often than not you don’t need and probably shouldn’t use the “XML File” approach. Got it?
There are some edge case scenarios where these can be quite useful, however:
- I need a list of data connections on the server
- I want a list of users in a specific group
- I want to know what workbooks or views a user or group has access to
- I want a list of workbooks owned by a specific user
- I’m bored and just feel like showing off
Here are “Big 12” files that are (easily) available. We’ll define what each does later:
Under the Covers Overview
Tableau Server is built on top of Ruby on Rails. RoR allows you to hit paths, like http://myServer/views. Ruby on Rails uses routes to forward various combinations of paths and HTTP Verbs (GET, POST, PUT, etc) to a controller which executes actions
We’ll be focusing on a subset of the most important controllers and actions, but there are tons of ‘em. All you have to do is watch the _http_requests view inside our PostgreSQL database, and you’ll see lots of fun stuff you can query.
For example, Tableau Server 8.1 will now “alert” you if one of your extracts didn’t refresh properly:
How is Tableau getting the information which is displayed in this dialog? It is hitting a controller. I can see this happening inside PostgreSQL:
If I know the URI of that path/action and I’m authenticated against Tableau Server I can often just hit the URI myself.
Most of these controllers have template handlers which allow someone to render as either HTML or XML I don’t know tons about RoR, but the templates which handle HTML generation have an .erb file extension, and the ones which “do” XML have a .builder extension. Here’s the template to build out the views.xml file and the output of the xml file itself:
Note the references to fields in the builder file which show up in the XML that Tableau Server generated for me
There are three steps you’ll take when leveraging these files:
- Authenticate: You must be logged into Tableau Server to request the files
- Request: Make a request for a file, sometimes providing additional filter parameters
- Consume/Parse: Reading raw XML isn’t the greatest experience, so while viewing this stuff directly in the browser is fine, you might want to parse it.
In the next post (part two) , we’ll talk about the start of this process — how you can authenticate against Tableau Server.
Part three will drill down into each XML file and what it does. We’ll discuss how to get the file you want, and how you can dynamically filter some of them.
The final post (part four) will be boring, but important. I’ll point out a couple ways you can parse the XML you get back so it is more human readable.
Happy holidays (whatever.)
Here’s a scenario that you may run into now and then using Tableau – you need to conditionally aggregate a dimension based on privacy or other concerns.
You’re in HR, and you need to build a report which displays managers and all of their respective reports — including each employee’s salary.
The employee names must be anonymized. If the manager has very few reports, you don’t want to show salary-per-anonymous-employee at all. It still would be pretty easy to guess who is who in a department of two people – we therefore need to aggregate.
So, for managers with “many” reports, we’ll show many individual salaries. For a manager with “few” reports, we only get a sum of payroll.
I approached this with table calcs and got pretty close. That said, I couldn’t get quite what I wanted and I asked for help.
Mark “the first Jedi” Rueter responded with a completely different beautifully simple solution using Sets. It’s his work, I just made it pretty.
Created a Calculated Set
Create a set based on a condition that counts the number of employee on a per-manager basis. In the screenshot below, EmpCount is a parameter in which hardcode the “cut off” number of employees-per-manager that we need to see before we aggregate. The [Category] field is a stand-in for Employee Name] – Pretend it says employees!:
Drop that set in a Calculated Field
Create a field which shows either the Employee Name ([Category]) OR the word “Restricted” if the user isn’t in the Set. Use this Calculated Field to drive color:
Create a Hierarchy
If you want, add the new Calculated Field to a Manager > Employee hierarchy. You can use this later to open up the bar chart and see employees underneath each manager.
With those items in place, you’re set.
Here’s my dashboard with a “cutoff” set to 2. In the dataset, each manager DOES have at least 2 employees, so we see everything:
And now, set the cutoff to 4. I only have one manager with at least four employees, so we aggregate on the other two. Could I have done a better job with the axes below? Maybe used an Average aggregation instead of Sum in the Box-and-Whisker? Sure. But I didn’t.
I love using cURL to hit #Tableau Server with requests. It’s especially useful for testing trusted tickets.
I finally started running SSL on my main external Tableau Server, and noted my standard techniques stopped working:
curl -d “username=someUser” https://myserver.com/trusted
returned this error:
curl performs SSL certificate verification by default, using a “bundle”
of Certificate Authority (CA) public keys (CA certs). If the default
bundle file isn’t adequate, you can specify an alternate file
using the —cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
If you’d like to turn off curl’s verification of the certificate, use
the -k (or —insecure) option.
Fine, so I used -k:
curl -d “username=someUser” -k https://myserver.com/trusted
and cURL keeps complaining:
curl: (35) error:14077458:SSL routines:SSL23_GET_SERVER_HELLO:reason(1112)
Used The Google, and found that sometimes you have to specify which version of OpenSSL to use. You need to add -3 to the string:
curl -d “username=someUser” -k -3 https://myserver.com/trusted