"Tableau"

#Tableau PM Neelesh Kamkolkar shows you how to load test Tableau Server with LoadRunner @hploadrunner

Browser rendering and fixed size dashboards: still good for performance?

Short answer: Yes. Move along if you don’t care to sweat the details. 

Many, moons ago I posted this nugget and a couple months ago I saw an interesting thread that wondered out loud if the fact that we now do client-side rendering changes this rule of thumb. A lot of folks now take it for granted. 

Good news, it doesn’t. Even when doing client-side rendering in the browser, the lion’s share of the heavy lifting happens on the Server. Querying of the data source and creating the presentation model are server based. The browser simply draws things at the correct X/Y coordinates based the model which was created and lives in-session on the server. Those models can be re-used, and are more likely to be re-used if of a fixed size. 

So keep setting exact height and width, campers.  

Where are my custom fonts on Tableau Server?

Whoops! You created a viz with a custom font, installed that custom font on Tableau Server, but Server is rendering the viz with a generic parent font replacement - not your wonderful update to Comic Sans. 

Why? One of these two reasons is correct. You figure out which: 

  • Tableau Server knows that the font that you chose is really, really ugly and we simply refuse to render it in order to spare you from embarrassment in front of the whole world. We look out for our customers like that.
  • Tableau Server generally defaults to client-side rendering these days. So unless you force server-side rendering, it doesn’t matter that the font is installed on the Tableau Server box — your browser will go hunting locally for that font, and it probably doesn’t exist on your machine. To force server side rendering, use ?:render=false on your request for the viz. More info here: http://onlinehelp.tableausoftware.com/v8.1/server/en-us/browser_rendering.htm

Lazy Man’s Guide to the Tableau Server REST API

When Tableau Server 8.2 releases, you’ll have the pleasure of a new API to work with. I’m sure there will be lots of chatter about the things you can do with the thing, so I’m not going to cover that here. The story I want to tell is a little different.

Over the weekend, evil Darth <redacted> created a guide that one can use to get a get a quick feel for how the API works without having to write lots of code to get there.

It is a combination of Tableau’s great help files on the API and a tool called Postman.

Lord <redacted> grudgingly gave his permission for me to share it with the little people.

image

You can download the guide here: 

http://1drv.ms/1kuf1Fp 

Of course, you’ll have to wait for 8.2 to release to use it. Sith’s revenge.

As always, this stuff is not supported by Tableau, you use the guidance at your own risk, and it may or may not be updated in the future. 

The document has only been lightly proofed, so if you find errors, feel free to post them here. In between slaying Jedi, the author may have time to apply corrections. 

Tableau TypeError: this._activeSheetImpl is null

Here’s some quick Google engine fodder. I suspect no one will ever need this because it is the result of me being the laziest coder on the face of the planet. (Yes, I’m proud).

When using the JavaScript API against a SSL protected Tableau Server, one should reference the viz you want to instantiate and the tableau js file itself via https: to indicate the protocol that is in play. 

I generally don’t. I’ve just used http: to refer to the viz and then rely on Tableau to redirect my request to https / 443

During the course of my career, this has saved me at least 15 keystrokes. 

Well, you can’t be lazy like this anymore. I’ve noticed since I upgraded to 8.1.7, that my JavaScript breaks (on the activateSheetAsync method, probably others) with the following error:

The “fix” is simple. Refer to the viz and the js file with https: like you should be doing in the first place. 

IP-based Geo-location in Tableau (New! Now with more R!)

@highvizability wrote a great piece last year about how to geocode a user’s address using R. You can read the article here and download a great sample workbook:

http://www.dataplusscience.com/Geocoding%20in%20Tableau%20using%20R.html 

Let’s do the same thing but with someone’s IP address, shall we?

Tableau has already documented one method to approach this, but technology moves on. FYI, I wouldn’t use the following technique if I’m going to plot thousands of  marks / ip addresses on a map - it would probably take too long. That said…

First, I found another great blog by a gentleman who figured out how to make ip-based geocoding go in R. You can read article here. Everything I did is based on his work.

Essentially, I’ve dumbed-down what Andrew put together and modified his function so that it only returns the latitude and longitude of the ip address being encoded:

{
     if (1 == length(ip))
     {
          # a single IP address
          require(rjson)
          url <- paste(c(“http://freegeoip.net/json/”, ip), collapse=”)
          ret <- fromJSON(readLines(url, warn=FALSE))
          ret <- data.frame(t(unlist(ret)))
          latlon <- paste(ret$latitude,ret$longitude,sep=”,”)
          return(latlon)

     }
}

The function above requires the rjson package, which lives in CRAN.

The function must also be available to your RServ server. This is generally accomplished by finding the Rserv.config file for RServ and adding a reference to a file which holds the text for your function.

For example, I dropped the text above into a file called geo.txt, and then plugged that value into the config file like so:

image

After that, all the work happens in Tableau,

You’ll create an expression which calls the function and passes it the ip address (in this case, named [ip] in my dataset):

image

Then, create two additional expressions which parse out the latitude and longitude:

image

…and you’re done!

You can download samples from here:  http://1drv.ms/PUO69Q

Testing Tableau with SQL Server 2014 (in memory!)

As an ex-blue badge, I have a love for SQL Server that’s baked into my genes. I. Love. SQL. Server. Clear?

So, I was excited to load up SQL Server 2014 for the first time since it RTM’d and take it for a spin.

I thought it would fun to run a little test: namely, run the same dashboard against “on disk” SQL Server, a Tableau Extract, and In-Memory OLTP on SQL Server. Let’s see what comes back fastest, shall we? 

(What?! You say you don’t know what In-Memory OLTP is? Luddite! Read about it here).

The Dashboard and Data

Anyway, here is the blue dashboard I came up with for my ex-blue badge testing. No, it doesn’t follow dataviz best practices, but it’s blue and I like blue today:

image

The key thing I wanted to do was introduce several levels of grouping into multiple vizzes to create a meat-and-potatoes dashboard representative of what a lot of people do with Tableau. 

The dataset I used was pretty small - just under 4M rows in a star schema. Here’s the ERD:

image

Only basic indexing is in play with the on-disk database - I have clustered indexes created on each of the PK fields on the dimension tables. 

Using SQL Server 2014

Upgrading my SQL Server 2012 to 2014 was like buttah - very straight forward and pain-free. This made me happy. 

One must store in-memory tables in a new type of SQL Server filegroup used for this purpose, so I added one of these suckers to my database and then used the Table Memory Optimization Advisor to make in-memory copies of my original tables. The TMOA is pretty much like a wizard that tells you if you’re doing anything stupid, gives you some basic advice and then creates a copy of the table you’re working with (with or without data) inside the Memory Optimized Data filegroup. 

I no longer consider myself a SQL Server Pro - more like a week-end warrior…and the fact that I could turn this part of the project around so quickly should be seen as a feather in Microsoft’s cap. 

The Testing

I performed my testing as follows, timing each dashboard execution: 

  • Cycled SQL Server to clear cache (or for Tableau extract testing, I restarted the OS - this is the only way to know the extract file isn’t already in memory from a previous run). I called this an “un-cached run”.
  • Ran the blue dashboard and closed Tableau
  • Re-opened Tableau and ran the dashboard again (a “cached” run)
  • Cycled SQL Server and/or restarted the OS once more and ran my blue dashboard, this time using Tableau Performance Recorder (another “un-cached run”)

…and things don’t look so good for In-Memory OTLP, do they?

Let’s look at the Performance Recorder output for Tableau extract based runs:

image

Geocoding took the most time, followed by the 920 millisecond query execution which helps to support this process. 

Here’s the recorder when I ran the dashboard against “on disk” SQL Server: 

image

The query which supports the running sum area chart at the top of my blue dashboard takes the longest here, followed by the same Geocoding task & geocode support queries.

And now, the surprisingly slow In-Memory OLTP source: 

image

It took 29 seconds to run the query which powers the box and whisker plot viz. Insane! The others weren’t particularly light on their feet either, but 29 seconds is just ugly.

Here’s the SQL for the query in question:

SELECT
   DATEPART(month,[DimDate].[Full_SQL_Date]) AS [mn:Full_SQL_Date:ok],
  [DimClient].[Client_Name] AS [none:Client_Name:nk],
   DATEPART(year,[DimDate].[Full_SQL_Date]) AS [yr:Full_SQL_Date:ok],
   SUM(CAST(CAST([FactDailyResponse].[Impressions] as bigint) as BIGINT)) AS [sum:Impressions:ok]
FROM
   [dbo].[FactDailyResponse] [FactDailyResponse]
   INNER JOIN [dbo].[DimClient] [DimClient] ON ([FactDailyResponse].[Client_Key] = [DimClient].[Client_Key])
   INNER JOIN [dbo].[DimDate] [DimDate] ON ([FactDailyResponse].[Date_Key] = [DimDate].[Date_Key])
GROUP BY
   DATEPART(month,[DimDate].[Full_SQL_Date]),
  [DimClient].[Client_Name],
   DATEPART(year,[DimDate].[Full_SQL_Date])

 I ran this sucker in Query Analyzer and got more or less the same result, then turned checked out the execution plan and saw several alerts that statistics on my table indexes were out of date. 

…and that was the problem. 

This help topic says that statistics are created automatically but not updated automatically on memory-optimized tables. When I updated the statistics manually on the dimDate table, query execution immediately dropped to ~3 seconds in Query Analyzer. 

I updated statistics on all of my tables and then tried again:

That’s better - still slower than on-disk SQL. Surprised.

What I’m seeing thus far could very well have to do with the type of index I’m using on my tables: (a NONCLUSTERED HASH on the PK field). Books Online indicates that this is what I should use for a field acting as my primary key, but I might try a NONCLUSTERED index instead. Might also try to add additional indexes on some of the fields that I get used a bunch, like my date field. 
Here’s the performance recorder running my dashboard against in-memory with updated statistics on my indexes:
image
Take-aways thus far:
  • Don’t forget to update your statistics!
  • Since everything is “cached” in an in-memory OLTP scenario, you’ll get about the same perf each time you run a query
  • Using In-Memory OLTP isn’t “fire and forget” like I’ve gotten used to with SQL Server over the years - looks like I’m going to need to spend some time adding & tuning indexes for best effect. This will make free-form analysis clumsy as each user may focus on different fields (which may or may not be indexed)
  • Extracts still offering better performance since all tables are “pre-joined” inside it.

More to come!

(Edit #1: It’s about 2 hours later)

I’ve started adding NONCLUSTERED indexes in an attempt to  improvement execution time: 

  • Index on the Date column in my Dimension Table: No Change
  • Index on the Client_Key field in the Fact table used to join to Clients dimension - gained ~4 seconds: Average run time now at ~12.5 seconds from ~17: Equal to disk storage, at least
  • Index on the Date_Key field in the Fact table which joins to the Date dimension: No Change

(Edit #2)

If Tableau gets great performance by creating a single in-memory table as the result of joining other tables, why can’t SQL Server? Let’s try it!

I took the same query that represents joining my fact table to all dimensions and just made a single materialized version of said query as a table on disk. Good so far. However, when I attempt to turn this into an In-Memory OLTP table, no joy:

image

Damn.

(Edit #3)

6:30 AM here, and I just added NONCLUSTERED indexes to each and every column on the Fact table that I join with. That got me slightly more than 1 second of additional improvement, so I’m no averaging about 11.1 seconds on this dashboard compared to 10.6 seconds against it’s on-disk cousin. Based on the fact my stopwatch thumb can be fast or slow depending on how much caffeine is coursing through my veins, I’ll call this “even”. 

If I really wanted to go all out, I see an opportunity to create a multi-column covering index on three fields in my fact table (two of which are measures). I suspect that would improve perf on the visualization that is taking “longest” at about 3.5 seconds. However, that seems to be overkill to me, so I’ll leave well enough alone for now.

Tableau Server Scalability Explained →

#Tableau Server scalability whitepaper updated for 8.1 and 64-bit yumminess!

Making a good first impression with Tableau Desktop

You’ve invested the time to create a great viz for your team, boss, or client…now, make sure you get fully paid — With money, adoration or whatever gets you through the night.

And you only get one chance to make a good first impression, right?  

Before you deliver your workbook, put your best foot forward by choosing your best-looking dashboard and putting it into Presentation Mode before you save. 

Example -  which “on open” experience looks and feels better to you? 

This: 

image

..Or this:

image

We’re talking about a trivial, 1-second exercise that really makes a difference when it comes to “presentation time”.

  • Choose your favorite viz
  • Hit F7 to go into Presentation Mode
  • Hit CTRL-S to save

…and you’re done. Send your work out into hard, cold world and wait for the kudos to come flooding in.

Am AMAZING hack by Michael Kovner: TabChat for #Tableau Server

You probably already know “Kovner” (the man with no first first name). Kovner co-presented at a number of TCC sessions with me, and is one of the scary-smart young guns at Tableau - he’s also the guy who created the Tables in the East game. 

Now he’s gone off and created TabChat, a proof of concept that allows two or more users to collaboratively explore, filter, and chat about a viz being displayed. It’s pretty self-explanatory, but you can click “How to use this” if you get lost. 

User 1 logs in:

image

User 2 Logs in:

image

So, jump out here, play with TabChat, then bookmark Kovner’s blog and return often.

http://tabcmd.com/tabchat-socketio-javascript-api