Upcoming changes to the blog

Till now this blog was only posted by one person (me). And I decided to take a new opportunity for my career so starting March 8 2021 I am not an employee of Tableau (Salesforce).

What will happen to the blog, how it will be managed and updated is still to be figured out by the team I used to be part of.

Anyway ownership for the blog to be transitioned to another person and hopefully you can see new updated, tips, tricks and bits of hidden knowledge in the nearest future.

Thanks everyone!

Share the post if you liked it

TabPy Heroku Deployment with Authentication Enabled

Since Analytics Extensions for Tableau Online were announced (read Analytics Extensions are Available with Tableau Online! on this blog and https://www.tableau.com/about/blog/2021/1/developers-analytics-extensions-tableau-online on Developer Platform news) there were a lot of interest and questions from people who want to try the new feature. And in many cases those who want to investigate using Python or any other analytics extension don’t have it configured and running yet.

Previously TabPy repo on github (https://github.com/tableau/TabPy) provided Deploy to Heroku button which allows to deploy TabPy in just a few clicks. However, deployed new instance won’t have authentication configured. And having both authentication and SSL is a requirement for analytics extensions to be able for using by Tableau Online.

With the latest improvement now you can specify user name and password for the instance:

Now the instance requires credentials which can be configured for a connections as shown in Analytics Extensions are Available with Tableau Online! post.

Another approach described in TabPy + Heroku = Tableau Online post and is useful for when you need to create your own flavor of TabPy – not only you can specify multiple users, but also can configure any other parameters. This approach requires a bit more work but gives a lot of flexibility in configuring TabPy for you needs. And with your own modifications you can deploy your own version of TabPy on Heroku the same way just in a few clicks.

Share the post if you liked it

How to run TabPy as Windows Service

People ask this question in different ways:

  • How do I make TabPy to start automatically?
  • How can I run TabPy in background?
  • How can I run TabPy as if it was a service?

Answer to these questions are different depending on your OS and can even be different for different versions of the same OS.

In this post I am showing the simplest way to make TabPy something like a Windows service – which means it starts automatically and runs in background. For the following steps, I assume you have TabPy environment configured in Anaconda (read https://tabscifi.golovatyi.info/category/anaconda/ for more examples and how-to’s for using Anaconda with analytics extensions).

With my Anaconda I have tabpy_demo environment where TabPy package is installed:

Next, I created c:\demo\TabPy_as_a_service folder with the simple config file in it:

args=('tabpy_log.log', 'a', 1000000, 5)
format=%(asctime)s %(levelname)ss:%(lineno)d): %(message)s

For more information about custom TabPy config consider reading posts at https://tabscifi.golovatyi.info/category/tabpy-configuration/.

Now let’s create a task with Task Scheduler (Start->Task Scheduler). In Task Scheduler Library click Create Task… on Actions pane on the right. In the task dialog specify the task name (I use “TabPy as a Service” in this example), and choose if you want the task to run with specific account.

On Triggers tab click New… and specify At startup – now the task runs automatically at system startup.

On Actions tab add new action Start a program and use the following parameters:

  • For Program/script use %windir%\System32\cmd.exe.
  • For arguments at Add arguments enter `”/K c:\ProgramData\Anaconda3\Scripts\activate.bat tabpy_demo && tabpy –config tabpy.conf”`. More about the string below.
  • For Start in specify folder you have your config file in. The same folder will have TabPy logs in my case.

The arguments string above means:

  • /K tells cmd to run the command and continue (documentation for cmd is here – https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/cmd).
  • c:\ProgramData\Anaconda3\Scripts\activate.bat tabpy_demo activates Anaconda environment tabpy_demo. For you this part will be different: Anaconda can be installed in a different folder and TabPy environment will have a different name.
  • tabpy --config tabpy.conf starts TabPy with the custom configuration file in c:\demo\TabPy_as_a_service folder specifies as Start in folder.

On Conditions and Settings tabs change parameters as needed.

Now you can test the task – click Run action on Actions pane. If the task was configured properly you’ll see TabPy is running.

Log files for TabPy can be found in the same c:\demo\TabPy_as_a_service folder where config file is:

Now we have TabPy scheduled to start on a system start up.

One question remains – how can I make cmd window go away? It is possible and requires a little bit of scripting so I am saving it for the next post 😉

Share the post if you liked it

Analytics Extensions are Available with Tableau Online!

You requested it, we listened, it took some time to make it happen… and now you can use TabPy, Rserve or any other analytics extension with Tableau Online!

To be able to enable the feature you need to be a site admin. In a similar way how on-prem Tableau Server is configured (see Multiple Analytics Extensions Connections with Tableau Server 2020.2 for details) for Tableau Online go to Settings, Extensions, set checkbox Enable analytics extensions for site and click Create new connection link. In popup window specify connection type, host, port and credentials.

NOTE. Tableau Online requires analytics extension connection to use SSL and authentication. This link shows some authentication related posts – https://tabscifi.golovatyi.info/category/authentication/. And with this link you can find some additional information about SSL and certificates – https://tabscifi.golovatyi.info/category/security/.

After configuring the connection click Save in the dialog window and then Save on the page. That’s it – now you can publish and use workbooks with Python/R/you-name-it scripts in them…

Actually there are couple more things to consider:

  • Your analytics extensions instance should use a certificate that Tableau Online can trust. It means no self-signed certs or certs issued by your organization. Those should be certs signed by one of well known CAs (Certificate Authority), e.g. DigiCert.
  • It has to be an endpoint (host and port) that is reachable for Tableau Online via the internet. If the analytics extension you are planning to use can be reached outside of your organization network Tableau Online should be able to see it too.
Share the post if you liked it

Why calculations for my Python/R/… SCRIPT_x field are per row?

Question Advanced Analytics Team hears somewhat regular is exactly what the title for the post says:

I created a calculated field and see the calls for TabPy/Rserve/MatLab/Einstein/etc. are made for each row. Why is that and what can I do for the field to be calculated once for the whole column?

Short answer is that behavior is how table calculation is computed – Tableau is evaluating each different value due to the settings. To change the behavior set the dimension as Specific Dimensions:

With Specific Dimensions data will be sent for calculation in one call.

For technical details read Tableau documentation on how to transform values with table calculations – https://help.tableau.com/current/pro/desktop/en-us/calculations_tablecalculations.htm#specific-dimensions.

Also consider watching these video recordings from Tableau Conference which provide some deeper dive into the topic and beyond:

  • Data science applications with TabPy/R:
  • Even More Data Science Applications in Tableau:
Share the post if you liked it