Till now this blog was only posted by one person (me). And I decided to take a new opportunity for my career so starting March 8 2021 I am not an employee of Tableau (Salesforce).
What will happen to the blog, how it will be managed and updated is still to be figured out by the team I used to be part of.
Anyway ownership for the blog to be transitioned to another person and hopefully you can see new updated, tips, tricks and bits of hidden knowledge in the nearest future.
Previously TabPy repo on github (https://github.com/tableau/TabPy) provided Deploy to Heroku button which allows to deploy TabPy in just a few clicks. However, deployed new instance won’t have authentication configured. And having both authentication and SSL is a requirement for analytics extensions to be able for using by Tableau Online.
With the latest improvement now you can specify user name and password for the instance:
Another approach described in TabPy + Heroku = Tableau Online post and is useful for when you need to create your own flavor of TabPy – not only you can specify multiple users, but also can configure any other parameters. This approach requires a bit more work but gives a lot of flexibility in configuring TabPy for you needs. And with your own modifications you can deploy your own version of TabPy on Heroku the same way just in a few clicks.
Answer to these questions are different depending on your OS and can even be different for different versions of the same OS.
In this post I am showing the simplest way to make TabPy something like a Windows service – which means it starts automatically and runs in background. For the following steps, I assume you have TabPy environment configured in Anaconda (read https://tabscifi.golovatyi.info/category/anaconda/ for more examples and how-to’s for using Anaconda with analytics extensions).
With my Anaconda I have tabpy_demo environment where TabPy package is installed:
Next, I created c:\demo\TabPy_as_a_service folder with the simple config file in it:
Now let’s create a task with Task Scheduler (Start->Task Scheduler). In Task Scheduler Library click Create Task… on Actions pane on the right. In the task dialog specify the task name (I use “TabPy as a Service” in this example), and choose if you want the task to run with specific account.
On Triggers tab click New… and specify At startup – now the task runs automatically at system startup.
On Actions tab add new action Start a program and use the following parameters:
For Program/script use %windir%\System32\cmd.exe.
For arguments at Add arguments enter `”/K c:\ProgramData\Anaconda3\Scripts\activate.bat tabpy_demo && tabpy –config tabpy.conf”`. More about the string below.
For Start in specify folder you have your config file in. The same folder will have TabPy logs in my case.
c:\ProgramData\Anaconda3\Scripts\activate.bat tabpy_demo activates Anaconda environment tabpy_demo. For you this part will be different: Anaconda can be installed in a different folder and TabPy environment will have a different name.
tabpy --config tabpy.conf starts TabPy with the custom configuration file in c:\demo\TabPy_as_a_service folder specifies as Start in folder.
On Conditions and Settings tabs change parameters as needed.
Now you can test the task – click Run action on Actions pane. If the task was configured properly you’ll see TabPy is running.
Log files for TabPy can be found in the same c:\demo\TabPy_as_a_service folder where config file is:
Now we have TabPy scheduled to start on a system start up.
One question remains – how can I make cmd window go away? It is possible and requires a little bit of scripting so I am saving it for the next post 😉
You requested it, we listened, it took some time to make it happen… and now you can use TabPy, Rserve or any other analytics extension with Tableau Online!
To be able to enable the feature you need to be a site admin. In a similar way how on-prem Tableau Server is configured (see Multiple Analytics Extensions Connections with Tableau Server 2020.2 for details) for Tableau Online go to Settings, Extensions, set checkbox Enable analytics extensions for site and click Create new connection link. In popup window specify connection type, host, port and credentials.
After configuring the connection click Save in the dialog window and then Save on the page. That’s it – now you can publish and use workbooks with Python/R/you-name-it scripts in them…
Actually there are couple more things to consider:
Your analytics extensions instance should use a certificate that Tableau Online can trust. It means no self-signed certs or certs issued by your organization. Those should be certs signed by one of well known CAs (Certificate Authority), e.g. DigiCert.
It has to be an endpoint (host and port) that is reachable for Tableau Online via the internet. If the analytics extension you are planning to use can be reached outside of your organization network Tableau Online should be able to see it too.
Question Advanced Analytics Team hears somewhat regular is exactly what the title for the post says:
I created a calculated field and see the calls for TabPy/Rserve/MatLab/Einstein/etc. are made for each row. Why is that and what can I do for the field to be calculated once for the whole column?
Short answer is that behavior is how table calculation is computed – Tableau is evaluating each different value due to the settings. To change the behavior set the dimension as Specific Dimensions:
With Specific Dimensions data will be sent for calculation in one call.