TabPy v2.3.1 Released: Deployed Function Overriding Bug Fixed

PyPi package – https://pypi.org/project/tabpy/2.3.1/.

GitHub release – https://github.com/tableau/TabPy/releases/tag/2.3.1.

The most important improvement of this release is a bug fix for overriding deployed functions (models).

TabPy supports overriding for deployed functions with increasing endpoint version for it.

For example if you run TabPy locally and deploy function like in the example below you will have endpoint Fn version 1:

from tabpy.tabpy_tools.client import Client

client = Client('http://localhost:9004/')

def fn(x, y):
  return x + y


client.deploy('Fn', fn, 'My awesome function')

In case you want to make any changes in the function in the way users of it automatically pick up new version without any work on their side use override parameter when deploying again:

def fn(x, y):
  return x * y + x * 2

client.deploy('Fn', fn, override=True)

With overriding endpoint Fn its version will be increased and all user SCRIPT_... calculations from now on will call this new version.

Share the post if you liked it

Running TabPy with Python Virtual Environment

In my post How to run TabPy with Anaconda on Linux I showed how to use Anaconda to create isolated Python environments so you can have different versions of Python with different packages installed all on the same machine.

If you can not use Anaconda for any reason there is another way to separate Python environments on a machine with help of Python virtual environment package.

For steps below I am using virtual machine running CentOS:

[ogolovatyi@3250e1583c74401 ~]$ hostnamectl
...
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-693.5.2.el7.x86_64
      Architecture: x86-64

All these steps with a little modifications can be used on MacOS or Windows.

To make things more interesting the machine has Python 2.7 on it:

[ogolovatyi@3250e1583c74401 ~]$ python -V
Python 2.7.5

Since TabPy requires Python 3.5 at least let’s first install it (on your Linux distribution you may need use some other package manager instead of yum):

sudo yum install python3

It is recommended to update pip with the latest version. Note that python command still will run Python 2.7. You can change that, but doing so may cause some of other commands and tools work properly. Instead I use python3 command which runs the newly installed Python 3.6.8:

sudo python3 -m pip install --upgrade pip

Now to installing virtualenv package:

python3 -m pip install virtualenv

For a virtual environment it will be created in the folder where the next command is executed:

python3 -m venv TabPy-venv

After the command above succeeds you can see TabPy-venv folder was created. You can create multiple virtual environments with using the command – just make the names unique.

To activate the environment run the following:

[ogolovatyi@3250e1583c74401 ~]$ source TabPy-venv/bin/activate

Note how command line prompt changes after the command above succeeded – now it has virtual environment name in it. Additionally python command now runs Python 3.6.8 I installed before:

(TabPy-venv) [ogolovatyi@3250e1583c74401 ~]$ python -V
Python 3.6.8

All the packages installed in the active virtual environment will be installed in the folder for the environment (TabPy-venv for the shown steps) and won’t affect “system” Python or any other Python versions or environments.

Now let’s update pip (remember it is different pip this time – the one for the virtual environment) and install TabPy:

pip install --upgrade pip
...
pip install tabpy

If you check where the TabPy is installed you’ll see something like this:

(TabPy-venv) [ogolovatyi@3250e1583c74401 ~]$ whereis tabpy
tabpy: /home/....../ogolovatyi/TabPy-venv/bin/tabpy

To start TabPy run the usual tabpy command or specify config for TabPy (more details at TabPy: modifying default configuration).

And to exit virtual environment simply run deactivate command.

Additional information about virtualenv can be found at https://virtualenv.pypa.io/en/stable/.

Share the post if you liked it

How to Get Online-Accessible TabPy Instance with Heroku

Although it is very easy to install TabPy on your laptop, desktop, VM, and so on with just pip install --upgrade tabpy command there are some limitations and additional work which needs to be done. Some examples are:

  • You environment has to be configured – supported Python version (3.6 or newer at the moment this post is written) needs to be installed.
  • To isolate TabPy from other Python applications you may want to use Anaconda (additional reading – How to run TabPy with Anaconda on Linux), Python virtual environment or similar solution.
  • The machine with TabPy on it cannot be accessible outside of you work/home network.
  • You don’t want to expose the machine with TabPy on it to the whole internet.
  • and so on…

And there is a way to quickly create TabPy instance available everywhere via the internet just in minutes. It is done with Heroku which by their own words is

Heroku is a cloud platform that lets companies build, deliver, monitor and scale apps — we’re the fastest way to go from idea to URL, bypassing all those infrastructure headaches.

https://www.heroku.com/what

In simple words Heroku allows you to have accessible with internet applications running on their side which lifetime you control. What languages are supported and other documentation is available at https://devcenter.heroku.com/.

For what I am demonstrating here free Heroku account is sufficient – register at https://signup.heroku.com/login.

When you have registered and successfully logged to your Heroku account go to TabPy GitHub page at https://github.com/tableau/TabPy and click Deploy to Heroku (purple button on the screenshot below) button or use this link – https://dashboard.heroku.com/new?button-url=https%3A%2F%2Fgithub.com%2Ftableau%2FTabPy&template=https%3A%2F%2Fgithub.com%2Ftableau%2FTabPy.

On the first screen provide your application name (it has to be unique across Heroku applications) and click Deploy app:

Wait for application to be installed, configured and started (all steps should succeed as shown on the screenshot below):

To access the application click View button or use URL https://<your-app-name>.herokuapp.com/ (in the example above it is https://tabpy-heroku-demo.herokuapp.com/).

That is it! You have TabPy instance accessible via the internet. Hostname for it is <your-app-nme>.herokuapp.com and port is 80:

For secure connection use port 443 and set Require SSL checkbox as screenshot below demonstrates:

Secure connection uses certificate from Heroku issued by DigiCert:

There are some limitations to this solution:

  • You may use your own certificate, but it is not free.
  • The port is not configurable – it is always 80 or 443.
  • There is no authentication.

To resolve the limitations above you may configure and create your own Heroku application based on TabPy. What you have instead in easy and fast way to get TabPy up and running available from anywhere when you need it.

Share the post if you liked it

TabPy: Using Deployed Functions from Deployed Functions

If you don’t know what deployed functions in TabPy are consider reading the following:

With this post I’ll demonstrate with extremely simplified code how to use already deployed functions in other deployed functions.

Assuming TabPy is running on localhost machine and port 9004 in Python session I create and deploy simple function which capitalizes first letter of each string in the input parameter:

def my_func_1(s):
  return [x.capitalize() for x in s]

from tabpy.tabpy_tools.client import Client
client = Client('http://localhost:9004')
client.deploy('MyFunction1', my_func_1, 'Capitalize first letter of each string')

When the above is executed there is a function deployed which is accessible with Tableau SCRIPT_...('return tabpy.query('MyFunction1', _arg1)['response']', ...) expression. But what we are looking for is calling that function from another function.

When a function is deployed it is available in TabPy context for user script meaning you can use function name (the name the function was declared with, not the name given to when deployed). Here is how the function can be called:

def my_func_2(s):
  c = my_func_1(s)
  return ['_' + x + '_' for x in c]

client.deploy('MyFunction2', my_func_2, 'Underscore and capitalize each string')

Things to pay attention in the code sample above:

  1. The first deployed function is called by its Python name my_func_1 and not with the name it deployed with.
  2. Deployed functions (both MyFunction1 and MyFuction2) are used in Tableau calculations with their deployed names:

One caution I need to make here: How deployed functions are presented and accessible in the context of a script running in TabPy is not documented and not guaranteed to work in the future. The recommended way to reuse Python code is to create packages or use standalone modules as explained in How to use Python modules for TabPy scripts in Tableau post.

Share the post if you liked it

How to install certificates on Linux

Intro

When running Tableau Server on Linux and need it to connect to secure TabPy or secure Rserve instances (or any other analytics extension over secured channel) for Tableau Server to trust the connection it need to know to trust the certificate analytics extension is using. Some more details about Tableau and trusted certificates are in this post – Tableau and Trusted Certificates for Analytics Extensions.

In this post, I will show you how to install a trusted certificate (root or self-signed certificate) on Linux. Remember Rserve sends to Tableau leave certificate only so you may need to install the whole chain as trusted certificates.

NOTE: Instructions below may not work for your specific Linux version – check with documentation for your exact system.

Certificate formats: PEM, DER, PFX, etc.

There are a few different formats certificate file can be stored in. For the instructions below only PEM and DER are used. Some details about specific formats and how they are related can be found at https://aboutssl.org/cer-vs-crt/.

PEM and DER are just different encoding for the same data. DER is binary and PEM is Base64 encoded DER.

One format can be converted to another with OpenSSL. E.g. to convert DER to PEM run

openssl x509 -inform der -in cert.der -out cert.pem

More examples for how to convert certificate commands are at https://aboutssl.org/ssl-tools/ssl-converter.php.

NOTE: you only need certificates (public part) and not private key for it.

RPM-based Linux Steps

The following are the instructions for RPM-based Linux (CentOS, Fedora, Red Hat, etc.).

Copy PEM certificate to /etc/pki/ca-trust/source/anchors:

sudo cp cert.pem /etc/pki/ca-trust/source/anchors/cert.pem

Run the following command:

sudo update-ca-trust

For the certificate to be picked up by Tableau Server it is recommended to restart the whole machine. Restarting just Tableau Server may work as well but is not guaranteed.

Debian-based Linux Steps

For Debian-based Linux (Debian, Ubuntu, Kubuntu, etc.) use PEM certificate in .crt file. It means the format for the certificate file is PEM, but the file extension is required to be .crt.

First copy certificate file to /usr/local/share/ca-certificates:

sudo cp cert.crt /usr/local/share/ca-certificates/cert.cr

Now run the following command:

sudo update-ca-certificates

For the certificate to be picked up by Tableau Server it is recommended to restart the whole machine. Restarting just Tableau Server may work as well but is not guaranteed.

Additional reading

Related posts:

Share the post if you liked it