Running TabPy with Python Virtual Environment

In my post How to run TabPy with Anaconda on Linux I showed how to use Anaconda to create isolated Python environments so you can have different versions of Python with different packages installed all on the same machine.

If you can not use Anaconda for any reason there is another way to separate Python environments on a machine with help of Python virtual environment package.

For steps below I am using virtual machine running CentOS:

[ogolovatyi@3250e1583c74401 ~]$ hostnamectl
...
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-693.5.2.el7.x86_64
      Architecture: x86-64

All these steps with a little modifications can be used on MacOS or Windows.

To make things more interesting the machine has Python 2.7 on it:

[ogolovatyi@3250e1583c74401 ~]$ python -V
Python 2.7.5

Since TabPy requires Python 3.5 at least let’s first install it (on your Linux distribution you may need use some other package manager instead of yum):

sudo yum install python3

It is recommended to update pip with the latest version. Note that python command still will run Python 2.7. You can change that, but doing so may cause some of other commands and tools work properly. Instead I use python3 command which runs the newly installed Python 3.6.8:

sudo python3 -m pip install --upgrade pip

Now to installing virtualenv package:

python3 -m pip install virtualenv

For a virtual environment it will be created in the folder where the next command is executed:

python3 -m venv TabPy-venv

After the command above succeeds you can see TabPy-venv folder was created. You can create multiple virtual environments with using the command – just make the names unique.

To activate the environment run the following:

[ogolovatyi@3250e1583c74401 ~]$ source TabPy-venv/bin/activate

Note how command line prompt changes after the command above succeeded – now it has virtual environment name in it. Additionally python command now runs Python 3.6.8 I installed before:

(TabPy-venv) [ogolovatyi@3250e1583c74401 ~]$ python -V
Python 3.6.8

All the packages installed in the active virtual environment will be installed in the folder for the environment (TabPy-venv for the shown steps) and won’t affect “system” Python or any other Python versions or environments.

Now let’s update pip (remember it is different pip this time – the one for the virtual environment) and install TabPy:

pip install --upgrade pip
...
pip install tabpy

If you check where the TabPy is installed you’ll see something like this:

(TabPy-venv) [ogolovatyi@3250e1583c74401 ~]$ whereis tabpy
tabpy: /home/....../ogolovatyi/TabPy-venv/bin/tabpy

To start TabPy run the usual tabpy command or specify config for TabPy (more details at TabPy: modifying default configuration).

And to exit virtual environment simply run deactivate command.

Additional information about virtualenv can be found at https://virtualenv.pypa.io/en/stable/.

How to install certificates on Linux

Intro

When running Tableau Server on Linux and need it to connect to secure TabPy or secure Rserve instances (or any other analytics extension over secured channel) for Tableau Server to trust the connection it need to know to trust the certificate analytics extension is using. Some more details about Tableau and trusted certificates are in this post – Tableau and Trusted Certificates for Analytics Extensions.

In this post, I will show you how to install a trusted certificate (root or self-signed certificate) on Linux. Remember Rserve sends to Tableau leave certificate only so you may need to install the whole chain as trusted certificates.

NOTE: Instructions below may not work for your specific Linux version – check with documentation for your exact system.

Certificate formats: PEM, DER, PFX, etc.

There are a few different formats certificate file can be stored in. For the instructions below only PEM and DER are used. Some details about specific formats and how they are related can be found at https://aboutssl.org/cer-vs-crt/.

PEM and DER are just different encoding for the same data. DER is binary and PEM is Base64 encoded DER.

One format can be converted to another with OpenSSL. E.g. to convert DER to PEM run

openssl x509 -inform der -in cert.der -out cert.pem

More examples for how to convert certificate commands are at https://aboutssl.org/ssl-tools/ssl-converter.php.

NOTE: you only need certificates (public part) and not private key for it.

RPM-based Linux Steps

The following are the instructions for RPM-based Linux (CentOS, Fedora, Red Hat, etc.).

Copy PEM certificate to /etc/pki/ca-trust/source/anchors:

sudo cp cert.pem /etc/pki/ca-trust/source/anchors/cert.pem

Run the following command:

sudo update-ca-trust

For the certificate to be picked up by Tableau Server it is recommended to restart the whole machine. Restarting just Tableau Server may work as well but is not guaranteed.

Debian-based Linux Steps

For Debian-based Linux (Debian, Ubuntu, Kubuntu, etc.) use PEM certificate in .crt file. It means the format for the certificate file is PEM, but the file extension is required to be .crt.

First copy certificate file to /usr/local/share/ca-certificates:

sudo cp cert.crt /usr/local/share/ca-certificates/cert.cr

Now run the following command:

sudo update-ca-certificates

For the certificate to be picked up by Tableau Server it is recommended to restart the whole machine. Restarting just Tableau Server may work as well but is not guaranteed.

Additional reading

Related posts:

How to run TabPy with Anaconda on Linux

What Anaconda is can be explained in many ways depending on how and what you use is for. In the case of TabPy Anaconda serves as Python environments manager which allows having different Python environments (different versions of Python with a different set of packages) on the same machine. To learn more about Anaconda and where it can be used to start at https://www.anaconda.com/.

First question to ask is why you may need to use Python environments. There are many reasons. Couple of the most important are:

  • Your OS may have old version of “system wide” Python (e.g. Python 2.7) and it is not easy (or not possible at all) to update it without breaking something, and
  • You may want to isolate TabPy environment with packages it needs from your other environment (development, training models, etc.), or
  • You may want to have multiple independent TabPy instances each with their own set of packages awailable.

To manage Python environments Anaconda is not the only option, you can use Python virtualenv tools for example – https://virtualenv.pypa.io/en/latest/. However Anaconda has some advantages like nice UI app to manages environments and packages, saving/restoring/sharing enviroments, support for R environments, etc.

And when for Windows and Mac you can use UI app in many cases for Linux GUI is not an option. In other words you will need to use command line. And this is what rest of this post shows how to do.

After login to a Linux machine, you need to download Anaconda installer from https://www.anaconda.com/distribution/. The following commands assume you are in your home folder (run cd ~ to navigate to your home folder). On the Anaconda distribution page find the link for the latest version for Linux and run the following command:

wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh

Now when you have the installer in your home folder run it:

bash Anaconda3-2020.02-Linux-x86_64.sh

When the installer finishes add path to the Anaconda to the environment variable (this assumes bash is your Linux shell):

echo 'export PATH="~/anaconda3/bin:$PATH"' >> ~/.bashrc

In the command above we add anaconda3 subfolder bin from the user home folder to environment variable PATH so anaconda commands can be found regardless of what is the current folder. If you installed Anaconda in a different location the command need to be modified properly.

Now let’s restart shell (or you can unlog and log in again):

source ~/.bashrc

Next let’s update Anaconda:

conda update conda

In the command above conda is the tool which will let us perform different Anaconda operations.

Next step is to create new environment for TabPy:

conda create --name TabPy python=3.7

What happens in the command above is new environment with name TabPy is created and it has Python 3.7 available for it.

You can have as many environments as you need with different or the same Python versions. As all of them are isolated and do not interfere it is possible to have different sets of packages or the same packages of different versions in them.

Now we can activate the environment:

conda activate TabPy

If the command succeeds you’ll the command prompt is updated to have the environment name in it.

And now you can install TabPy and other packages:

python -m pip install --upgrade
pip install tabpy
tabpy

To exit from the environment run deactivate command:

conda deactivate

For how to configure TabPy read https://tabscifi.golovatyi.info/2020/01/tabpy-modifying-default-configuration/.