TabPy v1.1.0 released

TabPy version 1.1.0 is released: package – https://pypi.org/project/tabpy/, release on GitHub – https://github.com/tableau/TabPy/releases/tag/1.1.0.

You can update your TabPy with the following command (you’ll need to stop all running TabPy instances first):

pip install --upgrade tabpy

New release main improvement is for /info method (https://tableau.github.io/TabPy/docs/server-rest.html#get-info) – now it checks for credentials to be provided if TabPy is configured for authentication. The improvement won’t affect any older Tableau Desktop or Tableau Server versions which already had support for TabPy authentication.

For how to configure authentication for TabPy read How to configure TabPy with authentication and use it in Tableau.

How to run TabPy with Anaconda on Linux

What Anaconda is can be explained in many ways depending on how and what you use is for. In the case of TabPy Anaconda serves as Python environments manager which allows having different Python environments (different versions of Python with a different set of packages) on the same machine. To learn more about Anaconda and where it can be used to start at https://www.anaconda.com/.

First question to ask is why you may need to use Python environments. There are many reasons. Couple of the most important are:

  • Your OS may have old version of “system wide” Python (e.g. Python 2.7) and it is not easy (or not possible at all) to update it without breaking something, and
  • You may want to isolate TabPy environment with packages it needs from your other environment (development, training models, etc.), or
  • You may want to have multiple independent TabPy instances each with their own set of packages awailable.

To manage Python environments Anaconda is not the only option, you can use Python virtualenv tools for example – https://virtualenv.pypa.io/en/latest/. However Anaconda has some advantages like nice UI app to manages environments and packages, saving/restoring/sharing enviroments, support for R environments, etc.

And when for Windows and Mac you can use UI app in many cases for Linux GUI is not an option. In other words you will need to use command line. And this is what rest of this post shows how to do.

After login to a Linux machine, you need to download Anaconda installer from https://www.anaconda.com/distribution/. The following commands assume you are in your home folder (run cd ~ to navigate to your home folder). On the Anaconda distribution page find the link for the latest version for Linux and run the following command:

wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh

Now when you have the installer in your home folder run it:

bash Anaconda3-2020.02-Linux-x86_64.sh

When the installer finishes add path to the Anaconda to the environment variable (this assumes bash is your Linux shell):

echo 'export PATH="~/anaconda3/bin:$PATH"' >> ~/.bashrc

In the command above we add anaconda3 subfolder bin from the user home folder to environment variable PATH so anaconda commands can be found regardless of what is the current folder. If you installed Anaconda in a different location the command need to be modified properly.

Now let’s restart shell (or you can unlog and log in again):

source ~/.bashrc

Next let’s update Anaconda:

conda update conda

In the command above conda is the tool which will let us perform different Anaconda operations.

Next step is to create new environment for TabPy:

conda create --name TabPy python=3.7

What happens in the command above is new environment with name TabPy is created and it has Python 3.7 available for it.

You can have as many environments as you need with different or the same Python versions. As all of them are isolated and do not interfere it is possible to have different sets of packages or the same packages of different versions in them.

Now we can activate the environment:

conda activate TabPy

If the command succeeds you’ll the command prompt is updated to have the environment name in it.

And now you can install TabPy and other packages:

python -m pip install --upgrade
pip install tabpy
tabpy

To exit from the environment run deactivate command:

conda deactivate

For how to configure TabPy read https://tabscifi.golovatyi.info/2020/01/tabpy-modifying-default-configuration/.

Add colors to TabPy console output

Let’s brighten the day, shall we?

In one of my previous posts, I explained how logging can be configured for TabPy: levels, the format of the message, where it is sent/stored, etc. Read here for more detail – How to Configure Logging in TabPy?

I also mentioned there is number of handlers from provided with Python (console, file, and others). But it is also possible to use third party or customized handlers. With this post, I am going to show how you can make console logging colorful with third-party formatter.

First, we need a formatter. You can choose any, I for the demonstration purposes picked up colorlog (https://pypi.org/project/colorlog/). The most important thing you need to pay attention to is how the formatter messages are formatted… Yes, you’ll need to define the format for the formatter. For the one I chose documentation for the format arguments is on the PyPi and GitHub (https://github.com/borntyping/python-colorlog) pages for the package.

Let’s install the package:

pip install colorlog

Next, we need a configuration file (additional reading – TabPy: modifying default configuration):

[loggers]
keys=root

[logger_root]
level=INFO
handlers=console

[handlers]
keys=console

[formatters]
keys=console

[handler_console]
class=StreamHandler
level=INFO
formatter=console
args=(sys.stderr,)

[formatter_console]
class=colorlog.ColoredFormatter
format=%(asctime)s [%(bold)s%(log_color)s%(levelname)-8s%(reset)s] %(log_color)s%(message)s%(reset)s
datefmt=%y/%m/%d %H:%M:%S

In the file everything till line 20 is standard: define loggers, handlers, and formatters. And now with the only formatter defined we want to use formatter from the package previously installed. The line class=colorlog.ColoredFormatter tells Python logger to use specified formatter (ColoredFormatter) from colorlog package.

For the format parameter which defines what will be in a logged message we use formatter specific arguments: %(bold) makes text bold, %(log_color) sets the color for following text, and %(reset) changes text attributes to default. Again – for any other formatter you choose to use the arguments list most likely will be different.

Running TabPy with the config and querying it will look something like this:

TabPy v1 released!

It took some time, but TabPy v1 finally happened. Comparing to the previous v0.9.0 there are not so many changes:

To install TabPy or update your TabPy instance to the latest release run the following command:

pip install --upgrade tabpy

Additional reading:

How to Configure Logging in TabPy?

TabPy logs are useful for investigating issues, learning about user activities, how TabPy is used and so on. In my previous post TabPy: modifying default configuration I showed how to change TabPy settings. Logging for TabPy is configured in a similar way and I am going to show and explain some details for how to configure logging.

For the most recent and complete documentation about Python logger configuration, read this documentation page – https://docs.python.org/3/howto/logging.html.

In TabPy config file logger (or rather loggers) are configured with a few sections. A logger itself associated with a logger handler which in its turn depends on a logger formatter:

  • Loggers are used in Python code to initiate logging a message. Based on the severity logger passes the message to corresponding handlers.
  • Handlers are dispatching log messages to the handler’s specified destination. There are handlers for the console, files, etc.
  • Formatters specify how a logged message should look like: order and format of the message properties.

For logger configuration file format read the documentation at https://docs.python.org/3/library/logging.config.html#configuration-file-format.

Following is the example of how to configure logging.

First, section to look at is [loggers] where names of loggers are listed (TabPy will log with all of them). The following example specifies 2 loggers:

[loggers]
keys=root,fileLogger

NOTE: there have to be a logger (it can be the only logger) named “root” in the list of loggers.

Next is the [handlers] section which lists handlers:

[handlers]
keys=consoleHandler,fileHandler

And finally formatters section [formatter] lists all the formatters (not all of them have to be used):

[formatters]
keys=consoleFormatter,fileFormatter

Now for each logger in [logger] section there is [logger_<name>] section with settings for a logger:

[logger_root]
level=WARNING
handlers=consoleHandler

[logger_fileLogger]
level=DEBUG
handlers=fileHandler
propgate=1
qualname=consoleLogger

As you can see from the example above for each logger severity level is specified which means only messages with specified severity of higher. The level can be one of DEBUG, INFO, WARNING, ERROR, CRITICAL or NOTSET (all messages will be logged).

Handlers for a logger are listed with handlers parameter. It is possible to have more than one handler for a logger.

For non-root loggers properties propgate and qualname need to be set. The first one specifies if messages need to be propagated to a handler higher up and is set to 1 or 0. And qualname sets the name for the logger so it can be referenced from Python code.

For each handler there is [handler_<name>] section:

[handler_consoleHandler]
class=StreamHandler
level=WARNING
formatter=consoleFormatter
args=(sys.stdout,)

[handler_fileHandler]
class=handlers.RotatingFileHandler
level=DEBUG
formatter=fileFormatter
args=('tabpy_log.log', 'a', 1000000, 5)

For each formatter there is level parameter which is configured in the same way as for a logger.

With class parameter a Python class which implements a handler is specified (more details below). It is possible to use the same class for different handlers. As an example, you can have one large log file where entries are appended and the same messages logged to date specific log files.

Formatter for logged messages is set with formatter parameter. Again – the same formatter can be used with multiple handlers.

And args parameter provides logger specific parameters.

Handler classes available with Python are listed at https://docs.python.org/3/library/logging.handlers.html#module-logging.handlers, but it is possible to use any custom handlers (e.g. for colorful output in the console) with other packages installed in the Python environment. Some useful handlers are:

  • StreamHandler sends logging to a stream (e.g. console).
  • FileHandler appends messages to a file.
  • NullHandler does not log anything.
  • RotatingFileHandler logs to a file until log file size limit is reached and then creates a new file and logs to it. The handler is used in the example above and it will create a new log file when the current one reaches 1000000 bytes in size. The current file name is always tabpy_log.log, previous log file will be named tabpy_log.log.1 and so on till tabpy_log.log.5.
  • TimedRotatingFileHandler is similar to RotatingFileHandler but creates a new file after the specified time interval.
  • HTTPHandler sends messages to a web server with GET or POST command.

For each formatter printf-style string formatting string (documentation is here – https://docs.python.org/3/library/stdtypes.html#old-string-formatting) specifies how the format message is built. In the format string additional log objects can be used (list of the log objects is here – https://docs.python.org/3/library/logging.html#logging.LogRecord):

[formatter_consoleFormatter]
format=%(asctime)s: %(message)s
datefmt=%H:%M:%S

[formatter_fileFormatter]
format=%(asctime)s [%(levelname)s] (%(filename)s:%(module)s:%(lineno)d): %(message)s
datefmt=%Y-%m-%d,%H:%M:%S

In this example console formatter will output all the messages with preceding timestamp and for file logs there will be date-time stamp, severity of the message, where in the code it was logged from and the message itself.

Now the whole config file (I have it on my machine saved as c:\demo\tabpy\tabpy.conf:

[loggers]
keys=root,fileLogger

[formatters]
keys=consoleFormatter,fileFormatter

[handlers]
keys=consoleHandler,fileHandler

[logger_root]
level=WARNING
handlers=consoleHandler

[logger_fileLogger]
level=DEBUG
handlers=fileHandler
propgate=1
qualname=consoleLogger

[handler_consoleHandler]
class=StreamHandler
level=WARNING
formatter=consoleFormatter
args=(sys.stdout,)

[handler_fileHandler]
class=handlers.RotatingFileHandler
level=DEBUG
formatter=fileFormatter
args=('tabpy_log.log', 'a', 1000000, 5)

[formatter_consoleFormatter]
format=%(asctime)s: %(message)s
datefmt=%H:%M:%S

[formatter_fileFormatter]
format=%(asctime)s [%(levelname)s] (%(filename)s:%(module)s:%(lineno)d): %(message)s
datefmt=%Y-%m-%d,%H:%M:%S

Starting TabPy with the config:

tabpy --config c:\demo\tabpy\tabpy.conf

After running a few requests against my local TabPy instance this is what I see in the console:

15:19:37: Responding with status=404, message="Unknown endpoint", info="Endpoint olek_add is not found"
15:19:37: 404 GET /endpoints/olek_add (::1) 5.03ms

And for the same TabPy there’s much much more information in tabpy_log.log file:

2019-12-04,15:14:54 [DEBUG] (app.py:app:215): Parameter port set to "9004" from default value
2019-12-04,15:14:54 [DEBUG] (app.py:app:215): Parameter server_version set to "0.8.9" from default value
2019-12-04,15:14:54 [DEBUG] (app.py:app:215): Parameter evaluate_timeout set to "30" from default value
...
2019-12-04,15:15:17 [INFO] (app.py:app:110): Initializing TabPy...
...
2019-12-04,15:15:17 [INFO] (app.py:app:93): Web service listening on port 9004