binfalse

Practical Challenges of Interdisciplinary Teamwork

October 4th, 2019

Working in an interdisciplinary field is my vocation. I am a Bioinformatician – my whole education was shaped by an interdisciplinary nature. I get paid by the University of Rostock, which is also everything but straight forward: Part-time I am working as a systems engineer for the institute of computer science, the other time I’m working at the department of systems biology and bioinformatics (SBI). Thus, I have two desks at the university, which are approximately 2km away from each other. At these desks I am doing hardcore IT stuff and/or/xor applied computer science in various projects and roles. My colleagues are again purely non-pure: There are computer scientists from Italy, mathematicians from India, medical biotechnologists from Canada, neurobiologists from Germany, computational engineers from Pakistan, and so on.. At one desk I am talking in English, at the other everyone’s expecting German (including complaints about anglicisms)..

Here I’m just jotting down some experiences from the recent past to increase the awareness of the complexity on the meta-level of interdisciplinary and intercultural collaborations.

Different Cultures…

Pretty early I learned about the issues when working across different domains, languages, and cultures. For example, a yes from a German to Do you understand what I just explained? typically means (s)he understood what I just explained. However, some cultures would not publicly admit a lack of understanding. Thus, my colleague will say (s)he understood and then go away to actually do the opposite.

Some time ago, I offered a beer to an Asian friend. He denied and so he did not get one. Months later he confessed that he has been suffering, watching me enjoying the cold beer – as he desperately wanted a beer as well! But in his culture you wouldn’t immediately take something offered. Instead he wants to be persuaded into a beer…

Hence, a no does not necessarily mean no. And vice versa, if he offers me some tea or sweets, he would not take my no for granted, but would offer the sweets again and again ;-)

Different Languages…

Such contradictions are not necessarily a cultural issue, but it is sometimes due to the language: If our mother tongues differ we are typically falling back to English, which is then a foreign language for both dialogue partners. Consequently, everyone struggles expressing and grasping thoughts. This entails a good potential for misunderstandings. In addition, there may be a clash of domain languages – experts from different fields think in orthogonal concepts or use the same words differently.

I recently had a kitchen-conversation with a biophysicist from Iran about PCR: The polymerase chain reaction. For quite some time I thought he must be drunk, because it did absolutely make no sense what he was talking about. Until I realised, that he was actually talking about PCA: The Principal Component Analysis! Both, PCR and PCA would have made sense to chat about with him and the pronunciation of the German A (ʔaː) and the English R (ɑːr) is quite similar..

There are many similar confusions. At our department, for example, PSA is used for a public service announcement or for a prostate-specific antigen – that’s not always crystal clear. Similarly, APT is an abbreviation for apartment at one of my desks, while it may mean advanced persistent threat or that someone is talking about Debian’s package manager at the other desk.

Different Tools…

However, it is not only about languages, but also about best-practices in different domains! People from diverse disciplines learnt to use diverging tools and workflows, that sometimes seem crazy from the opposite perspective.

Not long ago, we built a website in an interdisciplinary project. The developer drafted some text for the webpage and asked the others to review the wording and correct typos – assuming to get a pull request, as the sources are shared on a common code platform. However, the response was an email with a .docx attachment: The whole text of the web page was copied to Microsoft Word and then corrected using track-changes! Which in turn caused further trouble with the developer, who’s not used to work with Microsoft’s office… ;-)

Indeed, that happens all too often!

The other day, someone sent an email

Please kindly find the attached file, the first draft of workflow.
I am looking forward to your feedback.

attached was a file Präsentation2.pptx. That, of course, made the tech-guy’s hair stand on end! Powerpoint to draw figures? A meaningless file name with German diaeresis (Umlaut)? And the 2 in the file’s name explains everything about how the documents are versioned..

And so goes the whole communication between the experts from different domains. While some always communicate through tickets on the coding platform, others will respond with attachments to emails or using some other channels (such as Twitter messages or whatever chat protocols). Consequently, you are spending a significant amount of time on searching, jigsawing, and puzzling messages.

Different Times…

The communication becomes even more difficult if the partners are located in different time zones. Obviously, there is then less overlap in working hours.

When I write an email to a collaborator in New Zealand, he will typically receive it around the middle of his night and answer in the middle of my night. For a call, we need to schedule a meeting which is out-of-office-time for both of us.

Consequently, decisions, that would have been made in a few minutes during a f2f meeting, can take several days of discussions.

Different Goverments…

Working across different time zones typically also implies working across loyalties. My collaborators may need to comply with very different laws – or they may be affected by other absurd rules!

In a recent project we decided to use one of these big American platforms to facilitate our collaboration. Suddenly, it turned out that some people in the team cannot access the platform anymore. Even though they did nothing wrong, a wigged carrot on steroids violently banned them with embargos or sanctions. Including all consequences.

Such things are simply unpredictable, but have serious impact on the collaboration.

Ergo… Stop?

With all these difficulties, should you stop interdisciplinary teamwork? Certainly not!! Instead, be aware of these challenges and budget some extra time

to learn how to speak to one another without confusion,
to acknowledge the complexity on the meta-level of interactions, and
for unexpected interruptions.

Despite all the difficulties, it’s great to work in diverse teams! Even though it drove me crazy multiple times, I learnt to appreciate decelerations. Different skills, contradictory perspectives, and orthogonal peculiarities entail many discussions and cost a great deal of energy, but almost always improve the quality of the product.

In addition, and maybe more importantly, working in an interdisciplinary field expands your horizon and you will learn things you cannot imagine.

A recent visitor from Hong Kong exchanged insights about the current protests in his home country. I had a conversation with two colleague from India and Pakistan about the Kashmir conflict. And I actually felt the effects of embargos - which are otherwise far away from Germans..

However, the outcome is absolutely worth the “trouble” ;-)

Dockerising Contao 4

July 17th, 2019

Last year, we moved the website of our department from Typo3 to Contao version 3. I wrote about that in Dockerising a Contao website and Dockerising a Contao website II. Now it was time to upgrade from Contao version 3 to 4. And as usual: Things have changed… So, how to jail a Contao 4 into a Docker container?

Similar to Contao 3, we use two images for our Contao 4 site. One is a general Contao 4 installation, the other one is our personalised version.

A general Contao 4 image

The general Contao 4 is based on an PHP image that includes an Apache webserver. In addition, we need to

install a few dependencies,
enable some Apache modules,
install some extra PHP extensions,
install Composer,
and finally use Composer to install Contao.

This time, I outsourced the installation of Composer into a seperate script install-composer.sh:

#!/bin/sh

EXPECTED_SIGNATURE="$(wget -q -O - https://composer.github.io/installer.sig)"
php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"
ACTUAL_SIGNATURE="$(php -r "echo hash_file('sha384', 'composer-setup.php');")"

if [ "$EXPECTED_SIGNATURE" != "$ACTUAL_SIGNATURE" ]
then
    >&2 echo 'ERROR: Invalid installer signature'
    rm composer-setup.php
    exit 1
fi

mkdir -p /composer/packages

php -d memory_limit=-1 composer-setup.php --install-dir=/composer
RESULT=$?
rm composer-setup.php
#chown -R www-data: /composer

exit $RESULT

Thus, you’ll find a current composer installation in /composer.

The Dockerfile for the general image then boils down to the following:

FROM php:7-apache
MAINTAINER martin scharm <https://binfalse.de/contact/>

RUN apt-get update \
 && apt-get install -y -q --no-install-recommends \
    wget \
    curl \
    unzip \
    zlib1g-dev \
    libpng-dev \
    libjpeg62-turbo \
    libjpeg62-turbo-dev \
    libcurl4-openssl-dev \
    libfreetype6-dev \
    libmcrypt-dev \
    libxml2-dev \
    libzip-dev \
    ssmtp \
 && apt-get clean \
 && rm -r /var/lib/apt/lists/* \
 && a2enmod expires headers rewrite

RUN docker-php-source extract \
 && docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/ \
 && docker-php-ext-install -j$(nproc) zip gd curl pdo pdo_mysql soap intl \
 && docker-php-source delete

ADD install-composer.sh /install-composer.sh

RUN bash /install-composer.sh \
 && chown -R www-data: /var/www

USER www-data

RUN php -d memory_limit=-1 /composer/composer.phar create-project contao/managed-edition /var/www/html '4.4.*'

PLEASE NOTE: sSMTP is not maintained anymore! Please switch to msmtp, for example, as I explained in Migrating from sSMTP to msmtp.

This image includes the package for sSMTP to enable support for mails. To learn how to configure sSMTP, have a look into my earlier article Mail support for Docker’s php:fpm.

Alltogether, this gives us a proper recipe to get a dockerised Contao 4. It is also available from the Docker Hub as binfalse/contao.

A personalised Contao 4 image

Based on that general Docker image, you can now create your personalised Docker image. There is a template in the corresponding Github repository.

A few things worth mentioning:

After installing additional contao modules, you should clear Contao’s cache using:

RUN php -q -d memory_limit=-1 /var/www/html/vendor/contao/manager-bundle/bin/contao-console cache:clear --env=prod --no-warmup

Contao still does not respect the HTTP_X_FORWARDED_PROTO… Thus, if running behind a reverse proxy, Contao assumes its accessed through plain HTTP and won’t deliver HTTPS links. I explained that in Contao 3: HTTPS vs HTTP. However, the workaround for Contao 3 doesn’t work anymore - and there seems to be no proper solution for Contao 4. Therefore, we need to inject some code into the app.php… Yes, you read correctly… Ugly, but anyway, can easily be done using:

RUN sed -i.bak 's%/\*%$_SERVER["HTTPS"] = 1;/*%' /var/www/html/web/app.php

The composer-based installation apparently fails to set the files’ links. Thus we need to do it manually:

RUN mkdir -p /var/www/html/files && ln -s /var/www/html/files /var/www/html/web/files

Everything else should be pretty self-explaining…

Tying things together

Use Docker-Compose or whatever to spawn a container of your personalised image (similar to Contao 3: Docker-Compose).

Just make sure, you mount a few things correctly into the container:

your files need to go to /var/www/html/files
Contao’s configuration belongs to /var/www/html/system/config/*.php, as usual
Symfony’s configuration belongs to /var/www/html/app/config/parameters.yml and /var/www/html/app/config/config.yml
For the mail configuration see Mail support for Docker’s php:fpm

Please note, that the database connection must be configured in Symfony’s parameters.yml! Instead of Contao’s localconfig.php, as it used to be for Contao 3.

Puppet to deploy Matlab

June 27th, 2019

Merge of Puppet Logo [Puppet_Logo.svg, Public domain] and MathWorks, Inc. [Matlab_Logo.svg, CC0], via Wikimedia Commons

If you’re coming from a scientific environment you’ve almost certainly heard of Matlab, haven’t you? This brutally large software blob that can do basically all the math magic for people with minimal programming skills ;-)

However, in a scientic environment you may need to deploy that software to a large number Windows PCs. And lazy admins being lazy… We have tools for that! For example Puppet.

Deployment

Here I assume that you have a network license server somewhere in your local infrastructure. And I further assume that you already know how to install Matlab manually by answering all the questions in the installer GUI - so that you’ll end up with a working Matlab installation.

0. What we need

To deploy Matlab we need to have a few things ready:

the Matlab binaries. They typically come in form of two DVD images in ISO format.
a license key, which typically looks like a large number of integers seperated by dashes 12345-67890-12343-....
a license file, that contains information on the license server etc
a puppet manifest - I’ll assume it’s called MODULE/manifests/matlab.pp
a directory that is shared through Puppet - I will assume it’s the /share/ directory. Configure that for example in /etc/puppetlabs/puppet/fileserver.conf using:

[share]
    path /share/
    allow *

1. Unpack the Matlab files

We need to extract the Matlab binaries from both ISO images. There are many ways to access the files, eg.

open the files with a archive manager

xarchiver /path/to/matlab.iso

mount them using loop devices

mount -o loop /path/to/matlab.iso /mnt

or “uncompress” them using 7zip

7z x /path/to/matlab.iso

Whatever you’re using, you need to merge all the files of both images into a single directory, including the two hidden files .dvd1 and .dvd2! The target directory should be shared through Puppet. So move all files to /share/matlab/. If there is now a file called /share/matlab/.dvd1 and another file /share/matlab/.dvd2 on your system chances are good that you’re all set up :)

Afterwards, also put the license file into that directory (it’s typically called license.dat, save it as /share/matlab/license.dat).

2. Prepare an input file for the installer

Ever installed Matlab? It will ask a lot of questions.. But we can avoid those, by giving the answers in a file called installer_input.txt! You will find a skeleton in /share/matlab/installer_input.txt. Just copy that file to your module’s template directory and postfix it with .erb -> this will make it a template for our module. Go through that MODULE/templates/installer_input.txt.erb file and replace static settings with static strings, and variable settings with ERB syntax. You should have at least the following lines in that file:

## SPECIFY INSTALLATION FOLDER
destinationFolder=<%= @matlab_destination %>

## SPECIFY FILE INSTALLATION KEY 
fileInstallationKey=<%= @matlab_licensekey %>

## ACCEPT LICENSE AGREEMENT  
agreeToLicense=yes

## SPECIFY INSTALLER MODE 
mode=silent

## SPECIFY PATH TO LICENSE FILE (Required for network license types only)
licensePath=<%= @matlab_licensepath %>

We’ll fill the variables in the module’s manifest.

3. Prepare the installation

Go ahead and open MODULE/manifests/matlab.pp in your preferred editor.

First, we need to define a few variables (a) for the installer_input.txt.erb template and (b) for the rest of the manifest:

$matlabid = "2018b"
$matlab_installpath = "C:\\tmp\\install\\matlab${matlabid}"
$matlab_installer = "${matlab_installpath}\\setup.exe"
$matlab_licensepath = "${matlab_installpath}\\license.dat"
$matlab_licensekey = "12345-67890-12343-...."
$matlab_input = "C:\\tmp\\install\\matlab-installer_input.txt"
$matlab_destination = "C:\\Program Files\\MATLAB\\R${matlabid}"

I guess that is all self-explanatory? Here, we’re installing a Matlab version 2018b. We’ll download the shared Matlab files to C:\\tmp\\install\\matlab2018b. And we’ll expect the installed Matlab tool in C:\\Program Files\\MATLAB\\R${matlabid}

So let’s go and copy all the files from Puppet’s share:

file { "install files for matlab":
    ensure => present,
    path => $matlab_installpath,
    source => "puppet:///share/matlab",
    recurse      => true,
    notify => Package["MATLAB R${matlabid}"],
    require => File["C:\\tmp\\install"],
}

So we’re downloading puppet:///share/matlab to $matlab_installpath (=C:\\tmp\\install\\matlab${matlabid}). This requires the directory C:\\tmp\\install to be created beforehand. So make sure you created it, eg using:

file { "C:\\tmp":
    ensure => directory,
}
file { "C:\\tmp\\install":
    ensure => directory,
    require => File["c:\\tmp"]
}

Next we’ll create the installer input file based on our template:

file { $matlab_input:
    content => template('MODULE/installer_input.txt.erb'),
    ensure => present,
    require => File["install files for matlab"],
    notify => Package["MATLAB R${matlabid}"],
}

This will basically read our installer_input.txt.erb, replace the variables with our settings above, and write it to $matlab_input (=C:\\tmp\\install\\matlab-installer_input.txt).

That’s it. We’re now ready to tell Puppet how to install Matlab!

4. Launch the installer

The installation instructions can be encoded by a final package block in the manifest:

package { "MATLAB R${matlabid}":
    ensure => installed,
    source => "$matlab_installer",
    require => [
        File[$matlab_input],
        File["install files for matlab"]
    ],
    install_options => ['-inputFile', $matlab_input],
}

Thus, if MATLAB R${matlabid} is not yet installed on the client machine, Puppet will run

$matlab_installer -inputFile $matlab_input

which will expand with our variable-setup above to

C:\tmp\install\matlab2018b\setup.exe -inputFile C:\tmp\install\matlab-installer_input.txt

All right, that’s it. Just assign this module to your clients and they will start installing Matlab automagically :)

Thunar's volatile default application

June 27th, 2019

Thunar's hammer in the Xfce project [GPL (http://www.gnu.org/licenses/gpl.html)], via Wikimedia Commons

Thunar (Xfce’s file manager) has a rather unintuitive behaviour to select the default app: For some file types it seems that chossing a program of the context menu’s “Open With…” overwrites the default application for that file type… That means, once I open a PNG file with Gimp, Gimp becomes the default for PNGs and double clicking the next PNG will result in a >300 ms delay to launch Gimp. Strangely, that only happens for some file types. Others seem to be invariant to the open-with-selection…? Anyway, bugged me enough to finally look into it..

It seems, that this was a design decision whithin the Xfce project: If you actively selected a default application it will stay the default application, even if you temporarily open-with another application. If you did not actively select a default application, the last application will be used by default -> this is my annoying use case.

At least, I now know what is needed to do: Actively select a default applications…

You can do it using the UI by right-clicking a file of the type and selecting Open With Other Application…. Then select the desired application and make sure you tick Use as default for this kind of file. From then on, this will be your default application, until you actively change it.

That may be a good solution for many of you, but it’s also pretty tedious to find and right-click all the different file types. And of course it’s not the way I’m working. There must be a nicer option - and there is! The configuration for Thunar’s mime type bindings is stored in ~/.config/mimeapps.list :)

This file contains two sections:

[Added Associations] contains a list of known file types and possible associations to applications
[Default Applications] is a list of file types and … their default application…

Thus, to add another default-application-association, you just need to append another line to the [Default Applications] section. You may just copy a line from the [Added Associations] and reduce the number of applications to one, eg. for PNG images:

[Added Associations]
image/png=eom.desktop;eog.desktop;gimp.desktop
...

[Default Applications]
image/png=eog.desktop

If your desired application is not yet int the list of Added Associations, you may find it in /usr/share/applications/. If you still cannot find an application, you can generate a new one. Just create a file ~/.local/share/applications/YOURAPP.desktop containing something like this:

[Desktop Entry]
Encoding=UTF-8
Version=1.0
Type=Application
NoDisplay=true
Exec="/PATH/TO/YOUR/APPLICATION" %f
Name="YOURAPP"
Comment="YOURAPP COMMENT"

Afterwards, you can use YOURAPP.desktop in ~/.config/mimeapps.list.

Looks like I’m often in trouble with default applications…? Is it just me?
If you have problems with KDE applications, you may want to look into my article on KDE file type actions

apt-cacher-ng versus apt-transport-https

May 13th, 2019

The headline sounds pretty technical, and so is the topic. Let’s quickly introduce both antagonists:

apt-cacher-ng is a tool to cache packages of the apt ecosystem. As an administrator, you may have multiple Debian-based systems. The overlap of packages that all the systems need is typically huge. That means, hundreds of your systems will require the latest security update for curl at around the same time. Running an apt-cacher-ng server in your local environment will take a bit heat off Debian’s infrastructure and improves the download speed of packages. See also the Apt-Cacher NG project page.
apt-transport-https is an apt module to obtain packages over a secure https:// connection. Traditionally, packages are downloaded through plain HTTP or FTP, but as these are unencrypted a third party may observe what you’re doing at a repository (which packages you’re downloading etc..). Please note, that apt-transport-https is already integrated in latest versions of apt - no need to install it separately.

So basically, both apt-cacher-ng and apt-transport-https do a good thing! But… They don’t really like each other.. At least by default. However, I’ll show you how to make them behave ;-)

The Problem

The issue is perfectly obvious: You want apt-cacher-ng to cache TLS encrypted traffic…? That won’t happen.

The Solution

You need to tell the client to create an unencrypted connection to the cache server, and then the cache server can connect to the repository through TLS.

Example

Let me explain that using Docker. To properly install Docker on a Debian based system, you would add a file /etc/apt/sources.list.d/docker.list containing a repository such as:

deb [arch=amd64] https://download.docker.com/linux/debian stretch stable

However, when apt is told to use a cache server, it would fail to download Docker’ packages:

# apt-get update
[...]
W: Failed to fetch https://download.docker.com/linux/debian/dists/stretch/InRelease  Invalid response from proxy: HTTP/1.0 403 CONNECT denied (ask the admin to allow HTTPS tunnels)     [IP: 1.2.3.4 3142]
W: Some index files failed to download. They have been ignored, or old ones used instead.

Let’s fix that using the following workaround:

0. Assumptions

There is an apt-cacher-ng running at http://apt.cache:3142.
apt.cache resolves to 1.2.3.4.
There is a client configured to use the cache server, e.g. /etc/apt/apt.conf.d/02proxy says:

Acquire::http { Proxy "http://apt.cache:3142"; }

1. Create a mock DNS for the cache server

You need to create a pseudo domain name that points to the cache server. This name will then tell the cache server which target repository to access. Let’s say we’re using docker.cache. You can either create a proper DNS record, or just add a line to the client’s /etc/hosts file:

1.2.3.4 apt.cache docker.cache

Now, both apt.cache and docker.cache will resolve to 1.2.3.4 at the client.

2. Update the client’s repository entry

Instead of contacting the repository directly, the client should now connect to the cache server instead. You need to change the contents in /etc/apt/sources.list.d/docker.list to:

deb http://docker.cache stretch stable

Thus, the client now treats the cache server as a proper repository!

3. Inform the cache server

The apt-cacher-ng of course needs to be told what to do, when clients want to access something from docker.cache: It should forward the request to the original repository!

This is called remapping. First create a file /etc/apt-cacher-ng/backends_docker_com at the server containing the link to the original repository:

https://download.docker.com/linux/debian

Then, populate the remapping rule in /etc/apt-cacher-ng/acng.conf. You will find a section of Remap entries (see default config of acng.conf). Just append your rule:

Remap-dockercom: http://docker.cache ; file:backends_docker_com

This line reads:

There is a remap rule called Remap-dockercom
which remaps requests for http://docker.cache
to whatever is written in file backends_docker_com

That’s it. Restart the cache server and give it a try :)

4. Add more Remaps

If you want to use more repositories through https://, just create further mock-DNS-entries and append corresponding remapping rules to the acng.conf. Pretty easy..

The Improvements

This setup of course strips the encryption off apt calls. Granted, it’s just the connections in your own environment, but still not really elegant.. So the goal is to also encrypt the traffic between client and cache server.

There is apparently no support for TLS in apt-cacher-ng, but you can still configure an Nginx proxy (or what ever proxy you find handy) at the cache server, which supports TLS and just forwards requests to the upstream apt-cacher-ng at the same machine. Or you could setup an stunnel.

Supplemental

There are a few other workarounds for this issue available. Most of them just show how to circumvent caching for HTTPS repositories (which somehow reduces the cache server to absurdity). Here, I just documented the (in my eyes) cleanest solution.

Martin Scharm

stuff. just for the records.

Do you like this page?
You can actively support me!