Which country is the most stupid

Today I had a conversation with a scientist from Bulgaria who is working with microarrays. He told me some practical experiences of his work. It was very interesting and I learned a lot of things, in spite of the fact that I gave a lecture about microarrays some time ago.

In this talk he said a wonderful sentence:

Früher dachte ich immer die Russen wären dumm, bis ich die Amerikaner kennen gelernt habe!

English translation: Some years ago I thought the Russians are stupid, until I got to know the Americans.

Topic was the structuring of websites of companies. If he has a question he always has to search through the web because everyone tells him the answer is anywhere in there! affymetrix for example has thousands of user manuals, the intersection of all of these papers is very small, but one paper has hundreds of pages… And I think he is totally right. The arrangement of information today is very terrible, to find what you are searching about is some kind of art! But he doesn’t mince matters. I really like Eastern Europeans ;)

He invented me to his lab tomorrow so I can see how this affymetrix machinery produces the data that I get to analyze.

Little quickie through Germany

Oh no, not that kind of quickie you might think about! Rumpel an me decided more or less spontaneously to go to Bonn and visit one of our former employee Martin and additionally take a little look at SIGINT in Cologne.

So we rent a car at Sixt on Friday morning and met Martin at 5 pm in his flat. Of course our trip was very analog, we didn’t have any navigation device, just printed a route calculated by Google maps and rely to male instinct on the way through Germany and the high traffic in Ruhr Valley at Friday afternoon before holiday… What should I say, of course everything went totally well and we had a lot of fun in our little car! You can see some pictures at picasa.

Quickie through Germany
Quickie through Germany

Of course it was a great weekend! We’ve seen a lot of fascination places of Bonn and Cologne like Cologne cathedral, big ships on Rhine or Media Center in Cologne. The events at SIGINT were also interesting, where it cannot be compared with the Chaos Communcation Congress in Berlin. In Cologne you’ll always get a chair and the queues are very short. Nevertheless the topics are of high quality.

All in all it was an excellent trip, even it was very expensive.

Git merging showcase

One of the people that are working with me on some crazy stuff always forgets to pull the newest revision of the repository before changing the content and so he has very often trouble with different versions when he decides to push his work to the master repository. His actual workaround is to check out the complete repository in a new directory and merge his changes by hand into this revision… Here is a little instruction to maximize his productivity and minimize the network traffic.

Lets assume we have a repository, created like this:

/tmp % git init --bare root

And we have one user, that clones this new repository and inits:

/tmp % git clone root slave1
/tmp % cd slave1
/tmp/slave1 (git)-[master] % echo "line1\\nline2" >> testfile
/tmp/slave1 (git)-[master] % cat testfile
/tmp/slave1 (git)-[master] % git add .
/tmp/slave1 (git)-[master] % git commit -m "init"
[master (root-commit) bc7e4da] init
 1 files changed, 2 insertions(+), 0 deletions(-)
 create mode 100644 testfile
/tmp/slave1 (git)-[master] % git push ../root master

So we have some content in our root repo. Another user (our bad guy) clones that repository too:

/tmp % git clone root slave2

So let a bit of time elapse, while user one is changing the root repository so that the testfile may look like this:

/tmp/slave1 (git)-[master] % cat testfile | sed 's/line1/&\\nline1a/' > testfile.tmp && mv testfile.tmp testfile
/tmp/slave1 (git)-[master] % cat testfile

And of course, the changer commits his changes:

/tmp/slave1 (git)-[master] % git commit -a -m "haha, root has changed..."
[master e18f637] haha, root has changed...
 1 files changed, 1 insertions(+), 0 deletions(-)
/tmp/slave1 (git)-[master] % git push ../root master
Counting objects: 5, done.
Writing objects: 100% (3/3), 265 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
To ../root
   bc7e4da..a04d363  master -> master

Ok, nothing bad happened, but now our special friend decides to work:

/tmp/slave2 (git)-[master] % cat testfile | sed 's/line1/&\\nline1b/' > testfile.tmp && mv testfile.tmp testfile
/tmp/slave2 (git)-[master] % cat testfile
/tmp/slave2 (git)-[master] % git commit -a -m "oops, i am very stupid..."
[master d691ada] oops, i am very stupid...
 1 files changed, 1 insertions(+), 0 deletions(-)

What do you think will happen if he tries to push his changes to the master repo? Your right, nothing but a error:

/tmp/slave2 (git)-[master] % git push ../root master
To ../root
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to '../root'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes before pushing again.  See the 'Note about
fast-forwards' section of 'git push --help' for details.

Mmmh, so lets try to pull the root repo:

/tmp/slave2 (git)-[master] % git pull ../root master
remote: Counting objects: 5, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From ../root
 * branch            master     -> FETCH_HEAD
Auto-merging testfile
CONFLICT (content): Merge conflict in testfile
Automatic merge failed; fix conflicts and then commit the result.

Our friend would now check out the whole repository and insert his changes by hand, but whats the better solution? Merging the file! Git has a function called mergetool , you can merge the conflicts with a program of your choice. Some examples are vimdiff , xxdiff , emerge or also for GUI lovers kdiff3 . In this post I’ll use vimdiff :

/tmp/slave2 (git)-[master|merge] % git mergetool --tool=vimdiff testfile

Normal merge conflict for 'testfile':
  {local}: modified
  {remote}: modified
Hit return to start merge resolution tool (vimdiff): 
3 files to edit

So change the conflicting file(s), you will also see the changes made in root’s and in your local revision. If you’re done just save it and commit your merge:

/tmp/slave2 (git)-[master|merge] % git commit -m "merged"
[master 6be1482] merged

Great, now there is nothing that prevents you from pushing your changes to the root repository:

/tmp/slave2 (git)-[master] % git push ../root master
Counting objects: 10, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (6/6), 555 bytes, done.
Total 6 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
To ../root
   a04d363..6be1482  master -> master

I think this way of solving such conflicts maybe much more efficient than cloning the whole repository again and again and again ;)

First SUN Spot results

One week passed since I got a package of Spots, this weekend I found some time to hack a little bit with this funny components.

First of all I programmed a tool that visualizes the Spots movement in an OpenGL frame that draws a virtual Spot. Nice for demonstrations, but nothing spectacular.

After that I developed a little mouse emulator, that translates Spot movement to mouse motions on the screen. Here the Spot isn’t doing anything intelligent, it only sends its tilt status every 25 ms as well as switch events to broadcast. Another Spot, working as basestation connected to my machine, is listening to this talking Spot and my host analyzes the received values. To move the mouse on the screen or to generate a click I use the Robot class of the Java AWT package. Long story short, a video may explain it more understandable (via YouTube):

</embed> I will continue with working on these libraries before I publish them in another post. So look forward to the release ;-)

Java network connection on Debian:SID

The unstable release of Debian is of course tricky in a lot of cases, so there is also a little stumbling stone on your path of Java network programming. On every new system it annoys me.

Before I wrongful blame my preferred Debian release called Sid I have to acknowledge I don’t know whether this feature is also available in other releases… Here is a small program to test/reproduce:

import java.net.URL;
import java.io.BufferedReader;
import java.io.InputStreamReader;

public class WebReader
	public static void main (String[] args)
	throws Exception
		BufferedReader reader = new BufferedReader(
			new InputStreamReader (new URL (args[0]).openStream ()));
		String line = reader.readLine ();
		while ((line = reader.readLine ()) != null)
			System.out.println (line);

Compilation shouldn’t fail, but if you try to launch it you’ll get an exception like that:

Exception in thread "main" java.net.SocketException: Network is unreachable
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:525)
        at java.net.Socket.connect(Socket.java:475)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
        at sun.net.www.http.HttpClient.New(HttpClient.java:306)
        at sun.net.www.http.HttpClient.New(HttpClient.java:323)
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:860)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:801)
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
        at java.net.URL.openStream(URL.java:1010)
        at WebReader.main(WebReader.java:10)

This is caused by one little line in /etc/sysctl.d/bindv6only.conf saying you want to explicitly bind via IPv6. But my connection (maybe yours too) communicates still over IPv4, so this method of networking of course fails. To change this behavior you have to choose between two solutions.

Solution 1: Permanent modification (needs to be root)

You can change this behavior for the whole system by editing the file /etc/sysctl.d/bindv6only.conf :

# original only IPv6
# net.ipv6.bindv6only = 1
net.ipv6.bindv6only = 0

After that just type invoke-rc.d procps restart in your terminal to let your changes take effect. Your next run should work fine.

Solution 2: Change it for this single example

If your are not allowed to change system settings, you can add -Djava.net.preferIPv4Stack=true to your execution command:

java -Djava.net.preferIPv4Stack=true  WebReader http://localhost
# instead of `java WebReader http://localhost`

This causes your actual runtime to connect the network via IPv4, no matter to system preferences. I hope this could save some time of developers like me ;-)

You don't know the flash-trick?

Just sitting around with Micha on a SunRay (maybe meanwhile OracleRay?). He is surfing through the web until his session seems to hang and he said:

Fuck FLASH!! Need the flash-trick...

I didn’t heard about that trick before, but now he told me that feature.

If Flash kills your SunRay session you have to type Ctrl+Alt+Moon , relogin and your session will revive. With running Flash!

As far as I know this happens very often when he is using his browser because unfortunately the whole web is contaminated with this fucking Flash… The Flash-Trick is very nice, but a flashblock plugin would be more user friendly!?

Playing around with SUN Spots

My boss wants to present some cool things in a lecture that can be done with SUN Spots. I'm selected to program these things and now I have three of them to play a little bit.

The installation was basically very easy, all you should know is that there is no chance for 64bit hosts and also Virtual Box guests don't work as expected, virtual machines lose the connection to the Spot very often... So I had to install a 32bit architecture on my host machine (btw. my decision was a Sidux Μόρος).

If a valid system is found, the rest is simple. Just download the SPOTManager from sunspotworld.com, that helps you installing the Sun SPOT Software Development Kit (SDK). If it is done connect a Sport via USB, open the SPOTManager and upgrade the Spot's software (it has to be the same version as installed on your host). All important management tasks can be done with this tool and it is possible to create virtual Spots.

Additionally to the SDK you'll get some demos installed, interesting and helpful to see how things work. In these directories ant is configured to do that crazy things that can be done with the managing tool. Here are some key targets:

ant info		# get some info about the spot (version, installed application and so on)
ant deploy		# build and install to spot
ant host-run	# build a host application and launch it
ant help		# show info about existing targets
# to configure a spot to run as base station OTA has to be disabled and basestation must be started
ant disableota startbasestation

A basestation is able to administrate other Spots, so you don't have to connect each to your machine.

Ok, how to do own stuff?
There are some Netbeans plugins that makes live easier, but I don't like that big IDE's that are very slow and bring a lot of overhead to your system. To create an IDE independent project that should run on a Spot you need an environment containing:

  • File: ./resources/META-INF/MANIFEST.MF
    MIDlet-Name: NAME
    MIDlet-Version: 1.0.0
    MIDlet-Vendor: Sun Microsystems Inc
    MIDlet-1: App Description, ,your.package.MainClassName
    MicroEdition-Profile: IMP-1.0
    MicroEdition-Configuration: CLDC-1.1
  • File: ./build.xml
    < ?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <project basedir="." default="deploy">
    <property file="${user.home}/.sunspot.properties"/>
        <import file="${sunspot.home}/build.xml"/>
  • Directory: ./src
    Here you can place your source files

And now you can just type `ant` and the project will be deployed to the Spot.
A project that should run on your host communicating with other spots through the basestation needs a different environment:

  • File: ./build.xml
    < ?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <project basedir="." default="host-run">
    <property name="main.class" value="your.package.MainClassName"/>
    <property file="${user.home}/.sunspot.properties"/>
        <import file="${sunspot.home}/build.xml"/>
  • Directory: ./src
    Here you can place your source files

Ok, that's it for the moment. I'll report results.

April fools month

About one month ago, it was April 1st, I attached two more lines to the .bashrc of Rumpel (he is co-worker and has to operate that day).

These two lines you can see here:

# kleiner april-scherz von dein freund martin :P
export PROMPT_COMMAND='if [ $RANDOM -le 32000 ]; then printf "\\0337\\033[%d;%dH\\033[4%dm \\033[m\\0338" $((RANDOM%LINES+1)) $((RANDOM%COLUMNS+1)) $((RANDOM%8)); fi'

With each appearance of the bash prompt this command paints one pixel in the console with a random color. No respect to important content beyond this painting. That can really be annoying and he was always wondering why this happens! For more than one month, until now!

Today I lift the secret, so Rumpel, I’m very sorry ;)

Converting videos to images

I just wanted to split a video file in each single frame and did not find a program that solves this problem. A colleague recommended videodub, but when I see DLL’s or a .exe I get insane! I’ve been working a little bit with OpenCV before and coded my own solution, containing only a few lines.

The heart of my solution consists of the following 13 lines:

CvCapture *capture = cvCaptureFromAVI(video.c_str());
if (capture)
	IplImage* frame;
	while (frame = cvQueryFrame(capture))
		stringstream file;
		file < < prefix << iteration << ".png";
		cvSaveImage(file.str ().c_str (), frame);
cvReleaseCapture( &capture );

It just queries each frame of the AVI and writes it to an image file. Thus, not a big deal.

The complete code can be downloaded here. All you need is OpenCV and a C++ compiler:

g++ -I /usr/local/include/opencv -lhighgui -o vidsplit.out vidsplit.cpp

Just start it with for example:

./vidsplit.out --input video.avi --prefix myframes_

If you prefer JPG images (or other types) just change the extension string form .png to .jpg .

Download: C++: vidsplit.cpp (Please take a look at the man-page. Browse bugs and feature requests.)

From distance matrix to binary tree

In one of our current exercises we have to prove different properties belonging to distance matrices as base of binary trees. Additionally I tried to develop an algorithm for creating such a tree, given a distance matrix.

A distance matrix represents the dissimilarity of samples (for example genes), so that the number in the i-th row j-th column is the distance between element i and j. To generate a tree of it, it is necessary to determine some attributes of the distance between two elements so that it is a metric:

  1. (distances are positive)
  2. (elements with distance 0 are identical, dissimilar elements have distances greater than 0)
  3. (symmetry)
  4. (triangle inequality)

Examples for valid metrics are the euclidean distance , or the manhattan distance .

The following procedure is called hierarchical clustering, we try to combine single objects to cluster. At the beginning we start with cluster, each of them containing only one element, the intersection of this set is empty and the union contains all elements that should be clustered.

The algorithm now searches for the smallest distance in that is not 0 and merges the associated clusters to a new one containing all elements of both. After that step the distance matrix should be adjusted, because two elements are removed and a new one is added. The distances of the new cluster to all others can be computed with the following formula:

are two clusters that should be merged, represents another cluster. The constants depend on the cluster method to use, shown in table 1.

Table 1: Different cluster methods
Method $$\alpha$$ $$\beta$$ $$\gamma$$ $$\delta$$
Single linkage 0.5 0.5 0 -0.5
Complete linkage 0.5 0.5 0 0.5
Average linkage 0.5 0.5 0 0
Average linkage (weighted) $$\frac{|X|}{|X| + |Y|}$$ $$\frac{|Y|}{|X| + |Y|}$$ 0 0
Centroid $$\frac{|X|}{|X| + |Y|}$$ $$\frac{|Y|}{|X| + |Y|}$$ $$-\frac{|X|\cdot|Y|}{(|X| + |Y|)^2}$$ 0
Median 0.5 0.5 -0.25 0

Here denotes the number of elements in cluster .

The algorithm continues with searching for the smallest distance in the new distance matrix and will merge the next two similar elements until just one element is remaining.
Merging of two clusters in tree-view means the construction of a parent node with both clusters as children. The first clusters containing just one element are leafs, the last node is the root of the tree.

Small example

Let’s create a small example from the distance matrix containing 5 clusters, see table 2.

Table 2: Start distances
A 0 5 2 1 6
B 5 0 3 4 1.5
C 2 3 0 1.5 4
D 1 4 1.5 0 5
E 6 1.5 4 5 0

A and D are obviously the most similar elements in this matrix, so we merge them. To make the calculation easier we take the average linkage method to compute the new distances to other clusters:

With these values we are able to construct the new distance matrix of 4 remaining clusters, shown in table 3.

Table 3: Cluster after 1st iter.
A,D 0 4.5 1.75 5.5
B 4.5 0 3 1.5
C 1.75 3 0 4
E 5.5 1.5 4 0

This matrix gives us the next candidates for clustring, B and E with a distance of 1.5.

With the appropriate distance matrix of table 4.

Table 4: After 2nd iter.
A,D 0 5 1.75
B,E 5 0 3.5
C 1.75 3.5 0

Easy to see, now we cluster [A+D] with C:

and obtain a last distance matrix with table 5.

Table 4: Final matrix
A,C,D 0 4.25
B,E 4.25 0

Needless to say, further calculations are trivial. There are only to clusters left and the combination of them gives us the final cluster containing all elements and the root of the desired tree.
The final tree is shown in figure 1. You see, it is not that difficult as expected and ends in a beautiful image!


If is defined as above there is no guarantee that edge weights reflect correct distances! When you calculate the weights in my little example you’ll see what I mean. If this property is desired the distance function has to comply with the condition of ultrametric inequality: .

The method described above is formally known as agglomerative clustering, merging smaller clusters to a bigger one. There is another procedure that splits bigger clusters into smaller ones, starting with a cluster that contains all samples. This method is called divisive clustering.