Skip to content

Thoughts on Version Control Systems

I have experience with four version control systems. Let’s look at the pros and cons of each.

CVS

CVS may have been good 15 or 20 years ago. But today it is fragile and has a weak feature set:

  • It does not have atomic commits or even “commit sets”—commits consisting of changes to multiple files or directories—and so without meticulous coordination between committers it is easy to get a corrupted repository.
  • It does not handle binary files well. By default CVS messes with newlines in the files it stores, and performs substitution for things that look like keywords. These are problems for non-text files. CVS was designed for storing text files and because of this extra steps have to be taken to allow proper treatment of binary files. Also, delta compression is not used for binary files, so repository size may balloon rapidly when tracking binary files.
  • Administering a CVS repository is painful at best. pserver mode requires running a daemon or messing with xinetd, and one has to go through extra hoops to ensure security. One can also use ssh, but this requires new user accounts and a common group for the developers.
  • Using CVS is painful. One must either set an environment variable for the current repository being worked on, or must pass extra parameters to the CVS commands.
  • It doesn’t handle renames well. History is tracked on a per-name basis, so when you rename, history becomes very difficult to work with (you will probably have to poke around “attic”).
  • Directories cannot be entirely deleted from a CVS repository without mucking in its internals.
  • CVS does not offer any kind of merge tracking.

It’s been a very long time since I used CVS and it was also the very first version control system I used, so I don’t have more specific complaints. It is at least better than nothing at all.

Subversion

Subversion is referred to by its developers as “CVS done right”. It offers many improvements:

  • Atomic commits and “commit sets” (I avoid calling them changesets because this has a more precise meaning regarding storage model, which svn does not have). Changes to multiple paths can be bundled up as one commit. If for some reason the commit fails midway through (e.g. network troubles), the changes are rolled back—the repository has much less chance of becoming corrupted.
  • Revisions are stored compactly, even for binary files: instead of storing a complete copy of a versioned object for each revision, only a delta is stored (this might not be entirely accurate, e.g. for performance reasons each nth revision may be stored in its entirity—but it is mostly accurate).
  • Administration is slightly easier. There is an svn protocol which has most of the same issues as pserver for CVS. There is also HTTP(S) mode; svn can integrate with a webserver. It can also be set up to use SSH, but this is as much a pain in the ass as with CVS, and runs a greater risk of repository corruption than any of svn’s web-based modes.
  • Using Subversion is nicer than using CVS. Instead of having to set an environment variable or specify an extra repository parameter, most commands can figure out what to do when run within a working copy of a repository.
  • Renames are handled more elegantly than with CVS—`svn mv’ is implemented as a copy and delete.

Other advantages:

  • Good integration with IDEs. CVS has this too. I don’t use IDEs, so I don’t care about this very much.
  • Many people are familiar with Subversion already, so if you have to work with others who are not willing to learn new tools, it may be a reasonable choice.

Subversion has many issues, however:

  • No support for merge tracking. Merging is hokey and painful.
  • It’s very slow! When it was young it was at least an order of magnatude slower than CVS for most operations. That has surely improved by now, but it is still comparatively quite slow. Mercurial is much faster, and git is supposedly even faster than Mercurial for many operations. Perhaps I’ll benchmark version control systems one day.
  • It is difficult to identify the differences between branches. One has to resort to scripting this or performing a pretend merge.
  • It is more difficult than it should be to revert to older versions. One has to either merge with the old revision (which svn doesn’t make easy), or cat the changes to each file you want to roll back and then commit.
  • The supplied script for sending commit messages lacks desired features—like putting the first line of the commit message as the email subject.
  • It is not too difficult to get a working copy completely screwed up. If you manage to delete the .svn directory in a versioned directory, you are toast—probably the only way to recover is to do a brand new working-copy checkout. I hope you commit your changes before you make this mistake.
  • Did I mention that it is slow?
  • Many commands are not available from the main svn program, and instead one must execute svn-PROGNAME or svnPROGNAME. These commands also do not all accept arguments in a uniform, coherent way (I ran into this the other day but don’t remember the specific case).
  • The commands that operate on a repository (rather than a working copy) do not accept raw paths: for example, if the repository is at ~/repos/silly_svn_repo, one cannot use that as an argument. One must type instead file:///PATH-TO-HOME/repos/silly_svn_repo. This is stupid. Paths without “file://” should be treated as though they do by default.
  • Shitty/nonexistent man pages. Instead one must type “COMMAND help”. It also bothers me that when one enters a command with bad parameters, you get just a pity error message followed by “Type ’svn help’ for usage”. I would rather have it print help by default when a bad command/bad parameters are entered.
  • It is difficult or impossible to split up an existing repository or combine multiple repositories using the supplied tools. For example, for splitting, the svndumpfilter tool has issues with renames/deletions. This is a known flaw in the tool. Combining repositories only works properly if the histories do not overlap in time at all. Otherwise the history in the resulting repo will be borked.
  • Tags and branches are not entities that Subversion knows about; instead they are merely convention of the users.
  • Its commands don’t pipe into a pager by default if they output more than one terminal screen. This is an annoyance.

I sympathize with Linus Torvalds’ opinion of Subversion—it is broken by design.

Rational ClearCase

This is a classic example of overengineering, feature accretion, and poor design. I can’t say anything good about ClearCase. Unfortunately it seems to be used in many large corporations (this is one reason not to work for such a company). I would rather use CVS or even no version control than ClearCase.

  • It is much more bloated and probably even slower than Subversion, especially if you use dynamic views.
  • Integration between operating systems is very painful.
  • There are different UIs for each OS.
  • Its reference manual is a book of over 1000 pages. That’s bigger than most Robert Jordan novels. And that is only the users manual; there is an administrator’s manual and others. WTF. And these aren’t even freely available. I’m not sure if they are even included with each user license of ClearCase.
  • It is absurdly expensive. Each license is over $4000 per year.
  • There seems to be no way from the command line to list all files that you have made changes to with a single command. Instead you must resort to writing a shell script or piping between programs and using backticks or shell loops. So instead of saying something like “ct status” and getting a list of all modified files, you must type something like “ct lsco -r -cview|xargs ct diff -pred -quiet”.
  • You have to explicitly list files when you check in using the command line tools.
  • Hardly anyone (or perhaps no one) fully understands it. It’s too massive and poorly designed.
  • The GUI tools on Unix/Linux are ancient—no tooltips, no mousewheel support, etc.
  • Did I mention that it is incredibly slow?
  • It is difficult to import a directory tree. On the first occasion I needed to do this, I didn’t know of a command to do it, so I ended up scripting `clearcase add’ for each file in a tree then waiting for a couple hours. When I had to do this again later, I tried to use the single command that adds a tree, but it required ClearCase administrator privileges. Go figure.
  • The command line program’s name is “cleartool”. That’s an awful lot of typing.
  • With dynamic views, there are a couple different syntaxes for specifying a window of time to look at. These have undocumented differing semantics (I scoured the Jordanesque documentation for this and saw no mention. So it’s probably a bug). For the curious, see my previous post.
  • Coders spend lots of time fighting with the VCS rather than coding.
  • Google “clearcase evil twin”.
  • It tends to require full-time administrators. That’s a lot of overhead.
  • Atomic commits and “commit sets” are not supported (?). Only single files can be checked in at a time. This is unacceptable in a modern version control system.
  • It’s not open source.

Common Limitations

There are also common problems to all three of the previously mentioned VCS systems due to their centralized nature.

  • One must have access to the repository server to view or diff older revisions or (probably) view commit messages.
  • Only privileged people have the ability to commit.
  • If the system uses a lock/unlock concurrency model rather than an edit/merge model, you cannot do any development unless you have access to the repository server. ClearCase is this way, or at least was set up in such a way when I used it.
  • They are slower than decentralized VCSs because most operations are non-local.
  • A centralized repository is a single point of failure: if the server dies, you better have good backups; if the server goes down for a day, development will come to a standstill.
  • Centralized development does not scale well. As the number of developers increases, so do lock contention (in a lock/unlock system) and merge necessity (in an edit/merge system).

Mercurial

I have been using Mercurial for my own projects for several months now. Because I have only been using it for my own stuff, I have not exercised its merge capabilities very well. Nevertheless, I still identify many advantages:

  • Being decentralized, every operation save pushing and pulling from other repositories is local. You have access to all the capabilities of the system even without a network connection.
  • Mercurial is fast (probably second fastest on most operations, with only git being faster). For example… need some benchmarks.
  • By its decentralized nature, history is nonlinear and merges are tracked.
  • It ships with tools to convert subversion, git, darcs, and CVS repositories.
  • A Mercurial repository is typically more compact than a Subversion repository of the same stuff.
  • It is easy to split or combine existing Mercurial repositories, and it ships with tools to do this (and they actually work).
  • It is easy to write extensions. Much of the program is written in Python.
  • It has good support for email. Revisions can be emailed directly and emailed revisions can be imported without much trouble.
  • By its distributed nature, one can do work as normal on one feature, and once that feature is ready to be merged into another repository, the changesets can be rewritten into one “final draft”. How often do you commit a bunch of changes only to immediately after realize you forgot one file? Or you made a typo in what you just checked in? With Mercurial you can merge those revisions into one (at least before you have merged with other repositories, at which point it gets messy) and make sure no “garbage” revisions appear in the project’s history.
  • Although it is decentralized, it can be used in a centralized fashion (one “master” repository that everyone checks into). This is how I use it with my own projects.
  • It ships with some graphical tools for viewing history and such.
  • It can be easily set up to use graphical merge tools if they are available.
  • It has built-in patch capabilities, a la quilt.
  • “Pulling” between repositories scales well. I understand the Linux kernel is developed in this way: Linus at the top, who pulls from a small number of committers he trusts, who pull from a small number of committers they trust… and so on. Merging changes gets distributed throughout this tree of contributors. The Linux kernel currently has over 1000 contributors.
  • All contributors have access to all the features of the version control system. There is no technical distinction between those who can commit and those who cannot. Instead, the distinction is social and just defines whose changes get incorporated into the “official” project.
  • By its decentralized nature, each working copy of a repository is in itself a repository. So there is not a single point of failure. Explicit backups are less important than with a centralized VCS.
  • It can supposedly interoperate with git repositories.

One slight disadvantage of Mercurial compared to Subversion is that in the former, one cannot check out just a piece of a repository—one must check out the entire thing.

Here’s how I rank these version control systems that I have experience with:

  1. Mercurial
  2. Subversion
  3. CVS
  4. ClearCase

Understand that the difference for me between ranks 1 and 2 is immense—I quite dislike the three besides Mercurial that I have used. Many of its benefits come from its decentralized nature.

I would almost always recommend Mercurial or probably any other free, distributed VCS (git and bazaar come to mind) over Subversion, ClearCase, or CVS—but it depends whom you will be working with, development platform, and how hard it would be for them to learn new concepts and new tools.

ClearCase: Version Control for Masochists

ClearCase is a terrible version control system that unfortunately many companies seem to use. Unfortunately, a company where I once worked uses ClearCase for all their projects. Here is one example of how awful it is.

At this place I was working, they had ClearCase set up to use dynamic views—in that mode it integrates with the filesystem and uses a “config spec” to determine which versions of files to show. (I’m not sure why this is considered a good idea. Sure, only one copy of the repository is needed this way, but then a very fast network and repository server are needed. This “config spec” strategy also allows mixing and matching of versions, which really seems like a bad idea—your development process can become _very_ tightly coupled to the version control system, so you could switch to something else only with great difficulty.)

Anyways, I needed to branch off of an old version of “mainline”, or “trunk”, or whatever you will call it. Here is my first attempt at a config spec:

  element * CHECKEDOUT
  element /vobs/project/... /main/some_project_branch/LATEST
  element /vobs/project/... /main/LATEST -time 13-Jan-2006 -mkbranch some_project_branch
  element * /main/LATEST

Basically, first look at checked out versions of elements. Next, for everything under project, look at the version in some_project_branch. If there isn’t a version on some_project_branch, look at the version on mainline from January 13, 2006 and create a version on the branch. For everything else, look at the version on mainline.

After creating new directories on the branch using this config spec, I would see “[no version selected]” from “ct ls” and “No such file or directory” when doing a normal ls. Very odd. I scoured the documentation and came up with my second attempt:

  element * CHECKEDOUT
  element /vobs/project/... /main/some_project_branch/LATEST

  time 13-Jan-2006
  element /vobs/project/... /main/LATEST -mkbranch some_project_branch
  end time

  element * /main/LATEST

This should be exactly the same as the previous, just using different syntax. This config spec didn’t cause any of the aforementioned errors, but still didn’t work completely. New elements created in this view would not be created on the branch. Now why are the two syntaxes for time semantically different? No one at the company knew. Bug or feature, it’s definitely a ClearCase wart that doesn’t seem to be documented anywhere. My third attempt:

  element * CHECKEDOUT
  element /vobs/project/... /main/some_project_branch/LATEST

  time 13-Jan-2006
  element /vobs/project/... /main/LATEST -mkbranch some_project_branch
  end time

  element /vobs/project/... /main/LATEST -mkbranch some_project_branch

  element * /main/LATEST

This time it finally did what I wanted—branched off of an old version of main, and put new files and directories on the branch. This task took me about an hour to figure out. It should not have been so difficult.

Hello America

I have some requests.

Stop torturing people. Stop attacking foreign countries. Stop giving breaks to the wealthy. Realize that corporations are neither people nor citizens. Consider the environment. And stop spying on your people!

That is all for now.

Sincerely,
Brad Larsen

It is prudent never to trust completely those who have deceived us even once.

I spent nearly 3 days hunting for a bug in a queue data structure we are using in LVM. The queue is used to store a worklist of references to be marked during garbage collection. An LVM object reference (object_reference_t) is a 64-bit unsigned integer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class TCircularArray{
    private:
 
    // Backing array.
    object_reference_t* array;
    // Size of the array.
    u_int32_t array_size;
 
    // Index of first used spot in the queue.
    u_int32_t start;
    // Index of first free spot in the queue.
    u_int32_t end;
 
    // ...snip...
 
    public:
 
    // Constructor.
    TCircularArray(u_int32_t start_size);
 
    // ...snip...
 
    /** Adds an item to the end of the queue. */
    void Append(object_reference_t a);
    /** Removes the item at the front of the queue. */
    object_reference_t GetAndRemove();
};

The queue is implemented as an expandable circular array—it has a backing array, a start index and an end index, and the backing array grows as needed. The value 0 is used to signify an unused spot in the queue. Pretty simple.

Here is the implementation of the interesting functions, sanitized for comprehension:

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
// We allocate enough space for the requested size
// using calloc, which zeroes the memory it provides.
TCircularArray::TCircularArray(u_int32_t start_size){
    array_size = start_size;
    array = (object_reference_t*) calloc(array_size,
        sizeof(object_reference_t));
    assert(array);  // Make sure we actually got memory.
    start = end = 0;
}
 
void TCircularArray::Append(object_reference_t a){
    // The queue should never be completely full---we grow
    // when needed to preserve this invariant.
    assert(array[end] == 0);
    // Make sure we're not inserting the sentinel value by
    // mistake.
    assert(a != 0);
 
    // ...snip code to enlarge the array if needed...
 
    // Insert to end of queue and increment end index.
    array[end] = a;
    end = (end + 1) % array_size;
 
    // Again, shouldn't ever be completely full.
    assert(array[end] == 0);
}
 
object_reference_t TCircularArray::GetAndRemove(){
    // Return sentinel if empty.
    if (is_empty()) return 0L;
 
    object_reference_t current = array[start];
    // Empty the front of the queue.
    array[start] = 0L;
    // Increment the start index.
    start = (start + 1) % array_size;
 
    return current;
}

Upon construction, the backing array is allocated using calloc. So, the backing array should be initialized to all 0s—our sentinel value—the queue starts empty. Append adds an item to the queue, which stores the item in the backing array at the end index, then increments the end index. GetAndRemove sets the array at the start index to 0, increments the start index, and returns the item at was at the old start index. Again, pretty simple.

When running JCheck, a model checker we are using for benchmarks, I sometimes saw the assertion at the end of Append fail: the index to the end of the queue pointed to a non-empty spot in the array. Sometimes the entire array was non-empty. Not good.

This assertion failure was seemingly nondeterministic. Sometimes the program ran for a few minutes before aborting, and other times it happened right away, with the exact same input. I saw this behavior only on machines in our older cluster. “Data race!” was my first thought—LVM is heavily multithreaded. Improper synchronization is often the cause of nondeterminism in programs.

I looked very closely at the thread synchronization code, added many asserts, and found a few other bugs, but nothing that would explain the nondeterminism. It looks like a bug in calloc—it is supposed to return memory that has been zeroed, but sometimes on those certain machines, this wasn’t happening.

I switched to a normal malloc + memset, and have not seen any assertion failures from TCircularArray since. I searched briefly for other people experiencing calloc bugs, but only found out that many implementations of calloc do some complex things with the MMU (when I told Ronald about my suspected calloc bug, he said very seriously, “calloc does magic. Don’t you like magic?”). Very tricky. I’ve so far been unable to reproduce this suspected calloc bug outside of LVM.

The moral of this story: don’t trust the OS facilities to actually do what they say they do.

Sommerfesten.

A few weeks ago, I received the following invitation in my mailbox:

On wednesday the 20th of june 2007 at 6 p.m., the inhabitants of the dormitory Erwin-Rommel-Str. 51-59 have a huge open-air party.

All residents of the university’s guesthouse are invited to join the party. There will be a live band on the area next to your apartments about 2000 people are likely to be there. We beg your pardon because you might be incommoded by the party noise.

In order to maintain good relation to our neighbours you get for this letter a drink of your own choice free.

Hofmann Beer!

I’ll have you know, I was not incommoded—but I cashed in the letter for a liter of beer anyways. I’m not going to be the one who refuses an offer of good relation. You might be surprised how hard it is to hold that much beer in a huge glass for very long. But we don’t get beer to hold, do we?

Sundown

I have been to two other Sommerfesten since the huge one at Erwin Rommel Straße. They have both been smaller, but all have had live bands, who seem to play almost entirely songs with English lyrics.

There is another Sommerfest tonight at one of the other dorms. There has been one just about every week for the past month. I guess I came at the right time of year.

There was a handful of photographers that were documenting the evening at Erwin Rommel’s Sommerfest. I did not take any photos of my own, but I picked out several of the photos they took and have posted them here here.

I think not.

I’ve only had one year of study in German, so language difficulties have often frustrated me. It’s not too big a problem though, because almost everyone can speak at least a little bit of English here. My limited control of German has helped me many times. But it’s also caused some funny situations, like this afternoon.

I sit at a computer for about 6 hours each weekday. Because I like to think sometimes that I’m not completely sedentary, I’ve been running two or three times a week for about 25 minutes. It’s something. Anyways, conditions were right and I’ve ended up with a mild case of athlete’s foot. Wanting to nip this in the bud, today I went hunting for some anti-fungal creme. I didn’t see anything like that at the supermarket last time I was there, but I remembered seeing a Drogerie nearby—more or less like a drugstore in the US, sans any sort of medication (as I found out later). So when running errands downtown today, I stopped there. Not seeing what I was looking for, I tried to describe it to the teenage Turkish girl who was working there—who surprisingly didn’t speak any English.

“Hallo,” I said. “Ich weiß das Wort auf Deutsch nicht… auf Englisch es ist ‘Athlete’s Foot’.” I don’t know the word in German… in English it is ‘Athlete’s Foot’. Seeing that she didn’t know what I was talking about, I continued.

“Sprechen Sie Englisch?” I asked.

“Nein.”

This might be awkward, I thought. I figured I’d try explaining it to her. And here’s where things went downhill. “OK. Ich lache zu viel, und dann meine Fuße heiß und rote sind.” Okay. I run too much, and then my feet are hot and red. At least that’s what I thought I said. What I really told her was that I laugh too much and then my feet are hot and red. I can only imagine what she was thinking.

She led me to an aisle of foot deodorant. Yes, this drug store has devoted 10% of its floorspace to foot deodorant. The Germans take their pedal hygiene seriously.

“Diese sind Fußdeosprays. Suchen Sie nach Fußdeospray?” These are deodorant sprays for the feet. Are you looking for these?

I took a closer look. “Nein, ich denke nicht. Ich brauche Medizin.” This evoked a stream of giggles. I seem to be good at that. I should have said Ich denke nein, rather than telling her that I don’t think.

The girl told me I should go to an Apotheke—the place where the drugs you find at a U.S. drug store are sold. She pointed me in the (wrong) direction of the nearest one, and I eventually made it there and got some Lamisil creme.

So, lessons of the day:

  • German drug stores do not sell drugs. Mostly shampoos and foot deodorants.
  • There is a difference between ‘lachen’ and ‘laufen’. I don’t think.
  • My broken German can be used to make girls giggle.

At least I got a good story out of it.

I’m not just drinking in Germany.

I’ve been programming for almost 3 weeks, and today I was excited to see the following output in my terminal:

    thread 0: starting with d1.data = 0 d2.data = 0
    thread 1: starting with d1.data = 0 d2.data = 0
    thread 2: starting with d1.data = 0 d2.data = 0
    thread 3: starting with d1.data = 0 d2.data = 0
    thread 3: ending with d1.data = 4 d2.data = 4
    thread 1: ending with d1.data = 4 d2.data = 4
    thread 0: ending with d1.data = 4 d2.data = 4
    thread 2: ending with d1.data = 4 d2.data = 4

Doesn’t look significant, does it :) ? The program that creates this output is a simple Java program that creates several threads that each increment a pair of shared counters in tandem. Here’s the code for the threads:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
public class SynchronizedCount extends Thread
{
    Object lock;
    Data d1, d2;        // Data is a wrapper class around an int.
    int id;
    Barrier b;          // A barrier synchronization class.
    int num_iters;
 
    // I have omitted the constructor, which just sets the
    // instance variables to the constructor parameters.
 
    public void run()
    {
        synchronized(lock)
        {
            System.out.println("thread " + id +
                    ": starting with d1.data = " +
                    d1.data + " d2.data = " + d2.data);
            assert(d1.data == d2.data);
        }
 
        for(int x = 0; x < num_iters; x++)
        {
            synchronized(lock)
            {
                d1.data++;
                d2.data++;
                assert(d1.data == d2.data);
            }
        }
 
        b.block();
        // No thread will reach this point until all threads
        // have called b.block().
 
        synchronized(lock)
        {
            System.out.println("thread " + id +
                    ": ending with d1.data = " +
                    d1.data + " d2.data = " + d2.data);
            assert(d1.data == d2.data);
        }
    }
}

Each thread should have exclusive access to the counters when it accesses them—each access of the counters is a critical section.

Notice the assertions sprinkled throughout the code. I expect the values of both counters (d1 and d2) to always be the same, and so I check this at several places. If that condition is ever false, the program will crash and tell me which assertions failed. During development, assertions are a HUGE help in debugging and in writing bug-free code in the first place, as I very quickly learned. When releasing a production version of a program, assertions can be compiled out (in the case of C/C++) or disabled at runtime (in Java), so it is perfectly fine to leave them in source code.

So why did I write this asinine program? It’s a test to detect if the optimizations I’m making to Ronald Veldema’s distributed shared memory virtual machine for Java, which is designed in particular for programs that require massive amounts of memory (on the order of many terabytes) in a huge number of objects.

A Distributed Shared Memory (DSM) provides the application programmer with a single address space that is mapped onto the address space of multiple machines. This lets the application programmer write code as he (she?) would for a single computer, but the data and threads can be distributed automatically onto the available resources of machines the DSM is running on. Using a DSM lets one write a program that uses the resources of a computer cluster without having write message passing code or map shared memory buffers among machines. You just need to write multithreaded code.

Although I don’t like Java very much as a programming language (it has been hijacked by buzzword-loving enterprise weenies, and is kind of a dumbed-down language to begin with), it is an ideal target for a DSM for a few reasons:

  • The language is designed to run on a virtual machine (VM). All DSM stuff can be implemented in the VM. Java programmers/users are accustomed to running a VM, so running on our VM that implements the DSM doesn’t really add complexity—programmers don’t have to link with special DSM libraries, and users don’t have to start an extra process, both of which would be needed for a DSM in a language without a VM.
  • Threads and a memory model designed for concurrent programming are in the Java spec.

Issues arise when parallelizing an application. Unless the problem is embarrassingly parallel, there will be a possibly significant amount of data that must be shared. So, considering the shared counters program I wrote about above, when running on our DSM, the counters could be allocated on any machine running the DSM (even on different machines), but each thread needs access. What is typically done in distributed applications is to move the data where it is needed. In Ronald’s DSM, instead of moving data to threads, he moves threads to data. The fact that Java has threads as part of the language specification makes implementing thread migration a simpler than it would be otherwise. Thread migration is something that is usually only done for redundancy and for compensating for system failure. Doing it for performance is, as far as we know, a largely unexplored question.

If the program being parallelized has more reads of shared data than writes, performance can sometimes (dramatically) be improved by caching the shared data on the machines where it is often accessed. Then, data/thread migration can be bypassed for the cached data. This is what I am implementing this summer—an object replication mechanism for Ronald’s DSM that will (hopefully) offer big performance gains for many programs run on it. But caching requires synchronization when writes are performed for program behavior to be intelligible. The cache coherency protocol gets tricky fast. THAT is why I was excited when I saw the output from the counter program—it means that my synchronization protocol and implementation were correct for that run.

There are still likely race conditions present, which can be very hard to debug. An often-used strategy for detecting race conditions in a concurrent program is to run it for long periods of time (days, possibly) with a high level of concurrency—lots and lots of threads running on lots and lots of physical machines. Then you wait for your program to do the wrong thing. Before I can do this kind of testing, I need to make my caching scheme interface with Ronald’s garbage collector (which may turn out to be as complicated as the cache coherency protocol and implementation, which has taken me a couple weeks, and probably still has bugs).

After interfacing with the garbage collector and becoming confident that there aren’t in my code, I need to make sure my object replication mechanism uses a bounded amount of memory—right now, there is no limit on the amount of space the cache uses. This is bad, especially for a VM that is supposed to be very memory-efficient! What we did in Operating Systems class might actually be useful to me now—I’ll need to worry about replacement policies.

After implementing bounded memory usage for object replicas, I need to experiment with different replication heuristics, which will determine which objects should be replicated to which machines. Currently, replication is only done via a special Java method implemented in the VM that forces replication of an object to every machine.

Finally, I will need to do performance tuning and benchmarking, i.e., with my replication mechanism enabled and disabled, with real applications! Ronald has a few applications he has written or he uses to test, including a model checker (essentially, a nondeterministic Turing machine simulator), an n-body simulation, and a program that finds subgraphs in a larger graph.

Between all this work, I’ll be presenting to the Computer Science faculty here in about 3 weeks. Also, Ronald and I are planning on submitting a joint paper on his DSM with my object replication mechanism to VEE by September. If I get published it will be fantastic.

Sommerfest.

I’ve been in Germany for two weeks now. I’ve met a few people. The vast majority of younger people I have met speak English very well. It’s a shame language is not emphasized in US education.

I have been trying to meet people since I came here. I found out that in one of the Erwin-Rommel dormitories—of which the house I live in is a part—there is a student-run, nonprofit bar. That’s right, they have permission to run a bar in their dorm. Awesome! It’s usually open on Mondays and Thursdays. I went on Monday, met a few people, and had some local microbrews. One tasted like smoked meat! Different, although not bad.

One of the people I met at the bar is a guy named Norbert, who also studies computer science. He’s done, except for his Diplomarbeit (the closest thing to it is a master’s thesis). He asked me something like, “Is it the same in the U.S., where computer science students don’t go out at all?” I told him it is, at least for some of us. ”You don’t look that nerdy—you don’t have a neckbeard or NERD glasses!” he said. I got a kick out of that. He has a somewhat heavy accent.

I saw Norbert there again on Thursday along with a couple other people I had seen before, and they invited me to come with them to a party. At first I thought they meant an apartment party or a house party, but it turns out that at this time of year, many students and departments hold ’Sommerfest’, or as far as I can tell, awesome parties with loud music, lots of beer and food, and dancing. A bunch of physics students were putting on their Sommerfest at about 11PM, in the forest near one of the physics buildings. It was great. WAY cooler than any party I’ve been to in the U.S.

On Wednesday, the Erwin-Rommel houses hold their Sommerfest. It is apparently a very big one, biggest of all the dorms. Later in July, the computer science students have their Sommerfest. I’m excited.

I went on Google Maps and marked out a few places of interest for me in Erlangen. You can see it here. Germany as a whole has high-resolution satellite imagery on Google Maps, so switch it to hybrid mode, and zoom in!

On Entry.

On June 1st, I arrived in Nürnberg. Ronald Veldema (the Dutch guy I’m working for, see his university page here) had arranged to meet me at the airport. Here’s how it worked out.

After collecting my bag, we take a bus from the airport to the train station in Nürnberg, take the train from Nürnberg to Erlangen, and then another bus from the train station to the guest house where I am staying.

While Ronald and I fight with the train ticket machines (many of them are out of tickets), a cute oriental girl walks up to me and quickly tells me that she needs money for the train, then asks if I have money to give her (something along those lines, I realize later). The exchange goes something like this:

(Cute girl says something quick in German).

“…Wie, bitte?” (…Pardon?)

(Cute girl repeats herself).

“Es tut mir Leid, ich spreche nur ein bisschen Deutsch.” (Sorry, I don’t speak much German.)

“Oh, don’t worry about it.” She walks away.

Hooray, language awkwardness! And I’ve only been on the ground about 40 minutes. Ronald chuckles.

It’s uneventful until until the bus ride in Erlangen. Ronald buys two tickets from the driver (a 20-something German guy shaved completely bald), and enters the bus first. The driver glowers at me because of my huge bag that has all my stuff in it (thanks Sara!), and ushered me onto the bus with a quick snap of his arm. This amuses me—the angry German stereotype!

At the next stop, an older woman, probably in her 60s or 70s, enters the bus. She sits in a seat close to where I am standing, and then rattles of a question to me in German:

(Older woman quickly asks me a question in German).

Lacking the tenacity to attempt understanding, I respond, “Es tut mir Leid, ich spreche nur Englisch.” (Sorry, I speak only English.)

“Tja! Nur Englisch…” the woman scoffs.

Language awkwardness again! Ronald chuckles.

Apartments on Erwin-Rommel Straße

We make it to the guest house. Ronald picked up the key to my room in advance. We open it up, take a look around. “This is nice—it’s like a hotel!” he observes. And it is pretty nice—16.32 square meters, with its own kitchenette and bathroom. I set my bags down and take a look around. There’s an ethernet cable coming out from behind a chest of drawers. “Do you have a laptop?” Ronald asks. “Let’s see if you can get online.”

Living Room/Kitchen

I set up the laptop on the desk and plug in the ethernet cable. I get a connection, but no IP address. I look for wireless, and find an unsecured ad-hoc network. I connect to it. Jackpot! I can get online, but I am a bit leary of doing anything sensitive (the person who set up the network might be sniffing for passwords or personal information). This is the only internet connection I have for the next few days—one that frequently goes up and down.

Work Desk and Furniture

We next head to the computer science building where I will spend lots of time working. (Professor Hatcher wasn’t kidding when he said Ronald is intense! Maybe 15 minutes at my room, and then we get to work—by keeping me awake until later that night, he hopes to get me adjusted to German time as quickly as possible.) The building is huge, several stories high, and it is one of THREE buildings for computer science. Friedrich Alexander University has only around 25,000 students, by the way, and is not a polytechnic or technical institute.

We catch one of the sysadmins before he leaves for the weekend, and he sets my computer accounts up. This will let me do work on the mini cluster, a heterogeneous cluster of 16 nodes with x86, PowerPC, Itanium, and x86-64 processors. I am told I need a different account to use the cluster at the compute center, which houses a semi-large cluster of about 180 nodes. I quickly realize that this university has WAY more money than UNH, whose computer science department doesn’t have its own building (let alone 3), and as far as I know, has only a 4-node cluster. Ronald tells me that lots of the development work for the MP3 audio format was done by a doctorate student at Friedrich Alexander University. I wonder if any patent money has come to the university.

Computer Science Building A

Ronald explains some fundamental aspects of the software we will work on for an hour or two, then we take a bus to the center of Erlangen so that I can buy bedsheets and a couple other things. The beds here that I’ve seen are different than in the U.S. There is a downy pad that goes on the mattress, and there is another downy blanket that goes inside a sheet. The pillows are about twice as big, like two pillows from the U.S. joined at the long edge to make a large square pillow.

I get some kind of pastry at a bakery in the department store (bakeries everywhere!), then we head back, at which point I pass out for about 13 hours. It’s an adventure!  I just wish I spoke more German.

Pictures from the weekend.

The internet in my room is finally working, so I uploaded the pictures I took this weekend from downtown Erlangen.

Erlangen Downtown

I have arrived.

On Thursday I flew out of Boston to Frankfurt am Main. When I was checking in for the flight, I was worried that I would have trouble because my bag and carry on were too heavy—the carry on was supposed to be no more than 18 pounds, and the checked baggage no heavier than 50 pounds. Well, both were over that, but no one said anything. Strange.

The flight to Frankfurt was a little less than 7 hours in the air—not too bad. I slept through the movie, and that’s all the sleep I had that day. I had ordered a “HIGH FIBER MEAL” when I ordered my tickets because there was no “NORMAL MEAL” option. They gave me the same as everyone else. For airline tickets, I went directly through Lufthansa’s website. Many people recommended different sites that purportedly give discounts to students. All of those sites except Lufthansa were actually more expensive than, for example, Orbitz, probably because I decided to fly during a peak traveling time for summer. Lufthansa was a little bit cheaper than Orbitz.

On the flight, they had free alcohol—red wine, white wine, scotch, Irish creme, etc. I don’t know if that’s standard fare… but hey, buzzed passengers are happy passengers. I had some wine and read, but mostly
just sat thinking. They had a classical music channel playing which I listened to for most of the flight. As we flew in to Frankfurt, the sun was rising and Wagner’s Valkyrie played on my headphones. It was epic. I guess it’s my Germany theme music.

With cheapness of airfare comes a price. I had a 7 hour layover in Frankfurt, and didn’t make it out to see any of the city. I had to switch terminals for my connecting flight to Nuremberg. I passed through a tunnel connecting the two terminals. It was very long, kind of dim, with the walls and ceiling lit with lights that changed colors, and underwater newage music with sonar pings. After going through the tunnel, we had to go through security again.

Going through security for the terminal switch, someone flying first class started arguing with the guy that was administrating the lines, who was making everyone go into one line rather than run the separate lines for first class/business/coach. “Why are you letting economy passengers go through the line marked for first class! You will make me miss my flight!” he said. The administrator responded, “Sir, I am in charge here. Please, just step into the line.” The situation escalated for a few minutes, ending with the “troublemaker” being detained by German police, and a Scandinavian in the line shouting to the administrator, “You treat us like cows! We are people and you treat us like cows!”

After passing into the next terminal, I walked around, got coffee and a brezel (pretzel), took a quick nap, read some newspapers, searched for wifi, and read a book. I almost picked up “Harry Potter und der Stein der Weisen”, but figured it would be cheaper outside of an airport—plus, I had a book I was in the middle of. Seven hour layovers are long. There were many free newspapers in the terminal (mostly in German, and all in color, even on a weekday), a couple of which I looked at. I was able to get the gist of some of the articles I looked at, even if I didn’t understand word-for-word each sentence. There was one article about organ donoration in Europe. There were a couple tables of statistics. Austria was ranked very low both in terms of percentage of people who are signed up as organ donors, and in percentage of people who have spoken about such donation with their families. If I recall, Germany was also ranked low, although not as low as Austria. Sweden was number one for organ donors. Perhaps they are enlightened. I’ve also read that Sweden is the most secular nation on earth. Hmmm…

I was near a television playing Deutsche Welle much of the time I was waiting. At one point, a short clip of what looked like Tobias Isenberg’s research was played. He was (is?) a faculty candidate for UNH computer science, and specializes in graphics. I saw his presentation at UNH a few months ago. The spot that was on TV was for a collaboration device for working with photos—a big touch-sensitive table that doubles as a screen, with some fancy-pants UI so you can “slide” photos across the table to other people, among other things. It was neat to see something I had seen on person first, on TV.

I realized when I got here just how dependent I have been on my cellphone as a timepiece. I haven’t done the whole watch thing for about five years, so my cellphone at home started doubling as a watch. It’s not GSM like the rest of the world uses, so I have that cell phone disabled until I come back (saves me money). Jet lag and lack of a convenient timepiece have wreaked havoc on my body clock.

More later. I have pictures to go through, and no reliable internet until Tuesday probably—right now I am tenuously piggybacking onto someone’s unsecured ad-hoc network, which has been going up and down.

Overview

For those who don’t know, and those who don’t know the details, here goes.  I applied for an IROP grant back in October and was accepted.  It’s funding me to fly to Nuremberg, Germany on Thursday, where I will be working for Ronald Veldema and Michael Philippsen at Friedrich-Alexander University until August.  I’ll be implementing a distributed hash table specifically for computer clusters.  See the description page here.  (Over)simply, I’ll be writing software for a computer cluster, or a “supercomputer on a budget”—a handful, or hundreds, or thousands of off-the-shelf computers connected via a network and programmed to work together.

I set up this site so I can let people know how my summer is going and so that I can post pictures.  And I get Subversion hosting with this web hosting plan, which is a big plus.