Amanzi

Neo4j read-only transactions still need success()

2014-08-12T10:53:00.002+02:00

Since Neo4j 2.0 it has been necessary to wrap read-only database access in transactions just like you needed for write access in 1.X. This has necessitated refactoring of much legacy code, adding transactions around code that previously did not require it. It turns out that there is a subtlety to this, a case when it is easy to make a mistake that does not necessarily cause any immediate problems.

Let's first explain a common pattern that leads to this issue. Consider a method with multiple return statements:

public String getName(Node node) {
    if(node.hasProperty("name")) {
        return node.getProperty("name");
    } else {
        return null;
    }
}

This code is a bit contrived, but it does represent a common pattern, multiple return statements in code that does read-only access to the database. I've seen much more complex cases, but this one suffices to demonstrate the problem. So, what happens if we add a transaction? The easiest is to wrap all code in the method:

public String getName(Node node) {
    try (Transaction tx = node.getDatabase().beginTx()) {
        if(node.hasProperty("name")) {
            return node.getProperty("name");
        } else {
            return null;
        }
    }
}

You might immediately notice I left out the tx.success() call. The reason I did this was two-fold:

I would need to add multiple calls, one before each return.
If I follow the pattern of calling tx.success() after all database access, I should create a new local variable for the result of the getProperty() call making the code mode complex.
It turns out that this code runs fine, so why bother with the tx.success() call?

This last point is the subtle catch. While this code seems to run fine, there are cases when it really does not run fine. Before explaining those cases, let's just look at how the code would look if we were strict with the tx.success() call.

public String getName(Node node) {
    try (Transaction tx = node.getDatabase().beginTx()) {
        if(node.hasProperty("name")) {
            String name = node.getProperty("name");
            tx.success();
            return name;
        } else {
            tx.success();
            return null;
        }
    }
}

Clearly this is noticeably more verbose than the previous code. And since the previous code worked, it is extremely tempting to code like that. And I did. And all hell broke loose and it took a while to figure out why.

It turns out that while a single transaction like this does run fine without the tx.success() call, it does not really work with nested transactions. If you have other code calling your method, and that code also has transactions, and those transactions remember to call tx.success(), then an exception will be raised. The inner transaction, even though it becomes a dummy transaction, will mark the transaction as not having completed successfully, and if the outer transaction does mark success, the database will reject this case, and throw a:

TransactionFailureException

 - Unable to commit transaction

with cause:

RollbackException

 - Failed to commit, transaction rolled back

and no further information.

When I first saw this, I could not understand where it was coming from. I was in the process of updating legacy code by adding more read-only exceptions, and successfully fixing several unit test cases. However, I found that after fixing some unit tests, other apparently unrelated unit tests started failing with this exception. It occurred because the first unit tests added transactions deeper in the call stack, the same methods used by other tests that had already added transactions higher in the stack. Normally it does not matter if you nest transactions. This is a common pattern. So it was not obvious, at first, that the nesting would cause the problem, but it sure did.

To clarify the cases when this problem occurs I wrote a little test code:

private void readNamesWithNestedTransaction(boolean outer, boolean inner) {
    try (Transaction tx_outer = graphDb.beginTx()) {
        try (Transaction tx_inner = graphDb.beginTx()) {
            readNames();
            if (inner) {
                tx_inner.success();
            }
        }
        if (outer) {
            tx_outer.success();
        }
    }
}

This method can be called with two booleans, which control whether tx.success() is called in the inner transaction, outer transaction or both. It turns out only one of the four possible combinations will cause an exception (tested against Neo4j 2.1.2):

		Outer
		success	no-success
Inner	success
Inner	no-success	Exception

The cases where the inner and outer are the same, no exception is thrown because there is no inconsistency. If they are both true (success called for both), the transaction succeeds cleanly. If they are both missing the success call, then the entire transaction rolls back, but since it is a read-only transaction, nothing will actually be done. If the inner transaction is true, and the outer false, the outer ignores the success of the inner transaction, and decides on a full rollback, which again does nothing. However, in the specific case when the inner does not have success(), but the outer does, it marks this as an invalid case, and throws this exception despite being read-only.

My conclusion? While it is not really wrong to skip the success() on read-only transactions, it is certainly a very bad idea. Perhaps you can get away with it on the outer-most main loops of an app, when you are really certain you will not be called by something else, but be very, very wary of forgetting the success call in any methods you expect to be called from elsewhere. Either have no transaction at all (rely on the calling code to add transactions), or accept the higher verbosity that comes with writing the code right.

Using GoPro Time-Lapse Photos for Mapillary

2014-05-10T00:11:00.000+02:00

Over the last half year mapillary has grown in popularity with over half a million photos uploaded, more than 100k in the last ten days alone! And this is despite the fact that at first glance it feels like you have to use a normal hand-held smartphone to do the job. What if you could use a mounted action camera, like a GoPro to take photos for mapillary? Mounted on a bicycle, helmet, even a car? It turns out you can, and last week I did a little drive and collected about 1700 photos in half an hour using TimeLapse, consuming only 3.4G of my 32GB SD. I could have driven for over four hours and taken around 150k photos before filling my card!

However, it is not entirely trivial to do this, and so I thought I'd explain a bit about how it is done. The main problem is that the GoPro has no GPS. How then can the map know where to place the photos? It cannot. We have to tell it, and we have to use a separate GPS to do so. In my case I've been using my HTC One smartphone, which I also previously for normal mapillary photos. In brief the procedure is:

Mount the GoPro somewhere, on a bicycle or car using the various mounting accessories available.
Start a GPS tracking app. I used geopaparazzi, but there are many others that will work just as well.
Configure the GoPro to take time lapse photos. I choose 5MP, medium, narrow, 1s.
Start the GoPro and go for a drive or cycle tour.
Upload the photos and GPX track to a computer.
Geolocate the photos using time-correlation to the GPX track using gpx2exif.
Upload the photos to Mapillary.

OK. That doesn't sound too difficult, right. So let's describe these steps in more detail, using the drive I made last week.

I mounted the GoPro onto the front left fender of my Mazda RX8 to get a nice view of the center of the road.

Then I mounted my HTC One into a dedicated HTC mount inside the car. Using the 'GoPro App', I could connect to the camera and get a preview of what the GoPro can see. This is convenient, but not really necessary, as the preview disappears as soon as the recording starts. It is sufficient to just have the phone lying in the car somewhere, as long as it gets a good GPS signal. If you do use the GoPro app, take the opportunity to go to the settings and set the camera time to minimize the time difference between the two devices. As you'll see later, you will need to correct for differences in order to perform a correlation, and the smaller the correction the easier the job.

Make sure the GoPro is configured for time-lapse photos with the photo settings you like. This is easy to achieve using the GoPro App, but can also be done on the GoPro itself. I chose 5MB, Medium, Narrow to get a view I felt was a somewhat similar to what I would see with my phone, with a focal length of approximately 20mm compared to a 35mm camera. The GoPro defaults to very wide angle views with a lot of distortion, so I avoided that for this drive. In another blog I plan to describe an alternative scenario where I got mapillary photos from a wide angle 4K video stream. That was more complex, so we'll skip that for now. With time lapse I selected a 1s time gap to give about 1 photo every 10-20m for speeds between 40 and 80 km/h. The mapillary default of one photo every 2s is more suitable for cycling, and I was certainly planning to drive faster than I cycle!

Start the GPS tracking app. In my case I used geopaparazzi. In this app there is a button for starting a recording. I clicked that button and accepted the suggested file name for storing the track. OK. Now we all ready to go. Just to start the camera recording and drive!

When you have finished the drive, stop the recording of both the camera and the GPS. Now the real work starts. We need to perform a geolocation before we can upload to Mapillary. In Geopaparazzi, I exported to GPX, then used a file manager app to find my GPX file and email it to myself. For the GoPro I simply pulled out the SD card and plugged into into the side of my laptop and copied the photos over to a local directory.

The first thing I wanted to do was see what the drive looked like, so I ran:

geotag -g väla_to_billesholm.gpx \
       -o väla_to_billesholm.png \
       -D 1 -s 2048x2048

This produced a nice high res view of the entire map. Note the use of the -D option to put a big gap between the marker labels. This is necessary because the geotag default is set for much smaller tracks, like a quick bicycle ride. From the image produced, we can see the times and locations for key landmarks on the drive. We will want to zoom in on a few distinctive areas we can use to manually check that the camera and GPS clocks are synchronized, or determine exactly the error between them. We can correct for this error when correlating, so it is important to get it right.

I ran the command:

geotag -R 20140505T12:39:00+02-20140505T12:41:00+02 \
       -g väla_to_billesholm.gpx \
       -o krop.png -D 0.1

which produced the image above. I can look for a photo when going under the highway bridge and confirm the times.

The EXIF data for this photo shows it was taken at 12:39:11. When looking at the map, we can see that we passed under the tunnel at 12:39:14. So we have a 3s error. We can use this in the geolocation process, but let's double check with another photo.

I created a map of the track through Mörarp, because I could identify features like buildings and intersections. It is a bad idea to use any intersection that you might have paused at, like I did at the right turn above, but look for landmarks along the drive where you were sure to be moving. I looked at the first road to the right, near the top of the map, and found the following photo at 12:46:10.

I've taken note of things I know to be at the specified location, the speed warning sign in the middle, the white marker on the right for the intersection, and the tree and lamp-post on the side.

One interesting double-check you can also do if you happen to be in an area covered by google street view, is visually compare those images.

In google street view we can see the speed warning, the intersection marker and the tree and lampost, but notice that the fence is not there an fir trees can be seen instead. Clearly someone has been building since the google car passed! Google said the image was captured in September 2011, which is about 2.5 years ago, so things can certainly change.

On the GPX track of Mörarp, we see this intersection was passed at 12:46:13, which is 3s after the camera image timestamp. Again we have a 3s offset. This is good news. It appears the entire track was off by the same amount. We can go ahead and correlate all 1500 photos using the command:

geotag -g väla_to_billesholm.gpx \
       20140505_TimeLapse/*JPG -t 3 -v

I set the time offset with the '-t -3' option, and used '-v' to watch the progress. Since the script is actually wrapping the command-line tools 'exif_file', a new process is started for editing each file, and this can take a while, but in the end your images will have GPS information in the GPX.

Once the images are geolocated, then you can upload them to mapillary.com. Simply login, then click on your name, choose 'upload images', then click the 'choose files' green button. Once the files are all listed, scroll to the bottom and click the 'Start Uploading' button. It will go a slightly paler green, so it is not that obvious that the uploads have started. Just scroll to the top again and you should see thin red progress bars below each image as they are uploaded.

And finally, once the upload is completed, click your name and select 'my uploads', and you should see a new image for your track.

Click you most recent upload to browse the photos in mapillary!

It might take time for the photos to be completely processed, so don't worry if they are not immediately available. Just come back and look a little later. In the meantime there is something else really fun to play with!

Time Lapse Video

The GoPro settings for taking these photos were called 'Time Lapse' for a reason. They can be used to make a video. Since we recorded one frame a second, if we make a video at 25fps, we will have a 25x speedup. This is pretty cool! See what I made below....

This video was made using the following process:

Rename all photos to have names with numbers starting at 0000. In my case I used foo-0000.jpeg. To make life easy, I wrote a Ruby script to create hard links with the appropriate names. Then you can use the ffmpeg command to make a video like this:

ffmpeg -f image2 -i foo-%04d.jpeg \
       -r 25 -s 1280x960 ../myvid.mp4

This command compresses the 7MP 4:3 images to 960p HD video.
Then I used OpenShot video editor to trim to 16:9, add audio, add the map, and compress to 720p medium quality for faster web uploads.

Installing gpx2exif

In this blog we made extensive use of the geotag command. This command is included in the 'gpx2exif' Ruby Gem. For this blog we made use of features available in version 0.3.6. However, at the time of writing, the published gem was at version 0.3.1. So I will explain both the easy way to install (assuming we get 0.3.6 published before you try this) and the slightly harder way (if 0.3.6 is not published, or you wish to make changes to the script yourself).

Installing Ruby on Ubuntu

The first thing you need is Ruby. This depends on the OS. I use Ubuntu 14.04 and RVM, so I recommend you refer to ruby-lang.org and rvm.io for advice on your own platform. Here I'll give instructions for my OS, so you will need to make some adjustments to suite your platform.

sudo apt-get install curl
curl -sSL https://get.rvm.io | sudo bash -s stable --ruby
sudo usermod -g rvm craig
# logout and login to get RVM group
source /etc/profile.d/rvm.sh

The code that creates the png images above makes use of ImageMagick. On Ubuntu at least this means you need to install a few dependencies first:

sudo apt-get install imagemagick imagemagick-doc libmagickwand-dev
gem install rmagick # Requires libmagickwand-dev to compile

Installing from rubygems.org

Once ruby is installed, simply install the gem:

gem install gpx2exif

Then list the installed gems with 'gem list' to see which version you got. If it is not 0.3.5 or later, then use the instructions below.

Installing from github

Install git, and then run the command:

git clone https://github.com/craigtaverner/gpx2exif
cd gpx2exif
bundle install
rake build
gem install pkg/gpx2exif-0.3.6.gem

If all goes well, you will have built and installed the latest Ruby Gem.

Have fun!

Paging huge cypher requests

2014-04-08T16:48:00.000+02:00

Recently I typed a simple cypher command into the Neo4j browser that effectively 'killed' the server. By this I mean the server process went to 100% CPU usage and remained there. It became unusable. In retrospect I should have expected this, since the command I typed was going to hit about 80% of a 4.5 million record database - all in one transaction!

MATCH (c:GeoptimaEvent)
  SET c :ConfigCheck
  RETURN count(c)

This command finds all nodes labeled GeoptimaEvent and adds the label ConfigCheck. My goal was to change all node labels, first by adding the new one and then by removing the old one. But what happened instead was that the server started struggling to allocate memory to hold the entire 4M change transaction. No other requests to the server could be handled. Luckily Neo4j is resilient against failure. I simply needed to restart the server to get back to my previous state.

So, how do we do this properly?

I was giving a great suggestion by Jacob Hansson to split the work into blocks, and Ian Robinson who pointed out that the SKIP and LIMIT statements can apply to the WITH clause. This lead to the following command:

MATCH (c:GeoptimaEvent)
  WHERE NOT (c:ConfigCheck)
  WITH c LIMIT 1000
  SET c :ConfigCheck
  RETURN count(c)

See how this command will only change 1000 nodes, and only those that have not already been changed. This is achieved by first streaming (c:GeoptimaEvent) nodes through the filter WHERE NOT (c:ConfigCheck), taking only the first 1000 off the top of the stream using WITH c LIMIT 1000, and then applying the SET c :ConfigCheck label to the nodes. We return the number of nodes changed. By repeating the command until the result returned is zero, we can change all the nodes.

This command took only a few hundred milliseconds on our standard server (16GB RAM Quad Core i5-4670K, Neo4j 2.0.1 default installation in Ubuntu 12.04 LTS). However, we would have to repeat this command about four thousand times to change the entire database, so let's figure out how big we can go with our transaction size before performance becomes a problem.

By trying the same command with different LIMIT # settings, we can see that the performance scales nice and linearly up to around 400000 records. After this, things get noticably slower, and after 600000 nodes it gets really bad. What is happening here is that GC is kicking in. And if you go for large enough transactions you could exceed the maximum heap size.

We only needed to repeat the command ten times with a transaction size of 400,000 in order to change all 4,000,000 nodes. I was happy to do this in the Neo4j browser. If I needed to repeat the command many more times, I would have written a Ruby script and used Neography.

Now that we've added the new labels, we can remove the old ones by repeating the following command until it returns 0 changes:

MATCH (c:GeoptimaEvent)
  WITH c LIMIT 400000
  REMOVE c :GeoptimaEvent
  RETURN count(c)

A more complex example

The above case was pretty simple really. However, since then I've been faced with a much more challenging example, bulk copying properties from one part of the graph to another.

Let's start with a little background to the problem. The database above is actually a tree structure, with the leaf nodes representing the requests made to a web service. We wanted to data mine the Apache2 logfiles, and in order to calculate high performance statistics we build a tree structure with parent nodes representing the kinds of aggregations we would like to make. We imported the data using the csvtreeloader at https://github.com/craigtaverner/csvtreeloader, leading to a tree structure like:

(d:Device)
  -[:versions]-> (v:GeoptimaVersion)
    -[:days]-> (x:EventDay)
      -[:checks]-> (c:ConfigCheck)

We can ask queries like "How many service checks were there per day during March?":

MATCH (x:EventDay)-[r:checks]->()
  WHERE x.day >= "2014-03-01"
    AND x.day < "2014-04-01"
  RETURN x.day as Day, count(r) as Checks
  ORDER BY x.day DESC

This command returns very quickly and is used in dynamic websites providing instant statistics on the activity of the service.

The problem I faced was that some critical information about the service check, the PLMN code of the device making the check, was saved at the (c:ConfigCheck) level. Considering that we had about 10,000 devices and 4,000,000 config checks, any query on PLMN would hit 400 times as many nodes as needed. We needed to move this critical information up the tree. However, this is not trivial, because the most obvious command to do this will read all 4,000,000 ConfigCheck nodes and copy repeatedly the same information:

MATCH (v:GeoptimaVersion)-->(x:EventDay)-->(c:ConfigCheck)
  WHERE NOT has(v.plmn)
  SET v.plmn = c.mcc+'-'+c.mnc
  RETURN count(v)

This command has two problems:

It will read all 4,000,000 ConfigCheck nodes in one transaction (same problem as before)
It will set the same PLMN code over and over on the GeoptimaVersion node (wasted effort)

We can fix both problems with the following interesting command:

MATCH (d:Device)-->(v:GeoptimaVersion)
  WHERE NOT has(v.plmn)
  WITH d, v LIMIT 1000
  MATCH (v)-[:days]->()-[:checks]->(c)
  WITH d, v,
    filter(
      x in collect(
        distinct replace(c.mcc+'-'+c.mnc,'"','')
      )
      WHERE NOT x =~ '.*null.*'
    ) as plmn
    LIMIT 1000
  SET v.plmn = plmn
  RETURN v.version_name, v.plmn, plmn

The key points in the above command are:

We search first for the Device and Version, filtering for ones without the PLMN and using blocks of 1000. Since there are 10,000 devices, this command only needs to be run about 10 times. Each of these will hit about 10% of the database.
We search for all ConfigCheck events that each Device has and for each we apply the filter() method to combine them all into a single value for that device.
We finally set the value to the requisite parent node and return the results.

These commands each took about 20s to run on the same server. Considering how much is going on here, this is quite impressive performance, I think.

One part of the above command deserves a little more explanation. The construction of the PLMN. We called the following function:

filter(
  x in collect(
    distinct replace(c.mcc+'-'+c.mnc,'"','')
  )
  WHERE NOT x =~ '.*null.*'
) as plmn

What this does is:

Combine the properties c.mcc + '-' +c.mnc
This string contained double quotes, for example my own device has '"240"-"06"' and I expect to see '240-06' for Telenor, Sweden. We use the replace() function to remove the double quotes.
Then we reduce the set to distinct values only using distinct()
And use collect() to make a single set of all matching results.
And filter() to remove entries with the value 'null' in them.

This was a complex example, indeed. Took a while to figure out. The fun part of it, though, was that this could be done through a little trial and error in the Neo4j Browser. Once or twice I typed in a command that hit too much of the database, but a quick restart later and I was back on track. Once it worked, it worked well, and I was able to migrate the entire database quite quickly.

Unwanted side effects

2013-01-16T23:09:00.000+01:00

Last week the performance of Neo4j Spatial improved by 100 times! What the..? Can this be real? Does it mean Neo4j Spatial was always 100 times too slow? Of course, the truth is more subtle than that. Firstly, it was only the performance of the 'add geometry' function that changed, and only for the newer IndexProvider API.

Take a look at the chart above. The yellow and green show performance starting at around 300 geometries/second improving quickly to over 1000 nodes/second on my laptop. This was after applying Axel Morgner's bug-fix. Before the fix, the blue and red lines show the performance dropping quickly to as low as 10 geometries/second once the index contains around 20000 geometries. That is way too slow. In fact the performance was only acceptable for the first 1000 geometries or so, only good for small cases, perhaps the test cases. And that is the first hint as to why this was not noticed until recently.

The idea that a bug fix can break something else is not new. And code that works reliably in one place can fail miserably in another. Sometimes the results can be quite dramatic, as we see above. So what really happened here? It all started with a bit of old code that chained all geometry nodes together in a long chain. This code was not really needed, except by one obscure test case, and one unused API method. It was originally coded as a trivial example of a domain model, since Neo4j Spatial allows you to add an index to any graph structure you wish. But as time went by this model became the default case for whenever users did not have their own domain model, which is all of the time when using the higher level API's. And, of course, no-one bothered to remove it, because it had no obvious harmful side effects. Yet!

And then two things happened that are related to this code. Firstly, in 2011 Peter Neubauer developed a new API on Neo4j Spatial enabling the use of the Spatial Index through the standard Neo4j index API. This new code wanted to check against adding the same node twice to the same index. The easy solution to this was to use the old code, the old long chain of connected geometry nodes. This was tried and tested code, a tried and tested data structure, so it should be good, right? Wrong!

Then in September 2012 Axel Morgner noted that under some circumstances the old code caused NullPointerExceptions. This occurred if there was an error during the transaction that added the geometry. This error could occur in any application level code, and would simply roll back the transaction, which should be fine, but not with that old code. There was a static reference to the end of the geometry chain, a variable called previousGeomNode, that would end up pointing to a Node that had been rolled back. The simple fix was to not keep this reference at all. But that meant that each addition would take a little longer than the previous, because it would have to traverse the chain to find the end before adding to it. You can see quickly that this will escalate to a serious problem. After some upgrades Axel found he could not reproduce the bug, so it was never fixed because the obvious fix would cause a more serious bug.

And then the real problem came. Not from the NPE, but from the shift of users from the old API to the new one. It turned out that the new API was using a version of the long-chain code that did not have the NPE bug. No static reference. Nice clean code. But that meant it should have the expected performance problem. And it sure did. As more users started loading larger datasets, the performance of Neo4j Spatial fell through the floor.

In November 2012, Jonathan Winterflood noticed that performance dropped dramatically as more geometries were added through the IndexProvider API, and submitted bug report 72. The problem was discussed on the forums and others noticed this problem too. Rune Sörensen said "this is a show stopper for spatial data". And then the community came to the rescue. Jonathan submitted a pull request for a good test case demonstrating the problem. Then Axel Morgner submitted a series of pull requests fixing the problem. Finally Craig merged in the fixes and did some tests on the old and new code, resulting in the dramatic chart above.

But this was not all. I looked deeper at the problem and felt that the old 'long chain' code was the real problem here, and while Axel's fixes removed its used from the new API, it was still lying there tempting others to see what bugs they could write in future. So I removed this code almost completely. All standard code, test cases and formal API's no longer use this code. The consequence is the geometries are no longer ordered by insert order, something no-one has requested (yet). But just in case they do, or they already rely on this behavior in the old code, a 'new' class now exists called OrderedEditableLayer that uses this old code, in a small, self-contained way. Developers have to explicitly make use of that code if they want this behavior. And right now the only code I know that still uses this is the ShapefileImporter, and then only if you pass a special flag to enable geometry ordering.

OK, so all is well in spatial again. But before ending, I wanted to give a quick overview of the various API's mentioned above. Let's move away from the abstract discussions on side effects and get into some actual code. Here are some examples, each using a different one of the various API's mentioned above.

IndexProvider: compatible with other Neo4j indexes

Let's start with the newer API because it was the one with the bug. This API does not give you access to the full feature set of Neo4j-Spatial, but is convenient because of its compatibility with other Neo4j indexes. In the example below we show how to create an index, add geometries to it and then query for whether they are inside a polygon:

// Create the index
Map<String, String> config = SpatialIndexProvider.SIMPLE_POINT_CONFIG;
Index<Node> index = db.index().forNodes("layer1", config);

// Add the geometry
Transaction tx = db.beginTx();
Node n1 = db.createNode();
n1.setProperty("lat", (double) 56.2);
n1.setProperty("lon", (double) 15.3);
index.add(n1, "dummy", "value");
tx.success();
tx.finish();

// Query the index
IndexHits<Node> hits = index.query(
    LayerNodeIndex.WITHIN_WKT_GEOMETRY_QUERY,
    "POLYGON ((15 56, 15 57, 16 57, 16 56, 15 56))");

This example was extracted from the code in IndexProviderTest, which contains many more examples for how to interact with the index.

SpatialDatabaseService: the original API

Here you can access the RTree in a very flexible way, and can also plugin your own GeometryEncoders and do all kinds of wild things with Neo4j Spatial. For the purpose of comparison, I'll focus only on that part in common with the other API, adding and querying simple point geometries.

// Create SpatialDatabaseService on existing database
SpatialDatabaseService geo = new SpatialDatabaseService(db);
EditableLayer layer = (EditableLayer) geo.createSimplePointLayer("test", "lon", "lat");

// Add the geometry
Transaction tx = db.beginTx();
Node n1 = db.createNode();
n1.setProperty("lat", (double) 56.2);
n1.setProperty("lon", (double) 15.3);
layer.add(n1);    // Compare this to the index.add(node,key,value) method above
tx.success();
tx.finish();

// Use GeoPipeline to find geometries around point
List<SpatialDatabaseRecord> results = GeoPipeline
  .startNearestNeighborLatLonSearch(layer, new Coordinate(15.3, 56.2), 1.0)
  .toSpatialDatabaseRecordList();

This code is using the GeoPipeline API to perform the search. This is a useful approach because GeoPipeline can actually chain many operations together, resulting in quite complex GIS queries performed in a streaming, and effecient way. Look at the examples in TestSimplePointLayer.java (for point data) and GeoPipesTest.java for more complex examples.

One liner multi-image cropping

2009-10-29T12:09:00.000+01:00

A simple entry for a simple way to solve a common problem.

I often have a lot of scanned images, documents usually, that have unnecessary white-space, and other artifacts around the edges due to limitations in the scanner software or issues with the scanner itself. Normally I use the gimp, and can correct orientation problems, colour issues and cropping very easily. But this is too much mouse work in my opinion when you have 50 images and all need a similar simple cropping. So I fiddled around in the shell a little and came up with this one-liner:

for file in *.jpg ; do echo "Cropping $file" ; \
convert -trim $file trim/$file ; \
convert -crop 1692x2190+4+4 trim/$file crop/$file ; done

OK. Not really one line, but I swear I did write it on one line :-)

What it does is three things:

loop over all image files, and for each:
trim all white-space off the edges, bringing all images to the same basic size with the top left of the real data at the top left of the image
trim them down to a specified size, in this case a little smaller than US letter size. I cut 2 pixels off each edge just to remove some noise commonly found on the edges of scanned images.

And here is a simple set of thumbnails of the results:

original image with many artifacts

first trim removes continuous whitespace

final cropped image at letter size with edges removed

It took about 10 minutes to figure out this script, 20 seconds to run it on all 50 images, and then 20 minutes to write the blog!

Complexity threshold

2009-10-15T11:14:00.001+02:00

I've been planning to write a proper article about this for years, but inspired today by Dan Pink's excellent presentation on the surprising science of motivation, I thought I'd write a mini-version of this, as an introduction and a reminder to myself to write the real one.

Back in 2006 I wrote a blog titled 'The perception of control', dealing with uninformed decision making and the illusion of efficiency, commonly found in control-based companies. Allan Kelly gave a detailed and informative response, including the statement "The fundamental problem facing developers is Complexity. Attempting reuse simply increases complexity."

This got me thinking, and I remembered a long series of interesting experiences I had regarding attempts to scale up development teams, and the very mixed results I got. These days most people know about 'The Mythical Man-Month', but I had experienced many occasions where I needed to argue this point with people that did not. I'm embarrassed to say I was one such person for a while, and needed to learn the hard way myself. One advantage of learning the hard way is that you are forced to try to explain what happened, to yourself and others. So I though about it, and noticed a number of interesting correlations:

Adding people to projects works to some extent for very small teams, but can quickly get out of hand. It seems as if a threshold is crossed after which the project becomes a death march.
Assigning tasks of varying levels of complexity to a specific developer has a remarkably similar threshold. Their performance is consistent until they pass some level of complexity, after which the task goes _pear shaped_.
New developers on old code can perform well if the code is relatively simple, and perform extremely badly if it is not. Most critically, code that is twice as complex does not half the performance. The effects can be much worse. Again there seems to be a threshold.

How do we find, measure or manage this threshold? It is not simple, but there are a few factors to look at to help us analyse the situation. These involve the characteristics of the developer, the team and the project:

Each developer has their own threshold, after which the complexity becomes hard to manage. Obviously this threshold is not fixed, and depends on many factors, including their state of mind. Dan Pink's presentation demonstrates an interesting effect where increasing motivation decreases performance for creative tasks. He says increased focus can decrease problem solving skills. I agree, and see a correlation, since decreased problem solving skills means the developer will have a much lower _complexity threshold_.
The team has a threshold, very strongly correlated to the level of communication in the team, and most of the Agile software development approaches have a lot to say about how to deal with this problem. Usually they try to increase communication and couple that to catching problems early.
The project can obviously range from simple to very complex. Many approaches to dealing with this complexity are to split the project into modules and build teams on each, but in many cases this merely replaces one type of complexity with another. The more modern solutions to this involve prototyping, re-factoring, TDD and BDD.

In projects that I manage, I believe we need to deal with all three areas. However, I have a personal bias towards focusing on the first, largely because I'm particularly interested in individual behaviours. I think this helps me also deal with the second issue, because if you have a feeling for the individuals, you can use that to help improve communication in the team. On the third point, I have a solid track record of prototyping and refactoring. I am also a believer in TDD and BDD, but must confess I can do with some improvement in those areas.

Since this blog is merely an introduction to this idea, I will end off with a short piece on how I think we can help individuals keep on the _good side_ of their personal complexity thresholds. I have normally focused on two approaches, but Dan Pink has shown me a really nice new take on this. Let's start with my classic view:

Identify the threshold and avoid it. Ok, this sounds like a cop-out, but what I'm talking about is assigning tasks to team members based on their natural skills, and so reducing the risk of crossing the threshold. Of course, one of the best ways to do this is to get the developers to pick the tasks themselves, as common in several Agile approaches, like Scrum. This is a good start, but does not always work, because while some developers pick the easy tasks and perform well, others are suckers for a challenge. So sometimes you need to advise them. An external opinion can make all the difference. If you are the developer yourself, learning to identify when you're in trouble is a valuable skill, but you will also get help to do so in an Agile team, especially if you use pair programming.
Adjust the developers threshold. Ok, this one sounds hard. Not really, when you realize that this might be as simple as additional training. But one area that really matters is helping the developers identify the thresholds themselves and respond to them. It can be very hard, when stuck in a complex problem, to see outside the box. Again, working in an agile team really helps here. Getting a second opinion on something, even if you might not realize you need it, can make all the difference.

Dan Pink's presentation really builds on this second option. Instead of trying to improve performance with classic rewards, like bonuses, we focus instead on intrinsic motivators. Dan describes three:

Autonony - the freedom to define your own life. In my world that can mean something as simple as allowing the developers to pick their own tasks. I also see this respecting their opinions in design meetings. In fact anything that tries to break the control-based management I described in 'the perception of control' is a good start.
Mastery - the desire to get better and better. Obviously this can mean training, but coupled to the first point it can mean the option to develop ones career in the direction that one has a passion for. For more on the use of the word 'passion' in this context, I advise reading 'The Hacker Ethic and the Spirit of the Information Age'.
Purpose - the yearning to do what we do in the service of something larger than ourselves. Ok, being a fan of open source, this sounds great. Of course it is both more subtle and more applicable than that. People want to do something good, or something of note. In a software project this can be as simple as assigning key features to developers. Someone constantly getting the boring stuff is bound to perform worse.

I found Dan's presentation very interesting. But is this really related to the problem of complexity? I believe it is. If low performance can be attributed to crossing the complexity barrier, then these techniques can be used to move that barrier.

Dan finishes with a introduction to ROWE - the 'results only work environment'. I've been playing with a few ideas in this area, and I'd love to report on them, but need to do that in another blog :-)

Boolean behaviour

2009-07-31T17:00:00.000+02:00

I recently spent a good half hour debugging some grails unit test code, only to track the problem down to groovy's boolean behaviour. As a Ruby programmer I've become spoiled by the clean and simple predictability of Ruby booleans, and since groovy is visually so very 'Ruby'esque' I was only too easily deceived.

So, I've made a little summary of different programming languages boolean behaviour. IMHO There are only two modern languages with simple boolean Rules, Ruby and Java, and the rest are unreasonably complex:

Java - no coercion, only Boolean/true/false are valid
Ruby - no coercion, only nil and false are false, all else is true
Python - no coercion, a ton of rules for deciding what is true/false
Groovy - coercion to true/false, with a ton of rules for deciding which way to go
Perl - no coercion, a ton of rules for deciding what is true/false
C - no coercion, 0 is false, all else is true

What a mess! Every single one is different. But this is my opinion, I judge these on the basis of a few criteria:

Simple rules, in which case Java, Ruby and C rank high
Simple syntax, in which Ruby and C rank high
Enables expressive code, in which Ruby ranks high, and Python, Groovy and Perl do quite well (and Java does very badly)

OK, you guessed it, I'm a Ruby fan. But let me justify my opinion with one simple meaningful example. I'll show a series of statements that do the same thing:

a = (a == nil) ? 'default' : a
a = a || 'default'
a ||= 'default'

AFAIK, this is possible as a direct result of the simple boolean logic, and in particular the fact that everything is true except false and nil.

The groovy book I read claims similar behaviour, but it in fact is not true. The book claimed the following equivalent statements:

a = (a == null) ? 'default' : a
a = a ?: 'default'

And if you have a=nil (a=null in Groovy) both the Ruby and Groovy code behave the same. But just try pass in a='' (empty string), or a=0. Ruby will keep the assignment passed in, while Groovy will re-assign to 'default', not what you expected, and not what the Groovy book claimed.

The problem here is that Groovy, Perl and apparently Python, decided to try make developers lives easier with some convenience rules for booleans, notably that integer 0, empty strings and empty collections are all seen as false. And honestly, there are many scenarios that is useful, and I've used that fact for years in Perl. And when I first started Ruby coding, I balked at the idea that 0 and '' were true. But it did not take long to see the light. And then I began to remember the pain I had debugging Perl code where the bug was due to an unexpected false when an operation returned a numerical zero or an empty string.

Sorry, I'm convinced. Ruby got it right!

And the remaining question is: since Groovy clearly copied a lot of Ruby syntax, why did they not do it the Ruby way with booleans? Actually I think the answer is obvious once you think about it. Groovy is actually Java inside. Groovy tries to bring nice modern dynamic scripting capabilities to the Java language. Quite a paradox that Java, with the most rigid, predictable boolean behaviour, and the easiest debugging of the lot, should end up with this kind of scripted boolean. What I believe happened is that Groovy decided, quite naturally, to go with the coercion approach to scripting Java. Deep down it is all Java with strict types and strict booleans, but in between Groovy is coercing and converting all types automatically. This approach has been used all over the place.

And once you're on the coercion band-wagon, I think the end result is exactly what Groovy has.

Recruitment a'la open source

2009-06-08T16:46:00.000+02:00

I just read Allan Kelly's blog criticising 'Foie Gras recruitment'. Allan's point is that adding developers too quickly has the opposite effect than intended, slowing down a project. In fact, recruiting always slows things down before it speeds things up, due to the cost of training up and familiarizing developers with the company processes, product vision and code.

However, there are many factors that affect this, and influence the severity of the problem as well as the teams ability to deal with the problem. Allan mentions one, the teams processes and practices. However, another really important one is the character of the developers involved, especially the new ones.

Imagine the hypothetical case where you magically recruit only developers that are actually capable of such a high level of self training that the negative impact on the team is much less than average (obviously never zero). Imagine also that the answers to the questions are usually available without another team member having to spend time. For example, the answers lie in the code itself, and any associated well written documentation, including feature specification, project goals, etc.

Obviously this is a hypothetical scenario essentially never achieved in corporate development, but it does exist in the real world, in many open source projects. Often people enter open source projects because they did their own self-training, read the code, tried things out and made working contributions that were good enough to get the attention of the project owners, and as a result received admittance to the team. This scales much better, and faster, than normal recruitment. So why does this not happen in the corporate development world? Usually because it relies on statistical factors no often achieved, related to the percentage of available developers of sufficient and appropriate skills, also sufficiently interested in the project to put in the time. This is a low number, especially when you consider that by the term 'interested' I also imply that the developer is able to make a living from this activity.

So how do corporate development projects benefit from this? Or can they? The problem being that corporate projects are, almost by definition, not interesting enough to the potential developers.

Personally I believe it is possible to find a middle ground, if you close the gap from both ends:

move the project goals closer to the developers goals (make the project much more interesting to open source developers, make it open source, make it do things more interesting to a wider audience)
move the developers goals closer towards the projects goals (ie. pay the developers)

Obviously the second option should not be undertaken using normal recruitment. You still need to use open source recruitment (statistical filtering as described above).

Is this really hypothetical? No, I've actually been putting this into practice with my most recent recruitment drive. I recruited three new remote developers without reading a single CV or holding a single interview. Instead I simulated the open source approach by using the following steps:

require a code contribution, which was evaluated (testing not only coding skills, but ability to read specs, work remotely, solve problems independently, do internet research, and perform self-training)
contract for a trial period, testing their ability to perform with other remote developers, double checking their skills, notably an increasing understanding of the project itself
contract for longer periods with tighter integration into the team

Now three months down the line, I have actually seen quite decent productivity. I count the approach a success, and I'll be sure to use the same technique for most future recruitment drives.

One final point. This approach does not solve the problems identified by Allan Kelly. It only serves to reduce their impact. And it does introduce another set of problems related to efficient project management of loosely coupled remote teams. That is a subject for separate blog :-)

The Secret of Googlenomics

2009-05-25T17:16:00.000+02:00

I just read an amazing and insightful article in wired about the 'Secret of googlenomics', which was an riveting introduction to the auction based principles that have become the core of almost everything at google. And even more importantly represent a possible future for many other modern elements of the future economy.

Most of the article references a presentation given by google's chief economist, Hal Varian, who's career was inspired by Isaac Asimov's books The Foundation Series: "In Isaac Asimov's first Foundation Trilogy, there was a character who basically constructed mathematical models of society, and I thought this was a really exciting idea. When I went to college, I looked around for that subject. It turned out to be economics."

I was also inspired by Asimov's theory of 'psychohistory' when I read those books back in the early 90's, but unlike Hal, I thought the idea was entirely impossible, and so I stuck with reality and studied pure science. Perhaps I was wrong, as google's mathematicians now do take into account everything from the weather to peoples fashions and buying habits, to predict the best adverts to use on search results.

I strongly recommend reading the entire article at http://www.wired.com/culture/culturereviews/magazine/17-06/nep_googlenomics. For a taster, here is the concluding paragraph:

There's a wild contrast between this sparsely furnished residence and what it has spawned—dozens of millionaire geeks, billions of auctions, and new ground rules for businesses in a data-driven society that is far weirder than the one Asimov envisioned nearly 60 years ago. What could be more baffling than a capitalist corporation that gives away its best services, doesn't set the prices for the ads that support it, and turns away customers because their ads don't measure up to its complex formulas? Varian, of course, knows that his employer's success is not the result of inspired craziness but of an early recognition that the Internet rewards fanatical focus on scale, speed, data analysis, and customer satisfaction. (A bit of auction theory doesn't hurt, either.) Today we have a name for those rules: Googlenomics. Learn them, or pay the price.

Artistic Engineers

2009-05-11T17:16:00.000+02:00

I've always believed that artistic or creative talent was indispensable in technical fields like science, engineering and software development. But I never put together a coherent enough description to warrant a blog post, only the occasional soliloquy over a drink. But now I've just read DHH's blog entry "We need both engineers and artists in programming", and he described it so well, I just had to respond. His description focused on a developers perspective:

People waxing lyrically about beautiful code and its sensibilities. People willing to trade the hard scientific measurements such as memory footprint and runtime speed for something so ephemeral as programmer happiness.

Now I'm originally a pure science researcher. And there is no more extreme case of a non-artistic image than that of a scientist. What do most people think: white lab-coats, thick-rimmed glasses, rigorous systematic approach to everything in life and a total lack of
artistic flair.

And often that image is not entirely inaccurate. As 'Robert Martin' indicated, professionalism is a very important quality for software development (and I add - science and engineering in general). But as DHH asserts: 'the wonderful thing about this new age of programming is that we need and prosper from both types of programmers'.

I agree with David. You really do need both types. And if you look back at some of the most impressive discoveries in science in the 20th century, there were artistic people involved, usually with the key discovery. I love the biggest deviation from the boring stereotype - Einstein, with his wild hair and almost chaotic appearance.

It's all about thinking outside the box. David says it's all about 'programmer happiness'. Of course he's right too.

Now what about the irony that DHH's profile shot is so much more professional looking than Einstein's?

What's the point of github?

2009-04-20T12:07:00.000+02:00

While driving to Malmö last Friday to attend a tech talk on git by Sébastian Cevey and hosted by PurpleScout, I was trying to explain distributed source code management systems (like git) to a non-developer friend of mine. I very quickly found myself explaining much more about git than I realized I knew. And I found myself asking, and answering, what I think is a very interesting question: what is the point of github?

The situation is that git, and other distributed source code management systems, like bazaar and mercurial, appear to start from the philosophical position of giving complete control to the end user (in this case the developer). They are not centrally controlled systems, there is no central server, no 'little' boss to ask permission from for access to files, branches or projects. When you clone the repository, you get it all, with all history and everything. Power to the people!

This allows for highly flexible distributed teams, each working in their own way, as suites the developers themselves. It completely solves the usual problem found in central systems like CVS, SVN and, heaven forbid, Perforce: getting permission from a non-developer to do development.

So then, why does a site like www.github.org exist? It seems to imply adding back a central server to the de-centralized system. With a little thought, I realised what was going on. The problem had never been about central control, it was all about who has the control, and distributed systems actually do not remove the concept of central control at all. They just facilitate a situation where the right people are in control.

To explain this, I should re-describe what the original problem was. Consider CVS and SVN, arguably the industry standard(s). You have a central server with the code (and history and branches, etc.). Each user checks out a working copy of a branch of the code. After doing work, they commit back to that branch (dealing with conflicts and merges as needed). This implies a very particular workflow, and forces connectivity to the server for all major actions that require working with the code history (checkout, update, commit, branch, merge, etc.). And the mere existence of the central server implies the existence of IT and admin in the decision making loop, which can only hurt. Perforce, being more susceptible to the influence of IT on purchasing, took this one step further and required connectivity to the central server for almost any development activity, and, can you believe it, even requires developers to unlock each file they plan to work on! Can there be anything worse for developer productivity? Well, yes, anyone remember Microsoft's 'SourceSafe'?

What was the main problem here? It was not actually the central server, but rather it was a few things implied by this architecture:

The involvement of non-development staff in the smaller details of what the developer actually needs to do, which adds overhead to development activities, which means higher cost and less efficiency.
The implication of a specific workflow in the way the developers need to work with the code-base.
The need for regular or even continuous connectivity, which also has performance, efficiency and cost implications.

Distributed systems completely avoid all of this. Each developer has the complete history, and all branches, right there on his computer. They can do absolutely everything they want without asking anyone, and especially not asking people that don't know about software development. Maximum performance!

But at the end of the day, those developers need to get their code back to somebody in charge. There is always going to be one person or organization that actually sells the product, or distributes the product, or supports it. So, no matter how much power the developer thinks they have, the real world is still centrally controlled. But at least now the control is not micro-management. Now the control is closer to the real business, which is about getting good code to the right customers. Distributed source code management allows for this to be done most efficiently. The developers have all the power to do their job most efficiently, but with power comes responsibility and those same developer are now required to do all the merging back into the main code. How is this done without a central server? Easy, each developer simply publishes to their own public copy of the latest code-base. That public copy could even be a shared location on their own computer, accessible to the right people. Or, in the case of open source projects, it could be a world readable resource like github.

And that's the point of github! It is a convenient place for developers to publish their already merged work, for use by the central product distributor.

Not only is this a developers dream come true, but it is a software development companies dream come true. You don't have to manage the central server any more. You also don't have to do as much support merging other peoples code into your own, because you can push that responsibility back out to the developers, where it belongs.

I can't believe this was not done thirty years ago! Why is that? I have two theories:

Cobblers children - since both the customer and supplier are the same (the developer) for code management systems, perhaps it's a case of the cobblers children having the worst shoes. The developers simply work around bad code management systems, because they can.

Corporate control - if we look back at what I've said about the key differences between central and distributed systems, there seems to be a repeating theme regarding the involvement of non-developers, or company IT processes, in the way the older systems worked.

Having personally seen a lot of bad decision making by companies to increase their level of 'perception of control', I'm voting for the latter. (see my blog for more on this).

But those days are numbered! I think concepts like distributed SCM and open source itself are increasing the prevalence of businesses run on the principles of collaboration instead of control, with decision making by the people with the actual information.

15 million Africans are ready for work - Got Tasks?

2009-03-20T15:55:00.000+01:00

I followed a twitter comment by Tim O'Reilly that quoted Nat Torkington saying "first 5 minutes redlined my awe-ometer."

So I just had to watch the video he was referring too, and the above screen-shot is how it ends. I know I spoiled the punch-line, but it's still worth watching so click the link and enjoy!

The presenter, Nathan Eagle, has started a service in Kenya and East Africa, called TextEagle, which allows mobile phone users to complete small tasks by SMS and get rewarded for it, in airtime or in credit. For a workforce living on $5/day and eager for more airtime, this works like a charm. Tasks include simple text translation services, local news reporting, and even listening to advertising!

This really is 'crowdsourcing' in action.

'The network is the computer' and 'the client plus the cloud'

2009-03-17T14:52:00.000+01:00

I just read a very interesting article at computerworld, an interview with Craig Mundie of Microsoft, where he talks about the future of computing, and references some presentations he recently made. The article is titled "Microsoft's next big thing", which is a pity because it colours an otherwise interesting read with an overly self-congratulating attitude.

Craig described the future of computing as being 'the client plus the cloud', which reminded me very strongly of Sun's slogan 'the network is the computer' (originally coined by John Gage). Jonathan Swartz blogged specifically about 'the network is the computer' back in 2006, and gave a nice realistic picture about cloud computing, grids, and how both the average end user and corporate IT environments view and interact with these systems. Sun was, of course, announcing the imminent launch of their own commodity grid, www.network.com. Later that year Amazon launched the public beta of it EC2, which made a final release in late 2008. While Sun has not yet made the final release of their grid, many others have. It is clear that visionaries from these various companies have been on the right track for quite some time.

But, as Mr Mundie himself admitted, timing and market readiness are a very important aspect of the adoption of new computing paradigms. And according to him the future paradigm is all about the balance between the client (desktop OS) and the cloud (grids, the internet, etc.) He is absolutely right. And it is easy to be right when you are not predicting the future but observing the present. Aside from Jonathan's 2006 blog, we all know just how successful cloud computing, and commodity clouds like Amazon EC2 in particular, have become. Everyday internet services like google, yahoo, facebook and linkedin are all products of the success of the cloud. We are not about to undergo a paradigm shift, we have been in the transition for some time, and many, many vendors have jumped onto this particular train, including Microsoft, of course, with their 'Azure' grid.

While Mr Mundie may be a little off track about just how important Microsoft is to this new paradigm, one thing I must give him credit for is in making the whole subject much more interesting and enjoyable to read about. In particular his video presentation had the 'cool' factor usually associated with a Steve Jobs presentation.

Convert images to greyscale

2009-02-17T16:36:00.000+01:00

There are at least a hundred ways of doing this, but I wanted a single-line way to make greyscale versions of a bunch of images on a website I was developing. The last thing I wanted was to load each one in turn into a graphics application to edit the colors.

ImageMagick to the rescue:

for img in *.gif ; do convert $img -colorspace Gray -colors 16 grey_$img ; done

Ok - so not really one line, but almost :-)

And this is what it looks like afterwards in my file explorer:

Install or upgrade to ubuntu 8.04 on linux with no media

2008-09-03T17:14:00.000+02:00

I had a problem, my ubuntu server was too old for automatic updates, and the CD-ROM drive was broken and I'm allergic to floppies. A quick internet search lead to three options:

instlux, a nice graphical installer to run under windows
UNetbootin, a really nice graphical installer for linux and windows
A grub trick for booting the installer from grub under windows described in detail for any linux, and in less detail but for ubuntu.

The first option was no good, because it only ran on windows. The second looked really neat and easy, and is probably the best, but being the geek that I am, I wanted to try the ideas in the third option, but 'translated' to work on my old linux (in my case ubuntu 5.04). It turned out to be pretty easy. Here are the steps I used:

I opened my downloaded ubuntu 8.04.1 server ISO image in archive manager and extracted the 'install' directory to /boot/install on my old computer. I did this with another 8.04 desktop, but could just as easily done it with the old computer itself.
I edited /boot/grub/menu.lst, adding the following lines at the bottom:
```
title           Ubuntu 8.04.1 Installer (hd0,0)
kernel          (hd0,0)/boot/install/netboot/ubuntu-installer/i386/linux vga=normal ramdisk_size=14972 root=/dev/rd/0 rw --
initrd          (hd0,0)/boot/install/netboot/ubuntu-installer/i386/initrd.gz
```
(I actually first tried the vmlinuz and initrd.gz I found in the installer directory, but that insisted on a CD, and I did not want to try faking that with a raw partition, so I changed to the netboot option in the text above.)
I also commented out 'hiddenmenu' and added 'timeout 10' to menu.lst so that I would actually get to see the menu choice when I rebooted.
Finally I rebooted and choose the new installer, and, after answering a bunch of question, viola, I had a new Ubuntu 8.04.1 server!

These instructions assume a decently fast internet access, since everything installed is downloaded. If you have the CD already (as I had), and no internet, or slow internet, you can also copy the CD to a local hard-drive partition and install from there. That was too much trouble for me, so I did not try it :-)

Linux / open source OCR batch processing from PDF

2008-07-13T19:44:00.000+02:00

I recently needed to run OCR on a PDF of scanned pages, and found no direct way to do it in Linux, but did find a suitable combination of tools that when scripted together did the job quite nicely. Firstly the job needs to be broken down into two steps:

Extract individual pages from PDF. None of the open source OCR software I read about or tried could run directly on PDF. The easiest way to extract from PDF is to run ghostscript and print to TIFF or PNM, for example:
```
gs -r300x300 -sDEVICE=tiffgray -sOutputFile=ocr_%02d.tif -dBATCH -dNOPAUSE inputfile.pdf
```
Run OCR on individual pages. I tried ocrad and tesseract (versions 1.02 and 2.03). Ocrad supported the Swedish characters I had in my documents but otherwise had rather poor overall OCR performance. Tesseract did not support Swedish characters, but both versions were better than Ocrad, and version 2 was overall the best (and supports many other languages if you bother to train it). Training it on Swedish was more work than manually fixing the results, so I did not take that step, but was certainly tempted.

Since I had several PDF documents to process, and each had many pages, the above process was still too manual, so I wrote a utility Ruby script to do the work for me:

#!/usr/bin/env ruby

(ARGV.length>0) || puts("usage: ./ocr.rb file1.pdf <file2.pdf> ...") || exit(0)
$basedir=Dir.getwd
ARGV.grep(/\.pdf/i).each do |pdf|
      dir = pdf.gsub(/\.pdf/,'')
      dir += '_OCR'
      dir += '.dir' if(dir == pdf)
      Dir.mkdir(dir) unless(File.exist?(dir))
      Dir.chdir(dir)
      puts "Extracting pages from PDF: #{pdf}"
      system "gs -r300x300 -sDEVICE=tiffgray -sOutputFile=ocr_%02d.tif -dBATCH -dNOPAUSE \"#{$basedir}/#{pdf}\""
      tiff_pages = Dir.new('.').grep(/^ocr.*\.tif$/).sort
      puts "Running tesseract OCR on pages: #{tiff_pages.join(', ')}"
      tiff_pages.each do |page|
              page_base = page.gsub(/\.tif.*/,'')
              print "#{page_base} "
              system "/usr/local/bin/tesseract #{page} #{page_base}"
      end
      Dir.chdir($basedir)
      ocr_pages = Dir.new(dir).grep(/^ocr.*\.txt$/).sort
      if ocr_pages && ocr_pages.length>0
              puts "Created OCR result pages: #{ocr_pages.join(', ')}"
              archive = "#{dir}.zip"
              puts "Creating archive of result pages: #{archive}"
              system "zip -r \"#{archive}\" #{ocr_pages.map{|p| "\"#{dir}/#{p}\""}.join(' ')}"
      else
              puts "No OCR result pages found"
      end
      puts ""
end

This script will extract the images to TIFF, run the Tesseract OCR on each page and finally build a ZIP file of the result with a filename similar to the original PDF. So MyDoc.PDF is converted to MyDoc_OCR.ZIP. Intermediate TIFF and TXT files are maintained in a subdirectory (MyDoc/*).

If, on the other hand, I simply did not look far enough and there are better utilities and GUI applications for this on Linux, feel free to comment.

Bongi's voice - and the size of the Kruger Park

2008-04-11T15:47:00.000+02:00

My brothers blog 'other-things-amanzi', which is hugely more popular than mine, for obvious reasons if you read it, just got a boost as he was invited to be interviewed on BlogTalkRadio.com. Sid Schwab of surgeonsblog fame was a guest host, which worked well as he and 'bongi' got chatting about everything from the pleasures of living near the Kruger National Park, to regional issues facing surgeons, like the bad treatment sometimes delivered by the local witchdoctors or 'sangomas'.

Sid even mentioned my blog, but I'm guessing the ultra-geek content stopped him in his tracks :-). Of my blog, Bongi said: 'I don't understand a single word of it!' Perhaps this post will score better?

One thing Bongi said that I fear is some misinformation I might be responsible for, was that the Kruger Park is the size of England. I used to claim that myself, but recently decided to double check my facts and found I was WRONG! A quick google search reveals several sites claiming it is the size of Wales, and one claiming it is bigger than Ireland:
(Size: The Kruger Park is huge. It stretches for 350km (217 miles) from north to south and averages 60 kilometres in width which makes it bigger than Ireland. Most of the park is fenced so it is a self contained ecosystem.)

So, I stand corrected. So I decided to investigate and figure out what is really going on. How does the park compare to England, Ireland and Wales? Right now these are the facts I could find:
Kruger Park:
Area: 18 989 km2
Length: 350 km
England:
Area: 130 395 km2
Length: 580 km
Wales:
Area: 20 79 km2
Length: 215 km
Irland:
Area: 84 412 km2
Length: 360 km
Skåne:
Area: 10 939 km2
Length: 115 km
Great Limpopo Transfrontier Park:
Area: 35 000 km2 - 99 800 km2 (planned expansion)
(England, Ireland, Wales lengths were roughly north-south measured by me on google earth 'ruler'. Kruger length is from wikipedia.)

So England is about 7 times the area, and about 65% longer. So the Kruger is comparable in terms of length (60% the length of England), but not by area, since the Park is so narrow. The website that claimed the park was bigger than Ireland is wrong. It's a bit shorter, and less than a quarter the area. It is, however 60% longer than Wales and nearly the same area, so that is the best match. If the full-size transfrontier park materializes, it will close in on the size of England, which is really impressive.

It is especially interesting to me that the Kruger Park is nearly twice the length and over three times the area of Skåne, the province in Sweden in which I live.

Hardy Heron Beta (and release)

2008-04-04T17:14:00.001+02:00

I've had a generally good time with the Ubuntu 8.04 Hardy Heron Beta since it was released in late March, and have installed it on two different machines. I especially enjoyed getting compiz to work for the first time (possibly due to a new machine with better hardware, not something related to Hardy in particular). However, I have had three issues I thought worth mentioning here:

admin-users does not work
I reported this as a bug to ubuntu. Basically the problem is that no groups or users added with gnomes user administration tool actually get added, and in one case the tool crashed. I've had to add users and groups on the command line with tools like 'addgroup' and 'adduser'.
eclipse crashes silently
This happened several times before I thought to run it in the console and catch the error, which turned out to be 'java.lang.OutOfMemoryError: PermGen space'. Normally eclipse reports this to the user in a dialog, and I do not know why that was not done, but the solution is the same, add '-vmargs -Xmx1280m -XX:MaxPermSize=1024m' or similar to the eclipse launcher. I noticed the error first happen after switching from the Ruby to the Java perspective, and the virtual memory requirements of eclipse jumped from 0.5GB to 1.2GB. Amazing. (update: the symptoms returned on a new java6 update, and the new fix was to add -XX:CompileCommand=exclude,org/eclipse/core/internal/dtree/DataTreeNode,forwardDeltaWith to vmargs - see comments below for more details)
unable to set HumanList theme for login window
Once a change to the login window settings were made, logging out waited indefinitely (or in one case just about 30 minutes), before showing the login screen. No errors in the X log or any other log. Nasty. I had to kill gdm and hand edit gdm.conf-custom to remove the theme line.
sudo fails with: unable to resolve host
I found a lot of discussion about this on the internet, but in most cases it was due to people foolishly changing their hosts files. While sudo should not be sensitive to something like that, it was not my situation. I simply ran the usual daily upgrade with the list of updates for Hardy, and after the reboot this issue happened. With a lot of investigation I found how to 'fix' it, by getting /etc/hosts and /etc/hostname to have the same entry. Interestingly enough they do not have the same entry if you enter the 'obvious' values in the network manager applet for host and domain. For example, putting 'foo' and 'bar.com' as host and domain will put 'foo' into /etc/hostname and 'foo.bar.com' on the 127.0.1.1 line in /etc/hosts. My sudo continued to work for a week before my reboot because my DNS settings, and my 'search' list in particular allowed 'foo' to be resolved even without the 'bar.com', but after the reboot it failed due to the additional issue below:
static eth0 failed on reboot
Strangely enough eth0 appeared on the ifconfig output, but with no IP address, while my network configuration in the network applet, as well as in /etc/network/interfaces, looked just fine. I needed to add 'auto eth0' to the config file to get it to work correctly. I vaguely remember seeing this before on much older ubuntu versions, but have not seen it for a while, so it was quite a surprise. This issue caused the sudo issue above to appear suddenly after a reboot.

The sudo and eth0 issue was tricky to deal with because a non-working sudo means you cannot access and/or edit the files you need to to get things working. I found reports of people rebooting to single user or recovery mode, and other booting the live CD to access the hard drive, both of which seem like an over-sized hammer for this small nail. One mentioned using 'aptitude' and the the menu to switch to root, but from there I could not get a shell from there. One mentioned using gksudo to run xterm as root, and that should work. I tried to used the network admin tool to fiddle settings until I got the hosts and hostname files to match. This was not easy because that tool did not allow simple hostnames (no domains) in the hosts file.

Rails authentication: restful_authentication and acts_as_state_machine

2008-03-10T13:55:00.000+01:00

I have written a couple of rails apps with user authentication, but finally decided to start using some of the excellent plugins available for this. After a quick search I got the impression that restful_authentication was the current standard for rails (I'm using rails 2.0.2 at the moment), and especially with the link to the state machine plugin. However, my initial quick search did not yield a decent quick 'howto' for fast-tracking getting this all working. So I started writing notes on my findings here in my blog:

Create your rails application:

rails -d mysql myapp
cd myapp
rake db:create   # you might need to edit config/database.yml first to match your db installation

Install the two plugins required:

script/plugin install http://elitists.textdriven.com/svn/plugins/acts_as_state_machine/trunk
# I needed to use trunk, as other versions have a missing const RailsStudio error
script/plugin source http://svn.techno-weenie.net/projects/plugins
script/plugin install restful_authentication
# obviously these last two lines can be combined

Create the users model and controllers:
```
script/generate authenticated user sessions --include-activation --stateful
# This will create the users model and the users and sessions controllers.
```
It also adds map.resource entries for these in the routes.rb file. 'include-activation' is for email activation and 'stateful' is the tie to the state machine (for easily managing user activation and login status)

Edit the routes.rb file to specify the user states and add some useful routes:

map.resources :users, :member => {
:suspend => :put,
:unsuspend => :put,
:purge => :delete
}
map.resource :session
map.activate '/activate/:activation_code', :controller => 'users', :action => 'activate'
map.signup '/signup', :controller => 'users', :action => 'new'
map.login '/login', :controller => 'sessions', :action => 'new'
map.logout '/logout', :controller => 'sessions', :action => 'destroy'
map.forgot_password '/forgot_password', :controller => 'users', :action => 'forgot_password'
map.reset_password '/reset_password/:code', :controller => 'users', :action => 'reset_password'
map.account '/account', :controller => 'users', :action => 'account'

Edit the environment.rb file to include the line:
```
config.active_record.observers = :user_observer
```
This allows for activation emails to be sent.
Edit the migration, in this case db/migrate/001_create_users.rb, and add lines for the 'forgot password' feature and the option to have administrator users:
```
t.column :password_reset_code,       :string, :limit => 40
t.column :is_admin,                  :boolean, :default => false
```
Update the database:
```
rake db:migrate
```
Remove the following line from sessions_controller and users_controller and add it to application_controller to enable authentication application wide:
```
include AuthenticatedSystem
```
Remove or comment out these two lines from the UsersController.create method:
```
self.current_user = @user
redirect_back_or_default('/')
```
This allows us to add further processing of the user registration request, by adding a create.html.erb view and email activation.

Add the view/users/create.html.erb file with content similar to:

<fieldset>
<legend>New account</legend>
<p>Instructions for activating your account
have been sent to <%=h @user.email %>
If this address is incorrect, please
<%= link_to 'signup', signup_path %>
again. If you do not receive the email
soon, please check your spam filter.</p>
</fieldset>

Add the following helper methods to application_helper.rb:

def user_logged_in?
session[:user_id]
end
def user_is_admin?
session[:user_id] && (user = User.find(session[:user_id])) && user.is_admin
end

Add a 'forgot password?' link to the views/sessions/new.html.erb form (usually near the 'submit tag'):
```
<%= link_to 'Forgot password?', forgot_password_url %>
```

Add links to login/out and signup to your main page or layout. For example, I used a fixed position div like this:

<div style="position: absolute; right: 0px; top: 0px; height: 20px;">
<% if user_logged_in? %>
<%= link_to 'Logout', logout_url %>
<% else %>
<%= link_to 'Signup', signup_url %>
| <%= link_to 'Login', login_url %>
<% end %>

Support admin restrictions with the following method in the application controller:

protected
# Protect controllers with code like:
#   before_filter :admin_required, :only => [:suspend, :unsuspend, :destroy, :purge]
def admin_required
current_user.respond_to?('is_admin') && current_user.send('is_admin')
end

If you want admin control, add a before filter to the users_controller to restrict key actions to admin users only:
```
before_filter :admin_required, :only => [:suspend, :unsuspend, :destroy, :purge]
```

Add actions in users_controller.rb for account, change_password, forgot_password and reset_password:


def account
if logged_in?
@user = current_user
else
flash[:alert] = 'You are not logged in - please login first'
render :controller => 'session', :action => 'new'
end
end

# action to perform when the user wants to change their password
def change_password
return unless request.post?
if User.authenticate(current_user.login, params[:old_password])
#      if (params[:password] == params[:password_confirmation])
current_user.password_confirmation = params[:password_confirmation]
current_user.password = params[:password]
if current_user.save
 flash[:notice] = "Password updated successfully"
 redirect_to account_url
else
 flash[:alert] = "Password not changed"
end
#      else
#        flash[:alert] = "New password mismatch"
#        @old_password = params[:old_password]
#      end
else
flash[:alert] = "Old password incorrect"
end
end

# action to perform when the users clicks forgot_password
def forgot_password
return unless request.post?
if @user = User.find_by_email(params[:user][:email])
@user.forgot_password
@user.save
redirect_back_or_default('/')
flash[:notice] = "A password reset link has been sent to your email address: #{params[:user][:email]}"
else
flash[:alert] = "Could not find a user with that email address: #{params[:user][:email]}"
end
end

# action to perform when the user resets the password
def reset_password
@user = User.find_by_password_reset_code(params[:code])
return if @user unless params[:user]

if ((params[:user][:password] && params[:user][:password_confirmation]))
self.current_user = @user # for the next two lines to work
current_user.password_confirmation = params[:user][:password_confirmation]
current_user.password = params[:user][:password]
@user.reset_password
flash[:notice] = current_user.save ? "Password reset successfully" : "Unable to reset password"
redirect_back_or_default('/')
else
flash[:alert] = "Password mismatch"
end
end

Create html.erb forms in views/users for the change_password, forgot_password and reset_password actions.

Edit models/user_mailer.rb and replace YOURSITE and ADMINEMAIL with values appropriate for the new website. A good way of doing this is to define the variable SITE in the config/environments/*.rb files and then use that in the strings in the UserMailer with the "#{SITE}" format. Also add methods for forgot_password and reset_password (ie. send mails when those actions are invoked):

class UserMailer < ActionMailer::Base
 def signup_notification(user)
   setup_email(user,'Please activate your new account')
   @body[:url]  = "#{SITE}/activate/#{user.activation_code}"
 end

 def activation(user)
   setup_email(user,'Your account has been activated!')
   @body[:url]  = "#{SITE}/"
 end

 def forgot_password(user)
   setup_email(user,'You have requested to change your password')
   @body[:url]  = "#{SITE}/reset_password/#{user.password_reset_code}"
 end

 def reset_password(user)
   setup_email(user,'Your password has been reset.')
 end

protected

 def setup_email(user,subj=nil)
   recipients  "#{user.email}"
   from        %{"Your Admin" <bounce@yourdomain.com>}
   subject     "[#{SITE}] #{subj}"
   sent_on     Time.now
   body        :user => user
 end
end

Add forgot_password.html.erb and reset_password.html.erb to the user_mailer view.

Add methods to models/user.rb: forgot_password, reset_password, recently_forgot_password, recently_reset_password and recently_activated. Also add protected method make_password_reset_code.

  def forgot_password
    @forgotten_password = true
    self.make_password_reset_code
  end

  def reset_password
    # First update the password_reset_code before setting the
    # reset_password flag to avoid duplicate mail notifications.
    update_attributes(:password_reset_code => nil)
    @reset_password = nil
  end

  # Used in user_observer
  def recently_forgot_password?
    @forgotten_password
  end

  # Used in user_observer
  def recently_reset_password?
    @reset_password
  end

  # Used in user_observer
  def recently_activated?
    @activated
  end

protected
  def make_password_reset_code
    self.password_reset_code = Digest::SHA1.hexdigest( Time.now.to_s.split(//).sort_by {rand}.join )
  end

Modify UserObserver.after_save(user) to send activation only based on user.recently_activated? Also add to this method mail sending for forgot_password and reset_password events:

  def after_save(user)
   UserMailer.deliver_activation(user) if user.recently_activated?
   UserMailer.deliver_forgot_password(user) if user.recently_forgot_password?
   UserMailer.deliver_reset_password(user) if user.recently_reset_password?
 end

Make sure your mail subsystem is properly prepared to send mail. I installed postfix on my development and deployment machines (both ubuntu, so I used 'apt-get install postfix'). It is a good idea to test this with a command-line mail like:
```
sendmail -f admin@mydomain.com me@myaddress.com
Subject: test


Hello, world!
.
```
Add appropriate administrative links. In my case I created an 'account' route to a new account action and view in the users controller, and in this displayed current user settings and provided a link to the 'change_password' action. Since this is very similar to many of the actions above, it is left as an exercise to the reader :-)
Test everything, sign up a user, login, logout, click 'forgot password', respond to all emails sent, change the password, etc.
What's next? Well, in my case I continued by adding a boolean 'is_admin' flag to the users table and then adding extra capability to my site for admin users. I also created a cool layout and used it for all controllers in my site. This is rails after all, the sky is the limit :-)

Even Microsoft is getting cool

2008-03-07T13:39:00.000+01:00

When you think of 'cool' modern companies, names like 'yahoo' and 'google' spring to mind. For many 'apple' is also synonymous with cool. But Microsoft generally never gets that classification. 'Serious', 'business focused', even 'ruthless'. But take a look at the photo gallery of research projects from Microsoft's seventh annual techFest. Now that is cool!

IT back to business

2008-02-19T17:15:00.000+01:00

A recent ComputerWorld article describes a 'new trend' in IT towards having business savy 'IT' people working within business departments instead of centralized generic IT personnel. I think this is a trend that started a while ago with pragmatic companies focusing on operational efficiency. Unfortunately not all companies are pragmatic, but hopefully more start to follow this trend.

I had a gripe with the way IT was moving in two previous companies I worked for. In the first, the IT department was standardized across the very large international organization, and focused on the common low-tech user, which was completely unsuitable for our high-tech development site, dramatically restricting our efficiency. We had to do our own internal 'skunk-works' IT, and hide the costs, in order to operate efficiently. My worst case horror story was the time it took 6 months and 5 engineers in 4 countries to install a local printer! (previously we had one local IT tech who would do it in a couple of hours).

The second company was a start-up which meant it had good pragmatic IT for a while, but as it grew, the new management tried to increase operational efficiency by 'centralizing' and 'standardizing' IT. Sounds good on paper, but simply does not work in reality. Costs increased and performance decreased due to the separation of IT from the people actually doing the business. People working towards true operational efficiency were marginalized and often left the company. By the time I left it was well on the way to the level of operational inefficiency of the large multi-national I worked with before.

Over the years I've developed a very strong feeling that IT must be integrated into the business. I'm thrilled to see a prominent article claiming this as an industry trend! So I end this blog with a nice quote:

"I want them to think of themselves as people who work for this company, not people who work for this company's IT department," he says. "We have an energy supply business to manage. That's our business, and we want to do it as efficiently as possible. It doesn't really matter what the IT job is."

Jazz, software development and the Sun/MySQL deal

2008-01-23T13:20:00.000+01:00

Jazz - I kept hearing that word over and over the last couple of weeks, culminating in me finally joining YouTube and putting together a play list of jazz music videos. However, it all started with software development:

First IBM announced the partial open sourcing of their jazz.net service. This was interestingly relevant to my current search for on-line software project hosting services. The jazz.net site says: 'Developing software in a team is much like playing an instrument in a band. Both require a balance of collaboration and virtuosity. Jazz defines a vision for the way products can integrate to support this kind of collaborative work, and a technology platform to deliver on this vision.' Sounds great, but I'm not sure it is mature enough to replace my current top contender: launchpad.net. However, considering the great job IBM did with eclipse, I'm certainly going to keep an eye on this new offering.
Then I read an article about the new trend in developer recruitment, calling top candidates 'rock star coders'. This trend was countered by some well written blogs asserting that it is better to be a 'jazz musician programmer':

'I would rather be a jazz programmer' does a lovely comparison of rock stars and jazz musicians w.r.t. programming, emphasising creativity. I tried to paraphrase the key points here, but I think it just has to be read, so go take a look.
'I'd rather be a DJ than a rock star developer' likes the jazz programmer idea, but prefers to be a DJ, emphasising code reuse and web-mashups.
'Monks versus music' says that agile development teams are a lot like 'jazz bands'. This blog got me onto the YouTube jazz search, starting with Count Bassie and culminating with Nora Jones! In many videos artists joined forces to create a blend of music. I really like Nora Jones singing with Ray Charles, what a combo!

Sun's acquisition of MySQL - OK, no jazz mentioned there, but with jazz on my mind I saw a fit. It is yet another interesting merger of talents, the mature Sun trying their hand at the new world of open source, partnering with the younger, hipper MySQL with a very solid open source presence. It reminded me so much of the YouTube video of Ray Charles being introduced by Johnny Cash and then singing Johnny's song 'Ring of Fire' with a strong jazz slant. Let's see if MySQL will jazz up Sun's product offerings, or will Sun dull down MySQL? (considering Sun's recent moves towards open source, with Java and Solaris, I'm betting on the former, which is great).

Quick Ruby on Rails on Ubuntu 7.10 (and 8.04 and 9.04)

2007-11-02T19:20:00.000+01:00

There are many howto's on this subject out there, but since I ended up blending a few of them to get exactly the environment I wanted setup (and took notes so I could repeat this), I thought I'd blog it for future reference, and hopefully this info is useful to others.

So, what exactly do I want: a development and deployment machine with the following specifications:

A recent Ubuntu Linux, in this case 7.10 (Gutsy Gibbon) [update: I also installed on 8.04, hardy heron beta, with good success]
[update2: I have now installed on 9.04, with a few changes]
Web-App server comprised of apache2, mongrel, ruby on rails, mysql5
Development environment comprised of Aptana RadRails (based on eclipse), for both Java and Ruby on Rails development (I use Java for other projects, but also plan to do some JRuby work), and since eclipse rocks, I use it for everything I can, even rails. If you don't want java development, you can use the pure Aptana IDE, but then you also loose out on CVS support).
Some additional libraries for graphics and charting support in the rails apps.

The quickest setup procedure that worked for me, on four very different computers, a Dell Precision 380 desktop, an Acer Aspire 3100 laptop and most recently a Packard Bell quad core workstation and Acer Aspire 7530G, was:

Install ubuntu. I simply did a standard install from the live cd (downloaded from www.ubuntu.com)
Update all packages. I usually just click the update icon on the top right of the screen, but you can use synaptic ('mark all updates' and then 'apply'), or apt-get with the following commands:
- sudo apt-get update
- sudo apt-get upgrade
Install packages required for this setup. I used synaptic, but you could just as easily use 'sudo apt-get install ...':
- sun-java6-jdk / openjdk-6-jdk (includes a number of other required packages)
- sun-java6-source / openjdk-6-source (optional)
- joe (just my preferred old text editor, you choose what you like)
- flashplugin-nonfree (since I will need that for some sites I view/develop)
- ruby-full (bundles a bunch of ruby packages, including irb, rdoc and ri, but not rake and rubygems, see later for those - do not install them as ubuntu packages now)
- apache2 (provides version 2.2)
- mysql-server (provides version 5.0)
- libsqlite3-dev (required for the gem install of sqlite3-ruby, if you need that)
- build-essential (provides a c/c++ development environment required by some ruby gems which build on install)
- libmysqlclient15-dev (required by the mysql ruby library)
- eclipse (for 3.2 support in ubuntu 7.10 and 8.04) - to get eclipse 3.3 or 3.4 you need to download it from www.eclipse.org.
- On recent versions of Ubuntu (like 9.04) you also need to install xulrunner as described at this link. This is to get around a library issue between SWT (in eclipse) and Firefox3.0.
Install rubygems from source. This is a contested point, as ubuntu provides rubygems as a package also, but since it is a package management facility itself, it can conflict with the debian package management provided by ubuntu, so it is easiest to keep it completely separate:
- wget http://rubyforge.org/frs/download.php/57643/rubygems-1.3.4.tgz
- tar xzvf rubygems-1.3.4.tgz
- cd rubygems-1.3.4
- sudo ruby setup.rb
- sudo ln -s /usr/bin/gem1.8 /usr/bin/gem # This was not required for rubygems 0.9.4, but is required now on ubuntu with rubygems 1.0.1 and above
- sudo gem update --system # With 0.9.4 I needed to repeat this as the first time gave an error, but with 1.0.1 and 1.3.4 it worked first time
Use rubygems to install rails and some other useful gems using the command 'sudo gem install X' where X is any number of the following (I did all):
- rails # this includes dependencies like rake
- mongrel # for the deployment server
- mongrel_cluster # if you want to try out clustering
- capistrano # if you want to do the easy deployment as described in the 'agile' book
- mysql # for mysql database access from ruby
- termios # well, this was mentioned in several blogs and the agile book, so I just did it :-)
- sqlite3-ruby (sqlite3 is now the default database in rails2, so you might need this)
Add the Aptana Radrails plugins to the eclipse IDE:
- Start eclipse from the applications menu
- Go to menu 'Help->Software Updates->Find and Install'
- Select 'Search for new features to install' and click 'Next'
- Click 'New Remote Site'
- Enter name as 'Apatana' and URL 'http://update.aptana.com/install/3.2' and click OK
- Click 'New Remote Site' again
- Enter name as 'Apatana Radrails' and URL 'http://update.aptana.com/install/rails/3.2' and click OK
- Click 'finish' to start the search for updates
- Once the search is complete, you should check both 'aptana' and 'aptana radrails' and click 'next' to install
- Accept the license agreement and click 'next' and then 'finish' to start the actual download
- When prompted click 'install all' to finish the install
- Restart eclipse
Modify Eclipse to use Java6 (based on http://help.ubuntu.com/community/EclipseIDE). For Ubuntu 8.04 and 9.04, this was not necessary, but I did it for Ubuntu 7.10 and earlier.
- edit /etc/eclipse/java_home and move java6 up in the list so that eclipse uses java6 for running itself (which is faster)
- To get projects inside eclipse to use java6, open eclipse and go to the menu 'Window->Preferences->Java->JREs' and select the Java6 JRE.
Add some extra libraries to rails (optional, depends on your apps):
- I wanted ImageMagick for photo uploads and resizing in my rails apps, so I installed the ubuntu package for ImageMagick using synaptic, and then installed the rubygem 'mini-magick'. I had tried 'RMagick', but the gem did not install, and online help indicates that you need to re-install ImageMagick from source to get RMagick to work, so I opted for the simpler mini-magick.
- For charting support in my web apps, the popular approach of using 'gruff' wrapping ImageMagick seemed to generate a lot of help requests, so I went for the lighter approach of using client side flash as described in the following blog.
- For nicer color control, I installed pdf-writer to get the color-tools gem (and in case I needed pdf-writer itself for future development).
Once you have created a rails app, using the usual 'rails myapp', you should consider adding a number of cool rails plugins. There are a huge number out there. Currently I'm using attachment_fu, will_paginate, acts_as_state_machine, restful_authentication and ext_scaffold. See some of my more recent blogs for some more info on these.
Miscellaneous. I also changed my ubuntu font sizes down to 9pt on 'system->preferences->appearance' to get a bit more screen real-estate on the laptop. Windows uses smaller fonts so it looks better (to me), especially when developing in an IDE like eclipse where it is nice to have many panels open together. And for some reason the Ubuntu install had visual effects disabled (Composite extension not available), so I needed to edit /etc/X11/xorg.conf and change the "0" to a "1" on the Composite line near the end of the file. And I changed the theme to 'glossy' but with 'human' icons and darker colors. Looks cool now!

Well, quiet a few steps, but most of the time the computer is downloading packages, updates, plugins and gems from the internet, so you can just lounge around with a good latte :-)

And of course the required screenshot, this one is of the Acer laptop with firefox showing this page and eclipse with a Ruby on Rails application in production (ubuntu 7.10).

Next steps - there are lots of blogs out there on setting up production deployments for rails, and here's one I just read: http://www.urbanpuddle.com/articles/2008/01/09/install-ruby-on-rails-on-ubutu-gutsy-gibbon-apache-version

VMWare & Dual-Boot Ubuntu Feisty and Windows XP

2007-06-01T14:00:00.000+02:00

A long title I know, but I had to do some serious google searches to get this to work, and it was thanks to other long titles that I found the info I needed. So let me start with the references:

VMWare's document: Configuring a Dual-Boot Computer for Use with a Virtual Machine - good background info covering many (but not all) issues.
Running VMWare on a Physical Partition by Scott Bronson is a really good howto-style article with most of the details you need to do the job step-by-step.
How to install Vmware server From Canonical commercial repository in Ubuntu Feisty - that got me onto the easy install route
HOWTO Configure a dual boot windows linux to be able to open each os in vmware - gentoo based article, but had an interesting suggestion for windows IDE drivers.

So, as the first vmware document says, this is not necessarily a very easy thing to do, getting the same OS installation to boot on both the real physical machine and the virtual machine. Most issues relate to the different drivers and configurations required, or to the multitude of boot problems. However, the documents above, and the second one in particular, covered most of what I needed. So if you have a similar setup to me, you should be able to get this working OK by following that second article, with a few small changes suggested below.

So why did I bother to write this article? Well, not one of the above articles was enough to cover my setup, and I did not find any real help (hints, but not explicit help) for some of the problems I had:

My SATA drives are seen by Ubuntu as SCSI drives, but by Windows as IDE drives, preventing windows from even booting in VMWare without installing the vmscsi drivers *before* booting windows!
Any change I made to the Ubuntu network configuration after setting up vmware completely broke the vmware installation (and made gnome runs super-slow). This seems to be a edgy-feisty problem, and a google search found the solution (add the hostname to 127.0.0.1 in /etc/hosts - simple!)

So, let me outline the basic approach I took to get windows running inside ubuntu. The reverse is possible, and covered in the above articles, but I focused on this approach only.

My setup:

Dell Precision 370 with SATA drives, 2GB RAM, Dell 20" flatscreen (1600x1200)
Windows XP installed on C: (/dev/sda1)
Ubuntu Feisty (7.04) installed on /dev/sda3

Installation procedure:

Install the OS's (I think it is best to install windows first, making the C: partition 50% of the space and leaving the rest for Ubuntu, but other routes usually work too).
Make sure dual boot works, disable the timeout in grub so you are always forced to choose the OS (for now) - see article 2 above.
Under linux:

Prepare the disk as in article 2 (see article 1 for more info)
Install vmware-server from the canonical commercial - see article 4 above, or just use: deb http://archive.canonical.com/ubuntu feisty-commercial main
Make a floppy with the windows SCSI drivers from VMWare (either write a floppy with dd, or mount the image and copy the files to the windows partition)

Under windows

Prepare windows as in article 2 above (again article 1 has more info)

Make sure you make the two hardware profiles, and reboot to the new virtual profile for any driver changes.
Also install the vmscsi drivers by running the 'add hardware' wizard of the control panel, and clicking 'no I have not connected the hardware yet' to get windows to allow you to install the scsi drivers even through it cannot see any scsi hardware.
Consider changing the IDE drive as in article 4, although I'm not sure that was really required

Back in Ubuntu create the virtual machine according to article 2 (and reference article 3 for Ubuntu specifics if you need).

I recommend choosing the entire disk, not the partitions, as this seems to help a lot with windows/linux/vmware different views of the partition table.

Start the virtual machine as in article 2 and install VMWare tools (which will re-install the scsi drivers, but we needed to do that before this just to get it to boot in the first place).
Prepare the GRUB bootdisk for user-free reboots according to article 2. This is great because you can (mostly), not worry about rebooting to the wrong OS, or having the same OS running on both the hardware and the vmware at the same time (which would be disastrous). There are no solution to having to choose the windows hardware profile, but if you are like me and normally run windows inside linux, then having the virtual profile first, with a timeout, is great because even if you do boot the virtual profile on the real hardware by mistake and mess it up, it is much less serious than messing up the physical machine profile, and you can simply re-create the virtual profile by repeating some of the steps above.

Finally, more info on the two problems I had:

The need to install the vmscsi drivers in the windows virtual profile, but with windows booted onto the physical machine, is covered in the discussion above.
The issue with slow-gnome and broken vmware-server is really strange and I do not know the inner reason, but adding the line '127.0.0.1 localhost hostname' to /etc/hosts (with 'hostname' changed to your host), magically solves it. I wonder if the problem is related to the mysterious '127.0.1.1' line in the hosts file?

And, last of all, here is the obligatory screenshot showing the ubuntu desktop, with gimp and a gnome game running above the VMWare server console with Windows XP running eclipse showing a Ruby on Rail application with InstantRails windows as well. I know it's weird to be doing ruby and rails development in Windows, but I just happened to get started there, and I like the fact that eclipse on XP has nice compact fonts by default.

Amanzi Snippets

2007-05-04T17:23:00.000+02:00

Recently I've begun giving a weekly series of ultra-short tech talks to the other developers in our office. The original problem we had was that, while we were all interested in learning about, or hearing about, technologies outside our working domain, it was generally not possible to spent office working hours on projects not directly of benefit to the company. The solution has been to give 15 minute talks during normal coffee-breaks. One really positive aspect of this is that it forced me to focus on a single specific 'hot' topic, and thereby blocking my usual tendency to lecture on for hours. While the resulting tech-talks are not comprehensive, they were easily digestible. Anyway you can always go to the books referenced below if you want something comprehensive.
These ‘snippets’ follow my progression as I learned Ruby, JRuby and Rails from the books I read, internet articles, and trial-and-error with real scripting projects I did. Each week I thought of something that was of particular interest to me as a Java and sometimes Perl programmer, and presented it in a 2-4 slide ‘snippet’. Where possible I used examples directly relevant to the work we do, which is mostly Java programming of data models and numerical analysis in mobile phone networks (not mobile phones). Sometimes I used examples from the books I read along the way:

Beyond Java, Bruce Tate, O’Reilly – which got me interested in Ruby and Rails
Programming Ruby, 2nd Edition, Dave Thomas, Pragmatic Programmers – which taught me what I needed for me first few scripting projects
Agile Development with Rails, 2nd Edition, Dave Thomas and David Heinemeier Hansson, Pragmatic Programmers – which got me finally into Rails, after more than a year of hearing the buzz

For a more specific Ruby-2-Java comparison, see the extract in ComputerWorld of the book ‘Rails for Java Developers.’ This is a nice soft intro to Ruby for Java developers. However, while my snippets are not as complete, I think they are more interesting and relevant to me, of course. And I hope they will be of interest to others too.