Oct 29, 2009

One liner multi-image cropping

A simple entry for a simple way to solve a common problem.

I often have a lot of scanned images, documents usually, that have unnecessary white-space, and other artifacts around the edges due to limitations in the scanner software or issues with the scanner itself. Normally I use the gimp, and can correct orientation problems, colour issues and cropping very easily. But this is too much mouse work in my opinion when you have 50 images and all need a similar simple cropping. So I fiddled around in the shell a little and came up with this one-liner:
for file in *.jpg ; do echo "Cropping $file" ; \
convert -trim $file trim/$file ; \
convert -crop 1692x2190+4+4 trim/$file crop/$file ; done

OK. Not really one line, but I swear I did write it on one line :-)

What it does is three things:
  • loop over all image files, and for each:
  • trim all white-space off the edges, bringing all images to the same basic size with the top left of the real data at the top left of the image
  • trim them down to a specified size, in this case a little smaller than US letter size. I cut 2 pixels off each edge just to remove some noise commonly found on the edges of scanned images.
And here is a simple set of thumbnails of the results:

original image with many artifacts

first trim removes continuous whitespace

final cropped image at letter size with edges removed

It took about 10 minutes to figure out this script, 20 seconds to run it on all 50 images, and then 20 minutes to write the blog!

Oct 15, 2009

Complexity threshold

I've been planning to write a proper article about this for years, but inspired today by Dan Pink's excellent presentation on the surprising science of motivation, I thought I'd write a mini-version of this, as an introduction and a reminder to myself to write the real one.

Back in 2006 I wrote a blog titled 'The perception of control', dealing with uninformed decision making and the illusion of efficiency, commonly found in control-based companies. Allan Kelly gave a detailed and informative response, including the statement "The fundamental problem facing developers is Complexity. Attempting reuse simply increases complexity."

This got me thinking, and I remembered a long series of interesting experiences I had regarding attempts to scale up development teams, and the very mixed results I got. These days most people know about 'The Mythical Man-Month', but I had experienced many occasions where I needed to argue this point with people that did not. I'm embarrassed to say I was one such person for a while, and needed to learn the hard way myself. One advantage of learning the hard way is that you are forced to try to explain what happened, to yourself and others. So I though about it, and noticed a number of interesting correlations:
  • Adding people to projects works to some extent for very small teams, but can quickly get out of hand. It seems as if a threshold is crossed after which the project becomes a death march.
  • Assigning tasks of varying levels of complexity to a specific developer has a remarkably similar threshold. Their performance is consistent until they pass some level of complexity, after which the task goes _pear shaped_.
  • New developers on old code can perform well if the code is relatively simple, and perform extremely badly if it is not. Most critically, code that is twice as complex does not half the performance. The effects can be much worse. Again there seems to be a threshold.
How do we find, measure or manage this threshold? It is not simple, but there are a few factors to look at to help us analyse the situation. These involve the characteristics of the developer, the team and the project:
  • Each developer has their own threshold, after which the complexity becomes hard to manage. Obviously this threshold is not fixed, and depends on many factors, including their state of mind. Dan Pink's presentation demonstrates an interesting effect where increasing motivation decreases performance for creative tasks. He says increased focus can decrease problem solving skills. I agree, and see a correlation, since decreased problem solving skills means the developer will have a much lower _complexity threshold_.
  • The team has a threshold, very strongly correlated to the level of communication in the team, and most of the Agile software development approaches have a lot to say about how to deal with this problem. Usually they try to increase communication and couple that to catching problems early.
  • The project can obviously range from simple to very complex. Many approaches to dealing with this complexity are to split the project into modules and build teams on each, but in many cases this merely replaces one type of complexity with another. The more modern solutions to this involve prototyping, re-factoring, TDD and BDD.
In projects that I manage, I believe we need to deal with all three areas. However, I have a personal bias towards focusing on the first, largely because I'm particularly interested in individual behaviours. I think this helps me also deal with the second issue, because if you have a feeling for the individuals, you can use that to help improve communication in the team. On the third point, I have a solid track record of prototyping and refactoring. I am also a believer in TDD and BDD, but must confess I can do with some improvement in those areas.

Since this blog is merely an introduction to this idea, I will end off with a short piece on how I think we can help individuals keep on the _good side_ of their personal complexity thresholds. I have normally focused on two approaches, but Dan Pink has shown me a really nice new take on this. Let's start with my classic view:
  • Identify the threshold and avoid it. Ok, this sounds like a cop-out, but what I'm talking about is assigning tasks to team members based on their natural skills, and so reducing the risk of crossing the threshold. Of course, one of the best ways to do this is to get the developers to pick the tasks themselves, as common in several Agile approaches, like Scrum. This is a good start, but does not always work, because while some developers pick the easy tasks and perform well, others are suckers for a challenge. So sometimes you need to advise them. An external opinion can make all the difference. If you are the developer yourself, learning to identify when you're in trouble is a valuable skill, but you will also get help to do so in an Agile team, especially if you use pair programming.
  • Adjust the developers threshold. Ok, this one sounds hard. Not really, when you realize that this might be as simple as additional training. But one area that really matters is helping the developers identify the thresholds themselves and respond to them. It can be very hard, when stuck in a complex problem, to see outside the box. Again, working in an agile team really helps here. Getting a second opinion on something, even if you might not realize you need it, can make all the difference.
Dan Pink's presentation really builds on this second option. Instead of trying to improve performance with classic rewards, like bonuses, we focus instead on intrinsic motivators. Dan describes three:
  • Autonony - the freedom to define your own life. In my world that can mean something as simple as allowing the developers to pick their own tasks. I also see this respecting their opinions in design meetings. In fact anything that tries to break the control-based management I described in 'the perception of control' is a good start.
  • Mastery - the desire to get better and better. Obviously this can mean training, but coupled to the first point it can mean the option to develop ones career in the direction that one has a passion for. For more on the use of the word 'passion' in this context, I advise reading 'The Hacker Ethic and the Spirit of the Information Age'.
  • Purpose - the yearning to do what we do in the service of something larger than ourselves. Ok, being a fan of open source, this sounds great. Of course it is both more subtle and more applicable than that. People want to do something good, or something of note. In a software project this can be as simple as assigning key features to developers. Someone constantly getting the boring stuff is bound to perform worse.
I found Dan's presentation very interesting. But is this really related to the problem of complexity? I believe it is. If low performance can be attributed to crossing the complexity barrier, then these techniques can be used to move that barrier.

Dan finishes with a introduction to ROWE - the 'results only work environment'. I've been playing with a few ideas in this area, and I'd love to report on them, but need to do that in another blog :-)