Pentaho problems

I am finding myself using a lot of Pentaho Data Integration at the moment.

It’s a good, powerful, tool, but my god does it have some annoyances.

It’s a drag and drop tool that allows you to process massive amounts of data in parallel, without needing to be an almighty data analyst already. This means that you can bring up the configuration windows for each data processing step you’re working with at the same time, so you can check you’ve named all your variables correctly, and so on.

It has a help system built in, which pops up a window containing the wiki page for the step you’re working with. Except that the help window is modal. The only modal window in the whole application is the one which gives you a guide on what to type into which box or which contains example and values that you might want to copy/paste into your step. Except you can’t. Because modal.

As you run your data process, Pentaho marks each step as in progress, or successful. Except that if you have your process divided up into multiple data transformations then you can only check the status correctly if you close all but the first transformation in the process, run it, and then re-open the sub-transformations from there. Baffling.

When your transformations are running you get a nice real-time log of what’s happening at the bottom of your screen, which you can scroll through. Except that as new lines are added to the log it scrolls it to the bottom. Good luck finding the log message you were looking at before!

More complaining into the void next time! Hope you’re looking forward to it as much as I am!

Downloading a Udemy course

I have bought Go: The Complete Developer’s Guide on Udemy. It’s been a good intro so far, but there are more than 90 videos, and I’d quite like to do some each day at work, where streaming video isn’t always allowed.

The author of this course has allowed his videos to be downloaded, but that’s on a video-by-video basis. I’d like to grab all of them.

udemy-dl has me covered, downloading all the videos in mp4 format and grabbing the subtitles too. This is almost certainly against the Udemy terms and conditions, but in practice will be very useful. Who knows, maybe I’ll even end up buying more courses!

Writing for the future

About a year after my son was born, when my memory started working again, I started writing a blog about the things he was getting up to.

I started it on 19 September 2010, 2718 days ago. In that time I have made 733 posts, roughly one every four days. Obviously there are peaks and troughs, but that’s a nice average to have. Enough time can pass between each one for something new, nice or surprising to happen and warrant recording.

I don’t use pictures, only words, because I’m keen that moving blogging platforms, or the vagaries of image resizing don’t destroy it over time. In 18 months my son will be ten, I think that’ll probably be enough posts to get it printed out and bound. I’ve used many times with great success, but there are other services which do WordPress-specific imports so hopefully I’ll find something which lets me do it nice and easily as well as making something that looks good, and lets me preserve my digital record well past the ability of any digital records management.

Playing games

Growing up I played the board games that you might expect a kid growing up in the 80s to play: Scrabble, Monopoly, Frustration, Cluedo and so on. Although I mostly enjoyed them, they were all tedious in their own ways. The more interesting the game, the longer it took, and the more “adult” it was seen to be and was therefore either out of reach of my younger sibling or took to long to play with my parents. Today is very different.

Sites like mean that it’s possible to find games for my kids which don’t take too long to play and are also accessible for their ages, meaning they’re much more fun!

As well as some of the classics like “Guess Who” and “Connect 4” we’ve acquired Kingdomino, Castle Panic and Labyrinth, all of which are good fun.

On our horizon I can definitely see Catan Junior and Ticket to Ride: First Journey (Europe). Hopefully my kids will be able to look back on the board games they played with pleasure rather than mild horror, and will be able to play better, more interesting games as they grow up.

Output a timestamp with each line in a Maven log

Maven is a powerful build tool for Java and it tends to spit out a large amount of logs, requiring you to scroll back in your output window or console to look at what’s happening. If you’re running it regularly, for example whilst building tests then it’s easy to scroll back slightly too far and look at the results from a previous run by accident.

An easy way to avoid this is to configure Maven to output a timestamp on each log line. Just open up your MAVEN_HOME/conf/logging/ and change the dateTimeFormat like this:


Not only will this make it easier to spot if you’re looking at the correct log lines but you’ll also be better able to see how long each stage is taking (although for real measurements here you’ll want a profiler).


I do not like debugging. I prefer good logging.

The log4j manual quotes Brian W. Kernighan and Rob Pike from their “truly excellent book” The Practice of Programming:

As personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places.

Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugging sessions are transient.

There are times when a debugger can be really helpful, but in my experience they are normally used as a fallback for a poorly documented system with an unclear flow of logic, or overly large methods with poor test coverage.

Internet rabbit rebuild – step 1

Back in Christmas 2006, I was lucky enough to get everything I asked for, and one of those things was one of the first commercially available Internet of Things devices – a Nabaztag.

A picture of my Nabaztag with other Christmas presents

This was a beautifully moudled piece of plastic designed to look like a rabbit. It connected to your wifi and triggers could control its LEDs, its individually rotating ears or play sound through the speaker. I had great fun with it but, in what would be a salutary lesson, the fact that it was proprietary hardware, talking over a proprietary protocol to a proprietary server was soon a problem when not only was my model made obsolete by newer models but then the company going bust. Suddenly, I had a great-looking paperweight.

Some keen nerds reverse-engineered the protocol and wrote their own servers (like NabAlive, NabaztagLives and OpenJabNab and there are a whole host of libraries listed here), but they’re not all straightforward to set up and there isn’t as much support for the first version of the Nabaztag.

All this means that mine has been in the loft for the best part of a decade, but commodity hardware is now affordable enough, and low-effort enough (no soldering for me!) that I thought I’d try and bring my internet bunny back to life, in particular after being inspired by Roy Tanck’s attempt at doing the same thing by replacing the insides with a Raspberry Pi.

Step 1 was to take it apart. There are some triangular screws on the bottom which came out pretty easily using one end of some needle-nosed pliers and then the rest is standards phillips-head. It’s impressive to see how far electronics manufactoring has come in the last decade – the wireless in the original rabbit was provided by a full-sized PCMCIA card!

A picture of the front of the Nabaztag's main PCB

Once the case was off, I removed the electronics and motors from a central plastic frame, and my next step is going to be to prototype replacing them using a Raspberry Pi Zero W with a Blinkt attachment.