Archive for the ‘Uncategorized’ Category

Hello Ubuntu, so long Microsoft (forever!)

Friday, August 1st, 2008

I recently installed Ubuntu on my home laptop (and subsequently on my work laptop also, but thats a different story). The primary motivation for this was to get Vista off my home laptop (a Dell Vostro 1700). So why ditch Vista?

Well for starters, I never wanted it in the first place. When I went to buy a laptop last December, the best deal by far was for the Vostro, but, there was no option to get it with XP, it had to come with Vista (damn you Dell!) Secondly, Vista really is as crap as everyone says, here’s a summary of the problems I had with it:

  • blue screens: I thought blue screens were a thing of the past, but no, happened about once a month or so, for no apparent reason.
  • explorer really unstable: this happened quite a bit, explorer just bombs out and restarts itself, quite annoying as you loose all your open explorers, etc..
  • no improvements: ok fair enough I thought, the first version is bound to have some teething problems, I’m sure a lot of things will be fixed in SP1. Nope, SP1 came and not a single noticeable difference. I also have zero confidence that it will ever improve. I also have zero confience that Micrsoft will ever do anything innovative on the desktop either after my Vista experience, XP will be around forever.
  • slow, slow, slow: I bought a pretty kick ass machine (dual 2.4GHz processors, 4G memory) and I was quite disappointed with its performance under Vista, it just felt sluggish.

Now that I’m up and running and fully migrated all my apps/work over to Ubuntu, how has it worked out? Quite frankly, it couldn’t of worked out better! I did have some installation issues, which I’ll blog about again, nothing major just a few things that cost me some time that really shouldn’t of. I also should point out that I have used Linux on the desktop in the past and had to take an 7 year break as such (as the last company I worked for developed software explicitly for windows) so I’m not a complete linux novice (although close enough at times!)

The most notable difference is a general increase in productivity, which is what I had hoped would happen. The Vostro performs so much better under Ubuntu, its sharp, its snappy. It’s also incredibly stable, and I’m finally feeling happy about the machine and seeing some bang for my buck in buying a slightly higher spec machine.

I should point out what tools I use to do my work and their equivalents on Ubuntu (as there isn’t that much difference between the two operating systems from a tools perspective):

  • Emacs, Eclipse and Firefox: I spend most of my days flicking between these 3, all work identically on Ubuntu as they did on windows. My personal email is Gmail/Google Apps so all browser based.
  • lots of command shells: Gnome Terminal replaces both Putty on windows (which is a great tools but sucks when you spend a lot of time in it) and the default windows command prompt, its so superior to both its not funny.. This coupled with the workspace switcher is where my biggest gain in productivity has come from.
  • the odd presentation & word/excel documents: Open Office has done the trick here so far here. Just about though, some really screwed up word formatted docs just don’t appear right. This hasn’t be a real issue so far, but its for this reason that I still have a windows partition to boot to on the off chance that Open Office can’t handle some critical word document at some point in the future.
  • on a slightly negative note, the one tool I’ve yet to adequately replace on Ubuntu is the Tortoise windows shell for Subversion. I’ve tried kdesvn and rapidsvn but both fall well short of the simplicity of Tortoise (who knew Tortoise was a killer app, eah?). I find myself using subversion from the command line on Ubuntu more and more (which feels like a step backwards). I use subversion a lot (both at work and for personal projects, and I also use it as a backup of sorts for stuff I don’t want to loose) so I need to spend a bit of time figuring out what works best for me now on Ubuntu.

So now that I’ve moved, and never going back, how do I get my ‘Vista tax’ back from Dell? (and Dell if your listening, note that I’ll never buy a machine from you again thats preconfigured with any flavour of Windows!)

Ubuntu Poster
(Creative commons, courtesy of Hannes Pasqualini)

The Ecomonist kiss of death..

Thursday, July 17th, 2008

Interestingly that only after 5 days since the Economist gave her a glowing profile, Diane Green of VMWare got the boot, bad timing for all concerned. I also remember they did a glowing report on Enron (which I just dug up here) a few months before it went bankrupt, again bad timing for all concerned!

However, this story about Ahmad Batebi in last weeks edition is both tragic and amazing that having your picture on the front cover of the Economist could land you in so much shite. Will be keeping an eye out for translations of his blog which may well be what is under construction here.

I know we have (and have had) our own troubles in this country but I think the most of us would take recession over repression any day..

Light bogging of late..

Thursday, July 3rd, 2008

Just realised its been a month since my last confession (sorry, blog post) - in no particular order, this is what I’ve been up to of late:

Golf - its that magical 2 month window in the year that you can get 11 holes in teeing off at 7pm. The golf club in Tramore has recently opened up the long awaited 9 new holes, and they really are fantastic - its like having a brand new course in your old course (or something like that). Haven’t helped my game unfortunately, the new holes are a bit hard!

Euro2008 - ’nuff said. Roll on the olympics.

The new job - just over two months now in the new job and really enjoying it. It’s real startup stuff, every deadline is critical, securing the first big customer is critical, etc but quite exciting and a challenging to boot. I’ve mainly been looking at scalability, so lots of multi-threading, performance analysis, fixing bottlenecks, etc. Good fun so far. Here’s a video of the team on a recent night out - thanks Paul!


FeedHenry Team Dinner from Paul Watson on Vimeo.

YSlow and front end performance improvements..

Wednesday, June 4th, 2008

Only came across YSlow recently, it’s a neat little Firefox plugin for analysing website performance developed by Yahoo!. It provides help/advice for each ‘performance grade’ and the accompanying background articles make a very interesting read.

Obviously not all the advice applies to every site (we’re all not Yahoo) and perhaps some of the advice is of more benefit to huge sites (like having a CDN, again, we’re all not Yahoo). For me, the top three are:
1) Add an Expires Header
2) Configure Etags
3) GZip

However, with regards to Expires and Etags, you really need to know what your doing or it can be a disaster! IMO you shouldn’t really use Expires unless your using versioned content (e.g. instead of serving up http://…file.css serve http://…file-1.0.css) and can set the Expires date in the far far future (to never expire effectively). Then when your content changes on the server side, you change your version and a completely fresh copy is loaded into the cache, to expire again in the far far future). Non-trivial to implement. Setting Expires to be hours/days in the future is just guess work and should be avoided.

You also shouldn’t use ETags at the webserver level unless your sure its safe (as it relies on file based checksums which causes problems on server farms). To effectively use ETags, the value of the ETag in the request header needs to mean something to you when your processing it on the server, e.g. a ‘last modified’ timestamp from the database or something. If your sure the content hasn’t changed, return a 304 response, otherwise return 200 and the full content.

So thread carefully with Expires and ETags, but when they work, they do work very very well! GZip is a no-brainer, its safe, tried and trusted but still needs to be explicitly turned on in the majority of cases. Kudos to Yahoo though, YSlow is an excellent tool, just make sure you fully understand the advice being offered!

Screen scraping Google Blog Search for a large list of feeds..

Tuesday, May 20th, 2008

I needed to get a very large list (thousands) of feeds for testing our feed parser robustness, i.e. we want to be able to handle any feed, no matter in what format (or lack of format) the feed is in, etc.

Turns our getting a very large list is not that straightforward, sure you have ‘top 100′ lists, and lots of directory sites that categorise lots of feeds, but no one big list of a few thousand active blogs/news feeds! Google does have a ‘ajax feed search‘ API but its server based and more importantly I wasn’t entirely sure if I was in breach of their T&C’s so better safe than sorry..

So, I took to doing some simple screen scraping. I choose Google Blog Search to gather the feed URLs from, as it returns feed links in a single page when you search for a term.

The scraping process is quite simple:

- enter a search term into google blog search

- get the 10 URL links returned (all conveniently flagged with class=f1, you can tell this from Firebug or just view source)

- hit the next button to get the next 10 results

- repeat

One problem though, Google detects that your an automated script (which is correct!) and stops returning results if you misbehave. To get around this:

- use random sleeps between searches and ‘clicking’ next (I was in no hurry for this script to run so execution time didn’t matter, so I left very long pauses)

- make sure that the HTTP headers sent from your script match the same headers that are sent from your browser when you search manually (in my case firefox on windows). Wireshark is good for this sort of stuff, makes it very easy to examine the HTTP headers.

The Ruby script itself (thanks to Hpricot - fantastic library) is very simple. I stopped it running when it had gathered me a list of ~25,000 unique feeds or so. Now, to start the real testing…

New job..

Friday, April 25th, 2008

Good news, I have a new job - joined FeedHenry this week as their technical architect. FeedHenry is part of the TSSG organisation here, and will likely be spun out as a separate company in time. I’m working a 4 day week for the moment, so Tax123.ie is now a part time gig.

FeedHenry is a widget service delivery platform, and is a culmination of several innovative research projects carried out in TSSG. IMO is far superior to other competing platforms, mainly because it moves widgets beyond the ‘toy’ status. How it does this I can’t say as yet; its not quite live at the moment, but I should be able to blog a big more about it in a month or so.

So I’m very impressed after my first week, the team is great, the technology is very interesting, its got amazing potential to be incredibly successful and I’m really looking forward to getting stuck in over the next few weeks! (And its a good job we all studied about ‘widgets’ so much in school/college, its great that everyone knows what a widget is… right? ;-)

Guinness floating widget

Image via Wikipedia

Google App Engine, now that the dust has settled..

Tuesday, April 15th, 2008

Having read quite a bit of commentary on Goole App Engine (GAE) launched last week, in general the reaction has been quite positive, and I’m sure GAE will enable lower the bar for many developers looking for a cheap solution for storage and a potential solution to their scalability needs. What is striking is a the lack of concern over a few fundamental issues however:

1) Lock in: if you go down the GAE route, you are totally locked in to staying with Google for the lifetime of your application. You have to develop using their choice of language and their APIs. If you want to leave, you can’t take exactly take Googles proprietary BigTable and GFS technology with you, so where exactly are you going to find a new home for your system? It’s interesting that new technologies like AppDrop have already started to appear to address this issue (AppDrop just address portability however, it doesn’t provide an alternative scaling mechanism to BigTable, but I’m sure that will come..)

You’d have the same problem if you go with Amazon, although not to the same degree: EC2 is open, the new persistent storage should also be portable, but if you use S3 for storage, SimpleDB, or SQS, you can’t exactly take them with you either (admittedly you can abstract the API’s for these services, but the alternative implementations may not match the desired functionality/scalability requirements exactly). In short, until there are Open Standards for the Clouds, tread carefully unless you don’t care about being at the mercy of a 6 ton gorilla! In this day and age, I can’t see how any business would lock themselves into a proprietary solution only available from one vendor, didn’t ‘Open’ win some time ago?

2) Scalability: GAE primarily addresses the eternal detractors question of ‘does it scale?’. I can imagine many naive develops and VC’s alike breathing a sigh of relief when the answer is ‘we run on GAE’. Idiots. There is no one size fits all solution for scaling any software system. BigTable and GFS are a solution for specific scalability requirements (most notably search!). You are effectively locked in to Googles scaling solution, regardless of how appropriate it is for your application, with no leverage for even testing alternative scaling approaches (as your development is so tied into Googles language/APIs). Scalability is complex, usually involving both hardware and software, but its solvable on an application by application basis - once your happy with the ‘do what degree doesn’t it scale?’ question yourself (nothing scales infinitely without cost), there are well know techniques/patterns to address most scalability requirements. So, unless your absolutely sure that BigTable and GFS suit your application scalability requirements, why would you use GAE?

3) Pigeon-holed hosting: the odd thing about GAE is that you can only develop applications from scratch on it. You can’t use many of the open source tools and technologies that can play an important part in your overall application/service. Want to use an open source CMS, like Drupal, for some basic content management on GAE? Nope. Want to use Wordpress as the blogging tool on your site? Nope. Want to use a non-python open source library for PDF processing for example? Nope, just not an option. So, your overall application is restricted to what ever open source applications/libraries that run in GAE’s sandbox, of which they’re aren’t many (how many useful open source python projects don’t use the full standard library?). At the end of the day, I’m just more comfortable with a shell to a real server on which I can install anything that helps our business, regardless of what it was developed in. (None of this is an issue with Amazon obviously)

So in summary, unless you know all your requirements upfront (waterfall development model anyone?) think long and hard about committing to GAE.

Google App Engine and misc..

Tuesday, April 8th, 2008

After initial speculation that Googles ’storage’ announcement yesterday would be a ‘BigTable‘ web service, it looks like they’ve changed the game in a much bigger fashion again with the release of Google App Enigine. Huge win for Python and django framework in particular. It’s suppsed to be language neurtal, but I wonder if that includes java.. Java would be the biggest win for Google, not only is it the natural backend for GWT client applications (GWT is java development environment, even though it generates javascript behind the scenes) but its the key to getting applications out of the enterprise and hosted on Google. The pricing will also be interesting, no announcements on that yet..

Also, on a completely different topic, I was tracking down an odd timestamp issue yesterday where timestamps were still on winter time. Given that our server is based in the US (we use Joyent) I assumed that the servers hadn’t adjusted to summertime. However, the servers were running on GMT time, which is the timezone that I thought we lived in, but, I was wrong - we live in IST (Irish Standard Time) - full explanation on wikipedia here. Google will tell you the GMT time by typing ‘time GTM’ into the search box (and compare that to just typing ‘time’, which returns IST for me). File under ‘you learn something new everyday’.

Waterford Open Coffee in Dunhill

Monday, March 31st, 2008

This months Waterford Open Coffee meeting has the air of a school tour about it - we’re meeting in Dunhill in the heart of rural Waterford this Friday, more details here.

Given that I grew up only a few miles from Dunhill (in neighbouring Annestown, famous for being the only village in Ireland without a pub! Although given the way every second pub these days is up for sale, this may not be the case for too much longer unfortunately), I’m really looking forward to getting a tour of the Enterprise Park and hearing about the plans in motion for phase 3 of its development (I’ve been to the center for a few very enjoyable and enlightening evening lectures in the past few months given by renowned local historian Julian Walton, but I’ve never seen the place in day light).

The DBFA community (Dunhill, Fenor, Boatstrand, Annestown) really have done an incredible job over the last 15 years or more in developing the local area, and its a real credit to everyone involved (even earning a national award back in 2005). The Enterprise Center and the Copper Coast initiative are probably the biggest projects, but as you can see from the websites, there’s a lot going on there - fair play to Donal and team!

Here are some pictures on flickr of the area: Dunhill Castle, Annestown, Boatstrand, Bunmahon, Copper Coast, Stradbally.

Also, for anyone who’s curious about the Waterford Open Coffee club and is a bit hesitant to come along as you may not know anybody, get in touch with me beforehand (dberesford@gmail.com), I’ll answer any questions you may have and keep an eye out for you on the day (I know I personally hate walking into a room not knowing anyone!).

IMAP and misc..

Tuesday, March 25th, 2008

Blogging has been light of late, mainly due to being immersed in writing our own min-CRM tool for Tax123.ie (why write your own CRM in this day and age I hear you ask? For very good reasons, mainly to do with achieving 100% integration with our internal processes (and also a significant long term cost reduction) - I’ll blog about this some more at some point in the future. Anyhow, a core feature of our mini-CRM is complete integration with our email (hosted by Google Apps, which I have nothing but praise for also), so that any emails to customers can be kept in sync regardless of where they’re sent from (i.e. GMail or our mini-CRM).

Thankfully timing wise, Google Mail added IMAP support back in October. Also thankfully, IMAP does exactly what it says on the tin - I love it with API’s just work! (Although there were a few hairy bits ensuring that everything is UTF-8, the Polish charset is ISO-8859-2 and was causing a few problems before I got the encoding right). So, we’re away on a hack with our mini-CRM, roll on tax processing nirvana!

Also, just read the Ecomonists interesting take on social networks and the AOL/Bebo acquisition - some good points in there, but to me however, it is a business - its all about the brand! I’m not saying it’s justifies a $850 million price tag, but Bebo is a household brand name (in Ireland and the UK anyhow) and it very much keeps AOL in the game. Coupled with AOL’s other less news worthy acquisitions and that somebody, somewhere, in AOL has a plan - good luck to them, it will be interesting to watch!