I came across a bug in the zipfile python module yesterday that I had to fix today. The problem occurs when you try to create a ZipFile object and passing it a corrupt zip file. It doesn’t handle it gracefully like returning None or throwing an exception. Rather it heads into an infinite loop.

This is rather unfortunate for me. How would I get around this problem? The first thing I did was check for an updated python. Which there was a minor version upgrade. I found the changelog (why do they hide these things?) and noticed a few bugs resolved with the zipfile module. So I installed. Unfortunately, this didn’t solve my problem.

I managed to find a bug number in the python bug tracking software about people having similar problems. There was a patch, but hasn’t landed. I downloaded the latest stable version, but the patch wouldn’t go through. So I had to cvs checkout trunk and apply it. Once installed, I tried it and it worked! Success.

However, it broke other library I was using (PyXML). Unfortunate for me, the recent trunk build didn’t seem to fair any better.

At this point, I wasn’t in the mood for debugging. I had a few options at my disposal :

  1. Ignore this particular file
  2. Suck it up and debug it.
  3. Find a whacky work-around

Option 1 isn’t an option. Option 2 I tried for a fair while, but nothing worked. So Option 3 was my only option!

I tried using a lower level library to see if I can fix the problem (zlib library), but that didn’t work well at all.

I finally thought I had no choice but to initiate a thread to try and unzip the xpi, and if it took longer than 10 seconds, to kill the thread somehow. While seriously looking into this, and fighting the temptation to take tequelia shots at work. I came across signals (which I thought I could use to send to the thread. I’m so naive). It turns out, you can throw a signal after a specific number of seconds and it throws the SIGALRM. This was exactly what I needed without the extra complexity. The example provided was almost exactly what I did too! Here is my solution to the problem :

		signal.signal(signal.SIGALRM, signal_handler)
		signal.alarm(10)
		try:
			zippy = zipfile.ZipFile(io, 'r')
			signal.alarm(0)
		except:
			print "\tZipFile Timeout"
			continue

Maybe python isn’t just for programming sissies after all.