Linux kernel developer Theodore “Ted” Ts’o has found what he has dubbed a “Lance Armstrong” inside the code.
A Lance Armstrong is a reference to a US cyclist who managed to avoid being fingered for doping. In Linux terms it means code that never fails a test, but evidence shows it’s not behaving as it should.
According to Ts’o it all started when a user had reported a problem that caused them to lose data.
The kernel developers found a fault in the ext4 implementation that was introduced with the release of Linux 3.6.2, just over a week ago.
But the data corruption bug was hard to track down because it only happened when a system is rebooted twice a short period of time and when the file system’s starting block is zero.
The kernel apparently truncates its journal when the file system is unmounted so quickly and does not get a chance to be written completely.
When the hardware is booted the first time the ext4 driver can recover the journal which will not lead to any ill effects. But if it happens twice then data from the newer mount session will get written to the journal before data from the older one which leads to a bloody mess.
So users who shutdown and boot their laptop instead of hibernating them were likely to be hit by this bug more often.
Ts’o, who works for Google, was a little embarrassed he did not spot the bug earlier and has released a patch. But this did not appear to work so Ts’o has now published a second follow-up patch.
But the mess is not completely over – the code was also backported to the stable 3.5.x branch of the kernel. It is currently not known when it will be fixed. Apparently Linux developers are working to fix it.