Bespoke OS blip caused chaos in the air

A pilot proved that a human touch is always needed after a Qantas Airbus’s computer system suffered a major blip – monitors were littered with error messages while it failed to detect exactly what was happening.

The Airbus A380 is designed to be flown by a computer with minimal human interaction but the expertise of pilots was needed when their cockpit displays filled with alerts, and were replaced by screen after screen of warnings that had to be clicked through indvidually following an engine explosion on a flight earlier this month.

Aviation computing experts told TechEye that the multiple failure indications were caused by wires being cut by flying debris from the broken engine. They added that some kind of monitoring of the wires themselves would have been needed in order to prevent this from happening, something the OS didn’t have.

The pilots of the plane received  a total of 54 alerts on all eight of the Electronic Centralised Aircraft Monitor displays, which are designed to help pilots deal with emergencies.

One warned that a ram air turbine – a backup power supply – was about to deploy, although this didn’t happen.

The problem with the blip was that by giving too much and at times false information, after an incident there is a danger pilots may spend their time dealing with a deluge of error messages.  Thankfully for the passengers there were five experienced pilots on-board the flight – but they had to spend 50 minutes clicking through warnings, prioritising the most serious.

We wanted to know which bespoke operating system Qantas and Airbus used as standard. It’s a simple question, you’d think, but both were reluctant to talk. Qantas refused to say anything at all while Airbus told us: “First of all, you need to know that all our on-board systems are tested to be certified. None of the aircraft flying controls are based on COTS (Commercial Of The Shelf) applications.”

The European Aviation Safety Authority wouldn’t talk either. The topic’s ‘private’ because the software choices are for the airlines to make. The CCA were a bit more helpful in saying: “we assume it’s a bespoke system, carried and updated across from older models but we can’t say which one it is, we don’t know.”

We were given a nod at last. The bespoke OS used is Green Hills Software’s INTEGRITY-178B operating system, which is also used in military vehicles. A representative at Green Hills confirmed this: “We provide the INTEGRITY-178B operating system to our customers for the Airbus A380”.  

“We supplied Honeywell, who provide the flight controls for the aircraft, with our software. Our software is the OS for the craft, it does what other OSs in PCs do. For example other companies provide the applications, and we’d be the Microsoft equivalent – the spanners and hammer in the software part of the plane.”

Because the OS is a realtime system, Green Hills is the main provider.

Francesco Cesarini, founder and chief strategy officer at Erlang Solutions, told TechEye: “Green Hills Separation Kernel offers security at a level that is simply not achievable with mainstream OSes. It should also be said that INTEGRITY-178B is a Real-Time Operating System. This means that it was designed to behave predictably in situations where it is absolutely vital that the system responds correctly and in a timely manner.

He shed some light on Green Hill’s claims that its OS wasn’t entirely at fault for the failures: “As a general rule, it is not possible to exhaustively test sufficiently complex systems in order to completely rule out failures. Also, in order to verify a system exhaustively, you have to foresee every combination of events that it will be exposed to. In the case of an operating system, this includes all possible ways in which an application running on top of it may misbehave.” 

“The software at fault is hardly the operating system itself, but probably more likely the fault management system – not just the software, but the constructs in place to deliver sensible indications to the software.”

“When monitoring a system, the software itself often cannot tell the difference between a dead subsystem and one to which the wires have been cut. In order for it to understand that the multiple failure indications were in fact caused by wires being cut by flying debris from the broken engine, some kind of monitoring of the wires themselves would have been needed, and a dependency model stating that broken wires would lead to those failure indications.”

However, there are difficulties with using a bespoke OS. “One challenge is that Linux is grabbing much of the Real Time OS (RTOS) market share by offering increasing capability for applications that have low-latency requirements and modest demands on hard real-time.

“While it is not suitable for mission- critical applications, such as aircraft control systems, it removes much of the “easy pickings” for the RTOS vendors, and thus much of the profitability.”

The engine explosion happened about six minutes after the plane took off from Singapore Changi Airport heading for Sydney. The pilots landed the plane safely in Singapore, with 433 passengers and 26 crew on board.

Rolls Royce admitted earlier this month that dodgy parts played a major role in the engine failure. But flight controllers, component and other bits of kit under the bonnet have been criticised before. In March of this year, Motorola and Intel were accused of supplying defective components to another Airbus crash, which tragically resulted in the loss of life for 216 passengers and 12 crew.