CAPE CANAVERAL, FLORIDA – After NASA’s Aerospace Advisory Council revealed on Thurs. Feb. 6 that Boeing’s CST-100 Starliner had coding errors which could have led to the catastrophic failure of the capsule during reentry, NASA Administrator Jim Bridenstine, Boeing Space SVP Jim Chilton, NASA associate administrator Douglas Loverro, NASA Commercial Crew Program manager Kathy Lueders, and CST-100 Starliner VP and program manager John Mulholland held a press availability on Feb. 7 to discuss the ongoing independent review team investigation into Starliner’s anomalies during the Orbital Flight Test. In addition to the already publicly known Elapsed Mission Timer error which caused the craft to fail to reach the necessary orbit to dock with the ISS, Starliner also had a coding error which was found “a few hours” before reentry. This coding error would have led to the loss of the craft if it had not been corrected.
Starliner had “several anomalies” according to Administrator Bridenstine:
- An error with the Mission Elapsed Timer (MET), which incorrectly polled time from the Atlas V booster nearly 11 hours prior to launch.
- A software issue within the Service Module (SM) Disposal Sequence, which incorrectly translated the SM disposal sequence into the SM Integrated Propulsion Controller (IPC).
- An Intermittent Space-to-Ground (S/G) forward link issue, which impeded the Flight Control team’s ability to command and control the vehicle.
The initial error with the MET was caused by a mistimed polling of the time from the Atlas rocket by the spacecraft. When the time was polled, the MET was incorrectly set eleven hours off. As such, once Starliner had reached space, it continued following its preprogrammed commands from the time it had on board. In this instance, it believed it was eleven hours ahead in the mission and in a different orbit. In trying to maintain this precise orbit, Starliner began to use propellant to correct itself. This issue was confounded by the communication error. Once controllers recognized that the craft was rapidly firing thrusters to maintain the orbit it was not in, there was a delay in communications with the craft, resulting in a missed opportunity to react to the issue and reconfigure the craft to attain the needed orbit to dock.
The communications error is an interesting case, as part of the blame is being place on Earth-generated background noise. Specifically, they believe the issue was caused by signals from cell phone towers being radiated into space. This is a known issue in spaceflight and is typically mitigated for by using a larger band of frequencies for communications. It is not apparent what the underlying cause of the frequency interruption was, or what the fix will be.
While, before Thursday, both the MET error and the communications issue were known publicly, the coding issue with the SM was not In essence, two lines of code for the sequence to fire thrusters to deorbit the SM after separation from the command module (CM) were incorrect and if they had been executed would have cause the CM and SM to collide after separation. Such a collision would have been catastrophic.
While it was not revealed why these codes were being looked at prior to execution, there has been speculation that the author of the code may have felt that something was wrong before the flight and went back over it discovering the mistake just in time. While that may be true, it is more likely that after the initial anomaly, the entire team was in an alert state and all procedures were being evaluated thoroughly. This would already have been the case as this was a test flight. NASA and Boeing have committed to doing a complete line-by-line inspection of the nearly one million lines of code for the spacecraft, which will take quite a bit of time.
Both NASA and Boeing seem to have the understanding that each of these anomalies was software generated:
There was no simple cause of the two software defects making it into flight. Software defects, particularly in complex spacecraft code, are not unexpected. However, there were numerous instances where the Boeing software quality processes either should have or could have uncovered the defects. Due to these breakdowns found in design, code and test of the software, they will require systemic corrective actions. The team has already identified a robust set of 11 top-priority corrective actions. More will be identified after the team completes its additional work.
Marie Lewis on NASA’s official website
While much is being made of the several anomalies with this flight, this entire incident should serve as a reminder that spaceflight is not easy or routine. To quote Jim Bridenstine, “This is why we test.” Ground team intervention was successful twice in maintaining the safety of the craft and it was safely brought back to Earth. Both NASA and Boeing are taking this as a learning experience and moving forward. No date for the crewed flight test is being set, as this investigation is ongoing. However, NASA will be conducting a full Organizational Safety Assessment of Boeing Commercial Crew program. This along with the software review will take some time and I am cautiously optimistic that Boeing and NASA will slow down as necessary to make sure that once a crew is on Starliner, the craft will operate as intended, safely and effectively.