Facebook says system crash caused by error amid ex-employee hearing

The Facebook logo and stock graph are displayed through broken glass in this illustration, October 4, 2021.(Photo: Agencies)

A glitch during routine maintenance on Facebook's network of data centers caused Monday's collapse of its global system for more than six hours, leading to a torrent of problems that delayed the repairs, the company said on Tuesday.

The outage was the largest that Downdetector, a web monitoring firm, said it had ever seen. It blocked access to apps for billions of users of Facebook, Instagram and WhatsApp, further intensifying weeks of scrutiny for the nearly $1 trillion company.

In a blog post, Facebook Vice President of engineering Santosh Janardhan explained the company's engineers issued a command that unintentionally disconnected Facebook data centers from the rest of the world.

Facebook's systems are designed to audit commands to prevent mistakes, but the audit tool had a bug and failed to stop the command that caused the outage, the company said.

The outage was not caused by malicious activity, it said.

The timing of the system crash could not be worse for the company, as it has been under intense scrutiny from Congress after a former employee turned whistleblower accused the firm of putting profits before people's safety, which Facebook has denied.

"The company's leadership knows how to make Facebook and Instagram safer but won't make the necessary changes because they have put their astronomical profits before people. Congressional action is needed," Frances Haugen, the former employee, said at a U.S. Senate hearing on Tuesday.

Former Facebook employee and whistleblower Frances Haugen testifies during a Senate Committee on Commerce, Science and Transportation hearing titled "Protecting Kids Online: Testimony from a Facebook Whistleblower" on Capitol Hill in Washington, U.S., October 5, 2021. /Reuters

"As long as Facebook is operating in the shadows, hiding its research from public scrutiny, it is unaccountable," she said.

In an era when bipartisanship is rare in Washington, lawmakers from both parties excoriated the company, illustrating Congress' rising anger with Facebook amid numerous demands for legislative reforms.

Republican Senator Dan Sullivan said he is concerned about how Facebook and subsidiaries like Instagram affect children's mental health. "I think we're going to look back 20 years from now, and all of us are going to be like 'what the hell were we thinking?'"

While the company strives to weather the regulatory pressure, Monday's system collapse wreaked havoc on all its services.

Users lost access to one of the world's most popular messaging apps, WhatsApp, which has more than 2 billion users, and employees were also blocked from internal tools.

The outage knocked out tools that engineers would normally use to investigate and repair such outages, making the task even more difficult, Facebook said.

The company said it sent a team of engineers to the location of its data centers to try to debug and restart the systems.

However, it took the company a long time to get engineers inside to work on the servers due to the high physical and system security in place.

Even after network connectivity was restored to the data centers, Facebook said it worried a surge in traffic would cause its websites and apps to crash.

But because the company had run drills to prepare for such situations, access to its services returned relatively quickly.

"Every failure like this is an opportunity to learn and get better," Janardhan wrote. "From here on out, our job is to ... make sure events like this happen as rarely as possible."