[News] Facebook admits its engineers made mistake that caused $100m seven-hour outage and not hackers



Facebook has shut down claims of ‘ malicious activity’ behind the seven-hour blackout that cost the company an estimated $100million in lost revenue.

 

“Configuration changes on the backbone routers that coordinate network traffic between our data centers caused issues that interrupted this communication. This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt,” Facebook Vice President of Infrastructure Santosh Janardhan wrote in the post.

 

“The underlying cause of this outage also impacted many of the internal tools and systems we use in our day-to-day operations, complicating our attempts to quickly diagnose and resolve the problem,” Janardhan added.

 

Facebook said it had “no evidence that user data was compromised as a result of this downtime.”

 

The global outage  hit Facebook, Instagram, WhatsApp and Messenger on Monday evening was caused after faulty configuration disconnected its servers from the internet, meaning engineers had to travel to its Santa Clara data center to fix the glitch in-person.

 

It is believed that a faulty update to Facebook’s Border Gateway Protocol (BGP), which routes traffic between large private networks and the public Internet, left apps and browsers unable to locate the company’s services.

 

But the repair was delayed, according a purported insider, because of ‘lower staffing in data centers due to pandemic measures’, along with outages in physical access card systems and internal messaging services.

 

Before the outage was fixed, it was reported that the company had been hacked.

 

Speaking with DailyMail.com, Kieron Harding, an IT Infrastructure Engineer at GRC International Group, told said: ‘The nature of the problem meant Facebook would have needed network engineers to physically access their BGP routers – and due to the pandemic, some of the data centers quite possibly don’t have an engineer based on site, or someone who could have immediately started to work on the problem.’

 

‘One of the reasons why the outage lasted for as long as it did was because the misconfiguration of the BGP also affected Facebook’s physical door access systems – which shut down; meaning engineers couldn’t get into the buildings, or secure rooms, to start fixing the issues straightaway,’ said Harding.

 



Be the first to comment

Leave a Reply

Your email address will not be published.