Social life breakdown
And then suddenly the world stood still....
Exactly one week ago there was an outage at Facebook that caused Facebook, Messenger, Instagram, WhatsApp, Workplace and even the VR service Oculus to disappear from the network for hours. For other social media channels that are not part of the Facebook group such as Twitter and Tiktok, this meant a huge boost in their numbers!
And you can take that disappearance literally, because although the sites themselves were not deleted, no device could find the servers to it. According to the official announcement of Facebook, the problem lay with "a configuration of the underlying servers that coordinate network traffic. From there, the problems would have spread to the communications of the data centers, and the internal tools also became unusable as a result. This explanation is fairly vague, but it strengthens the suspicion that this appears to be a failed update of the border gateway protocol (bgp).
The signposts of the internet
The bgp is a primal protocol of the Internet that aims to ensure that users and devices are routed to the correct location of, say, their chat messages or their uncle's status post. It works a bit like the domain name system (dns), which guides devices to the correct ip address, but at a higher level. Briefly, it tells the world how to surf to the network a particular provider or, in this case, a tech giant.
Because Facebook is not with the local Telenet. The company has its own domain registry and dns servers, and uses its own 'routing prefix' for its own network. So anyone who opens a Facebook app should be directed to the Facebook network. In an update Monday night, however, that signage would have been erased, so that no device could find the Facebook network anymore. It immediately explains why so many sites and apps went down at the same time. Cloudflare, itself a specialist in managing web traffic for sites, reports in a blog post that an update removed all bgp routes to Facebook, making Facebook's own dns servers inaccessible as well.
Everything on your own server
That it took so long to fix the problems, in turn, seems to be the result of those "internal tools" becoming inaccessible. Reports by New York Times reporter Sheera Frenkel and others report that Facebook employees could not log into their own work servers, or even enter the physical buildings with their badges, because all systems run through their own servers. So it seems that those who had to fix the error were more or less locked out of their own infrastructure.
Currently, there is no indication of malicious intent. Moreover, this kind of error has one advantage: for once, no (additional) data was leaked, since it seems that only the routing servers, and not the data servers, were affected.