Single point of failure blamed for Verizon FiOS, DSL outage
A single stalled router is being blamed by Verizon officials for a service outage that impacted customers of its high-speed Internet service, including fiberoptic FiOS, in New York and Massachusetts.
The outage occurred at approximately 3:15 pm EDT, according to a message Friday afternoon from the company's chief PR executive, Eric Rabe. He acknowledged that routers typically fail over to adjacent ones, but in this instance, this one didn't.
"The router went into a hung state and did not appear to the rest of the network as though it was having problems," Rabe wrote, being careful not to name the manufacturer.
According to reporting from Telephony Online's Ed Gubbins, Verizon's principal hardware provider for FiOS is Juniper Networks. In fact, Gubbins reports, Verizon contributes 13% of Juniper's total revenue, and may be the sole reason why that company found black ink again last year.
Juniper's E-series routers service Verizon's broadband network. Last October, Juniper announced a major upgrade to its routers' operating system, adding features that included the capability for service providers to deploy deep packet inspection -- the ability to analyze Internet traffic based on its contents. The company marketed this feature as part of its "Intelligent Services Edge" portfolio, which it described as "leverag[-ing] a single, consistent operating system and high-performance hardware to flexibly deliver many service types -- including broadband routing, voice, multimedia and integrated security, as well as application-level awareness."
The rollout schedule for these changes targeted the third quarter of 2009. The latest version of Juniper's router software for E-series routers, called JUNOSe, began rollout on August 13. While the evidence that Juniper's router software may have been involved is circumstantial, these facts do tell a curious tale.
As some customers confirmed to Verizon's support forum, Rabe's statement that the outage lasted about 40 minutes on Friday afternoon appears accurate. However, other customers, including in Massachusetts, reported poor or no service even after the problem was resolved by 4:00 pm. What's more, support representatives who diligently worked with customers in an attempt to resolve issues as if their own on-premise equipment were to blame, were apparently not informed of the service outage themselves until after the problem was resolved.
One heavy-use customer complained to Verizon, "I have 30 years in networking designing service provider networks and I don't have a single design that has a single point of failure. It appears Verizon does."
That provoked another customer to come to Verizon's defense: "Even with redundancy, there is no way to guarantee 100% availability. That is an impossibility, I don't care who you are. And that percentage is including regularly scheduled failover testing."