Posts Tagged ‘Incident’

Incident report for outages 16th of April to 18th of April 2013.

tirsdag, april 23rd, 2013

Included is the full incident report for network problems and outages
from 16th to 18th of april.

OUTAGE/PROBLEM START 2013-04-16 00:20
OUTAGE/PROBLEM END 2013-04-18 20:00

A core router at our network provider in Digiplex, Oslo suffered a
hardware fault. The fault forced the router to reboot every 5 minutes,
after successfully building neighborships with the network. Thus
resulting in accepting traffic, forwarding some traffic and then
rebooting while the rest of the network «thought» it was still
forwarding traffic, thus forcing some traffic to be dropped.

Our redundant network suffered some short drops in network due to the
problems with the core router at the network provider while the provider
replaced core equipment.

Hardware and software were replaced, and traffic was re-routed through
other paths in the network. There has also been made some changes in the
infrastructure to ensure higher uptime in case of a new outage on the
core router.

In addition to this we have been ensured that steps are being taken to
improve the resiliency of the network, upgrades will be implemented the
next week.