DETAIL: We are currently experiencing another a problem with our internet connectivity.

RESPONSE: We are working with our provider to restore services as soon as possible.

UPDATED 22:24: Our network provider has confirmed the problem and are working to restore service.

UPDATED 22:52: Services appear to have been restored.

UPDATED 11:03 30 Nov: Incident report from Telstra Added

FOLLOW UP: We have received the following incident report from our upstream provider:-

UPDATE TO TELSTRA INCIDENT REPORT FOR OUTAGES ON 23 AND 24 NOV 2009

Having identified the root cause of the previous day’s incident engineers planned an activity to configure the x.x.0.0/16 summary route on to the appropriate Juniper core routers to allow a future decommissioning of the legacy routers. The configuration of this summary route on the Juniper core routers resulted in Internet services for certain customers being affected again.

The engineers were unable to quickly isolate the cause of this issue and so reversed the change in order to restore service. However once the change had been reversed service was not restored for all customers as it should have been. The engineers identified a spurious route being received from the legacy routers which appeared to be causing the problem. The engineers reset the BGP sessions to the legacy routes which removed the spurious route and restored service to affected customers.

The engineers later identified a Juniper OS bug that had caused the reversal to be unsuccessful. Telstra has already been testing a later version of the Juniper OS in their labs which is intended for network wide deployment. Juniper has confirmed that this specific bug is resolved in the release in test but Telstra will also include this bug in their test planning prior to deployment in production networks.

REMEDIAL ACTION

  1. An urgent cross-functional review of the current MPLAN process has been scheduled (including a detailed analysis of our planning and handling of this incident).
  2. All MPLAN’s now have an extended Director level approval policy whilst we review the current planned works process.
  3. All MPlans will be checked after completion to ensure that the works have been carried out in accordance with the plan.

Telstra would like to take this opportunity to sincerely apologise for the disruption and inconvenience that these incidents have caused. Please be assured that the immediate actions stated above have been given the highest priorities within Telstra to be implemented as quickly as possible. This is in order to avoid further incidents and to provide the highest levels of service to our customers.