micfo | network outage
Dear Renato,
On Thursday 11/11/2010 at 6:42pm, we became aware of a network issue affecting all servers in operation among several facilities except the new datacenter. Approximately half of these servers were experiencing connectivity problems ranging from packet loss to total loss of connectivity. Other servers were unaffected by this issue, and were responding as normal.
Our network monitoring server was amongst those fully affected by this problem and therefore reported a total outage, including for servers hosted at other datacenters and not affected at all.
The issue we detected was affecting both the primary and secondary Cisco 6500 network system that are configured in a VSS-1440 redundant cluster. We ran through our emergency procedures to identify the problems, but all tests were responding within normal parameters.
After finishing our emergency procedures, and not identifying a specific problem, we raised a case with Cisco TAC at 11:50pm. A Cisco engineer then logged into our routers to try and identify the problem. After 9 hours, the Cisco engineer was unable to provide a resolution; we understood the problem was either a software bug within the routers, or else a hardware fault.
On Friday 11/12/2010 at 10:30am, we took the matter into our own hands and after nearly 2 hours of troubleshooting, we decided to reboot both routers. It takes about 15-20 minutes for the routers to reload. During the reload, the primary router failed to boot up normally. The secondary router booted normally, and our monitoring showed service was restored as a result of this.
Our conclusion is that the failure of the primary Cisco 6500 to boot indicates a hardware problem. We take full responsibility for all the infrastructure required to provide you with a reliable service, and therefore we'll continue to work with Cisco to identify this matter and put necessary measures into place to avoid such incidents from happening in future.
Last but not least, as always, we stand behind our promises and will continue to honor our commitments. Despite the fact that this incident occurred as a result of a third party's hardware failure, every client affected by this network outage is eligible for one month's credit. To claim your credit, please send an email to [email protected].
I'd like to take the opportunity to apologize for this incident and assure you that we are working on further enhancing our network infrastructure to provide you with the reliable service you deserve.
Please feel free to contact me personally if there is anything else I can do for you.
regards,
amir golestan | executive director
micfo | llc.
divine hosting experience™
micfo.com
Sobre a pane de sexta no site do GAVCA
- 07_Phantom
- Veterano
- Posts: 6803
- Joined: 07 Apr 2003 21:00
- Contact:
Sobre a pane de sexta no site do GAVCA
SP!
Phantom
Phantom