Widespread server outages - recovering
9/18/2014
An unusual network hardware problem isolated the eight servers that hold the majority of Engineering web sites (including this web site) and several services at 8:45am Tuesday morning. The web sites and servers hosted by these eight machines were unable to reliably access their network based storage, which caused them to shut down or fail in unpredictble ways.
The network problem was found and repaired, but the slow process of restarting each server continues. Most outward facing services have been restored but the process is expected to take until 5pm.
Note that this outage is unrelated to the outage of our ticketing system, which is hosted off site. The two outages happening at nearly the same time compounded issues with diagnosing and reporting problems.
Should an outage of this nature occur again in the future, please keep in mind that the Engineering IT help desk is available via phone at 333-1313 during normal business hours. The help desk can help users even when multiple servers and services are down.