Data centers have been mauled by outages very frequently over the past few months. Data centers have become the prime targets for defamatory allegations citing erroneous wastage of power and other resources. These unscrupulous outages have affected the likes of Level3, Telecity, Colo4, Amazon, and Equinix to an extent that it resulted in hundreds of businesses going offline. It not only affected the hosting companies but also the other companies that used the data centers for hosting and for connectivity. These are latest in the line of the several outages that have disrupted services in the hosting and colocation firms. If it was the power failure that caused disruptions in Level3 and Telecity, Colo4 experienced problems with the electrical equipments, and NTT felt the jolt with networking issues. In some cases, the situations were so worse that all the physical equipments were damaged. All this ultimately boils down to just one big question which is “Could this all be prevented?”
With the expansion of data centers worldwide and the ever increasing global infrastructure, global outages are also a part and parcel. Statistical reports have shown that most of these outages are the result of human error which can be prevented. Some of the recent high profile outages were caused by failure on the infrastructure side; all of these outages were avoidable. Today there are a large number of systems available that globally classify the infrastructure and resources of a facility and give them a suitable ranking. Unfortunately these ranking systems do not stress on individual and same components. Most of the ranking and evaluation systems rely on the mechanical and electrical infrastructure of a facility. The operations and maintenance which are equally important when compared to the infrastructure facilities. These along with the human error factor contribute to the continuous operation of the data centers.
Beyond the infrastructure layer lies the information technology layer. The perfect combination of these two layers contributes to the success of any organization. All these layers add to the organization’s business continuity and availability. Mere failure estimation and assessment of the mechanical and electrical infrastructure does not address all the variables. It in fact adds to the woes of risk assessment. It is important to primarily understand the levels of a data center, its purpose, capacity, and the type of services provided. Risk assessment procedures for a colocation data center are totally different from the regular hosting facility. A cloud enterprise requires different procedures.
A better understanding of the risk profiles and the fact that all data centers are not equal could have prevented the previous outages. Here is where the four Tier classification of data centers needs to change. These four levels of classification do not seem to fit into the present IT world where all the IT services are delivered through the cloud. With just four levels available to frame the most stringent procedures for a complete fail proof facility, it is necessary to deploy a more comprehensive method to better understand a data center’s resilience.
Data Center Talk updates its resources everyday. Visit us to know of the latest technology and standards from the data center world.
Please leave your views and comments on DCT Forum