Tuesday, February 8, 2011

Experts to ensure the normal operation of the virtual environment means necessary

 Development of weather satellites in the past when the U.S. government had used a known failure mode and effects analysis (FMEA: failure mode and effects analysis) of the framework. The framework used to analyze the potential system failure modes, and calculate the expected results.

FMEA is a very complex task, but one of billions of dollars worth of satellite terms is necessary. Negligence of any subsystem is relatively small fault could lead to the collapse of the entire satellite.

but this and the virtualization and storage area networks (SAN: Storage Area Network) What is the relationship? Because the current virtualization technology is unusually dependent on centralized storage architecture. Dynamic migration and centralized SAN vMotion require completion of a virtual machine failover and load balancing.

SAN: the key to a virtual infrastructure

the existence of this demand, almost all of the virtual work environment must be deployed SAN infrastructure, but also increases the possible side effects and SAN failure to bring charges. To illustrate this point, you first need to know virtual infrastructure management in the relationship between each component. It depends on each component and the connection between the components until you find all the components of the dependencies, you can see the draw of the entire dependency tree (the final result is somewhat similar to, and FMEA).

can see all the arrows point to the end SAN infrastructure. Therefore SAN infrastructure uptime, if not 100%, it must be very close to 100%. Downtime of any storage device, in particular extended downtime, will work to the entire virtual environment of a catastrophic failure, because such a situation will force all of the virtual machine, server and application downtime.

This is a very serious situation, from all the chaos is very difficult to restart. In the case of these resources are not available, the server and the application of the established association between a very time-consuming it may need to start the process.

storage equipment manufacturers are also aware of this problem. Last year, Hitachi announced that its high-availability manager of Hitachi (Hitachi High Availability Manager) to ensure the storage device to run full time. DataCore Partner Conference in announced its storage virtualization software running full time can be achieved. EMC, HP and Dell's high-end solution for these companies also provide zero-downtime options or SAN in a particular operation to ensure ensure zero downtime. SAN-based software vendor even StarWind software companies through active / active, two-node storage cluster development with storage replication zero downtime.

but can be a combination of technology and skills to achieve 100% of available storage devices, which require SAN power supply, disk drives, storage connectivity, storage processor, the multiple levels of redundancy, or even completely redundant storage nodes I (for example, HP's modular storage solutions.) To assist, on-site or off-site copies of SAN storage can be added to further protect the data.

last question is a virtual work environment of the storage device can accept failure time? Even if there is an acceptable time, certainly not more. If possible, the best design of multi-level redundancy. In addition, prior to the purchase of SAN infrastructure, the best advice business infrastructure weaknesses. Within a year, you certainly do not want to see the SAN failure resulting in the paralysis of the entire computing infrastructure.

No comments:

Post a Comment