Interconexión WAN en centros de datos para solución de recuperación ante desastres
The data center carries key services and data essential to enterprise operations. This important role makes a disaster recovery system vital to enterprise business. With a large selection of disaster recovery technologies available, choosing an appropriate technology can be difficult. The high cost of a disaster recovery system presents another hurdle when evaluating disaster recovery solutions. A cost-efficient disaster recovery system that meets the business requirements of the enterprise is the ideal solution.
As the core of information and communications technology (ICT) in an enterprise, the data center carries key services and data essential to enterprise operations. This important role makes a disaster recovery system vital to enterprise business. With a large selection of disaster recovery technologies available, choosing an appropriate technology can be difficult. The high cost of a disaster recovery system presents another hurdle when evaluating disaster recovery solutions. A cost-efficient disaster recovery system that meets the business requirements of the enterprise is the ideal solution.
Disaster recovery type — software or hardware
A disaster recovery system consists of at least two IT systems with identical functions in separate locations that monitor each other's status. If an IT system breaks down from a disaster such as fire or an earthquake, the other IT systems take over all operations. A data center disaster recovery solution must protect both data and services. The service disaster recovery solution backs up application systems using technologies such as remote server clustering. Data disaster recovery is fundamental for service disaster recovery and the focus of this document.
Data disaster recovery is implemented by using a data system in another city as backup for the local data system. The backup data system provides a copy of key data when the local data system or even the entire application system breaks down.
Data disaster recovery is implemented in hardware or software.
1) Hardware-based data disaster recovery (disk array level)
In hardware-based data disaster recovery, data is transmitted between storage devices through storage array controllers. This mode uses dedicated array controllers to copy data at a rapid rate without affecting system stability. Hardware data copying does not occupy host hardware resources, such as the CPU, memory, and I/O module. Besides, it can be used on any operating system, including Windows, Linux, and UNIX. This data recovery solution is most suited to key services and high-end applications.
Despite these strengths, hardware-based data disaster recovery has limitations. The production center and disaster recovery center must use the same or similar storage arrays, narrowing the scope of application. In addition, hardware-based data disaster recovery requires complicated technologies and considerable expense.
Hardware-based data disaster recovery is a well developed and widely used technology, and dominates in FC-SAN disaster recovery systems.
2) Software-based data disaster recovery (host level)
In software-based data disaster recovery, a data copy tool is installed on servers in the production center and the disaster recovery center for data backup between hosts. This mode allows the two centers to use different types of storage devices and servers. Enterprises only need to buy the data copy tool, not extra hardware.
A disadvantage of software-based data disaster recovery is that the copy tool uses server hardware resources, which may degrade service processing performance and make the system unstable. Most data copy applications can run on a Windows operating system but do not operate properly on Linux or UNIX operating systems.
Software-based data disaster recovery is widely used in IP-SAN disaster recovery systems but seldom used in FC-SAN disaster recovery systems, which require rapid data backup and high system stability.
Data copying mode—synchronous or asynchronous
Data copying is a synchronous or asynchronous process. Hardware-based data disaster recovery supports both synchronous and asynchronous data copying, whereas software-based data disaster recovery supports only asynchronous data copying. The data copy mode determines whether key data is backed up in real time or not.
1) Synchronous data copying (real-time backup)
In synchronous data copying mode, the production center performs an I/O operation only after confirming that data in the previous I/O operation is backed up to the disaster recovery system. Synchronous data copying ensures complete consistency between the production center and disaster recovery center; it also provides the highest data security. This mode works well for core services, for example, services that require a very low (at near 0) Recovery Point Objective (RPO) and Recovery Time Objective (RTO).
Synchronous data copying requires a significant amount of bandwidth and low latency on a network. The distance between source and destination devices and time spent in protocol packet conversion on intermediate devices determines latency. For this reason, this data copying mode can only be implemented in FC-SAN disaster recovery systems, where the distance between the production center and disaster recovery center is less than 200 kilometers. In this scenario, hardware-based data disaster recovery technology must be used to implement synchronous data copying.
2) Asynchronous data copying (non-real-time backup)
In asynchronous data copying mode, the production center performs I/O operations without waiting for acknowledgement from the disaster recovery center. This mode copies data much more quickly but does not ensure data consistency between the production center and disaster recovery center.
Compared with synchronous mode, the asynchronous mode has lower requirements for bandwidth and latency. It also allows a longer distance between the production center and disaster recovery center. The asynchronous mode applies not only to IP-SAN disaster recovery systems, but also to FC-SAN disaster recovery systems where the distance between the production center and disaster recovery center exceeds 200 kilometers.
WAN type — IP or optical
In most disaster recovery systems, the production center and disaster recovery center communicate through a WAN. The WAN must meet the following requirements:
Large data-storage capacity: A data center stores all key data of an enterprise, so the SAN must have a capacity of 10 Gbit/s to 1,000 Gbit/s, or even several Tbit/s.
High scalability: The capacity of the SAN must be increased every year to store greater amounts of data.
Short latency: Transmission latency is an important factor that determines whether a disaster recovery system can achieve the RPO and RTO needed by the enterprise. This requirement is especially important for disaster recovery systems using synchronous data copying.
High reliability: Loss of key service data is unacceptable for an enterprise and can cause huge losses.
Large variety of interfaces: Apart from the mainstream FC-SAN and IP-SAN storage techniques, other techniques such as IBM's ESCON and FICON can also be used; therefore, the WAN must provide various protocol interfaces to support different storage devices.
A WDM optical network or an IP WAN can be used as the WAN between the product center and disaster recovery center.
In an FC-SAN system, if the distance between the production center and disaster recovery center is less than 200 kilometers, synchronous data copying is typically used. In this scenario, a WDM optical network is recommended because it ensures the high bandwidth and low latency required for synchronous data copying, while also providing high scalability and reliability. However, a WDM optical network is more expensive than an IP WAN. If the distance between the production center and disaster recovery center is larger than 200 kilometers, the asynchronous data copying mode must be used. This mode does not require high performance from the WAN, making the more economical IP WAN or Internet a better choice. If this FC-SAN system requires high reliability, a WDM optical network can be used as the WAN.
An IP-SAN system typically uses the asynchronous data copying mode. Because data saved in an IP-SAN system is not as important as that saved in an FC-SAN system, an IP WAN or the Internet is recommended.
The following table summarizes the disaster recovery types, data copying modes, and wide area networking modes discussed so far.
SAN System | Distance | Recommended Disaster Recovery Type | Recommended Data Copying Mode | Recommended WAN Type |
FC-SAN | < 200 km | Hardware | Synchronous | WDM |
> 200 km | Hardware or software | Asynchronous | IP WAN/Internet/WDM | |
IP-SAN | – | Software | Asynchronous | IP WAN/Internet |
With more than 20 years of experience in the telecommunications industry and 15 years of experience with IP networking, Huawei offers leading optical transmission and IP routing technologies in all enterprise disaster recovery solutions. These solutions use Huawei OTN devices and routers that customers can choose from to meet specific requirements.
Huawei’s IP & optical multi-layer Long-Distance Disaster Recovery solution is shown below:
Disaster recovery at the optical layer — OTN solution
Huawei OSN series OTN devices have industry-leading WAN transmission capabilities. These devices create a WDM optical network for high-speed, real-time transmission. The OSN series OTN devices are especially suited for disaster recovery systems that require large capacity and low transmission latency, such as FC-SAN systems.
Each OSN device has a capacity of 40 Gbit/s or 100 Gbit/s x 80 channels (3.2 Tbit/s or 8 Tbit/s). The capacity can be easily expanded by increasing channels (wavelengths). The OSN series devices support fourteen SAN protocol interfaces and have compatibility certificates from seven mainstream device vendors. The devices allow up to 3,000 kilometers between the production center and the disaster recovery center, the largest distance in the industry. What’s more, the devices include carrier-class 50-ms protection switching for high reliability.
Huawei OTN and DWDM devices have the highest market share (25%) in the global optical network market, including over 65% of the market share in China.
Disaster recovery at the IP layer — router solution
When enterprises want to use IP routing for WAN transmission, Huawei NE40E series routers can be used as egress routers. Egress routers transmit both service traffic between data centers and backup traffic on an IP SAN, reducing the investment in equipment. Service traffic and backup traffic are separated by VPN instances on egress routers.
Huawei NE series routers feature an industry-leading 400G platform that allows the bandwidth in each slot to be increased to 400 Gbit/s. These routers isolate backup traffic from service traffic using VPNs and ensure service quality with QoS capabilities. In addition, Huawei NE series routers implement protection switching within 200 ms to ensure reliability on the IP WAN.
Huawei is one of the top 3 router suppliers in the world. Huawei NE40E series routers have been widely used on carrier networks and industry networks, which grow 60% in sales every year.
Data is critical to developing enterprise business and disaster recovery is key to preserving data. When building a data center disaster recovery system, enterprises should select the most suitable disaster recovery type, data copying mode, and WAN type based on their unique business requirements and system characteristics. In this way, customers can maximize expected benefits and minimize investment.