Requirements and Challenges
HRSS DR systems are significant for Golden Insurance Projects, which manage various service information linked to social insurance, such as unemployment and medical insurance. Since this data is critical for national stability and citizens' livelihood, Golden Insurance Project information systems are of national importance. As a result, the State Council has issued multiple documents detailing specific requirements on data comprehensiveness, security, and core system availability, all of which DR systems can help HRSS agencies achieve.
HRSS Information Communications Technology (ICT) systems include a complicated set of application systems, such as social security, labor relationship, civil service, internal Office Automation (OA), and Operation and Maintenance (O&M) management systems.
Among these application systems, core service systems, such as social security service handling, public services, funds supervision, social insurance auditing, and statistical analysis and decision-making support systems, possess higher data security and reliability requirements.
Data losses in peripheral systems are acceptable within a specified range. Customized DR designs must be developed based on the peripheral system characteristics to address diverse Recovery Point Objective (RPO) and Recovery Time Objective (RTO) requirements.
HRSS information system DR solutions face the following major challenges:
- DR systems must ensure high HRSS data security and reliability and zero data loss.
- Core application systems must be restored within minutes after a disaster, and peripheral systems must be restored within 1–2 hours after a disaster.
- Before a disaster, the production system must be running properly when DR links or DR devices are faulty.
- DR systems must be easy to implement.
- Upon completion, DR systems must support easy accuracy verification and DR recovery on backup systems.
- Given customer investment and DR occurrences, DR systems must be cost-efficient, while ensuring unified, efficient O&M support.
- DR systems must support data and system migration from standby to active status.
- DR systems must feature high system scalability to be fully compatible with HRSS services.
Huawei Solution
To meet customer requirements, Huawei offers an HRSS information system DR solution to enable application-level DR for core services and data-level DR for non-core services. Additionally, to address software architecture requirements for core HRSS services, Huawei's solution supports dual- or multi-center redundancy survivability capabilities. In other words, the DR center backs up data and processes data and services.Solution Highlights
Two-Site, Three-Center DR Solution
To efficiently utilize resources, Huawei configures multiple functions, such as snapshot or clone, on the active provincial data center storage array, local provincial DR center storage array, and geographic provincial DR center. These functions accelerate the deployment of the mirroring test environment and enable the local DR center to take over part of the production center services and provide fast DR tests.
Figure 2-1 Two-site, three-center DR solution overall architecture
As shown in Figure 2-1, the two-site, three-center DR solution consists of one production center (A), one local DR center (B), and one geographic DR center (C). Huawei deploys large-sized data storage systems at the local production center to store massive service data. To ensure data consistency, Huawei's solution uses synchronous data replication technology to write A's data into B's storage systems in real time.
Meanwhile, B uses asynchronous data replication technology to copy B's data to C. When B is faulty, A uses asynchronous data replication technology to copy A's data to C, providing a geographic data backup and protection mechanism. When A is faulty, A switches all applications to the backup data servers in B or C to restore data access and ensure service continuity.
Data-level DR is only available for provincial service systems but not for service or office systems. Integrated disk array DR management software uniformly manages the three data centers in an intuitive manner. Additionally, the provincial production center uses backup media to back up service system data, prevent incorrect manual operations and Single Point of Failure (SPOF) and prepare the system for large-scale regional disasters.
Local Application-Level, Dual-Center Redundancy Solution
Core unified social security card service systems must ensure 24/7 proper operation of all service units. To restore provincial and municipal social security services within the shortest-possible time, the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) must address service requirements. If possible, a dual-center redundancy survivability system is ideal.
The local provincial DR center functions as the application-level DR center for the provincial production center, as well as the application- and data-level DR center for municipal production centers.
The active provincial data center and local DR center are online simultaneously. Both database and application servers use load balancing technologies, and core service systems use disk arrays, characterized by centralized storage, to improve system performance and data processing efficiency.
Geographic Data-Level DR Solution
Customers are advised to build the geographic DR center about 400 kilometers away from the active provincial data center at a central regional city. The geographic DR center must have mature infrastructure, a smooth network, and a suitable environment (such as air cooling).
The geographic provincial DR solution at the data level provides a storage backup system for each city and a sharing and backup platform for local DR devices. With this platform, all municipal backup and storage devices are connected to the municipal local backup system, and copying policies are added to copy local data to the provincial sharing and backup platform.
Additionally, this solution copies backup data remotely to achieve geographic DR for provincial data. Two geographic DR modes are available:
- Each province is equipped with a geographic provincial DR center.
- All provinces use the shared DR center planned by the HRSS ministry.
The ministry-to-province data-level DR solution includes two parts: one ministry-level shared backup platform and 32 provincial backup platforms. The ministry-level shared backup platform includes platform management servers, self-service servers, portal servers, and storage devices (addressing backup and performance requirements of the 32 provinces). The provincial backup platform is added with a backup and storage device that can be seamlessly integrated with the existing backup system, enabling data copying with the ministry-level shared backup platform.
Solution Highlights
Huawei's hotline solution offers customers the following benefits:
- The DR design combines both synchronous and asynchronous modes to maximize customer investment while maintaining high availability.
- The local DR solution features large capacity, high bandwidth, and low delay, and uses a Dense Wavelength Division Multiplexing (DWDM) interconnection solution to enable real-time backups and ensure service continuity.
- The geographic DR solution features high efficiency, high bandwidth utilization, and long-distance transmission, and uses the MSTP/SDH interconnection solution to enable periodic backups and minimize natural disaster impacts.
- Data center DR level solutions (such as network-, server-, and SAN-level) ensure comprehensive HRSS data backups.
- The Storage Area Network (SAN) transmission solution optimizes HRSS service traffic and eases Fiber Channel (FC) switch pressure.
- The active data center and DR centers are uniformly managed to ensure a controllable DR system.
Customer Benefits
High data security and service continuity ensure the proper operation of HRSS information systems.
The local DR system ensures zero data loss and supports real-time service switchover.
Fully functional backup networks enhance system reliability.
Solution redundancy ensures low delay and provides reliable real-time backup.
A heterogeneous network design for active and standby centers decreases network attacks.
Built on the two-center, three-site mode, Huawei's solution offers multiple deployment modes, such as active-and-standby, backup, and load balancing, to meet dual-center or multi-center redundancy survivability of HRSS customers and independent software vendors.