Skip to main content

In the fast-paced technology driven market, businesses in general, and telecom operators in specific have started to focus on ensuring High Availability (HA) of the services that they offer. High availability means that the end-users (either resellers or subscribers in the case of a telecom operator) are able to access data and applications whenever needed with an acceptable level of performance. It is a feature that provides redundancy and down-time tolerance. Using high availability reduces the risk of lost revenue and customers in the event of internet connectivity or power outage.

With high availability, the user is able to perform maintenance without downtime and failure of transactions. Click To Tweet

It is usually expressed as a percentage and falls between 99.9% and 100% on the spectrum of system availability. In this context, reliability (of hardware & software components) and performance (response-time, throughput time etc.) are parts of system availability.

The performance of a system depends on the database servers that execute different operations like data creation, data fetching etc. It is useful to explain two concepts here: local redundancy and site redundancy. Local redundancy refers to the Primary site (PR) which provides read and write redundancy. Data backups are created from this site as well. Site redundancy refers to the Disaster Recovery (DR) site where backups (of data) are stored in real-time. When the primary site is overloaded with traffic, users may suffer from service degradation. The servers become slow and the users face down-times, posing a threat to business continuity. This service unavailability is a major concern for the operators as they take measures to reduce the chances of a transaction failing.

Seamless Distribution Systems’ all-in-one high availability and disaster tolerant solution, “Multi-Site Cluster solution”, helps service providers avoid service degradation, loss of data and ensures business continuity at all times.

Multi-Site Cluster Solution enables the configuration of servers located in multiple locations providing down-time tolerance and high availability across multiple sites. It allows real time duplication of data to redundant servers, increasing reliability of the system to improve actual system performance.

It is usually in the form of a backup or fail-safe. SDS’ platform supports redundancy at two levels:

  • Transaction level: using multiple transaction (TX) nodes so if one TX goes down, the system remains outage free.
  • Database level: supports asynchronous replication of data from master database servers to secondary database servers and from Primary site DB servers to Secondary site DB servers

Transaction requests, typically fall onto the primary site which consists of two-way replication of data: in-between the data servers and with Disaster Recovery site.

However, instances may arise in which the connection between the DR and the PR site breaks, resulting in service degradation or service unavailability. Click To Tweet

For maximum output and redundancy, SDS uses Galera replication to provide Active/Active synchronization of data. Galera Replication is multi master (virtually) synchronous replication which enables:

  • Read and write to any node
  • No master failover, no slave lag
  • Guaranteed write consistency
  • Cluster wide conflict resolution

The solution consists of minimum of 5 DB servers at transactional databases, which work in the cluster mode. 2 at the primary site, 2 at the disaster recovery site and 1 arbitrator. With this in place, a telecom operator is able to load balance and direct 50% of the traffic to the primary site and 50% of the traffic to the disaster recovery site (which now acts as the secondary site as it is accepting transaction requests in real time). There is real-time synchronization of data between the primary site, secondary site and the Arbitrator. Therefore, if any one site falls, for example the primary site is down, the secondary site can join the Arbitrator to gain majority and continue processing the transactions. If the arbitrator goes down, there are still 4 servers available to continue the transaction processes. Basically, the servers remain in majority at all times and enable business continuity in case of power outage of any of the database servers. This takes redundancy to a whole new level where there is no chance that the user experiences service failure or unavailability.

Database traffic

Galera replication

Services on both sites are active at the same time thus providing redundancy and high availability. The result is a high-availability solution that is both robust in terms of data integrity and high-performing with instant failovers.

Benefits of Multi-Cluster Solution:

  1. All-in-one solution: Workload mobility, availability and disaster recovery effectively synchronized
  2. Ensures business continuity: Users have ready access to your servers at all times.
  3. Multiple levels, multiple sites: Multi-cluster provides full-scale redundancy over servers running in separate data centers (primary and DR site)
  4. Synchronous replication: Galera cluster synchronization technique ensures the storage, replication and availability of data

Author Mahwish Ilyas

More posts by Mahwish Ilyas