Using Replication

This chapter provides information about how to use the replication feature in Cisco Prime Access Registrar (Prime Access Registrar).

This chapter contains the following sections:

note.gif

Noteblank.gif When using replication, use the aregcmd command-line interface to make configuration changes to the Prime Access Registrar server. Replication is not supported when using the GUI.


Replication Overview

Prime Access Registrar replication feature can maintain identical configurations on multiple machines simultaneously. When replication is properly configured, changes an administrator makes on the primary or master machine are propagated by Prime Access Registrar to a secondary or slave machine.

Replication eliminates the need to have administrators with multiple Prime Access Registrar installations make the same configuration changes at each of their installations. Instead, only the master's configuration need be changed and the slave is automatically configured eliminating the need to make repetitive, error-prone configuration changes for each individual installation. In addition to enhancing server configuration management, using replication eliminates the need for a hot-standby machine.

Using a hot-standby machine is a common practice to provide more fault-tolerance where a fully-installed and configured system stands ready to takeover should the primary machine fail. However, a system setup for hot-standby is essentially an idle machine only used when the primary system fails. Hot-standby or secondary servers are expensive resources. Employing Prime Access Registrar's replication feature, both servers can perform RADIUS request processing simultaneously, eliminating wasted resources.

The replication feature focuses on configuration maintenance only, not session information or installation-specific information such as Administrator, Interface, Replication or Advanced machine-specific configuration changes. These configuration items are not replicated because they are specific to each installation and are not likely to be identical between master and slave. While changes to Session Managers, Resource Manager, and Remote Servers are replicated to the slave and stored in the slave's configuration database, they are not hot-configured on the slave (see Hot Configuration Detailed below for more information)

Changes should be made only on the master server. Making changes on a slave server will not be replicated and might result in an unstable configuration on the slave. Any changes made using replication will not be reflected in existing aregcmd sessions. aregcmd only loads its configuration at start up; it is not dynamically updated. For example, if aregcmd is running on the slave, and on the master aregcmd is used to add a client, the new client, while correctly replicated and hot-configured, will not be visible in the slave's aregcmd until aregcmd is exited and restarted.

When there is a configuration change, the master server propagates the change set to all member servers over the network. All member servers have to update their configuration after receiving the change set notifications from master server. Propagating the change set to a member serve involves multiple packet transfer from the master server to the member because the master serve has to convey all the configuration changes to the member. The number of packets to be transferred depends on the size of the change set.

After receiving a change set notification, the member server will go offline before applying the change set received from master server. This state is indicated by the log message Radius Server is offline in name_radius_1_log file. When the change set is successfully applied, the member server goes up automatically. This is indicated by the log message Radius Server is online in name_radius_1_log file. When the member server goes offline to apply the change set, no incoming packets are processed.

Due to the number of packets to be transferred in the change set and the amount of time the member server will be offline updating its databasepoints, we recommend that you use multiple save commands rather than a large configuration change with one save command. You can also minimize the number of changes that occur in a replication interval by modifying either the RepTransactionArchiveLimit or the RepTransactionSyncInterval, or both of these properties. For example, instead of using the default value of 100 for the RepTransactionArchiveLimit, you might change it to 20.

note.gif

Noteblank.gif The IP address format is enhanced to support both IPv4 and IPv6.


How Replication Works

The following sections describe the flow of a simple replication as it occurs under normal conditions:

Replication Data Flow

The following sections describe data flow on the master server and the slave server:

Master Server

The master server or primary server is the fully configured machine that is used to archive all the transactions that taken place in Prime Access Registrar.

Performing the Data Flow for the Master Server

To perform data flow for the master server:


Step 1blank.gif The administrator makes a change to the master server’s configuration using the aregcmd command line interface (CLI) and issues a save command.

Step 2blank.gif After the changes are successfully validated, the changes are stored in the Prime Access Registrar database.

Step 3blank.gif aregcmd then notifies the Prime Access Registrar server executing on the master of the configuration change.

Step 4blank.gif The Prime Access Registrar server then updates its version of the configuration stored in memory. (This is called hot-config because it happens while the server is running and processing requests.)

Step 5blank.gif The Prime Access Registrar server first copies the changes pertaining to the aregcmd save, also known as a transaction to its replication archive, then transmits the transaction to the slave server for processing.

Step 6blank.gif In aregcmd, the prompt returns indicating that the save has completed successfully, the transaction has been archived, and the transaction has been transmitted to the slaves.


 

Slave Server

The slave server or secondary server is a fully-installed and configured system stands ready to takeover when the primary machine fails.

Performing Data Flow for the Slave Server

To perform data flow for the slave server:


Step 1blank.gif When the slave server receives the transaction, its contents are verified.

Step 2blank.gif After verification, the changes are applied to the slave server's database.

Step 3blank.gif The changes are then applied (hot-configured) in the slave server's in-memory configuration.

Step 4blank.gif The transaction is written to the slave server's replication archive.


 

Security

Replication has two primary security concerns:

1.blank.gif Security of the transactions transmitted to the slave server

2.blank.gif Storage of transactions in the replication archive

Both of these concerns use shared secret (MD5) encryption via the shared secret specified in the replication configuration on both master and slave servers. Replication data transmitted between master and slave is encrypted at the source and decrypted at the destination the same way as standard RADIUS packets between Prime Access Registrar's clients and the Prime Access Registrar server. Transactions written to the replication archive are also encrypted in the same manner and decrypted when read from the replication archive.

Replication Archive

The replication archive serves two primary purposes:

  • To provide persistent, or saved, information regarding the last successful transaction
  • To persist transactions in case the slave server requires re synchronization (see Ensuring Data Integrity below for more information on re synchronization).

The replication archive is simply a directory located in../CSCOar/data/archive. Each transaction replicated by the master is written to this directory as a single file. The name of each transaction file is of the form txn########## where ########## is the unique transaction number assigned by the master server. The replication archive size, that is the number of transaction files it might contain, is configured in the Replication configuration setting of TransactionArchiveLimit. When the TransactionArchive limit is exceeded, the oldest transaction file is deleted.

Ensuring Data Integrity

Prime Access Registrar's configuration replication feature ensures data integrity through transaction data verification, transaction ordering, automatic resynchronization and manual full-resynchronization. With the single exception of a manual full-resynchronization, each of the following techniques help to automatically ensure that master and slave servers contain identical configurations. A detailed description of each technique follows. This section contains the following topics:

Transaction Data Verification

When the master prepares a transaction for replication to a slave, the master calculates a 2's complement Cyclic Redundancy Check (CRC) for each element (individual configuration change) in the transaction and for the entire transaction and includes these CRC values in the transmitted transaction. When the slave receives the transaction, the slave calculates a CRC for each transaction element and for the entire transaction and compares its own calculated values with those sent with the message. If a discrepancy occurs from these comparisons, the transaction element or the entire transaction is discarded and a re-transmission of that particular transaction element or the entire transaction is requested by the slave from the master. This process is called automatic resynchronization. (described in more detail below)

Transaction Order

When the master prepares a transaction for replication, it assigns the transaction a unique transaction number. This number is used to ensure the transactions are processed by the slave in exactly the same order as they were processed on the master. Transactions are order dependent. Since the functionality of Prime Access Registrar's configuration replication feature is to maintain identical configurations between master and slave, if transaction order were not retained, master and slave would not contain identical configurations. Consider where two transactions modify the same thing (a defined client's IP address for example). If the first transaction was a mistake and the second was the desired result, the client configuration on the master would contain the second setting; however, if the transactions were processed in the reverse order on the slave, the client configuration on the slave would contain the mistaken IP Address. This example illustrates the critical need for transaction ordering to ensure data integrity.

Automatic Resynchronization

Automatic Resynchronization is the most significant feature with respect to data integrity. This feature ensures the configurations on both the master and slave are identical. If they are not, this feature automatically corrects the problem.

When the master and slave start-up, they determine the transaction number of the last replication transaction from their respective replication archives. The master immediately begins periodic transmission of a TransactionSync message to the slave. This message informs the slave of the transaction number of the transaction that the master last replicated.

If the transaction number in the TransactionSync message does not match the transaction number of the last received transaction in the slave's archive, then the slave will request resynchronization from the master. The resynchronization request sent by the slave will include the slave's last received transaction number.

The master will respond by retransmitting each transaction since the last transaction number indicated by the slave in the resynchronization request. The master obtains these transactions from its replication archive.

Should the slave's last received transaction number be less than the lowest transaction number in the master's replication archive, then automatic resynchronization cannot occur as the master's replication archive does not contain enough history to synchronize the slave. In this case, the slave must be resynchronized with a full-resynchronization.

Full Resynchronization

Full Resynchronization means that the slave has missed more transactions than are stored in the master's replication archive and cannot be resynchronized automatically. There is no automatic full-resynchronization mechanism in Prime Access Registrar's configuration replication feature. To perform a full resynchronization, see Full Resynchronization.

Understanding Hot-Configuration

Hot-Configuration is the process of reflecting configuration changes made to Prime Access Registrar's internal configuration database in the in-memory configuration of the executing Prime Access Registrar server. Hot-Configuration is accomplished without interruption of RADIUS request processing. For example, if an administrator uses aregcmd to configure a new client and issues a save command, when the prompt returns, the newly configured client can send requests to Prime Access Registrar.

Hot-Configuration minimizes the down-time associated with having to restart an Prime Access Registrar server to put configuration changes into effect. With the Hot-Configuration feature, a restart is only necessary when a Session Manager, Resource Manager or Remote Server configuration is modified. These configuration elements might not be hot-configured because they maintain state (an active session, for example) and cannot be modified without losing the state information they maintain. Changes to these configuration elements require a restart of Prime Access Registrar to put them into effect.

Hot-Configuration’s only connection to the replication feature is that when a change is replicated to the slave, the slave is hot-configured to reflect the replicated change as if an administrator had used aregcmd to make the changes directly on the slave server.

note.gif

Noteblank.gif Any misconfiguration of replication will impact the performance of Prime Access Registrar. Hence manual resync is required to rectify the misconfiguration.


Replication’s Impact on Request Processing

The replication feature was designed to perform replication of transactions with minimal impact on RADIUS request processing. When a transaction is received by a slave, RADIUS requests are queued while the transaction is applied to the slave. After the transaction is complete, RADIUS request processing resumes.

The impact on RADIUS request processing is a direct result of the size of a transaction. The smaller the transaction the lesser the impact, and the larger the transaction, the greater the impact. In other words, when making changes to the master, frequent saves are better than making lots of changes and then saving. Each change is one transaction element and all changes involved in a save comprise a single transaction with one element per change. Since the replication feature only impacts RADIUS request processing when changes are made, the impact under normal operation (when changes are not being made) is virtually unmeasurable.

Replication Configuration Settings

This section describes each replication configuration setting. In aregcmd, replication settings are found in //localhost/Radius/Replication. This section contains the following topics:

RepType

RepType indicates the type of replication. The choices available are SMDBR and NONE.

When RepType is set to NONE, replication is disabled. To enable replication, set RepType to SMDBR for Single Master DataBase Replication. RepType must be set to SMDBR on both the master and slave servers.

RepTransactionSyncInterval

Master

On the master server, RepTransactionSyncInterval is the duration between periodic transmission of the TransactionSync message expressed in milliseconds. The default is 60000 or 1 minute.

The purpose of RepTransactionSyncInterval is to indicate how frequently to check for an out-of -sync condition between the master and slave servers. When the slave received the TransactionSync message, it uses its contents to determine if it needs to resynchronize with the master.

The larger the setting for RepTransactionSyncInterval, the longer the period of time between out-of-sync detection. However, if RepTransactionSyncInterval is set too small, the slave can frequently request resynchronization when it is not really out of sync. If the duration is too small, the slave cannot completely receive a transaction before it receives the TransactionSync message. In this case, the servers will remain synchronized, but there will be unnecessary excess traffic that could affect performance.

note.gif

Noteblank.gif We recommend that you use smaller values for the RepTransactionSyncInterval to limit the time a slave server is offline applying change sets during automatic resynchronization.


Slave

On the slave, RepTransactionSyncInterval is used to determine if the slave has lost contact with the master and to alert administrators of a possible loss of connectivity between the master an slave. If the elapsed time since the last received TransactionSync message exceeds the setting of RepTransactionSyncInterval, the slave writes a log message indicating that it might have lost contact with the master. This log message is repeated each TransactionSyncInterval until a TransactionSync message is received.

RepTransactionArchiveLimit

On both master and slave, the RepTransactionArchiveLimit setting determines how many transactions can be stored in the archive. The default setting is 100. When the limit is exceeded, the oldest transaction file is deleted. If a slave requires resynchronization and the last transaction it received is no longer in the archive, a full resynchronization will be necessary to bring the slave back in sync with the master.

note.gif

Noteblank.gif The value set for RepTransactionArchiveLimit should be the same on the master and the slave.


An appropriate value for RepTransactionArchiveLimit depends upon how much hard disk space an administrator can provide for resynchronization. If this value is large, say 10,000, then the last 10,000 transactions will be stored in the archive. This is like saying the last 10,000 saves from aregcmd will be stored in the archive. Large values are best. The size of each transaction depends upon how many configuration changes were included in the transaction, so hard disk space usage is difficult to estimate.

note.gif

Noteblank.gif We recommend that you use smaller values for the RepTransactionArchiveLimit to limit the time a slave server is offline applying change sets during automatic resynchronization.


If the slave should go down or otherwise be taken off line, the value of RepTransactionArchiveLimit and the frequency of aregcmd saves will determine how long the slave can be offline before a full-resynchronization will be required.

There are two reasons why a slave server should have an archive:

1.blank.gif The slave must save the last received transaction for resynchronization purposes (at a minimum).

2.blank.gif Should the master go down, the slave can then be configured as the master and provide resynchronization services to other slaves.

RepIPAddress

The RepIPAddress value is set to the IP Address of the machine containing the Prime Access Registrar installation.

note.gif

Noteblank.gif The IP address format is enhanced to support both IPv4 and IPv6.


RepPort

The RepPort is the port used to receive of replication messages. In most cases, the default value (1812) is sufficient. If another port is to be used, the interfaces must exist in the machine.

RepSecret

RepSecret is the replication secret shared between the master and slave. The value of this setting must be identical on both the master and the slave.

RepIsMaster

The RepIsMaster setting indicates whether the machine is a master or a slave. On the master, set RepIsMaster to TRUE. On the slave set it to FALSE. Only the master can have this value set to TRUE and there can be only one master.

RepMasterIPAddress

RepMasterIPAddress specifies the IP Address of the master. On the master, set RepMasterIPAddress to the same value used in RepIPAddress above. On the slave, RepMasterIPAddress must be set to the IP Address of the master.

note.gif

Noteblank.gif The IP address format is enhanced to support both IPv4 and IPv6.


RepMasterPort

RepMasterPort is the port to use to send replication messages to the master. In most cases, the default value (1812) is sufficient; however, if another is to be used, the interfaces must exist in the machine.

Rep Members Subdirectory

The Rep Members\ subdirectory contains the list of slaves to which the master will replicate transactions.

Rep Members/Slave1

Each slave is added much like a client is added. Each slave must have a configuration in the Rep Members directory to be considered part of the replication network by the master. The master will not transmit any messages or replications to servers not in this list, and any communication received by a server not in this list will be ignored.

note.gif

Noteblank.gif Although it is possible to configure multiple slaves with the same master, we have only considered a single-master/single-slave configuration. This is the recommended configuration.


Name

This is the name of the slave. The name must be unique.

IPAddress

This is the IP Address of the slave.

note.gif

Noteblank.gif The IP address format is enhanced to support both IPv4 and IPv6.


Port

This is the port upon which the master will send replication messages to the slave.

Setting Up Replication

This section provides step-by-step instructions about how to configure replication on both the master and member servers. The “Replication Example” section, shows an example of replication configuration.

If possible, open an xterm window on both the master and member. In each of these windows, change directory to $INSTALL/logs and run xtail to watch the logs. This allows you to watch replication log messages as they occur. If you are using a system which had a previous installation of Prime Access Registrar, delete all files located in the $INSTALL/data/archive directory if it is present on either the master or member systems. This section contains the following topics:

Configuring The Master

On the master server, RepTransactionSyncInterval is the duration between periodic transmission of the TransactionSync message expressed in milliseconds. The default is 60000 or 1 minute.

Configuring the Master Server for Replication

To configure the master server for replication:


Step 1blank.gif On the machine which is to be the master, using aregcmd, navigate to //localhost/Radius/Replication

Step 2blank.gif Set the RepType to SMDBR:

set RepType SMDBR

Step 3blank.gif Set the RepIPAddress to the IP address of the master:

set RepIPAddress 192.168.1.1

Step 4blank.gif Set the RepSecret to MySecret:

set RepSecret MySecret

Step 5blank.gif Set RepIsMaster to TRUE:

set RepIsMaster TRUE

Step 6blank.gif Set RepMasterIPAddress to the same value used in step 3:

set RepMasterIPAddress 192.168.1.1

Step 7blank.gif Change directory to /Radius/Advanced and set the MaximumNumberOfRadiusPackets property to 8192:

cd /Radius/Advanced

set MaximumNumberOfRadiusPackets 8192

Step 8blank.gif Change directory to Rep Members:

cd “rep members”

note.gif

Noteblank.gif You must enclose Rep Members in quotes due to the space in the name.


Step 9blank.gif Add member1:

add member1

Step 10blank.gif Change directory to member1:

cd member1

Step 11blank.gif Set the IPAddress to the IP Address of the machine to be the member:

set IPAddress 192.168.1.2

note.gif

Noteblank.gif The RepPort and RepMasterPort properties on the Master must correspond to one of the ports configured in /Radius/Advanced/Ports, if one is configured. Otherwise, the default values for the RepPort and RepMasterPort properties are sufficient.


Step 12blank.gif Save the configuration:

save

Step 13blank.gif Reload the configuration:

reload


 

Configuring The Member

On the slave, RepTransactionSyncInterval is used to determine if the slave has lost contact with the master and to alert administrators of a possible loss of connectivity between the master an slave.

Configuring the Member Server for Replication

To configure the member server for replication:


Step 1blank.gif On the machine which is to be the member, using aregcmd, navigate to //localhost/Radius/Replication.

Step 2blank.gif Set the RepType to SMDBR.

set RepType SMDBR

Step 3blank.gif Set the RepIPAddress to the IP address of the member.

set RepIPAddress 192.168.1.2

Step 4blank.gif Set the RepSecret to MySecret.

set RepSecret MySecret

Step 5blank.gif Set RepMasterIPAddress to IP Address of the master (the same value used in Step 3 on page 8-1).

set RepMasterIPAddress 192.168.1.1

Step 6blank.gif Change directory to /Radius/Advanced and set the MaximumNumberOfRadiusPackets property to 8192.

cd /Radius/Advanced

set MaximumNumberOfRadiusPackets 8192

Step 7blank.gif If the Master has been configured to use a port other than the well-known (and default) RADIUS ports, configure each Member to use the same port.

note.gif

Noteblank.gif The RepPort and RepMasterPort properties on the Master must correspond to one of the ports configured in /Radius/Advanced/Ports, if one is configured. Otherwise, the default values for the RepPort and RepMasterPort properties are sufficient.


Step 8blank.gif Save the configuration:

save

Step 9blank.gif Reload the configuration:

reload


 

Verifying the Configuration

After both servers have successfully started, use aregcmd to make a small change to be replicated to the member server which you can easily verify. We recommend setting the description in //localhost/Radius to something like Test1. After you issue an aregcmd save and the prompt returns, run aregcmd on the member server and change directory to //localhost/Radius. Ensure that the description is set to Test1. If this was successful, then replication is properly configured and functional.

Replication Example

This section provides an example of replication and shows the actions that occur.

Adding a User

The Users object contains all of the information necessary to authenticate a user or authorize a user. Users in local UserLists can have multiple profiles. On the master server, use aregcmd to add a new user to the default user list.

Adding a New User

To add a new user:


Step 1blank.gif Change directory to //localhost/Radius/UserLists/Default.

Step 2blank.gif Enter the following:

add testuser

Step 3blank.gif Change directory to testuser.

cd testuser

Step 4blank.gif Set the password for testuser.

set password testuser

Step 5blank.gif Confirm the password by entering testuser again.

Step 6blank.gif Enter save to save the configuration.


 

Master Server’s Log

The log on the master shows the following:

***./name_radius_1_log ***
10/23/2013 23:17:07 name/radius/1 Info Server 0 Initiating Replication of Transaction 1 with 2 Elements.
10/23/2013 23:17:07 name/radius/1 Info Server 0 Replication Transaction #1 With 2 Elements Initiated
 

Member Server’s Log

The log on the member shows the following:

***./name_radius_1_log ***
10/23/2013 23:15:18 name/radius/1 Info Server 0 Radius Server is On-Line
10/23/2013 23:17:12 name/radius/1 Info Server 0 Committing Replication of Transaction 1 with 2 Elements.
10/23/2013 23:17:16 name/radius/1 Info Server 0 Replication Transaction #1 With 2 Elements Committed.
 

Verifying Replication

You can use one of two methods to verify that the new user testuser was properly replicated to the member:

  • Run aregcmd on the member and look at the default userlist to see if it is there.
  • Run radclient on the member and enter simple testuser testuser to create a simple access request packet (p001).

Enter p001 send to send it. When it returns with p002, enter p002 to see if it is an Access Accept packet or an Access Reject packet. If it is an Access Accept, the user was properly replicated to the member. Using radclient is the recommended method to validate that a user was properly replicated.

On the Master, use aregcmd to delete the user from the default user list and save the user list.

Master Server’s Log

The log on the master shows the following:

***./name_radius_1_log ***
10/23/2013 23:20:48 name/radius/1 Info Server 0 Initiating Replication of Transaction 2 with 1 Elements.
10/23/2013 23:20:48 name/radius/1 Info Server 0 Replication Transaction #2 With 1 Elements Initiated
 

Member Server’s Log

The log on the member shows the following:

***./name_radius_1_log ***
10/23/2013 23:20:53 name/radius/1 Info Server 0 Committing Replication of Transaction 2 with 1 Elements.
10/23/2013 23:20:57 name/radius/1 Info Server 0 Replication Transaction #2 With 1 Elements Committed.

Repeat the validation procedure above to ensure the user testuser is no longer present on the member.

Using aregcmd -pf Option

Prime Access Registrar's replication feature works well using aregcmd input files. An aregcmd input file contains a list of aregcmd commands. For example, if the initial configuration of Prime Access Registrar were constructed in an input file, the master and member could be configured for replication first, then the input file applied to the master will be automatically replicated to the member.

Using aregcmd -pf Option

To illustrate replication using an aregcmd input file:


Step 1blank.gif Create a text file called add5users with the following commands:

add /Radius/UserLists/Default/testuser1

cd /Radius/UserLists/Default/testuser1

set password testuser1

add /Radius/UserLists/Default/testuser2

cd /Radius/UserLists/Default/testuser2

set password testuser2

add /Radius/UserLists/Default/testuser3

cd /Radius/UserLists/Default/testuser3

set password testuser3

add /Radius/UserLists/Default/testuser4

cd /Radius/UserLists/Default/testuser4

set password testuser4

add /Radius/UserLists/Default/testuser5

cd /Radius/UserLists/Default/testuser5

set password testuser5

save

Step 2blank.gif On the master server, run the following command:

aregcmd -pf add5users


 

Master Server’s Log

The log on the master shows the following:

***./name_radius_1_log ***
10/23/2013 23:27:08 name/radius/1 Info Server 0 Initiating Replication of Transaction 3 with 10 Elements.
10/23/2013 23:27:08 name/radius/1 Info Server 0 Replication Transaction #3 With 10 Elements Initiated
 

Member Server’s Log

The log on the member shows the following:

***./name_radius_1_log ***
10/23/2013 23:27:12 name/radius/1 Info Server 0 Committing Replication of Transaction 3 with 10 Elements.
10/23/2013 23:27:17 name/radius/1 Info Server 0 Replication Transaction #3 With 10 Elements Committed.
 

When the prompt returns, go to the member and use aregcmd to view the /radius/defaults/userlist. There should be five users there named testuser1 through testuser5.

An Automatic Resynchronization Example

This example will illustrate resynchronization of the member. This will be accomplished by stopping the member, making changes on the master, then restarting the member forcing a resynchronization.

Performing Resynchronization of the Member

To perform resynchronization of the member:


Step 1blank.gif At the member, stop the Prime Access Registrar server:

/etc/init.d/arservagt stop

At the master, run aregcmd and change directory to /radius/userlist/default.

cd /radius/userlist/default

Step 2blank.gif Enter the following:

add foouser

Step 3blank.gif Change directory to foouser.

cd foouser

Step 4blank.gif Set the password for foouser.

set password foouser

Step 5blank.gif Confirm the password by entering foouser again.

Step 6blank.gif Save the configuration:

save


 

Master Server’s Log

The log on the master shows the following:

***./name_radius_1_log ***
10/23/2013 23:31:02 name/radius/1 Info Server 0 Initiating Replication of Transaction 5 with 2 Elements.
10/23/2013 23:31:02 name/radius/1 Info Server 0 Replication Transaction #5 With2 Elements Initiated
 

On the member, run /etc/init.d/arservagt start. Notice the following log messages in the Master's log:

***./name_radius_1_log ***
10/23/2013 23:33:19 name/radius/1 Info Server 0 Resynchronizing member1.
 

Member Server’s Log

The log on the member shows the following:

***./name_radius_1_log ***
11/07/2013 23:33:14 name/radius/1 Info Server 0 Radius Server is Off-Line
11/07/2013 23:33:14 name/radius/1 Info Server 0 Starting Replication Manager
11/07/2013 23:33:24 name/radius/1 Info Server 0 Master Selected As Partner (DEFAULT)
11/07/2013 23:33:24 name/radius/1 Info Server 0 Radius Server is Off-Line
11/07/2013 23:33:24 name/radius/1 Warning Server 0 Requesting resynchronization from Master: Last Txn#3
11/07/2013 23:33:24 name/radius/1 Info Server 0 Resynchronization from Master in progress.
11/07/2013 23:33:24 name/radius/1 Info Server 0 Committing Replication of Transaction 4 with 2 Elements.
11/07/2013 23:33:28 name/radius/1 Info Server 0 Replication Transaction #4 With 2 Elements Committed.
11/07/2013 23:33:28 name/radius/1 Info Server 0 Radius Server is On-Line
 

As the log above shows, when the member started up, it validated its last received transaction number (#3) with the master's last replicated transaction number (#4). They did not match because a replication was initiated by the master which was not received by the member (because the member was stopped). When the member detected this discrepancy, the member made a resynchronization request to the master. The master responded by transmitting the missed transaction (#4) to the member. After it received and processed the retransmitted transaction, the member determined that it was then synchronized with the master and placed itself in an online status.

Full Resynchronization

Full Resynchronization means that the member has missed more transactions than are stored in the master's replication archive and can not be resynchronized automatically. There is no automatic full-resynchronization mechanism in Prime Access Registrar's configuration replication feature. If a full resynchronization is required, you must export the master server’s database and update the member configuration.

note.gif

Noteblank.gif Before beginning, ensure there are no aregcmd sessions logged into the master server.


Performing a Manual Full-resynchronization

To perform a manual full-resynchronization:


Step 1blank.gif On the master server, stop the Prime Access Registrar server agent using the following command:

/etc/init.d/arserver stop

Step 2blank.gif On the master server, change directory to $INSTALL/data/db.

Step 3blank.gif Create a tarfile made up of the three database files, mcddb.d01, mcddb.d02, and mcddb.d03.

tar cvf /tmp/db.tar mcddb.d0*

Step 4blank.gif Create a tarfile of the archive.

tar cvf /tmp/archive.tar $INSTALL/data/archive

Step 5blank.gif On the master server, start the Prime Access Registrar server agent using the following command:

/etc/init.d/arserver start

Step 6blank.gif On each member server requiring resynchronization, perform the following:

a.blank.gif On the member server, stop the Prime Access Registrar server agent using the following command:

/etc/init.d/arserver stop

b.blank.gif Copy the tarfiles (db.tar and archive.tar) to /tmp.

c.blank.gif Change directory to $INSTALL/data/db, then untar the compressed database files.

cd $INSTALL/data/db

tar xvf /tmp/db.tar

d.blank.gif Rebuild the key files using the following command:

$INSTALL/bin/keybuild mcddb

note.gif

Noteblank.gif This step might take several minutes.


e.blank.gif Untar the archive.

cd $INSTALL/data/archive

tar xvf /tmp/archive.tar

f.blank.gif As a safety check, run the following UNIX command to verify the integrity of the database.

$INSTALL/bin/dbcheck mcddb

note.gif

Noteblank.gif You must be user root to run dbcheck.


No errors should be detected.

g.blank.gif Start the Prime Access Registrar server agent using the following command:

/etc/init.d/arserver start

note.gif

Noteblank.gif After you start the member server with the master server’s database, you will probably see messages such as the following:

11/07/2013 23:21:23 name/radius/1 Error Server 0 TXN_SYNC: Failed to get master's socket handle.
11/07/2013 23:21:49 name/radius/1 Warning Server 0 TXN_SYNC Received by Master from unknown member 10.1.9.74. Validation Failed

These messages will likely continue until you complete steps h and i.


h.blank.gif Change directory to //radius/replication and change the following attributes:

  • Change the RepIPAddress to that of the member.
  • Change RepIsMaster to FALSE.
  • Remove any entries under Rep Members.

i.blank.gif Save and reload the configuration.

save

Validating //localhost...
Saving //localhost...
 

reload

The member will start up and show online status in the log after it has verified it is synchronized with the master.


 

Replication Setup with More Than One Slave

When replication is set up with more than one slave, Prime Access Registrar’s replication feature ensures that all the servers maintain identical configuration. This is done by forming a communication mesh. This mesh is formed by every server choosing two partners for itself from the replication setup. The servers tend to receive/send configuration updates from/to its partners. This ensures that all the servers maintain identical configuration inspite of minimal communication failures.

When bringing up a replication setup, Prime Access Registrar server comes up first and then initiates a partner sync request to all its replication members as visible from the configuration. So, a slave server will initiate partner sync to its master only. This is because master server is the only server visible to the slave server from the configuration. The master server will then broadcast the partner syncs that it has received, to all its replication members (slaves). Based on the sync messages sent by the master to this server, the evaluation of workload happens. The partner selection is based on the workload evaluation. Choosing the partners based on workload, ensures that the workload is equally distributed across the partner network.

The partners are selected based on the count of partner syncs received from the master:

  • If partner syncs that have been received is one, choose the master as a partner.
  • If partner syncs that have been received is two, choose the master and the other replication server as partners.
  • If partner syncs that have been received is greater than two, perform a workload evaluation on the partners. Identify two servers that do not have two partners and choose them as partners.