Reference

1. Architecture Overview

The diagram below outlines the SVN MultiSite Plus architecture, in term of how the application is split up and how those component parts communicate with each other and the outside world.

settings

Key points

2. Install directory Structure

SVN MultiSite Plus is installed to the following path by default:

/opt/wandisco/svn-multisite-plus/
It's possible to install the files somewhere else on your server, although this guide will assume the above location when discussing the installation.

Inside the installation directory you'll find the following files and directories:

[DIRECTORY] drwxr-xr-x 2 wandisco wandisco 4096 Dec  5 01:16 bin
        -r-xr-xr-x 1 root root 9130 Feb 17 17:35 backup
        -r-xr-xr-x 1 root root 1630 Feb 17 17:35 rollback
        -r-xr-xr-x 1 root root 1569 Feb 17 17:35 svn-multisite-plus
        -r-xr-xr-x 1 root root 12571 Feb 17 17:35 talkback
        -rwxr-xr-x 1 root root 3764 Feb 17 17:35 watchdog       
[DIRECTORY]drwxrwxr-x 2 wandisco wandisco 4096 Dec 5 13:21 config
	       
        -rw-r--r-- 1 wandisco wandisco 240 Feb 18 17:33 main.conf
[DIRECTORY] -rw-rw-r-- 1 wandisco wandisco 256 Dec 5 13:56 lib
        -rw-r--r-- 1 root root 7142 Feb 17 17:35 init-functions.sh
[DIRECTORY] -rw-rw-r-- 1 wandisco wandisco 256 Dec 5 13:56 local-ui
        -rw-r--r-- 1 wandisco wandisco 1049 Feb 18 17:36 ui.properties
[DIRECTORY] drwxrwxr-x 2 wandisco wandisco 4096 Dec 5 14:20 logs
     
        -rw-r--r-- 1 wandisco wandisco    88 Feb 18 17:35 multisite.log
        -rw-r--r-- 1 wandisco wandisco   220 Feb 18 17:35 replicator.20140218-173532.log
        -rw-r--r-- 1 wandisco wandisco 15822 Feb 18 17:35 ui.20140218-173305.log
        -rw-r--r-- 1 wandisco wandisco  1902 Feb 18 17:35 watchdog.log      
   
[DIRECTORY] drwxr-xr-x 8 wandisco wandisco 4096 Dec 5 14:20 replicator
[DIRECTORY]drwxr-xr-x 2 wandisco wandisco   4096 Feb 18 17:33 content
[DIRECTORY]drwxr-xr-x 5 wandisco wandisco   4096 Feb 18 17:35 database
[DIRECTORY]drwxr-xr-x 4 root     root       4096 Feb 18 17:33 docs
[DIRECTORY]drwxr-xr-x 2 wandisco wandisco   4096 Feb 18 17:33 export
[DIRECTORY]drwxr-xr-x 6 root     root       4096 Feb 18 17:33 gfr
[DIRECTORY]drwxr-xr-x 2 root     root       4096 Feb 18 17:33 lib
[DIRECTORY]drwxr-xr-x 5 wandisco wandisco   4096 Feb 21 10:12 logs
[DIRECTORY]drwxr-xr-x 2 wandisco wandisco   4096 Feb 18 17:35 properties
[DIRECTORY]drwxr-xr-x 2 root     root       4096 Feb 18 17:33 properties.dist
 
        -rwxr-xr-x 1 root     root       7679 Feb 17 17:35 resetSecurity.jar
        -rw-r--r-- 1 root     root     315294 Feb 17 15:49 svn-ms-replicator-fsfsrestore.jar
        -rw-r--r-- 1 root     root     315280 Feb 17 15:49 svn-ms-replicator.jar
        -rw-r--r-- 1 root     root     315292 Feb 17 15:49 svn-ms-replicator-updateinetaddress.jar
        -rwxr-xr-x 1 root     root      23731 Feb 17 17:35 transformer-tool.jar
        -rw-r--r-- 1 root     root        605 Feb 17 15:49 VERSION-TREE
    
[DIRECTORY] drwxr-xr-x 3 root root 4096 Oct 17 15:29 resources
[DIRECTORY]drwxr-xr-x 2 root root     4096 Oct 17 15:28 svn

3. Properties Files

The following files store application settings and constants that may need to be referenced during troubleshooting. However, you shouldn't make any changes to these files without consulting WANdisco's support team.

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties
file contains settings for the replicator and affects how MultiSite performs. [view sample]

** Alert! **Temporary requirement:
If you (probably under instruction from WANdisco's support team) manually add either connectivity.check.interval or sideline.wait to the applications property file then you must add an "L" (Long value) to the end of their values so they are converted correctly. View our sample application.properties file to view all the properties that are suffixed as "Long".

/opt/wandisco/svn-multisite-plus/replicator/properties/logger.properties
handles properties that apply to how logging is handled. [view sample]
/opt/wandisco/svn-multisite-plus/replicator/properties/replicator-api-authorization.properties
handles properties that are used for managing remote access through the API. [view sample]
/opt/wandisco/svn-multisite-plus/replicator/properties/users.properties
contains the admin account details which will be required when installing second and subsequent nodes. [view sample]
/opt/wandisco/svn-multisite-plus/local-ui/ui.properties
contains settings concerning the graphical user interface such as widget settings and timeout values. Stored in this file is the UI Port number and is considered the defacto recording of this value, superceding the version stored in the main config file /opt/wandisco/svn-multisite-plus/config/main.conf. You can [view a sample]

4. Replication Strategy

SVN MultiSite provides a toolset for replicating SVN repository data in a manner that can maximise performance and efficiency whilst minimising network and hardware resources requirements. The following examples provide you with a starting point for deciding on the best means to enable replication across your development sites.

4.1 Replication Model

In contrast with earlier replication products, SVN MultiSite Plus is no longer based upon a network proxy that handles file replication between replica. Now, replication is handled at the filesystem level, via FSFS. ** This is history **

SVN MultiSite Plus differs from earlier WANdisco replication products on a number of levels.

Limitations of the old model

** This is history **

SVN MultiSite Plus differs from earlier versions in that it replicates at the file system level.

4.1.1 Per-Repository Replication

SVN MultiSite is able to replicate on a per-repository basis. This way each site can host different sets of repositories and replicate repositories this means that you can have different repositories replicate as part of different replication groups.
** Alert! **

4.1.2 Dynamic Membership Evolution

** evolution without stopping work **

No need for a synchronized stop - SVN MultiSite Plus allows replication groups to change their membership on-the-fly.

A repository can only replicate to a single replication group at any one time, although it is possible to move between replication groups as required - this can now be done on-the-fly, nodes can be added or deleted without the need to pause all replication (with a synchronized stop)

SVN MultiSite Plus offers a great deal of flexibility in how repository data is replicated. Before you get started it's a good idea to map out which repositories at which locations you want to replicate.

4.2 Creating resilient Replication Groups

SVN MultiSite Plus is able to maintain SVN repository replication (and availability) even after the loss of nodes from a replication group. However, there are some configuration rules that are worth considering:

Rule 1: Understand Learners and Acceptors

The unique Active-Active replication technology used by SVN MultiSite Plus is an evolution of the Paxos algorithm, as such we use some Paxos concepts which are useful to understand:

Rule 2: Replication groups should have a minimum membership of three learner nodes

Two-node replication groups are not fault tolerant, you should strive to replicate according to the following guideline:

Rule 3: Learner Population - resilience vs rightness

Rule 4: 2 nodes per site provides resilience and performance benefits

Running with two nodes per site provides two important advantages.

4.3 Content Distribution Strategy

Preamble

WANdisco's replication protocol separates replication traffic into two streams, the coordination stream which handles agreement between voter nodes and the content distribution stream through which SVN repository changes are passed to all other nodes (that is "learner" nodes that store repository replicas).

** evolution without stopping work **

SVN MultiSite Plus provides a number of tunable settings that let you apply different policies to content distribution. By default, content distribution is set up to prioritize reliability. You can, instead, set it to prioritize speed of replication. Alternatively you can apply a policy that prioritizes delivery of content to voter nodes. These policies can be applied on a per-site, per-repository and replication group basis providing a fine level of control providing that you take care to set things up properly.

Changing Content Distribution Policy

In order to set the policy, you need to make a manual edit to SVN MultiSite Plus's Application properties file:

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties

** Alert! **Changes require a restart:
Changing the current strategy requires the modification of properties files that the replicator only reads at start-up. As a result any changes to strategy require that the replicator be restarted before the change will be applied.

Editable Content Distribution Properties

content.push.policy=faster
content.min.learners.required=true
content.learners.count=1

Above is an example Content Distribution Policy. We'll breakdown what each of the settings does:

content.push.policy

This property sets the priority for Content Distribution. It can have one of three options which set the following behavior. Each option tells MultiSite to use a different calculation for relating replication agreement to the actual delivery of replicated data.

"reliable" Policy:
Replication can continue if content available to a sufficient number of learners ( the value of content.learner.count, not including the node itself)
Reliable is the default setting for the policy. Content must be delivered to a minimum number of nodes, the value of the property (content.min.learners.required, for a proposal to be accepted - which will allow replication to continue.
Reliable because it greatly reduces the risk that a proposal will be accepted when the corresponding content cannot be delivered (due to a LAN outage etc). This policy is less likely to provide good performance because replication is kept on hold until content has been successfully distributed to a greater number of nodes than would be the case under the "faster" policy.

Setting the corresponding "content.learner.count" value
  • This value is the number of learners (exluding the originating node) to which content is delivered before a proposal will be raised for agreement.

  • If content.min.learners.required is false then the system will automatically lower content.learner.count to ensure that replication can continue om the event of a loss of node(s).

  • If the value is higher than the number of available nodes then SVN MultiSite is notified of the failure and the value is dropped further, ( based on 'content.min.learners.required').

  • If content.min.learners.required is true the then SVN MultiSite Plus is notified of a failure to summon enough voters for agreement - content.leaner.count is automatically dropped again (to equal one less than the number of available learners), or if that's no longer possible, a failure is reported.

Setting the corresponding "content.min.learners.required" value

For "reliable" policy that offers the upmost reliability, set this to "true".

** Alert! **true enforces the requirement
When content.min.learners.required is set to "true" SVN MultiSite Plus will not lower the content.learner.count in light of insufficient learner nodes being available.

Example:

content.learner.count=5, content.min.learners.required=true

After an outage there are now only 4 learner nodes in the replication group - replication will be halted because there aren't enough available learners to validate a content distribution check.

content.learner.count=5, content.min.learners.required=false

After an outage there are now only 4 learner nodes in the replication group - replication will not be halted because SVN MultiSite Plus will automatically drop the count to ensure that it doesn't exceed the total number of learners in the group.

"acceptors" Policy:
Voting can commence if content is delivered to 50% of voters (acceptors), include self if a voter

Content push policy only deals with delivering content to voters. This policy is ideal if you have a small number of voters. You don't want replication to continue until you have confirmed that at least half the voters have the corresponding payload. This policy supports a "follow-the-sun" approach to replication where the voter status of nodes changes each day to correspond with the locations where most development is being done. This ensures that the sites with the most development activity will get the best commit performance.

Setting the corresponding "content.learner.count" value

For "Acceptors" policy this is ignored.

Setting the corresponding "content.min.learners.required" value

For "Acceptors" policy this is ignored - learners do not factor into the policy, only voters (acceptors).

"faster" Policy:
Replication can continue if content available to x learners (not including self)
OR [delivered to half the voters, including self if its a voter]
where x = content.learner.count

Setting the policy to 'faster' lowers the requirement for successful content distribution so that replication can continue when fewer nodes (than under the reliable policy) have received the payload for the proposal. 50% of voters (acceptors) must receive the content. It's faster because if there's a slow or intermittent connection somewhere on the WAN, it wouldn't delay agreement/ slow down replication. It is less reliable because it increases the possibility that the ordering of a proposal can be agreed upon, only for the corresponding content to not get delivered. The result would be a break in replication.

Setting the corresponding "content.learner.count" value
  • For the 'faster' policy the node in question is always included in the count. In the event that this is not satisfied, a further check is made against acceptors. The check passes if half or more (rounded up) of the available voters took delivery of the content.

Setting the corresponding "content.min.learners.required" value

For Faster, set this to "false".

** Alert! **All the acceptors (voters) must also be learners (carry replica data that can be changed). If all the acceptors are not learners, we switch to 'reliable' policy with a log message. A node always includes itself in the count - in contrast with the "reliable" policy where a node never includes itself in the count.

Steps for a policy change

Use this procedure to change between the above Content Distribution policies.

  1. Make a back up and then edit the /opt/wandisco/svn-multisite-plus/replicator/properties/application.properties file (Read more about the properties files).

  2. Change the value of content.min.learners.required, make it "true" for reliability, "false" for speed (default is true).

  3. Save the file and perform a restart of the node.

Set Policy on a per-state machine basis

When editing the property, add the state machine identity followed by '.content.push.policy'. e.g.

<machine_identity_name>.content.push.policy=faster

The system assigns policy by looking up the state machine policy followed by 'content.push.policy'. If none are available, 'reliable' is chosen. Conditional switch between 'faster' and 'reliable' remains in effect regardless of the policy.

Example 1 - Faster policy on a 2-node replication group

Two-node Replication Group, NodeA (Active Voter) and NodeB (Active).

content.push.policy=faster
content.min.learners.required=true
content.learners.count=1

Example 2 - Acceptors policy on a 4-node replication group

Four nodes split between two sites. On Site 1 we have NodeA and NodeB, both are Active Voters. On site 2 we have NodeC (AV) and NodeD (A).

content.push.policy=acceptors
content.min.learners.required=true
content.learners.count=1

5. Guide to Node Types

Each replication group consists of a number of nodes and a selection of repositories that will be replicated.

There are now some different types of site:

** node **Active
An Active node has users who are actively committing to SVN repositories, which results in the generation of proposals that are replicated to the other nodes. However, it plays no part in getting agreement on the ordering of transactions. Active nodes support the use of the Consistency Checker tool.
** node **Active Voter
An Active Voter is an Active node that also votes on the order in which transactions are played out. In a replication group with a single Active Voter, it alone decides on ordering. If there's an even number of Active Voters, a Tiebreaker will have to be specified. Active nodes support the use of the Consistency Checker tool.
** node **Passive
A node on which repositories receive updates from other nodes, but doesn't permit any changes to its replicas from SVN clients - effectively making its repositories read-only. Passive nodes are ideal for use in providing hot-backup. Passive nodes do not support the reliable use of the Consistency Checker tool.
** node **Passive Voter
A passive node that also takes part in the vote for transaction ordering agreement.

Use for:
  • Dedicated servers for Continuous Integration servers
  • Sharing code with partners or sites that won't be allowed to commit changes back
  • In addition, these nodes could help with HA as they add another voter to a site.
  • Passive nodes do not support the reliable use of the Consistency Checker tool.

  • ** node **Voter (only)
    A Voter-only node doesn't store any repository data, it's only purpose is to accept transactions and cast a vote on transaction ordering. Voter-only nodes add resilience to a replication group as they increase the likelihood that enough nodes are available to make agreement on ordering.

    The Voter-only node's lack of replication payload means that it can be disabled from a replication group, without being removed. ** node **
    A disabled node can be re-enabled without the need to interrupt the replication group.
    ** node **Tiebreaker
    In the event of even number of voters in the Replication Group the Tiebreaker gets the casting vote. The Tiebreaker can be applied any type of voter: Active Voter, Passive Voter or Voter.
    ** node **Helper
    When adding a new site to an existing replication group you will select an existing site from which you will manually copy or rsync the applicable repository data. This existing site enters the 'helper' mode in which the same relevant repositories will be read-only until they have been synced with the new site. By relevant we mean that they are replicated in the replication group in which the new site is being added.
    ** node **New
    When a site is added to an existing replication group it enters an 'on-hold' state until repository data has been copied across. Until the process of adding the repository data is complete, New nodes will be read-only. Should you leave the Add a Node process before it has completed you will need to manually remove the read-only state from the repository.

    Acceptors, Proposers and Learners?

    The table below shows which node roles are acceptors, proposers or learners.

    Node Type Acceptor Proposer Learner
    Active (A) N Y Y
    Active Voter (AV) Y Y Y
    Passive (P) N N Y
    Passive Voter (PV)Y N Y
    Voter Only (V) Y N N

    Key

    Learners are either Active or Passive nodes:
    Learns proposals from other nodes and takes action accordingly. Updates repositories based on proposal (replication).
    Proposers are Active nodes:
    To be able to commit to the node the node must be able to make proposals.
    Acceptors are Voters:
    Accepts proposals from other nodes and whether or not to process or not (ordering/achieve quorum).