WANdisco SVN MultiSite Plus® User Guide

1. Introduction

Welcome to the User Guide for WANdisco’s SVN MultiSite Plus 1.9.

To view User Guides for previous versions of SVN MultiSite Plus visit the Archive.

SVN MultiSite Plus (MSP) is the core of WANdisco’s enterprise SVN product line:

  • SVN replication, mirroring and clustering for enterprise performance and 24-by-7 availability.

  • MSP means that a central SVN server is no longer a single point of failure, or performance bottleneck and the effects of WAN latency are greatly reduced.

  • By combining WANdisco’s patented replication technology and intelligent load balancing software, SVN can be deployed in an active-active WAN cluster that delivers optimum performance, scalability and availability, with built-in continuous hot backup.

Read more about MSP on the WANdisco website.

1.1. Documentation for 3rd party components

You can integrate MSP with open source software components that require user-level documentation. In these cases, we provide links to the open source project’s documentation.

1.2. Get support

See our online Knowledgebase which contains more in-depth information on specific topics.

We use terms like node and replication group, and define them in the Glossary. This contains some industry terms, as well as WANdisco product terms.

If you need more help raise a case on our support website.

If you find an error or if you think some information needs improving, raise a case or email docs@wandisco.com.

1.3. Symbols in the documentation

In this document we highlight types of information using the following boxes:

Alert
The alert symbol highlights important information.
Tip
Tips are principles or practices that you’ll benefit from knowing or using.
Stop
The STOP symbol cautions you against doing something.
Knowledgebase
The i symbol shows where you can find more information in our online Knowledgebase.

1.4. Release Notes

View the Release Notes. These provide the latest information about the current release, including lists of new functionality, fixes and known issues.

2. Installation Guide

This guide describes everything you need to deploy Subversion MultiSite Plus:

  • Installation requirements

  • A standard installation

  • Node configuration

2.1. Technical skill requirements

Before installing MSP, make sure that you have sufficient hardware and that all required software is configured appropriately.

2.1.1. System administration

  • Linux operating system installation

  • Disk management

  • Memory monitoring and management

  • Command line administration and manually editing configuration files

  • Service (init.d) configuration and management

2.1.2. Apache administration (if applicable)

  • Familiarity with Apache web server architecture

  • Management of httpd.conf / Apache2 configuration file management settings

  • WebDAV protocol

  • User authentication options

  • Log setup and viewing

2.1.3. Networking

  • IP Address assignation

  • TCP/IP ports and Firewall setup or server certificates (if SSL is to be used)

2.1.4. SVN and MultiSite Plus

If you’re not confident about meeting the requirements, you can request a supported installation by raising a case on our support website.

A single administrator can manage all the systems running MSP. However, it is a good idea to have someone at each site who is familiar with the MSP basics.

2.2. Deployment overview

We recommend that you follow a well-defined plan for your WANdisco MSP deployment.
This helps you keep control, understand the product, and find and fix any issues before production. We recommend that you include the following steps:

  1. Pre-deployment planning: Identify the requirements, people, and skills needed for deployment and operation. Agree on a schedule and milestones. Highlight any assumptions, constraints, dependencies, and risks to a successful deployment.

  2. Deployment preparation: Prepare and identify server specifications, locations, node configuration, port availability and assignments, repository set-up, replication architecture, and the server and software configurations.

  3. Testing phase: Actions for an initial installation and testing in a non-production environment, executing test cases, and verifying deployment readiness.

  4. Production deployment: Actions to install, configure, test, and deploy the production environment.

  5. Post-deployment operations and maintenance: Actions including environment monitoring, system maintenance, training, and in-life technical support.

2.3. System requirements

This section gives guidelines for preparing existing servers for replication. These are not a fixed set of requirements. Run your own performance tests during an evaluation period.

2.3.1. Hardware sizing guidelines

Size #Users Repository size (GB) CPU speed (GHz) #CPU #Cores/CPU RAM (GB) HDD (GB)

Small

100

25

2

1

2-4

8-16

100

Medium

500

100

2

2

4

16-32

250

Large

1000

500

2.66

4

4

32-64

750

Very Large

5000

1000

2.66

4

4-6

128

1500

GB or GiB
This note describes how WANdisco measures memory and data. MSP uses the binary prefixes provided by the International Electrotechnical Commission. We therefore use Mebibyte (1,048,576 bytes) instead of Megabytes (1,000,000 bytes) within our products. However, we still refer to Megabytes and Gigabytes where these are more commonly understood, e.g. in the above table.

For more information about the binary prefixes see http://en.wikipedia.org/wiki/Mebibyte.

Memory requirements of DConE2 replication
Each state machine, or replicated object (repository/replication group, etc) needs about 1MB of system memory to run. So for small to moderate deployments the memory requirement of the replication system itself is quite modest. For very larger deployments where you are replicating hundreds or more repositories then you may need to consider the specific memory requirements of the DConE2 replication engine.

2.3.2. Storage

  • For SVN and MSP: Use separate physical disks for SVN and MSP. This ensures that heavy disk usage by either should not impact the other. If you are running with SAN storage we recommend using a fiber connection between the server and the SAN with a minimum dedicated bandwidth of 1GiB.

  • SVN repository storage requirement: Plan your requirements. Clarify what version control changes may be on the horizon so that you can account for any sudden leaps in your repository storage requirement. Consider that it’s usually a lot less costly to over-specify disk capacity than have to deal with running out of storage.

  • MSP storage requirement: Although the storage requirements for the installed files is fairly modest (800 MiB), MSP will store data that has not yet been replicated to all other replication group members. Should a node be offline for an extended period this can result in the buildup of lots of temporary data.

    How much storage does SVN MultiSite Plus need?
    We provide a guideline for calculating for WANdisco’s replication products: Hardware Sizing Guide
  • Use the fastest possible disks for storage. Disk I/O is the critical path for improving repository responsiveness.

  • We recommend using RAID-1 or RAID-10 solutions. You should not use RAID-0, the performance benefits are not worth the drop in resilience (and increased risk of data loss). Where performance is considered more important than resilience then RAID-10 can be used instead. This mirrors two or more striped segments, providing the high I/O performance of RAID-0 without the increased risk of failure.

  • Spinning vs Solid State: Solid state drives (SSDs) offer significant benefits for deployments that make big demands of disk I/O. SSDs are recommended if you have a large deployment or require extra capacity for future growth. However, if your concurrent SVN usage is not very high you may get acceptable performance from trusty old HDD technology.

Disk space

SVN: Match to projects and repositories.
MSP Transaction Journal: Equivalent of seven days of changes.

Estimating your disk requirements can be very difficult and there’s no perfect system for making an accurate estimation. Some organizations monitor their repository growth over a period of time and use an extrapolation as a guide. This method works best if your organization is unlikely to see the addition of large new projects that instantly introduce large amounts of extra repository data.
You need to quantify some elements of your deployment:

  • Overall size of all of your SVN repositories.

  • Frequency of commits in your environment.

  • Types of files being modified - text, binaries (SVN clients only send deltas for text).

  • Number and size of files being changed.

  • Rate that new files are being added to the repository.

Talk to those who know
There is absolutely nothing like having a solid communications path between those managing the SVN system resources and those who manage the development project. Actually talking to the people who are planning upcoming SCM efforts is better than trusting an abstract system for measuring requirements.
Disk space for recovery journal

Provision enough disk space for /opt/wandisco/multisite-plus/replicator/database to cover the expected number of commits for four hours of peak usage, or a busy 3-day weekend (whichever is larger).

2.3.3. Running in virtualization

Deploying on a virtual server platform provides lots of practical benefits. Costs, admin time, and flexibility can all see big improvements when running services from a small number of specialist servers. However, virtualization does not suit every application. Dedicated servers give you confidence in the available resources. Although well-designed virtual platforms can build in load balancing and failover, these are often bolt-ons that work against the whole drive to consolidate physical equipment. They may not offer separation of services or militate the risks of a single point of failure. Be particularly wary of over-subscription of VM servers and services, or insufficient monitoring of the same.

2.3.4. Processor tips

  • MSP can run on a single 2GHz CPU, but for production you should run fast multi-core CPUs and scale the number of physical processors based on your peak concurrent usage.

  • You should aim to have no more than 15 concurrent SVN users per single-core CPU or 7 concurrent users per core with multi-core CPUs:

    • Example 1: A server with 4 physical single core processors is expected to support (15x1x4) = 60 concurrent users.

    • Example 2: A server with 4 physical processors, each being a quad core, is expected to support (7x4x4) = 112 concurrent users.

2.4. Setup requirements

This is a summary of requirements. You must also check the more detailed Installation checklist.

2.4.1. SVN MultiSite Plus servers

This section summarizes requirements:

  • The same operating system, including same architecture and patch versions

    Everything the same
    Keep the setup of nodes identical because subtle variations in software could result in non-deterministic behavior that might lead to a loss of sync.
  • Java and Python installed, with identical versions everywhere

  • A browser with network access to all servers

  • A command line compression utility

  • A unique license key file: This is provided by WANdisco. You need one for each node and you may need to provide the server IP addresses.

If originally installed as "root"
If you are following on from a previous installation/upgrade that was done using root, all subsequent upgrades also need to be run using root.

2.4.2. SVN installations

SVN installation requires:

  • We recommend that you install SVN during the installation of MSP. See the Release Notes for which version of SVN you need.

  • Matching file and directory-level permissions on repositories.

Tips for installation:

  • Make sure you don’t overwrite the WANdisco SVN binaries with system versions. The WANdisco versions are required for replication to work correctly.

  • You must run SVN and MSP on the same server.

  • A repository can belong to only one replication group at a time.

  • Repositories should start out as identical at all sites. A tool such as rsync can be used to guarantee this requirement. The exception is the hooks directory which can differ as variances in site policy may require different hooks, see hooks.

2.5. Installation checklist

Though you may have referred to the checklist while evaluating MSP, we strongly recommend that you re-read the checklist and confirm that your system meets all requirements.

2.5.1. System setup

Operating systems

See the Release Notes for which operating systems are supported for your MSP version.

SVN server
Required version:

MSP 1.9 has been written to support changes introduced in SVN 1.9. For this reason MSP 1.9 can only run with SVN 1.9.

Installing the version of SVN that is bundled with MSP is the only option as this takes care of the requirement for running with WANdisco’s customized FSFSWD libraries, it is also offers the benefit of being a version of SVN that have been extensively tested with MSP.

Option Component Packages

MSP installation checks for the presence of a number of option SVN components. These components, if found, are upgraded from a collection of packages that are bundled with MSP. However, if they are not already installed they will not be touched by the installer, if you need any of them you will need to install them manually.
All SVN packages, including the optional packages, are located here (assuming you used the default installation location):

        /opt/wandisco/svn-multisite-plus/resources/svn
        -rwxr-xr-x 1 root root    80412 Dec  8 16:49 mod_dav_svn-1.9.2-8.x86_64.rpm
        -rwxr-xr-x 1 root root    44632 Dec  8 16:49 serf-1.3.7-1.x86_64.rpm
        -rwxr-xr-x 1 root root  2565568 Dec  8 16:49 subversion-1.9.2-8.x86_64.rpm
      	-rwxr-xr-x 1 root root 13473588 Dec  8 16:49 subversion-debuginfo-1.9.2-8.x86_64.rpm
        -rwxr-xr-x 1 root root  4328340 Dec  8 16:49 subversion-devel-1.9.2-8.x86_64.rpm
  	    -rwxr-xr-x 1 root root    41120 Dec  8 16:49 subversion-fsfswd-1.9.2-8.x86_64.rpm
        -rwxr-xr-x 1 root root   420860 Dec  8 16:49 subversion-javahl-1.9.2-8.x86_64.rpm
        -rwxr-xr-x 1 root root  1027700 Dec  8 16:49 subversion-perl-1.9.2-8.x86_64.rpm
        -rwxr-xr-x 1 root root   726396 Dec  8 16:49 subversion-python-1.9.2-8.x86_64.rpm
      	-rwxr-xr-x 1 root root    50460 Dec  8 16:49 subversion-tools-1.9.2-8.x86_64.rpm
Missing third party libraries

Note that when manually installing MSP using the tarball files (available only on request), you may need to account for missing libraries:

One or more required third party libraries are missing:
  libapr-1.so.0
  libaprutil-1.so.0

If you encounter this problem you may need to set the locations of these files as environmental variables, e.g.:

-  export LD_LIBRARY_PATH=BASE/usr/$LIBDIR
+ export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:BASE/usr/$LIBDIR

To find where these libraries are created you can use, for example, yum, rpm or zypper. If you are unable to use these tools please contact support.

Data Migration
  • If you are installing MSP 1.9 fresh with new repositories

    • No need to do anything

  • If you are upgrading to MSP 1.9 from MSP 1.6

    • You can use the established upgrade path. This is assuming that you did not upgrade to MSP 1.6 from a previous version.

  • If you are upgrading to MSP 1.9 from MSP 1.5 or if your repositories have ever been touched by MSP 1.5 or earlier

    • Your repositories must undergo migration using the SVN 1.9 Data Migration Procedure.

Existing Repositories: SVN 1.6 or later format required
MSP 1.9 requires that all replicated repositories are at least SVN FSFS format 4, as created by SVN 1.6.
Additionally: The requirement is that every customer who had used MSP prior to version 1.6.0 should migrate their repositories so that they can be prepared to install MSP 1.9.x with its support for SVN 1.9. Even if they do not plan on moving to MSP 1.9.0 immediately it is in their interests to do the data migration, since over time the migration task will tend to get bigger and potentially more disruptive to Subversion operation.
Please contact WANdisco support for more information.
Repository Creation:

Ensure that all the repositories you intend to replicate use the FSFS database (MSP cannot replicate repositories using the old Berkeley DB).

If you have any repositories using the old Berkeley DB then you will need to use svnadmin dump and then svnadmin load to put them into FSFS DB repositories prior to adding them to replication. See the Knowledgebase article How to Move a Subversion Repository for more details.

Write access for the operating account

The operating account to be used for the MSP application must be the same account that owns all directories and files of the repositories. If using Apache, see the Knowledgebase article on the best system accounts to use.

Manage repository file ownership if using SVN+SSH:// or file://

Accessing SVN repositories via Apache2+WEBDAV is simplified by the fact that all user access is handled via the same daemon user. SVN+SSH or file:// access is less straightforward.

When using SVN over SSH both processes should be run using the same system account as MSP. This account’s .ssh/authorized_keys entry must provide the necessary access and specify the appropriate account. However, when unifying control in this way you must lock down wider system access or SVN access will equate to full root access. Read more about controlling the invoked command.
file:// access should not be used with other than the same account as MSP.

Tips:
  • Certified SVN binaries are now available from WANdisco. They provide the latest builds without the risks associated with Open Source distribution.

  • Same location - All replicas must be in the same location (same absolute path) and in exactly the same state before replication can start.

  • Same UUID - If you start with new repositories, don’t create them individually at each site. This is because even though they may share the same repository data, each has its own universally unique identifier (UUID), unless the repositories have the same UUID they’re not replicas. For more information read Setting up Repositories for Replication.
    Conversely, two different repositories must not share the same UUID. See UUID Warning.

MSP 1.9 adds support for svnadmin pack

The move the SVN 1.9 brings a number of enhancements to Subversion’s file system. This includes changes that allow svnadmin pack to run on repositories that are replicated by MSP. It is possible for packing to run "on-the-fly", without taking the repository in question "offline" or without undue impact on repository traffic.

Important: on-the-fly packing will only work for SVN repositories that are in Format 7 (the native format for Subversion 1.9). Repositories in earlier formats can undergo a format upgrade in order to gain the capability.

Read more about SVN 1.9’s FSFS changes: SVN 1.9 Release Notes

Linux Standard Base (LSB)

LSB provides developers with a degree of confidence about their applications being able to run on a range of distributions. The package is widely included by default, but not always.

Run the following command to verify the version of LSB yours server is running:

[root@redhat6 wandisco]# lsb_release -a
LSB Version:   :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:
graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: RedHatEnterpriseServer
Description:   Red Hat Enterprise Linux Server release 6.4 (Santiago)
Release:       6.4
Codename:      Santiago

MSP’s init.d scripts are dependent the the LSB package. Running the installer script, if the package isn’t present it should be downloaded before the installation continues.

SVN client

Any that are compatible with local SVN servers.

Hooks

Normally we recommend that all hook scripts be duplicated exactly on all repository replicas however in some circumstances this is not possible. See hooks for more information.

File descriptor/User process limits

Ensure hard and soft limits are set to 64000 or higher. Check with the ulimit or limit command.

Running lots of repositories
Since the replicator must not be run as the root account, the max user processes needs to be set to a high value otherwise your system will not be able to create the threads required to deploy all your repositories.
User process limits:

Maximum processes and open files are low by default on some systems. We recommend that process numbers, file sizes, and number of open files are set to unlimited.

Permanent changes:

Make the changes in both /etc/security/limits.conf and /etc/security/limits.d/90-nproc.conf. Add the following lines, changing "svnmsp" to the username the software will run as:

    svnmsp soft nproc 65000
    svnmsp hard nproc 65000
    svnmsp soft nofile 65000
    svnmsp hard nofile 65000
If you do not see these increased limits, you may need to edit more files.

If you are logging in as the MSP user, add the following to /etc/pam.d/login:

session  required  pam_limits.so

If you su to the MSP user, add the following to /etc/pam.d/su:

session  required  pam_limits.so

If you run commands through sudo you need to make the same edit to /etc/pam.d/sudo.

Systemd default limit of concurrent processes

Some distributions of Linux, including RHEL7, Ubuntu 16, etc, now install with tighter defaults concerning the maximum number of concurrent processes handled by systemd. For up to date information see the GitHub page for systemd news.

In the context of MSP - which can need very high thread counts - the value should be the same as that assigned for nproc above, for example:

  • In system.conf, set TasksMax=64000

  • In logind.conf, set UserTasksMax=64000

This is necessary only if the "pids" cgroup controller is enabled in the kernel.

Browser compatibility

Setup and configuration requires access through a browser. The browsers listed in the Release Notes are known to work.

File systems

Supported file systems include:

  • ext4

  • VXFS from Veritas

  • XFS on RHEL/CentOS 7

    • XFS version 2.8.10 (or newer) combined with Kernel version 2.6.33 (or newer) - this requirement is met by RHEL7.2 and above.

Write barriers should always be enabled.

Journaling file system

Replicator logs should be on a journaling file system, for example, ext3 on Linux or VXFS from Veritas.

ext4 can be used as your journaling file system, although it must be configured appropriately. See Using Ext4 filesystem for journaling.

Avoiding Data Loss
We have an article in our Knowledge Base that looks at a number of implementation strategies that will militate against potential data loss as a result of power outages - Data Loss and Linux.
Java

Install the JRE / JDK version shown in the Release Notes for your MSP version.

  1. Install JDK/JRE (from Oracle) and define the JAVA_HOME environment variable to point to the directory where the JDK/JRE is installed.

  2. Add $JAVA_HOME/bin to the path and ensure that no other java (JDK or JRE) is on the path.

              $ which java
              /usr/bin/java
              $export JAVA_HOME="/usr"
  3. It is possible to run with the JRE package instead of the full JDK. You can check this by running java -server -version. If it generates a not found error, repeat Steps 1 and 2.
    If you find package management problems or conflicts with the JDK version you are downloading (for example, rpm download for Linux), you may want to use the self-extracting download file instead of the rpm (on Linux) package. The self-extracting download easily installs in any directory without any dependency checks.

Python

See the MSP Release Notes for which version is needed.

Browser compatibility

Set up and configuration requires access through a browser. The browsers listed in the Release Notes are known to work.

Kerberos SSO

We support the implementation of Kerberos for single sign-on. By default Kerberos requires that stronger encryption algorithms be available than are currently provided by default in Java. This is so that Oracle can avoid the complications that arise from countries that place import restrictions on encryption technology.

The stronger encryption algorithms are available as an optional download where the user takes responsibility for compliance with the local laws.
For Java 7: JCE Unlimited Strength Jurisdiction Policy
For Java 8: JCE Unlimited Strength Jurisdiction Policy
When downloaded, extract the contents to (and overwrite the existing contents of) the Java security library directory on all nodes, e.g:

$JAVA_HOME/lib/security/

2.5.2. Network settings

Reserved ports

Several ports are reserved by MSP. You should do a port survey of all of your servers to determine which ports are already in use and then select appropriate ports for MSP that will not conflict. You can change these ports after installation, but it is not straightforward and so it is best to do it during installation. To change the ports after installation follow the instruction in Update a node’s properties. The default values suggested during the installation are the following:

Required ports:
DConE.port= An integer between 1 - 65535, default=6444
  • DConE port handles agreement traffic between nodes

content.server.port= An integer between 1 - 65535, default=4321
  • The content server port is used for the replicator’s payload data: repository changes etc.

delegate.port= An integer between 1 - 65535, Default: 7777
  • The delegate port is used by SVN to delegate write operations to the WANdisco Replicator (via the above content.server.port)

jetty.http.port= An integer between 1 - 65535, Default: 8080
  • The jetty port is used for the MSP management interface.

jetty.https.port An integer between 1 - 65535, Default: 8445
  • The jetty port is used for the MSP management interface when SSL encryption is enabled.

Make each port different
In contrast with earlier versions of MSP, which used the same port for both the UI and replication traffic, MSP doesn’t multiplex different traffic on a single port. You will need to assign a different port to each type of traffic.
Firewall or AV software

If you have a virus scanner running on the system housing your repositories and replicator you should:

  • Ensure that you make frequent backups of your repository data

  • If possible, configure your AV system to Notify Only. Otherwise you should prepare for the possibility that a virus infection or for that matter a false-positive could result in potentially catastrophic corruption of either repository or system data.

In general, virus scanners don’t filter ports: firewalls do that. However, some Anti-Virus products contain firewall-like filtering capabilities - if this is the case in your platform, you should make sure that you understand what impact it could have on your MSP deployment.

Full connectivity

MSP requires full network connectivity between all nodes. Ensure that each node’s server is able to communicate with all other servers that will host nodes in your MSP installation on all ports assigned (see above).

VPN

Set up IPsec tunnel, and ensure WAN connectivity.

VPN persistent connections

Ensure that your VPN doesn’t reset persistent connections for MSP.

Bandwidth

Put your WAN through realistic load testing before going into production. You can then identify and fix potential problems before they impact productivity.

DNS setup

We strongly recommend using Fully Qualified Domain Names (FQDN), not IP addresses, and IP addresses cannot be used if SSL is going to be implemented. It is complex to change from IP addresses to FQDNs after the product is installed and so you should use FQDNs anywhere the product or installation requests IP addresses.

To prevent outages based on an inability to convert from FQDN’s to IP addresses, we suggest you make sure your DNS service robust in the face of failures (it should be replicated).

Monitoring

MSP provides a limited system for monitoring system disk space available. This monitor is intended only to provide a deployment with a last line of defense against running out of storage space. We recommend that you deploy a system-wide monitor that ensures that you quickly identify potential problems that could impact services.

Monitor Recommendation
Load balancing

The use of a correctly configured load balancer can greatly benefit performance in situations where there could be large numbers of concurrent SVN users.
The load-balancer should direct session requests to the same server based solely on the source IP address of a packet. Once the choice of server has been made the load-balancer should only change to a different server if the original chosen server is no longer communicable.

Not on DConE or Content Delivery ports
Load balancers should never be used on the DConE or Content Delivery ports. They should only be used for client traffic to the repository service port(s) (Apache/SSHD).

Therefore, MSP requires that any load balancing solution has the following features:

  • Stateless session persistence - Any potential SVN load-balancer needs the ability to handle stateless session persistence within its load balancing algorithm. This is because each Subversion commit needs to go to the same backend node in its entirety or the commit will fail. We achieve this by ensuring the client is bound to a particular back-end node in some way.

    • Client’s IP Address - Not always an option, but this IP-based persistence is easy to manage when the network is stable with static IPs.

    • Cookie-based persistence - SVN command line clients can’t read cookies so for a load balancer to use cookies for the binding they would need to be able to use sticky cookies that are not reliant on the client honoring them.

  • Node health-checking - Another vital requirement is the support for a health check mechanism - whereby the load-balancer makes periodic checks on the connected nodes to make sure that it isn’t passing traffic to an off-line or overloaded server. Any prospective load-balancer should support HTTP status code (application-layer) checks.

  • The load-balancer sends HTTP GET or HEAD requests to back-end nodes. Watching for 'unhealthy' response codes offers greater reliability and flexibility than doing your checks before the network layer.

Time synchronization with NTP

You should deploy a robust implementation of NTP, including monitoring as NTP will not auto-correct if the time is too far off-set from the current time. This is an important requirement because without nodes being in sync there are a number of problems that can occur.

2.5.3. SVN MultiSite Plus setup

System User Account

Take careful note of this requirement as many installation problems are caused by running applications with unsuitable or incompatible system accounts.
In most cases you must install MSP with Apache’s username, e.g. apache.

Read a detailed explanation of why this is required: System accounts for running MultiSite.

Replication Configuration

Read our Replication Strategy Guide for information on how to set up and optimize your replication - Replication Strategy.

Voters follow the sun

To ensure best performance, make sure that MSP can deliver the content of a commit to another local node. MSP normally requires that content reach at least one other node for data integrity purposes. As the content normally represents the bulk of the data in a commit, having a second local node available will improve performance. Furthermore, you may wish to use our scheduling system to modify the voter roles so a proposal may be accepted by local voter nodes during regular working hours. If you need more help with setting up the most efficient deployment please get in touch with our support team.

License model

MSP is supplied through a licensing model based on the number of nodes and users. WANdisco generates a license file matched to your agreed usage model.

Evaluation license

To simplify the process of pre-deployment testing MSP is supplied with an evaluation license. This type of license imposes no restrictions on use but is time-limited to an agreed period.

Production license

Customers entering production need a production license file for each node. These license files are tied to the node’s IP address. In the event that a node needs to be moved to a new server with a different IP address customers should contact WANdisco’s support team and request that a new license be generated. Production licenses can be set to expire or they can be perpetual.

Special node types

MSP offers additional node types to provide limited sets of functionality:

  • Passive Nodes (Learner only): A passive node operates like a slave in a master-slave model of distribution. Changes to its repository replicas only occur through inbound proposals, it never generates any proposals itself.

  • Voter-only nodes (Acceptor only): A voter-only node does not contain repositories. It casts votes based only on the basis of replication history without knowing the actual contents of the proposal data.

These limited-function nodes are licensed differently from active nodes. The IP addresses are a fixed list but the node count and special node count may move between sets of nodes, as long as the number of each type of node is within the limit specified in the license. Speak to WANdisco’s sales team for more details.

2.5.4. Migrate from SVN MultiSite 4.x

MSP uses a new version of WANdisco’s DConE replication engine and has a different architecture compared with earlier versions of MultiSite. As a result there are some special considerations when migrating from SVN MultiSite 4.x.

Byte-for-byte replicas

Repository replicas must be byte-for-byte mirrors of each other. This stringent requirement did not apply to SVN MultiSite 4.x: the previous tests for whether replicas are identical are not sufficient for FSFSWD replication (see 4.2 vs Plus, below). As a result, you need to recreate your replica repositories using a nominated master repository:

  • Identify which of your current replicas is to be the master repository.

  • Then remove or back up all other replicas.

  • Rsync from master to remote servers using checksumming recursively (-r).

You may need to plan the exact process of copying repositories so that it is practical and achievable. Many production repositories take a long time to checksum. If you are in any doubt about handling the process, talk to your WANdisco account manager.

4.2 vs Plus
SVN MultiSite 4.2 replication is done using a proxy that sits between SVN and clients that replays commit operations from the users on the repository via Apache and so constructs a new transaction at every node. In contrast, MSP applies the same FSFS db/transactions at each node. This transaction is constructed based on the contents of the rev files - so with FSFSWD the repositories need to be identical at the revision (and revprop) file level.
Authentication and Apache

MSP opens up more options because MultiSite is no longer running as proxy and options that were previously not compatible with MultiSite now are compatible.

Also Access Control Plus (ACP) is the newer product version of Access Control. It provides for Team management and generates authentication, if needed, and authorization files for use with Apache or svnserve. ACP uses MSP to deliver these files. Since ACP is no longer a proxy, things to consider:

  • If you were using Access Control then be aware that there is a way to migrate from 4.2 AC to ACP+MSP. Please contact our support team for help with this.

  • All traffic goes through either Apache or svnserve (via SSH).

  • ACP does not support Perl regular expressions for defining sub-repository path rules. Subversion sub-repository path wildcards can be used instead (see the ACP User Guide for more information on wilcards).

  • There is no concept of pre-replication authentication.

  • ACP will need to be integrated with LDAP (if AC was integrated with LDAP).

  • Hook scripts - these no longer need to all run on all nodes. See Hook Scripts.

Configure Apache

This section gives an example Apache configuration. In Apache’s config file, httpd.conf:

  1. Set the listen port. There’s more information about the Listen directive in the Binding chapter of the Apache documentation.

  2. Change the Apache KeepAlive settings to allow long-lived HTTP connections.

  3. Make sure that the SVN DAV settings in Apache’s configuration files are exactly the same at all nodes. The top-level location URI prefix should be the same.

    # Needed to do Subversion Apache server.
    LoadModule dav_svn_module modules/mod_dav_svn.so
    
    # Only needed if you decide to do "per-directory" access control.
    LoadModule authz_svn_module modules/mod_authz_svn.so
    
    Listen 80
    MaxKeepAliveRequests 0
    KeepAlive On
    KeepAliveTimeout 30000
    Timeout 7200
    
      <Location /svn>
          DAV svn
          SVNParentPath /opt/Subversion
          AuthType Basic
          AuthName "SVN Repo"
          AuthUserFile /opt/Subversion/svn.passwd
          #AuthzSVNAccessFile /home/user/svnauthfiles/authz.authz
          Require valid-user
      </Location>
  4. Make sure that the Apache usernames and passwords match at all nodes. If you are using Apache passwords the best practice is to use an LDAP authority for Authentication purposes.

MSP must have a valid username inside the HTTP authorization header to be passed for all DAV commands.

2.5.5. Upgrading from Apache 2.2 to 2.4

Please read the section below for more details on critical details when upgrading Apache configurations from Apache 2.2 to Apache 2.4.

A number of critical changes have been made between Apache 2.2 and 2.4. One change that could impact MSP is the consolidation of AcceptMutex, LockFile, RewriteLock, SSLMutex, SSLStaplingMutex, and WatchdogMutexPath directives with a single Mutex directive.

You must ensure that any calls to the AcceptMutex are changed to the Mutex directive.

See the Apache documentation - https://httpd.apache.org/docs/2.4/upgrading.html

2.6. Installation

The installation guide describes setting up MSP for the first time. If you are upgrading from an earlier version of MSP you should also follow this procedure. MSP is a completely new class of product so it’s not possible to follow a shortcut upgrade procedure.

2.6.1. Installation overview

This is an overview of the process:

  1. Double-check the Installation checklist. Take time to make sure that you have everything set up and ready. This avoids problems during installation. In particular, check:

    • SVN authentication: SVN installed, and using authentication. If you require a SVN access control solution see our Access Control Plus product.

    • JDK: See the release notes for which version of JDK to use. It may be possible to run MSP with other versions of Java but support will be reduced. Please contact support if you wich to use a different version.

    • Java memory settings: The Java process on which MSP runs is assigned a minimum and maximum amount of system memory. By default it gets 128MB at startup and 4GB maximum.

    • System resources: Ensure that your system meets the hardware recommendations.

  2. Ensure that your repositories are copied into place on all nodes.

  3. Download and copy the MSP files into place.

  4. Run the setup (as root user), then complete the installation from a web browser.

2.6.2. Before you start

  • Read through the Installation checklist thoroughly.

  • Back up Apache Config: Because the installation could modify your Apache configuration, we recommend that, if you have an existing config, you back it up before the installation. Then do a reconciliation when the installation has completed to check any changes are not going to adversely affect your operation.

Previous SVN versions
If you are installing MSP for the first time we recommend removing all previous versions of SVN from your box prior to MSP installation. Previous SVN versions can interfere with installation and even if you already have SVN 1.9 installed, MSP requires the WANdisco modified version supplied with the product.
If you have any queries regarding this, please contact support.
Setting the LOG_FILE environmental variable

If you need to capture a complete record of installer messages, warnings, errors, then you need to set the LOG_FILE environment variable before running the installer. Run:

 export LOG_FILE="/opt/wandisco/log/installLog.txt"

This file’s permissions must allow being appended to by the installer. Ideally, the file should not already exist (or it should exist and be empty) and its directory should enable the account running the installer to create the file.

Install with ACP auditing functionality

If you are installing MSP where the account access auditing functionality for ACP is required then the following information will be required during installation:

  • Flume Receiver Hostname or IP address

  • Flume Receiver Port

For more information about installing Account Access Auditing, see the ACP installation instructions and How to do a manual set up for audit logging.

For information on how to how to upgrade the ACP Flume sender delivered with ACP1.9 and how to set up SSL, see the How to upgrade the ACP sender delivered with ACP1.9 (and above) and how to set up SSL

2.6.3. Start the installation

These steps describe how to do an interactive installation. If you would like to use a non-interactive installation see the next section.

Run MSP installer as root
The installation requires full system access so you must run the installer as root or a user with equivalent permissions.
  1. Extract the setup file.

  2. Save the svn-multisite-plus.sh installer file to your Installation site.

  3. Make the script executable, e.g. enter the command:

    chmod a+x svn-multisite-plus.sh
  4. Run the setup script.

    Running with Apache?
    If using Apache, see the Knowledgebase article on the best system accounts to use.
    Workaround if /tmp directory is "noexec"

    Running the installer script will write files to the system’s /tmp directory. If the system’s /tmp directory is mounted with the noexec option then you will need to use the following argument when running the installer:
    --target <someDirectoryWhichCanBeWrittenAndExecuted>
    E.g.

    ./svn-multisite-plus.sh --target /opt/wandisco/installation/
    [root@redhat6 wandisco]# chmod a+x multisite-plus.sh
    [root@redhat6 wandisco]# ./svn-multisite-plus.sh
    Verifying archive integrity... All good.
    Uncompressing WANdisco SVN MultiSite Plus....................
        ::   ::  ::     #     #   ##    ####  ######   #   #####   #####   #####
       :::: :::: :::    #     #  #  #  ##  ## #     #  #  #     # #     # #     #
      ::::::::::: :::   #  #  # #    # #    # #     #  #  #       #       #     #
     ::::::::::::: :::  # # # # #    # #    # #     #  #   #####  #       #     #
      ::::::::::: :::   # # # # #    # #    # #     #  #        # #       #     #
       :::: :::: :::    ##   ##  #  ## #    # #     #  #  #     # #     # #     #
        ::   ::  ::     #     #   ## # #    # ######   #   #####   #####   #####
  5. If it is detected that you do not have a compatible version of SVN on your server this needs to be installed. Select Y.

    Welcome to the WANdisco SVN MultiSite Plus installation
    
    Checking prerequisites:
    
    Checking for perl: OK
    Checking for svn: SVN MultiSite Plus requires a compatible version of SVN to be installed.
    
    Install SVN? [Y] > Y
    Installing SVN 1.9.5-2
  6. Select Y if you are using Apache or both Apache and svnserve, and N if only using svnserve.

    Install mod_dav_svn? (Y/n) Y
    
    Stopping httpd: [  OK  ]
    Starting httpd: [  OK  ]
    OK
  7. The next test looks at the Java heap settings. It lists the maximum and minimum allocations for both the replicator component of MSP as well as the admin console UI:

    INFO: Using the following Memory settings:
    
    INFO: UI:         -Xms128m -Xmx1024m
    INFO: Replicator: -Xms1024m -Xmx4096m
    
    Do you want to use these settings for the installation? (Y/n) Y

    Enter Y if these heap settings will suit the needs of your deployment. If you have any doubts, discuss the heap requirements with WANdisco’s support team before going into production.

  8. You’ll now be asked to enter a TCP port number for accessing the browser part of the installation process.

    Which port should the MultiSite UI listen on? [8080]:

    The default port is 8080. Check with your network administrator about which ports are available. You can change the port during the next part of the installation.

  9. The installer now checks to see which system user and system group should be used to run MSP.

    Run MSP with the same user that runs Apache
    When deploying MSP and using Apache as an access mechanism, ensure that they are both run by the same system user. Their operations are so entwined that attempting to run the services with separate users will introduce the risk of permission problems that would halt replication. If using Apache, see the Knowledgebase article on the best system accounts to use.
    We strongly advise against running SVN MultiSite Plus as the root user.
    
    Which user should SVN MultiSite Plus run as? <apache (or httpd)>
    Do you want to continue? (Y/n) Y
    
    Which group should SVN MultiSite Plus run as? <the same primary group as the account owning the repositories>
  10. The installer now asks you to set the umask value for MSP:

    What umask should SVN MultiSite Plus use? [022]:

    You can with the default of 022, this will result in permissions set at 755, if the owner permission is set less than 7 the replicator won’t have sufficient permission to start up. Group/Other permissions are not so critical.

    Testing your umask setting
    To check what umask value is being applied, create a repository via the Admin UI then check the new repositories permissions on the file system to ensure they match your umask value.
  11. Confirm auditing.

    Do you wish to install auditing components for use with Access Control Plus (Y/n)
  12. If the answer is Y then steps below will follow, if not then installation will skip to step 20.
    Confirm the maximum memory size for Flume

    Please enter the maximum memory size for flume process in megabytes [256]:
  13. Enter Flume install information

    Please enter Flume installation location. We recommend the use of a separate file system with sufficient disk space for several days of auditing events. [/opt/wandisco/flume-svn-multisite-plus]:
  14. Confirm if you want to monitor the log

    Do you want to monitor a SVN Multisite Plus log? (Y/n)
  15. Confirm the log file location, hit return to accept the default

    Location of SVN MultiSite Plus log. [/opt/wandisco/svn-multisite-plus/replicator/logs/fsfswd.log]:
  16. Enter Flume details.
    Note - In most deployments we recommend you use a Fully Qualified Domain Name (FQDN) not an IP address and if SSL will be enabled this is a necessity.

    Please enter Flume Receiver connection details.
    Flume Receiver Hostname or IP address [localhost]: <FQDN>
    A port must be set
    Flume Receiver Port [8441]: <custom flume receiver port or just hit return to accept default 8441>
  17. Confirm if you are using SSL

    Is SSL enabled (Y/n) Y
  18. If you are using SSL then you will need to give the following information. The passwords should be inputted as clear text, not in the encrypted form.

    Location of keystore: <Directory Path to your keystore file>
    Keystore password:
    Location of truststore: <Directory Path to your keystore file>
    Truststore password:
  19. A settings summary is shown. Confirm the configuration settings and enter Y to finish the install.

    Installing with the following settings:
    
    MultiSite user:    <your username>
    MultiSite group:   <your groupname>
    MultiSite umask: 0022
    MultiSite UI Port: 8080
    MultiSite UI Minimum memory: 128
    MultiSite UI Maximum memory: 1024
    MultiSite Replicator Minimum memory: 1024
    MultiSite Replicator Maximum memory: 4096
    SVN Multisite Plus will be installed to : /opt/wandisco/svn-multisite-plus
    
    Do you want to continue with the installation? (Y/n)

    The default install location is /opt/wandisco. You can install to a non-default location if needed but that complicates installation. To simplify things, if you cannot install into /opt/wandisco then create your wandisco installation directory, for example /var/data/wandisco, and place a symbolic link at /opt/wandisco pointing to your location.

    Error message if using SUSE

    If you are using SUSE or SLES and the following error message occurs, please ignore it.

    The following package is not supported by its vendor:

    This is an issue from SUSE/SLES, WANdisco fully supports our products for our customers.

  20. Open a browser and go to the provided URL to finish the installation. If your server’s DNS isn’t running you can go to the next step at the following address:

    http://<IP_Address>:<admin port>/

    e.g. http://10.0.100.252:8080/

    • Flush your browser cache
      If you are reinstalling and using SSL, then you should clear your browser cache before you continue. Previous SSL details are stored in the cache and will cause SSL errors if they are not flushed.

  21. The web installer begins with the Welcome screen:

    setup1 1.9
    Welcome
  22. The next screen contains the WANdisco Master Subscription Agreement.
    Please read the terms & conditions and then click I Agree to continue the installation.

  23. On the next (License Upload) screen you are prompted to browse for your product license key file. Click on the Browse button and locate your file. You will have been sent this by the WANdisco sales team, contact them if you have any problems locating or using your license file.

    setup2 1.9
    License Upload
  24. On the Administrator Setup screen you indicate whether you have a user.properties file from a previous installation or not. If this is the first node you are installing select No.

    • If this is the first node you need to enter the username plus an associated password which you will use to log in to the MSP UI. Admin account details are only entered when installing the first node.

      svnmsp firstnode
      Set up Admin account
      Username

      The administrator’s username.

      Password

      The administrator’s password.

      Confirm Password

      Enter your password again to confirm that it’s been typed in correctly.

      Full Name

      Enter your full name.

      Email address

      Enter the email address that you wish to associate with your MSP admin account.

    • If this is the second or subsequent node you will instead be prompted for the users.properties file.
      You can get this file from your first node, its default location is /opt/wandisco/svn-multisite-plus/replicator/properties/users.properties. Use this file for all subsequent nodes you install.

      Can I just enter the same details?
      No. You could enter exactly the same details for each node, but encrypted password would not match. You MUST copy the users.properties file. There is no shortcut. If this has been done, you can match up the necessary details using the procedure for Matching a node’s admin settings.
      If you are providing a users.properties file, take extra care to select the correct file. You are not warned if the file is invalid. If you select the wrong file you will not be able to connect the node to the replication ecosystem.
  25. The last screen in the setup process shows Server Settings.

    setup3 1.9
    Sever Settings
    Node Name

    The default name for this node. It is used to identify the node within the application and will not be used as a host name.

    Temporary limitation
    Node names can not contain spaces or ".".
    Node IP/Host

    The node’s IP or hostname. If the server is multi-homed, you can select the IP to which you want MSP to be associated (if you are using SSL thi must be an FQDN not IP address).

    Dropdown selector

    The IP/Hostname entry field provides a dropdown list of available IP addresses. The dropdown cue is not visible if the browser window’s width is limited.

    setup4 1.9
    For multiple instances of MSP on one node, you must use unique hostnames tied explicitly to unique fully qualified domain names.
    For example, each of the following FQDNs must be tied to a unique IP address:
    msp1.somewhere.company.com
    msp2.somewhere.company.com
    msp3.somewhere.company.com
    This assumes either multiple NICs (one per MSP instance), or a single NIC that responds to multiple IP addresses (using technology implemented to enable High Availability).
    Replication Port

    Select the port to use for WANdisco’s DConE agreement engine. Default=6444.

    Content Server Port

    Select the port to use to transfer replicated content (data for repository changes). Default=4321. This is different from the port used by WANdisco’s DConE2 agreement engine.

    Content Node Count

    This setting gives you the ability to choose the degree of resilience. The value represents the number of nodes within a membership that must receive the content before a proposal is submitted for agreement. If the value is greater than the total Active, Active Voter or Passive nodes in the current membership, then the value is adjusted to equal the total number of Active, Active Voter or Passive nodes in the current membership. The initial Active or Active Voter is not considered in the calculation.

    Minimum Content Nodes Required

    Ticking this checkbox will enforce the Content Node Count as a prerequisite for replication.

    REST API Port

    The port to be used for MSP’s REST-based API. (Default:8082)

    REST API & UI Using SSL

    Check box for enabling the use of SSL for all REST API and UI traffic. If this box is checked more options appear.

    sslsettings01
    SSL Set up
    REST API SSL Port

    The port to be used for MSP’s REST-based API when traffic is secured using SSL encryption. Default=8445.

    UI Port

    The port for HTTP access to the MSP administrative interface. Default=8080.

    UI SSL Port

    The port for HTTPS encrypted access to the MSP administrative interface. Default=8443.

    SSL Certificate Alias

    The name of your SSL Certificate file.

    SSL Key Store

    The name of the keystore file. The keystore contains the public keys of authorized users.

    SSL Key Store Password

    The password for your HTTPS service.

    SSL Trust Store

    The location of your truststore file. The truststore contains CA certificates to trust. If your server’s certificate is signed by a recognized Certification Authority (CA), the default truststore that ships with the JRE will already trust it because it already trusts trustworthy CAs. Therefore, you don’t need to build your own, or to add anything to the one from the JRE.

    SSL Trust Store Password

    The password for your truststore.

    A word about trust stores and key stores
    You might be familiar with the Public-key system that allows two parties to use encryption to keep their communication with each other private (incomprehensible to an intercepting third-party). The keystore is used to store the public and private keys that are used in this system. However, in isolation, the system remains susceptible to the hijacking of the public key file, where an end user may receive a fake public key and be unaware that it will enable communication with an impostor. Enter Certificate Authorities (CAs). These trusted third parties issue digital certificates that verify that a given public key matches with the expected owner. These digital certificates are kept in the trust store. An SSL implementation that uses both keystore and trust store files offers a more secure SSL solution.

    If you need help getting your SSL keys set up, read our guide to Setting up SSL.

  26. Click Finish when you have entered everything. The installer now completes the configuration. When completed, you see a Start Using MultiSite Plus button. Click the button to log in for the first time.

    setup5 1.9
    Finish!
  27. Log in: enter the username and password set above. Then click Let’s Do This!.

    login 1.9
    Log in
  28. Next, read the WANdisco Subscription Agreement. Click I Agree to continue.

    Temporary duplication of license agreement
    Currently the license agreement is presented twice, once during installation and then here when the first end user logs in. This will not appear in future.
  29. The first time you view the dashboard, it contains mostly blank areas. You can view the reference section to learn what all the buttons and options mean. You can now set up some of your settings, such as SSL. However, we recommend that you wait to perform advanced admin account management until you have completed induction.

2.6.4. Non-interactive installation

You can also install MSP with an unattended (scripted) install. Set the following environment variables:

MSP_USER

The system account that runs MSP.

MSP_GROUP

The system group that MSP runs in.

MSP_UMASK

Set your required Umask settings. We validate your entry so that it must be a 3-digit number that begins with a zero, e.g. 077.
Note: The first digit signifies the base of the number (octal) so 0777 is a 3-digit number. The product installs using 0022 or 022, but always shows 4-digits when installing.

MSP_UI_PORT

The TCP port that the browser UI initially uses. You can change this during the browser-based setup. Default is 8080.
The configurator will load on this following install.

Auditing environment variables If you are installing or upgrading and will be using the ACP auditing functionality, please make sure to set the ACP auditing environment variables outlined below.

For a scripted start to the installation run:

export TERM=xterm
export MSP_USER=(user_to_Run_MSP)
export MSP_GROUP=(Group_to_Run_MSP)
export MSP_UMASK=(Umask to apply): default 022
export MSP_UI_PORT=(PortToHostUI): default 8080
export ENABLE_AUDITING=(true/false)

If you are installing MSP where the account access auditing functionality for ACP is required (ENABLE_AUDITING=true), make sure that you set the following variables:

  • ENABLE_AUDITING=true/false: True to install auditing

  • FLUME_INSTALL_DIR=/opt/wandisco/flume-svn-multisite-plus: Full path where Flume is to be installed, the default is shown.

    • Make sure that you do not set the Flume install variable to a directory that is a parent directory to any other product, or a parent directory where repositories are stored (or above).

  • ACP_AVRO_HOST=(ACP_Flume_Address): Flume sender IP

  • ACP_AVRO_PORT=(ACP_Flume_Port): Flume sender port

  • FLUME_MAX_MEMORY=256

  • FLUME_AVRO_SSL=true/false: true to enable SSL

If using svnserve you also need to set:

  • SVN_MONITOR_ACCESS=true/false: true if using svnserve

  • SVN_ACCESS_LOG=/path/to/svnservelog: Full Path to svnserve log

If using apache server you also need to set:

  • SVN_WEBDAV_LOG=/path/to/httpd/access.log: Path to HTTPD access.log

    • This is usually under /var/log/httpd/access.log or /var/log/apache2/access.log

  • SVN_MONITOR_WEBDAV=true/false: true to monitor the httpd access log

If FLUME_AVRO_SSL=true you also need to set:

  • FLUME_AVRO_KEYSTORE_LOC: Full Path to Flume Keystore

  • FLUME_AVRO_KEYSTORE_PASS: FlumeKeyStorePass

  • FLUME_AVRO_TRUSTSTORE_LOC: Full Path to TrustStoreFile

  • FLUME_AVRO_TRUSTSTORE_PASS: FlumeTrustStorePass

Note - The Keystore and Truststore passwords need to be given as clear text not as encrypted passwords. If you do not want to provide environment variables with clear text passwords then you can configure the Auditing components after installation.

For more information about installing Account Access Auditing, see the ACP installation instructions.

The installation then runs without user interaction. When installation is complete, the browser-based UI starts. You then need to complete the node set up from step 20.

Installing with tarball installer

If you wish to run the tarball installer please run the same script as above but with following extra parameters:

export MSP_PREFIX=(Path for tarball to install under): default is /opt/wandisco/svn-multisite-plus
export MSP_INIT=1

2.6.5. Manual setup for audit logging

Use this procedure to account for some configuration relating to the audit feature that is currently missing from the installer.

Sender configuration

Setting sources

This value sets the sources that flume will monitor: acpSender.sources =

  • Example: To monitor all three set: acpSender.sources = svnServeSource svnWebdavSource gitmsSource

  • Example: To monitor just Webdav: acpSender.sources = svnWebdavSource

Setting log locations

Settings that apply to SVNServe and Webdav:

acpSender.sources.svnServeSource.type = exec
acpSender.sources.svnServeSource.command = tail -F /var/log/svnserve.log
acpSender.sources.svnServeSource.restart = true
acpSender.sources.svnServeSource.channels = memChannel

acpSender.sources.svnWebdavSource.type = exec
acpSender.sources.svnWebdavSource.command = tail -F /var/log/httpd/access_log
acpSender.sources.svnWebdavSource.restart = true
acpSender.sources.svnWebdavSource.channels = memChannel

The system user that runs MSP MUST have permissions to read all the files that you configure to monitor.
This can be particularly tricky since, for example, /var/log/httpd is normally set to 0700. One way to work-around this would be to set the group of the /var/log/httpd directory to the same group as MSP runs as and then change its permissions to be 0750. This enables proper monitoring with minimal security impact.

For more information see the ACP manual’s section on configuring the Flume Receiver.

2.6.6. Repeat the installation process at all nodes

Now repeat the installation process for every node that you want to share your SVN repositories.

  • To ensure a successful induction, you will take the configuration files from the first node and use them during the installation of all additional nodes to ensure that all nodes are started with the same administrator account.
    You may benefit from creating an image of your initial server, with the repositories in place and using this as a starting point on your other nodes. This helps ensure that your replicas are in exactly the same state. For example capture a tar-ball image that can be copied to each machine and extracted, or alternatively you can use rsync.

  • Same location - All replicas must be in the same location (same absolute path) and in exactly the same state before replication can start.

  • Same UUID - If you start with new repositories, don’t create them individually at each node. This is because even though they may share the same repository data, each will have it’s own universally unique identifier (UUID) - unless they have the same UUID they’re not replicas.

Ensure that all nodes have matching configuration before completing the inductions
  • Copy configuration (e.g. admin account property file, SSL certs) to all other servers on which you intent to install MSP.

  • Run the installer on the servers nodes and continue to the the induction. Installer will let you select the copied-over admin property file instead of manually entering details for the admin account.

  • If you do not provide the admin account property file during installation, or the admin accounts use LDAP, or the admin accounts change before induction, then you have to use the regular export-import process.

  • If you have conflicts in the admin accounts then you need to delete or rename accounts on the to-be-inducted node to remove the conflicts.

  • Make certain that induction is complete by looking at the UI of every inducted node and verifying that they all show zero pending transactions. Be patient, induction can take time and you must not prematurely start the next induction. More on this below.

2.7. Node induction

After installing MSP at all sites, you need to make the nodes aware of each other through the node induction process. Follow the steps in this section, in the order that they are given.

2.7.1. Membership induction

It’s important that nodes are connected together in a specific sequence. Run through the following steps to ensure that your nodes can communicate with each other:

  1. When MSP is installed on all your sites, select one node to be your Inductor. This node accepts requests for membership and shares its existing membership information. It doesn’t matter which node you select.

    induction overview
    Inductor Node schematic
  2. Go to http://<Inductor’s IP>:8080/multisite-local/ to gather the necessary information, most is available from the Settings tab.
    You will need to know the:

    Node ID

    The UUID of the node.

    Node Location ID

    The reference code that is used to define the inductor node’s location.

    Node IP Address

    The IP address of the inductor node server.

    Node Port No

    The DConE Port number (6444 by default).

  3. All your remaining nodes are now classed as Inductees. Select one of your Inductee nodes. Connect to its web admin console, http://<Inductee1:8080/multisite-local/, and click the Nodes tab.

  4. Click the Connect to Node button and enter the details that you collected from your Inductor node.

    nodeinduct1 1.9
    Connect to Node
    Consistency check revisions
    When inducting a new node, make sure it has the same number of consistency check revisions configured as the current nodes or the induction will fail.

    When these details are entered, click the Send Connection Request button. The inductor node will accept the request and add the inductee to its membership. You will need to refresh your browser to see that this has happened.

    nodeinduct2 1.9
    Inducted node visible after refresh
  5. Check that all of the inducted nodes in your current ecosystem agree this node is completely inducted and that there are no pending transactions at any sites. Do not continue with any other induction until you have checked the UI of the all of the current nodes in the ecosystem. Be patient as induction can take a while.

  6. After they all agree, go back to step 3 and select one of your remaining inductees. Repeat this process until all the nodes that you want to be included in the current membership have been connected to the inductor.

2.7.2. If induction fails

If the induction process fails, you may be left with the inductee in a pending state:

  1. From the Nodes tab, review the state of your prospective node. During the induction process a prospect will display a Connectivity Status of Pending Induction. The process should move forward within a few seconds, providing that there isn’t a network connection problem.

    If the prospect appears to be stuck in the pending state then click the Cancel Induction link.

    inductionpending1 1.9
    Pending Nodes can be cancelled
  2. A growl message confirms that the induction was cancelled successfully. Click the Reload button to clear the cancelled induction.

    inductionpending2 1.9
    Growl confirms confirmation
  3. Repeat the induction procedure after confirming:

    • You are entering the correct details for the inductee node.

    • There isn’t a network outage between nodes.

    • There isn’t a network configuration problem, such as a firewall blocking the necessary ports.

    • There isn’t an admin account mismatch between nodes - this occurs if you don’t use the correct procedure for installing a second or subsequent node. If the admin account doesn’t match because nodes were not installed using the first node’s user.properties file then you should follow Matching a node’s admin settings.

    • There isn’t a product license problem. Should the license file clash between two nodes, or be missing from a node this could cause induction to fail. License problems are noted in the Application Logs.

2.7.3. Match a node’s admin settings

Ensure that all nodes start with a common admin account by importing the admin settings from the first installed node during the installation of all subsequent nodes. If a node is accidentally installed without this match you can use the following procedure to resync them. You’ll need to follow this if you wish to induct the mismatched node into a replication network that includes the other nodes.

  1. Log in to your first node, click on the Security tab and click Export Security Settings to perform a security (user) settings export.

    exportsettings1 1.9
    Security tab
  2. Access the same node using a terminal window. Copy the exported settings file (/opt/wandisco/svn-multisite-plus/replicator/export/security-export.xml) to a location on the node you’re fixing. You may need to create a directory. E.g.

    /opt/wandisco/svn-multisite-plus/replicator/import/security-export.xml
  3. Log in to the admin UI of the node that you’re fixing to enable induction. Click on the Security tab then click the Import Security Settings button.
    Enter the path to the copied across security-export.xml file then click Check.

    exportsettings2 1.9
    Import Security Settings
  4. You’ll be presented with a Diff report that shows you what differences exist between the current user settings and those in the exported file.

    exportsettings3 1.9
    Enter Security Settings

    Click Overwrite. The admin user settings will now match those used in the other nodes.

  5. Now that the admin user account details are matching again you’ll be able to complete an induction of the corrected node into a replication network.

2.8. Create a replication group

MSP lets you share specific repositories between selected nodes. This is done by creating Replication Groups that contain a list of nodes and the specific repositories they will share.

repgroups
This illustration shows a collection of four nodes that are running two replication groups. Replication Group one replicates Repo1 across all four nodes, whilst Replication Group 2 replicates repo2 across a subset of nodes

Follow this procedure to create a Replication Group.

Replication Rules:
  • A node can belong to any number of replication groups.

  • A repository can only be part of a single active replication group at any particular time.

  • You can easily move a repository between replication groups.

  1. When you have nodes defined, click on the Replication Groups tab. Then click on the Create Replication Group button.

    ui repgroup01
    Create Replication Group
  2. Enter a name for your Replication Group in the Group Name field. Then select an existing Node from the drop-down list.
    You can select any number of available nodes.

    ui repgroup02
    Enter a name and add some nodes
    Local node automatically made the first member
    You cannot create a replication group remotely - the node on which you are creating the group must itself be an member. For this reason, when creating a replication group, the first node is added automatically.

    Those nodes that you select will appear as clickable buttons.

  3. New nodes are added as Active Voters (denoted with AV). You can change the type of a node by clicking on its label. For an explanation of what each node type does, view the Guide to Node Types

    ui repgroup03b
    Change node type
    Replication Group Validation
    The admin UI won’t let you create a replication group that doesn’t meet the requirement set by DConE, for example, the proposed replication group must not have an even number of voter nodes (without also having a tiebreaker). When the selected member nodes don’t make a valid replication group, the Create Replication Group button will be disabled (greyed out).
    Advice on creating effective replication groups

    For a description of rules for replication read Creating resilient Replication Groups. Nodes are automatically added to a group as Active Voters.

    Tiebreaker availability If you add or remove nodes so that a replication group goes from having an even to an odd number of voter nodes, any node that is assigned as a tiebreaker will lose this designation. Tiebreakers are only applicable/available when there is an even number of voter nodes and the corresponding risk of a "split brain".
    To understand the differences between different types of nodes, read Guide to node types

    When you have added all nodes and configured their type, click Create Replication Group.

  4. Replication Groups that you create will be listed on the Replication Groups tab.

    repgroupcreate04
    Groups boxes, click View to view your options
Don’t cancel replication group creation tasks
If you create a new replication group, then find that the task is stuck in pending because one of your nodes is down, do not use the Cancel Tasks option on the Dashboard’s Pending Tasks table.
If, when all nodes are up and running, the replication group creation tasks are still not progressing, please contact the WANdisco support team for assistance.

2.9. Add repositories

When you have added at least one replication group you can add repositories to your node.

Unique UUIDs

The Subversion repository UUID should be unique across all of the repositories.
It is possible to create multiple repositories with the same UUID (for example using the svnadmin load --force-uuid command, or simply copying the repository into a different name).
If you are using Apache or a long-lived svnserve process, and you have multiple repositories with the same UUID, then you will eventually end up with repository corruption unless your repositories are updated into the Subversion 1.9 native format (7). Only in repository format 7 is it safe to serve multiple repositories with the same repository UUID, however it is unwise to do so as eventually this will cause confusion for your Subversion users.

MSP Administrator note: MSP has its own UUID to track a Subversion repository. This is NOT to be confused with the Subversion repository UUID. It is normally only interesting to WANdisco support personnel.

  1. Click on the Repositories tab. Click on the Add button.

    addrepo1 1.9
    Repositories > Add
  2. Enter the following information, then click Add Repo:

    Repo name

    Choose a descriptive name. This doesn’t need to be the folder name, it can be anything you like.

    FS Path

    The local file system path to the repository. This needs to be the same across all nodes.

    Replication Group

    The replication group in which the repository is replicated. It is the replication group that determines which nodes host repository replicas, and what role each replica plays.

    Global Read-only

    Check box that lets you add a repository that will be globally read-only. You can deselect this later. In this state MSP continues to communicate system changes, such as repository roles and scheduling, however, no repository changes will be accepted, either locally or through proposals that might come in from other nodes.

    Create New Repository

    If the repository already exists it must be tested before you place it under the control of MSP. If it doesn’t already exist then tick the Create New Repository box to create it at the same time as adding.

    addrepo2 1.9
    Repositories > Enter details then click Add Repo
    Repository stuck in Pending state
    If a repository that you added gets stuck in the deploying state, you see this on the Dashboard, in the Replicator Tasks area. You can cancel the deployment and try adding the repository again. To cancel a deployment, go to the Dashboard Replicator Tasks area and click the Cancel Task link.
    If you are creating a new repository and this gets stuck in Pending state, see Fix pending repository creation.
  3. Click the Repositories tab to see a list of the repositories added.

    addrepo3 1.9
    Repositories listed

    Information in the repositories list describes the master branch, not the whole repository.
    See the Reference section for more details on the Repository list.

3. Upgrade Guide

This upgrade procedure describes how to upgrade and keep most, if not all, of your existing configuration.

The version number of each MSP component is tracked. You need to know the svn-ms-replicator version. You can see this:

  • Listed at the bottom of the admin UI

  • Listed in System Data at the bottom of the settings tab of the admin UI, along with other component versions

  • In this file, accessible without starting up:

    /opt/wandisco/svn-multisite-plus/VERSION

3.1. From MSP 1.5.3.2 to later

3.1.1. Before you upgrade

Before starting an upgrade:

  • Recheck the installation checklist.

  • Back up existing Apache config files. When the upgrade is completed you should verify that it hasn’t made Apache configuration changes that will adversely affect your operation.

  • All repositories must have unique names - Duplicate repository names are not allowed.

  • Do not make any system changes, like adding new repositories, until the upgrade has been completed.

  • If you are also installing Access Control Plus with auditing functionality, make sure that you set the following variables:

    • FLUME_INSTALL_DIR: Flume install location, default is /opt/wandisco/flume-svn-multisite-plus
      Do not set the Flume install var to a directory that is unaccessible, i.e. one that is not writable by anyone, including root. Also, do not set the Flume install var to a directory that exists and contains any critical data (such as repositories).

    • ENABLE_AUDITING: Set to true to install auditing and false to not install auditing.
      For detailed information see ACP installation instructions.

  • In MSP 1.6 the functionality for tiebreaker nodes change slightly. After upgrading from an earlier version of MSP, if you make a change to a replication group that has an odd number of voter nodes, including a change to the schedule, any tiebreaker nodes will now lose their special tiebreaker designation. Tiebreaker nodes are now only available when there is an even number of nodes - along with the corresponding risk that a vote could be evenly split between voters. Note that this change is not applied after upgrading to MSP 1.6 but only after you then alter an applicable replication group.

3.1.2. Upgrade procedure

  • Ensure that no corresponding repositories are stuck in Local Read-only mode.

  • Ensure that there’s at least one defined replication group on the nodes, which is required for a synchronized stop.

  • Upgrade one node first, then upgrade the rest in parallel
    Select one node and run the upgrade on this node to completion. Type y to the question:

    Is this the first node? y/N:

    When this is complete you can then upgrade all the other nodes in parallel by typing n to answer this question.
    Each upgrade will leave MSP "down" on that node. Do not bring up any node until ALL of the nodes have been upgraded.

If your upgrade of the initial node fails for any reason, then you must contact WANdisco Support immediately, without trying to upgrade any other nodes.
  1. Open a terminal window and log in to the server.

  2. Get the latest installer file and make sure it is executable:

    chmod a+x svn-multisite-plus.sh
  3. Run the installer:

    ./svn-multisite-plus.sh
    Verifying archive integrity... All good.
    Uncompressing WANdisco SVN MultiSite Plus........................
    Running in non-interactive mode, installing with user 'wandisco' and group 'wandisco'. Output will be logged
    to the daemon.info syslog facility
    
    Please enter the username of an administrative user: admin
    Please enter password for the 'admin' administrative user:xxxxxxxxxx
    
    This process must be run on ALL nodes individually, in sequence. NOT AT THE SAME TIME
    If this is the first node we will use this node to co-ordinate the upgrade.
    Is this the first node? y/N: y
    
    Stopping ui:..[  OK  ]
    Stopping replicator:..[  OK  ]
    
    Backup process logged at /opt/wandisco/svn-multisite-plus/logs/backup.log
    
    Version 1.2.2-SNAPSHOT Build: 29412 backup found in /opt/wandisco/svn-multisite-plus/replicator/database
    /backup/2014-06-17T16:58:54Z_DConE_Backup
    
    ......................................
    
    Transformation complete
    Jun 17, 2014 4:59:22 PM com.wandisco.fsfs.backup.FsfsRestore main
    INFO: main:[About to restore database from /opt/wandisco/svn-multisite-plus/replicator/database/
    backup/2014-06-17T16:58:54Z_DConE_Backup to /opt/wandisco/svn-multisite-plus/replicator/database]
    
    Checking SVN
    SVN version 1.7.9-2 is already installed, will not install SVN
    
    Beginning WANdisco SVN MultiSite Plus Installation...
    Attempting to run backup on existing install of SVN MultiSite Plus...
    Ignore the installation script warning that you should upgrade the nodes individually, in sequence!

    As MSP is already installed and running, the installer determines that you wish to perform an upgrade instead of an initial installation:

    SVN MultiSite Plus Pre-Upgrade Backup Script
    
    Please enter password for the 'admin' administrative user:
  4. Enter the MSP admin credentials:

    Initiating synchronised stop...
    Node has achieved synchronised stop
    Initiating database dump...
    Database dump complete
    Stopping ui:.                                              [  OK  ]
    Stopping replicator:.                                      [  OK  ]
    Performing backup...
    Backup has been stored in /opt/wandisco/svn-multisite-plus/var/backups
    
    Starting ui:[  OK  ]
    Starting replicator:[  OK  ]
    
    Backup process logged at /opt/wandisco/svn-multisite-plus/logs/backup.log
    
    Version 1.2.2-SNAPSHOT Build: 29412 backup found in
    /opt/wandisco/svn-multisite-plus/replicator/database/backup/2014-06-17T16:58:54Z_DConE_Backup
    ......................................
    
    Transformation complete
    Jun 17, 2014 4:59:22 PM com.wandisco.fsfs.backup.FsfsRestore main
    INFO: main:[About to restore database from /opt/wandisco/svn-multisite-plus/replicator/
    database/backup/2014-06-17T16:58:54Z_DConE_Backup to /opt/wandisco/svn-multisite-plus/replicator/database]
    Jun 17, 2014 4:59:22 PM com.wandisco.database.prevayler.DatabaseInstance createLocation
    INFO: main:[Writing to database location /opt/wandisco/svn-multisite-plus/replicator/database]
    Jun 17, 2014 4:59:22 PM com.wandisco.database.prevayler.Manager create

    The installer stops your nodes, then backs up the application data. The repository data is not touched.

  5. After all nodes have been upgraded, choose 1 node to bring up first.

    Do not bring up any node until ALL of the nodes have been upgraded.

    With the local replicator restarted, log in to the admin UI and click the Settings tab. You can confirm that MSP has been updated by looking at the Module versions.

  6. Bring up all other nodes. Verify that all nodes are communicating with one another via the Nodes page. If you have any issues, please contact WANdisco support team.

Which version am I running?

The version number of each MSP component is tracked. You need to know the svn-ms-replicator version which is:

  • Listed at the bottom of the admin UI

  • Listed in System Data at the bottom of the settings tab of the admin UI, along with other component versions

  • In the following file, accessible without starting up:

    /opt/wandisco/svn-multisite-plus/VERSION

3.2. Migration from SVN MultiSite 4.2

This section gives important information if you’re upgrading from an earlier version of SVN MultiSite.

3.2.1. Replica consistency

Summary: MSP uses a stronger definition of identical

SVN MultiSite 4.x and earlier versions require that repository replicas are identical, in that all replica contain identical revisions. However, cosmetic differences may exist for example revision time stamps. MSP uses a stronger definition of identical in that the repositories must now be byte-for-byte the same (identical in terms of properties, time-stamps, etc).

A migration from SVN MultiSite to MSP therefore requires that you get rid of all but one copy of each repository. This remaining copy becomes your master and must be copied to each applicable node. This ensures that you start from the position where replica are identical.

3.2.2. Migration strategy

Per-repository
  • Repositories are migrated one at a time.

  • There is less risk.

  • This is the preferred approach if installing on existing servers.

  • May not be able to preserve the repository URL.

Full migration
  • All repositories are migrated in a single operation.

  • All your eggs in one basket risk.

  • The preferred approach when installing onto new servers.

  • Biggest obstacle is managing the migration of repository data.

3.2.3. Copying strategy

In production, repositories can be extremely large, so large that making copies of them can be challenging. There are a number of strategies you can use to manage the distribution.

Strategy 1

Make a physical copy of each repository for real-world transfer. This option makes most sense when the scale of repository data is so great that a WAN network transfer is not practical (e.g. copying Terrabytes of repository data over a low/moderate bandwidth connection).

Strategy 2

Full rsync (without checksum). This option is a direct solution that works best if you have the opportunity to copy all repository data during a maintenance window. You won’t be able to continue committing changes to the original repositories once done.

Strategy 3

Incremental rsync (with checksuming). This option takes a copy of the repository from each location’s local server, and then uses an incremental rsync with checksums to make the new copy identical to the master copy. WANdisco has a script that can incrementally "top up" changes made to your 4.2 repositories to an MSP repository storage location to enable fast cutover from 4.2 AC/MS to MSP. Please contact WANdisco support for more information.

This approach updates MSP copies of the repository with the commits made since the last rysnc. Note that the following steps will update the MSP copies of the repositories with the commits made since the last rsync. However, these repositories are not guaranteed to be correct.

To guarantee a correct repository, you need to run a final incremental rsync after the repository has been made read-only. In addition, repositories that are being incrementally updated should not be added to MSP. Addition to MSP should only be done after all incremental updates are finished.

3.3. Roll back to an earlier version

Follow these steps to roll back MSP to an earlier version. Execute all of the following steps on all nodes before continuing to the next step.

  1. Stop the service on all nodes

    service svn-multisite-plus stop

    or if using SLES 12/systemd:

    systemctl stop wdmsp.target
  2. Backup your existing installation on all nodes

    cp -rp /opt/wandisco/svn-multisite-plus /opt/wandisco/svn-multisite-plus-bak
  3. Uninstall MSP on all nodes

    yum -y erase svn-multisite-plus
  4. Remove product files on all nodes

    rm -rf /opt/wandisco/svn-multisite-plus
  5. Fully complete the installer for the prior version.
    Note: The existing product license will be at /opt/wandisco/svn-multisite-plus-bak/replicator/properties/license.key

  6. Copy the MSP backup that was generated during upgrade:

    mkdir /opt/wandisco/svn-multisite-plus/var/backups
    cp -rp /opt/wandisco/svn-multisite-plus-bak/var/backups/* /opt/wandisco/svn-multisite-plus/var/backups/
  7. Run the rollback script:

    cd /opt/wandisco/svn-multisite-plus/bin
    ./rollback

    This stops the service.

  8. Start the service:

    service svn-multisite-plus start

    or if using SLES 12/systemd:

    systemctl start wdmsp.target
  9. Check that the version number on the Settings page is correct.
    Note: You may have to clear your browser’s cache.

  10. Check also that core functionality is correct, i.e. CO, CI, replication, etc.

In the earlier version you will see that recent features are no longer present. All inducted nodes, replication groups, and repositories, etc. are still present and correct.

3.4. Upgrade or downgrade SVN binaries

The versions of MSP are tied one-to-one with specific versions of Subversion. You must not try to install, upgrade or downgrade the Subversion components independently of MSP. This can and will cause significant damage to your installation.

3.5. svnadmin Upgrade Procedure

3.5.1. SVN 1.9 FSFS changes

SVN 1.9 uses the new format-7 file system. Key changes to FSFS chiefly service the need for improved performance, as a result of the following enhancements:

  • Logical addressing

  • Improved organization of revision data storage when using svnadmin pack

  • Block-level reads are now cached

Features support by FSFS format

The following table clearly indicates which FSFS formats support each of the new SVN 1.9 features:

Feature Format 6 Format 7 Upgraded Format 7 Native Format 7 Native (packed)

Reduction in dynamic memory usage. Where feasible, temporary buffers have a fixed maximum size now.
Other temporary containers have been reduced in memory consumption.

Yes

Yes

Yes

Yes

Saturate 10Gb networks from SVN caches. If almost all requests can be served from SVN fulltext caches etc., an 8-core server running Apache can saturate a 10Gb network with uncompressed data.
It will take 20+ concurrent checkout or export requests to generate that load

Yes

Yes

Yes

Yes

Saturate 1Gb networks from OS caches.
If virtually all requests can be served from the OS file cache, a 4-core server running Apache can saturate a 1Gb network with uncompressed data. It will take 2 or more concurrent checkout or export requests to generate that load.

Yes

Yes

Yes

Yes

svnadmin pack does not block commits.

No

Yes

Yes

Yes

Full checksum coverage of revision data. Not only user file contents, directories and properties are protected by checksums but also the meta-data tying them together.
This only detects external corruption caused by rogue scripts, hard disk failure etc. and will not help against internal corruption caused by faulty SVN logic.

No

No

Yes

Yes

Quick verification to find external corruption.
Verifies a repository at several 100MB/s and does not slow down with increasing number of revisions. This allows for a much faster health check after system failure.

No

No

Yes

Yes

Fast access to cold data on disk. Core feature of format 7.
Revision data is read about twice as fast as with older formats. Assuming reading data from disk being 10x slower than from OS caches and a mere 10% OS cache misses, this translates into 30% higher overall throughput with format 7 over previous formats.

No

Yes

Yes

Yes

3.5.2. Get Packing

Regular packing of repository data is now recommended, since commits are now accepted during the packing, doing away with the old requirement for downtime during packing.

Packing FSFS filesystems

Subversion 1.9 FSFS-backed repositories create, by default, a new on-disk file for each revision added to the repository. Having thousands of these files present on your Subversion server, even when housed in separate shard directories, can be inefficient. Firstly, the OS has to reference many different files over a short period of time, leading to inefficient use of disk caches and, as a result, more time spent seeking across large disks. Because of this, Subversion pays a performance penalty when accessing your versioned data.

A second problem is that filesystems allocate disk space, each file claims more space (from 2 to 16 kilobytes per file) on the disk than it actually uses. This gives FSFS_backed repositories a per-revision disk usage penalty, and the penalty can be noticeable on repositories that have extremely large numbers of small files

Benefits of packing

Packing was introduced in Subversion 1.6. It works by concatenating all the files of a completed shard into a single "pack" file and then removing the original per-revision files. This reduces the file count within a given shard down to just a single file. In doing so, it aids filesystem caches and reduces (to one) the number of times a file storage overhead penalty is paid.

Subversion can pack existing shared repositories which have been upgraded to the 1.6 filesystem format or later (see svnadmin upgrade).

Example packing operation

To do so, just run svnadmin pack on the repository:

$ svnadmin pack /opt/Subversion/repo1
Packing shard 0...done.
Packing shard 1...done.
Packing shard 2...done.
...
Packing shard 34...done.
Packing shard 35...done.
Packing shard 36...done.
$

Running under FSFS format, you can run svnadmin pack on repositories that are in use, or even as part of a post-commit hook. Repacking packed shards is legal, but will have no effect on the disk usage of the repository.

3.5.3. svnadmin upgrade

MSP 1.9.0 introduces support for svnadmin upgrade for the first time.

This operation is done via a proposal, in other words, it is a coordinated action. While the operation is usually very fast (it can appear to happen instantly), there are some scenarios where an svnadmin upgrade can take an unexpectedly long time, during which time the repository will be locked.

Potentially very slow svnadmin upgrades with Format 4

Back in Subversion 1.7 (using FSFS format 4), there was support for revision packing, but not for revprop packing and so a packed shard had unpacked revprops. Later formats expect revprops to be packed if revisions are packed. Upgrading format 4 involves packing all the revprops corresponding to the packed revisions, reading those revprops, one file per revision, and writing the revprops, a few files per shard. So if a large number of revisions are packed then the runtime is likely to be proportional to the number of revisions.

Another factor that can delay the process is repository write-lock, although upgrade is no different to commit in this respect.

Note that the upgrade of an unpacked repository is fast as no revprops need to be packed.

Upgrade changes the file db/format, with 1.9 you can use:

# svnadmin create --compatible-version 1.6 myrepo
# svnadmin info myrepo | grep 'Filesystem Format'
Filesystem Format: 4
# svnadmin upgrade myrepo
Repository lock acquired.
Please wait; upgrading the repository may take some time...
Bumped repository format to 7

Upgrade completed.
# svnadmin info myrepo | grep "Filesystem Format"
Filesystem Format: 7

One other thing that can cause upgrade to take a long time is some other process holding a write lock:

$ svnadmin freeze repo sleep 60 & # background
$ svnadmin upgrade repo # blocks for freeze to complete

4. Administration Guide

4.1. Housekeeping

4.1.1. Starting up

To start the MSP replicator:

  1. Open a terminal window on the server and log in with suitable file permissions.

  2. Run the svn-multisite service, located in the /etc/init.d folder:

    lrwxrwxrwx 1 root root 55 Jul 11  2016 svn-multisite-plus -> /opt/wandisco/svn-multisite-plus/bin/svn-multisite-plus
  3. Run the start script using the service command to make certain that the MSP execution environment is the same on every run. Do not execute the script by hand when starting up MSP:

    root# service svn-multisite-plus start
    Starting ui:                                               [  OK  ]
    Starting replicator:                                       [  OK  ]

    or if using SLES 12/systemd:

    root# systemctl start wdmsp.target
    Platform dependent commands
    See here for more information on platform specific start and stop commands.
  4. The two components of MSP, the replicator and the UI will start up. Read more about the svn-multisite-plus init.d script

4.1.2. Shutting down

To shutdown:

  1. Open a terminal window on the server and log in with suitable file permissions.

  2. Run the svn-multisite service, located in the init.d folder:

    lrwxrwxrwx 1 root root 55 Jul 11  2016 svn-multisite-plus -> /opt/wandisco/svn-multisite-plus/bin/svn-multisite-plus
  3. Run the shutdown script using the service command.

    The service command is not required here but it is a good habit to use it when executing scripts from /etc/init.d. Note that it is critical to use the command when starting things up to ensure consistency.
    root# service svn-multisite-plus stop
    Stopping ui:.                                              [  OK  ]
    Stopping replicator:.                                      [  OK  ]

    or if using SLES 12/systemd:

    root# systemctl stop wdmsp.target
    Platform dependent commands
    See here for more information on platform specific start and stop commands.
  4. Both the replicator and the UI processes shut down.

4.1.3. Startup Script Commands

As of MSP 1.9.5 systemd commands are used for platforms that support only systemd (without compatibility mode). On all other platforms the chkconfig commands are used. See the sections below for these different commands, and here for more information on platform specific commands.

chkconfig commands
Service Command Behavior

start

Start the application

stop

Stop the application

restart

Restart the application

status

Show whether the application is running or not

version

Display the application version

Example: service svn-multisite-plus restart

systemd commands
Systemctl Command Behavior

start

Start the application

stop

Stop the application

restart

Restart the application

status

Show whether the application is running or not

Example: systemctl start wdmsp.target

Note: To obtain version information for MSP on a systemd governed system, please execute /opt/wandisco/svn-multisite-plus/bin/svn-multisite-plus version.

4.1.4. Change a locally defined administration account password

You can change a locally defined MSP account’s password at any time by following this procedure:

  1. Log in to the MSP admin console.

    login 1.9
    Login
  2. Click the Security tab.

    securitytab
    Security
  3. At the top of the Security screen is the password change form. Enter the current password, along with a new password.

    password1 1.9
    Changed password
  4. Click the Save button to store the new password. A growl message will appear.

    password2 1.9
    Growl
    Changing Account Name
    You cannot currently change the Administration account name. To change the account name you would need to add a new administrative account with the desired name and then remove the original account name.

4.1.5. Update your license.key file

Follow this procedure if you need to change your product license. You would need to do this if, for example, you needed to increase the number of SVN users or the number of replication nodes.

  1. Log in to your server’s command line, navigate to the properties directory: /opt/wandisco/svn-multisite-plus/replicator/properties and rename the license.key to license.20170606.
    i.e.

        total 16
        -rw-r--r-- 1 wandisco wandisco 1183 Dec  5 15:58 application.properties
        -rw-r--r-- 1 wandisco wandisco  512 Dec  5 15:05 license.key
        -rw-r--r-- 1 wandisco wandisco  630 Dec 17 15:43 logger.properties
        -rw-r--r-- 1 wandisco wandisco  747 Dec  4 10:31 svnok.catalog
  2. Get your new license.key and drop it into the /opt/wandisco/svn-multisite-plus/replicator/properties directory.

  3. Restart the replicator by running the MSP script with the following argument:

    service svn-multisite-plus restart

    or if using SLES 12/systemd:

    systemctl restart wdmsp.target
    Platform dependent commands
    See here for more information on platform specific start and stop commands.

    This will trigger an MSP replicator restart, which will force MSP to pick up the new license file and apply any changes to permitted usage.

    If you don’t restart
    If you follow the above instructions but don’t do the restart MSP will continue to run with the old license until it performs a daily license validation (which runs at midnight). Providing that your new license key file is valid and has been put in the right place then MSP will then update its license properties without the need to restart. However, if the license file is somehow corrupt, or belongs to a different WANdisco product, then MSP will shutdown. We therefore recommend restarting MSP during working hours to avoid needing to come in after midnight if there is an issue.

    If you run into problems, check the replicator log (/opt/wandisco/svn-multisite-plus/replicator/log) for more information. No message will appear on the dashboard as the system will not start with a bad license.

4.1.6. Update a node’s properties

In the System Data section of the Settings tab there’s a bank of editable properties that can be quickly updated by re-entering, saving and allowing the MSP replicator to restart - although this may cause brief disruption to users whose in-flight commits may fail.

settings edit
Node properties that you can change - subject to a restart of the replicator
Node Name

This is the human-readable form of the node’s ID. Unlike the Node ID you can change the value of Node Name and reuse it (after it has been removed from the replication network). You can’t have two nodes with the same name, but you can reuse a previously removed node name.

Location Longitude

The Node’s geographical location is no longer recorded during installation. Instead you enter the details here.

Location Latitude

Along with Longitude, this value places the node on the internal map and helps the application determined the local time for the node based on the timezone in which it falls.

Hostname / IP Address

The hostname or underlying IP address can be updated.
Changing this property initiates a Replicator restart and requires a manual UI restart.

If SSL is configured then after an IP address change all nodes must be manually restarted.
Only change one node
The UI can only be used to change the IP address of a single node at one time. If you need to change the address of multiple nodes see the KB article on How to use updateinetaddress.jar to change IP address. Please contact WANdisco support for assistance if you want to use this procedure.
DConE Port

The TCP port used for DConE agreement traffic - not to be confused with the Content Distribution port which carries the payload repository data.
Changing this property initiates a Replicator restart and requires a manual UI restart.

Dashboard Polling Interval (Minutes)

Sets how often the dashboard messaging is updated. The messaging is populated by Warnings and Errors that appear in the replicator logs file. The default frequency is every 10 minutes.

Dashboard Item Age Threshold (Hours)

The number of hours a logged event is displayed on the Dashboard for. We recommend not setting this value lower than 96 hours so you don’t miss an important issue over a 3-day weekend.

After entering a new value, click the Save button. A growl message will appear to confirm that the change is being replicated - this will result in a restart of the replicator which may cause brief disruption to SVN users.

Other property changes

You can also modify other properties in the application.properties configuration file. By default it is located in /opt/wandisco/svn-multisite-plus/replicator/properties/application.properties.

Take care when making changes to "hidden" properties
An error can affect product behavior and be difficult to trace. In most situations, you should only make changes with the assistance of WANdisco’s support team.
Content Delivery Port

To change the Content Delivery Port follow these steps. In this example we change the port for node1 to 4322:

  1. On the node you want to change open the application.properties file and change:
    content.server.port=4322

  2. On all remaining nodes create a new file in the replicator directory called update.properties. This file should have the following content (<node1nodeId> is a UUID):

    port.content.<node1nodeId>=4322

    Then run the following command from the replicator directory:

    java -jar svn-ms-replicator-updateinetaddress.jar -c /opt/wandisco/svn-multisite-plus/replicator/properties/application.properties

    This will update the route for the Content Delivery port on node1.

  3. Restart all nodes.

  4. To check your change has occurred, log in to the updated node and check its System Data.

    Also check the api on each node which will show that node1 has the following entry post change:

    <route>
        <routeType>ContentDistributionType</routeType>
        <hostname>node1.company.com</hostname>
        <port>4321</port>
    </route>
  5. Finally, you should do some test commits to ensure that replication continues successfully.

Task garbage collection

There are two configurable properties that control how often the task garbage collection process should run. These properties are set during the installation. If you need to modify their values you need to add them to the application.properties file.

task.removal.interval

This setting controls how often the task garbage collection process should run. The default is 96 hours, noted in milliseconds (i.e. 345600000 milliseconds for 96 hours).

task.expired.interval

This setting controls how old a successfully run task must be before it is made available for garbage collection. The default is 96 hours, noted in milliseconds (i.e. 345600000 milliseconds for 96 hours).

Summary: For large deployments reduce the time from 96 to 24 hours
The recommended settings are suitable for most deployments. However, for deployments with very large numbers (thousands) of repositories and where repository consistency checks are automated then we recommend that you reduce the setting times, initially to 24 hours (86400000 ms).

Shorter periods will result in a corresponding reduction in your ability to troubleshoot problems that involve replicator task history. If you notice large numbers of failed tasks accumulating over time or have any concerns about what settings are right for your specific deployment, contact WANdisco’s support team.

Example:
For a deployment that replicates several thousand repositories and schedules daily consistency checks it’s decided to reduce the task expiry to 48 hours and the garbage collection frequency to 24 hours. The settings would therefore be:

task.removal.interval   86400000L
task.expired.interval   172800000L

Note: you must add an "L" to the end of your value.

Node content distribution timeouts

There are two configurable properties that you can modify as part of fine-tuning an MSP deployment. They are provided to allow you to balance best possible performance against the tolerance of a poor WAN connectivity.

socket.timeout

socket.timeout=900000

This is the time, in milliseconds, that a read() call on a socket will wait before timing out. Default value is 15 minutes (90,000 milliseconds).

Not less than 10 minutes!
DO NOT set socket.timeout to less than 10 minutes (60,000 milliseconds) or you may encounter problems.

content.pull.timeout

content.pull.timeout=30000

The content pull timeout sets how long the Content Distribution system will wait for new content to be pulled fully over from a remote node. The default value is 30 seconds (30,000 milliseconds). This default is set on the assumption that there are no problems with the deployment’s WAN connectivity.

You should not increase the timeout value, and decreasing the value is not generally recommended.
Decreasing is not intended as a method for boosting performance - although this may occur in some situations. We recommend that you don’t drop the timeout value below 5000 (5 seconds) without consulting with our support team.

content.max.idle.time

content.max.idle.time=2147483647 (never)

This should be set to the amount of time, in milliseconds, that your networking (router) infrastructure will close an idle connection.

See the content disribution policy section for more information.

4.1.7. Set up data monitoring

The Monitoring Data tool monitors the disk usage of MSP’s database directory, providing a basic level of protection against MSP consuming all disk space. The tool also lets you set up your own monitors for user-selected resources.

Monitoring Data - not intended as a final word in system protection
Monitoring Data is no substitute for dedicated, system-wide monitoring tools. Instead, it is intended to be a 'last stand' against possible disk space exhaustion that could lead to data loss or corruption.
Read our Recommendations for system-wide monitoring tools.
Default settings
resourcemonitor1 1.9
Click the View link to go to a monitor’s settings

By default MSP’s database directory (/opt/wandisco/svn-multisite-plus/replicator/database) is monitored - this is the location of MSP’s prevayler database where all data and transactions files for replication are stored.

This built-in monitor runs on all nodes. To add a new monitor click Add. Any additional monitors that you set up will monitor on a per-node basis. Monitors are not replicated so a monitor set up on one node is not applied to any other node.

Additional monitors

As well as MSP’s own database folder, there are several directories that might grow very large and potentially consume all available file space.

Consider monitoring the following MSP directories:

/opt/wandisco/svn-multisite-plus/replicator/content
/opt/wandisco/svn-multisite-plus/logs
/opt/wandisco/svn-multisite-plus/replicator/logs

Also monitor /path/to/authz. If you are using Authz to manage authorization and your Authz file is situated on different file system from MSP, then you are recommended to set up monitoring of the Authz file.

For most deployments all these directories reside on the same file system, so that your default monitor would catch if any of them were consuming the available space. However, there are two scenarios where we’d recommend that you set up your own monitor for the content directory:

  • You want to set a higher trigger amount than the default monitor (1GiB for warning, 0.09GiB for emergency shutdown).

  • You have placed the content directory on a different filesystem with its own capacity that wouldn’t be tracked by the default monitor.

In both cases, after setting up a monitor you should set up a corresponding email notification that is sent if some or all of your monitor’s trigger conditions are met.

Create additional resource monitors using the following procedure:

  1. Log in to the Administrator user interface.

  2. Click the Settings tab.

  3. Monitoring Data is situated below the Administrator Settings. Enter the full path to the resource that you wish to monitor. For example, you might wish to monitor the replicator logs: /opt/wandisco/svn-multisite-plus/replicator/logs. Enter the path and click Add.

    resourcemonitor2 1.9
    Add
  4. The new resource monitor appears as a new box - it will display No records found, indicating that it doesn’t yet have any monitoring rules set. Click Configure.

    resourcemonitor3 1.9
    Configure
  5. The screen updates to show the Resource Monitoring screen for your selected resource.

    resourcemonitor4 1.9
    Settings
    File Path

    The full path for your selected resource (must be a directory)

    Monitor Identity

    The unique string that will identify the monitor

    Edit Condition and Event List

    Lists current resource monitors, initially this will state No records found

  6. Add Conditional and Event to list.

    Storage amount entry field

    Enter an amount of disk space in Gigabytes. e.g. 0.2 would be equal to 200 Megabytes of storage.

    Select an Event from the dropdown:

    SEVERE

    Initiates a shutdown of MSP and also writes a message to the log and the SEVERE logging level. See next section for more information.

    WARNING

    Writes a message to the log and the WARNING level of severity.

    INFO

    Writes a message to the log and the INFO level of severity.

    DEBUG

    Writes a message to the log and the DEBUG level of severity. Messages at this level will not be seen in the log file unless the logging configuration is changed from the default (INFO and more critical).

    For more information on levels view logging levels.

  7. When you have added all the trigger points and events that you require for the resource, click Update. You can then navigate away: Click Resource Monitoring on the breadcrumb trail to return to the settings screen.

When a shutdown is triggered

If the disk space available to a monitored resource is less than the value you have for a SEVERE event, then the event is logged and the MSP replicator shuts down immediately. By default, the disk space is checked every 10 minutes. You can configure this interval in the applications.properties file:

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties
monitor.period.min=10L

Value in minutes.

Edits to property files require a replicator restart
Any change that you make to the application.properties file will require that you restart MSP’s replicator.

When the MSP replicator is shutdown, no modifications the the repositories at that site can be made. Read-only access can continue, depending on how severe the disk shortage becomes. You should immediately take action to make more disk space available. You can then start the replicator up again as soon as the resource that triggered the shutdown has enough available disk space not to shut down again.
To restart use the commands:

service svn-multisite-plus restart

or if using SLES 12/systemd:

systemctl restart wdmsp.target
Platform dependent commands
See here for more information on platform specific start and stop commands.
Cannot use replicator UI to start after a shutdown.
If the replicator is in a shutdown state due to an error (e.g. running out of disk space) then the MSP UI cannot be used to restart it. You will need to login to the command line and run the service command per above. See starting up
Tunable settings

Change the threshold used to trigger low disk space warnings. You can set the following values in the application.properties, then restarting the replicator:

monitor.threshold.severe

Set the level at which the replicator will immediately shutdown when monitoring the database directory (see above). The value cannot be below 100MiB. Give the storage amount in bytes, remembering to add an "L" to the end to signify long value, e.g. 1000000000L = 953.67MiB.

monitor.threshold.warning

Sets the threshold for a notification warning if the free disk space drops below the specified value. Given in bytes, as noted above.

Overriding the forced shutdown

You may need to override the forced shutdown if you can’t start a node to resolve the cause of the forced shutdown. For example, you might have mistakenly created a data monitor that triggers a severe log message if there’s less disk space than the disk’s actual capacity. You then cannot free up space, apart from swapping for a bigger disk.

To unlock the forced shutdown:

  1. Log in to the locked node using a terminal.

  2. Navigate to the properties folder, by default it is here:

    /opt/wandisco/svn-multisite-plus/replicator/properties/application.properties
  3. Create a backup, then edit the file, changing the line monitor.ignore.severe=false to =true

  4. Save the change to the file.

  5. Restart the replicator (see Starting up). During the restart the replicator ignores the severe warning, which is still written to the log file, allowing you to delete the monitor.
    You cannot use this procedure to override the default monitor. Its emergency shutdown limit of <100MiB will ALWAYS shut down the replicator.

4.1.8. Changing UI properties

You can edit UI properties through ui.properties.

/opt/wandisco/svn-multisite-plus/local-ui/samples/ui.properties

This file contains settings concerning the graphical user interface such as timeout values. Stored in this file is the UI Port number and is considered the defacto recording of this value, superseding the version stored in the main config file /opt/wandisco/svn-multisite-plus/config/main.conf. You can view a sample.

The following properties are configurable:
Note - this is from MSP 1.9.4 onwards. Please check what version you are using before updating the ui.properties file.

  • task.check.max.interval.milliseconds=200
    This setting changes the delay used by the UI when checking for completed tasks. It needs to be added to the ui.properties file, without it the default is 200milliseconds. It must be a numerical value in milliseconds.

    Not too small
    Do not set this value too small as it it will cause a lot of loading on the MSP replicator (including lots of log entries).

For more information please contact WANdisco support.

4.1.9. Set up email notifications

The email notification is a rules-based system to deliver alerts based on user-defined templates over one or more channels to destinations based on triggers that are activated by arbitrary system events. Put simply, email notification sends out emails when something happens within the MSP environment. The message content, trigger rules and destinations are all user-definable.

notifications01
Automated alert emails

The different aspects of setting up notifications are:

  • Gateways - the store for email server details.

  • Destinations - the email addresses of recipients.

  • Templates - the email messages. These can contain different variables depending on the rules applied.

  • Rules - defines which event triggers a notification, which template is used and who the recipients are.

Set up a Gateway

The Gateways section stores your email (SMTP) server details. You can set up multiple gateways to ensure that the loss of the server doesn’t prevent alert notifications from being delivered.

  1. Log into the admin UI, then click Settings.

  2. Click the Gateway section of the Notifications area.

  3. Enter your email gateway’s settings:

    notifications02
    Enter settings
    IP/Hostname of SMTP Server

    Your email server’s address.

    SMTP Server Port

    The port assigned for SMTP traffic (Port 25 etc).

    Encryption Type

    Indicate your server’s encryption type - None, SSL (Secure Socket Layer) or TLS (Transport Layer Security). SSL is a commonly used. For tips on setting up suitable keystore and truststore files see Setting up SSL Key pair.

    Keystores?
    If you’re not familiar with the finer points of setting up SSL keystores and truststores it is recommended that you read the following articles: Using Java Keytool to manage keystores and How to create self signed certificates and use them in test environments.
    Authentication Required

    Indicate whether you need a username and password to connect to the server using true or false.

    User Name

    If authentication is required, enter the authentication username here.

    Password

    If authentication is required, enter the authentication password here.

    Sender Address

    Provide an email address that your notifications will appear to come from. If you want to be able to receive replies from notifications you’ll need to make sure this is a valid and monitored address.

    Number of Tries Before Failing

    Set the number of attempts that MSP makes to send out notifications.

    Interval Between Tries (Seconds)

    Set the time, in seconds, between your server’s attempts to send notifications.

  4. Click +Add. Your gateway appears in the table.
    You can add any number of gateways. MSP exhausts the Number of Tries Before Failing for each registered gateway before moving on down the list to the next.
    You can use the Test button to verify that your entered details connect to a mail gateway server.

Set up a destination

The destinations section stores the email address for your notification recipients.

  1. Click + on the Destinations line.

  2. Enter an email address for a notification recipient. Click +Add.

    notifications03
    Destination
  3. The destination appears in a table. Click Edit or Remove to change the address or remove it from the system.

Set up a template

The template section stores email messages. You can create any number of templates, each with its own notification message, triggered by one of a number of trigger scenarios that are set up in the Rule section.

  1. Click + on the Template line.

  2. Enter a Template Subject line which will be the subject of the notification email.

  3. Enter some Body Text which will be the message that is sent out when the notification is triggered. The message has a 1024 character limit. You can track the available number of characters at the bottom of the text box.

  4. When you’ve entered the message, click + Add to save the message template.

    notifications04
    Template

For example if an Admin wanted to receive an email when a new repository is deployed the the body text could be:

Hi Admin,
RepositoryDeployedEvent occurred {timestamp).
The repository deployed was {event.repository.name} and its path is {event.repository.fSPath}
Regards,
Replicator

In the rules, the template needs to trigger in the event Deploy Repository Succeeded, and the event selected determines which variables are available in the message body - for more information see events and variables.

Set up a rule

The Rule section defines which system event should trigger a notification, what message template should be used and which recipients should be sent the notification.

  1. Click + on the Rule line.

  2. Choose an event from the Event drop-down list:

    notifications05b
    Rules
  3. Choose a Template from the drop-down list. These are the templates that you have already set up under the Templates section.

  4. Choose destinations for your notification from the available destination email addresses. You can make multiple selections so that a message is sent to more than one recipient address.

  5. Click + Add to save your rule.

Events and variables

When writing email notification templates, you can insert variables into the template that will be interpolated when the notification is delivered. The variables available depend on the event type selected. The following variables are available for all event types:

{node} - This returns the node name.

{timestamp} - This returns the time at which the event is received (not the time at which the notification is delivered).

{event} -This returns the raw dump of the event.

All events are now listed along with a brief description and the additional variables available for each specific event:

Disk Monitor Info

Disk Storage has dropped below the Info level. This will trigger if any data monitor message is written to the logs at the INFO level.

{event.message} - This returns information about the disk monitoring threshold that was exceeded.
{event.resource} - This returns the resource on disk that is being monitored.

Disk Monitor Severe

Disk Storage has hit the Severe level. This will trigger if any Severe level data monitor message is written to the logs. At this level, MSP will have shutdown to ensure that disk space exhaustion doesn’t corrupt your system and potentially your SVN repositories. For more information about disk warning messages, see the Setting up data monitoring section.

{event.message} - This returns information about the disk monitoring threshold that was exceeded.
{event.resource} - This returns the resource on disk that is being monitored.

Disk Monitor Warning

Disk Storage has dropped below the Warning level.This will trigger if any data monitor message is written to the logs. For more information about disk warning messages, see the Setting up data monitoring section.

{event.message} - This returns information about the disk monitoring threshold that was exceeded.
{event.resource} - This returns the resource on disk that is being monitored.

Generic file replication error occurred

An error occurred in the system that handles the replication of system files.

{event.message} - This returns information on the triggering event.

Repository exited Global Read-Only

A repository that was flagged as Global Read-Only has now returned to replication.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.
{event.repository.globalReadOnly} - This will be True if the repository Global Read only.

Repository entered Global Read-Only

A repository that has been replicating successfully is now flagged as Global Read-only.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.
{event.repository.globalReadOnly} - This will be True if the repository Global Read only.

License is about to expire

The user license for the node is about to expire.

{event.message} - This returns information on the triggering event.

License has expired

The user license for the node has now expired.

{event.message} - This returns information on the triggering event.

License is nearing the maximum number of users

The number of active users is close to the license limit.

{event.message} - This returns information on the triggering event.

User License Limit Reached

The number of active users has now reached the licensed limit.

{event.message} - This returns information on the triggering event.

Repository Local Read-Only Event

A repository has entered Local Read-Only mode.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.
{event.repository.localReadOnly} - This will be True if the repository is Local Read only.

Repository exited Local Read-Only

A repository that was in Local Read-Only mode has now left the mode.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.

Repository entered Local Read-Only

A repository has entered the local Read-Only mode.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.

Replicator is Started and Ready

The replicator component of a node is up-and-running and ready to replicate data.

{event.message} - This returns information on the triggering event.

Deploy Repository Checks Failed

A repository added to MSP has failed to deploy, in which case the repository will not be replicated.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.state} - This returns the repository state.

Deploy Repository Checks Succeeded

A repository added to MSP has successfully deployed. Such an event might be sent to a mail group received by SVN users, telling them that their repository is now accessible.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.

Deploy Repository Succeeded

A repository has been successfully added to MSP.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.

Global Read-Only Due To Admin Action

In case of any repository entering a global read-only mode as a result of administrator interaction through the admin UI.

{event.repositoryIdentity} - This returns the repository ID
{event.globalReadOnly} - This will be True if the repository Global Read only.
{event.reason} - This returns a message indicating why the repository went read only.

Any Repository Global Read-Only Event

In case of any repository entering a global read-only mode.

{event.repository} - This returns the repository object.
{event.repository.name} - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.
{event.repository.globalReadOnly} - This will be True if the repository Global Read only.
{event.repository.globalReadOnlyReason} - This returns a message indicating why the repository went read only.
{event.message} - This returns information on the triggering event.

Global Read-Only Due To Consistency Check Failure

In case of any repository entering a global read-only mode as a result of failing a consistency check with its replicas.

{event.repositoryIdentity} - This returns the repository ID
{event.globalReadOnly}
 - This will be True if the repository Global Read only.
{event.reason} - This returns a message indicating why the repository went read only..

scheduledConsistencyCheckSkippedNotificationEvent

A scheduled consistency check has been skipped.

{event.message} - This returns information on the triggering event.

Repository is Sidelined

A repository has entered the sidelined mode and has been dropped from the replication system.

{event.repository} - This returns the repository object.
{event.repository.name}
 - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath}
 - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state} - This returns the repository state.
{event.repository.globalReadOnly} - This will be True if the repository Global Read only.
{event.repo.globalReadOnlyReason} - This returns a message indicating why the repository went read only.

Repository is Unsidelined

A repository has left the sidelined mode and can be recovered using the standard repair procedure.

{event.repository}
 - This returns the repository object.
{event.repository.name}
 - This returns the user-specified name of the repository to which the event pertains.
{event.repository.fSPath}} - This returns the location on-disk of the repository that the event pertains to.
{event.repository.dsmId} - This returns the deterministic state machine ID in which the event occurred.
{event.repository.state}
 - This returns the repository state.
{event.repository.globalReadOnly}
 - This will be True if the repository Global Read only.
{event.repository.globalReadOnlyReason} - This returns a message indicating why the repository went read only.

4.1.10. Back up SVN MultiSite Plus data

It is best practice to backup MSP’s database, but it is extremely rare to actually use that backup. Please contact WANdisco support before actually restoring a node using that backup since corruption of other nodes could occur if done incorrectly.

Only MSP Settings are backed-up
This procedure backs up MSP’s internal Prevayler database. It does not touch your SVN repository data or any other system files (such as Apache configuration, authz files etc.) that you should also be backing up.
Back up while shut down

Run this from within /opt/wandisco/svn-multisite-plus/:

java -cp ./svn-ms-replicator-fsfsrestore.jar com.wandisco.fsfs.backup.FsfsBackup -c ./properties/application.properties

Use this to back up the current state of all prevaylers when MSP is shut down. You therefore do not need to start the replicator in order to create a backup of the database.

4.1.11. Restore SVN MultiSite Plus data

Contact WANdisco Support if you want to restore MSP data.

4.1.12. Back up and restore other data

Restore data as follows:

Repository content

See Repair an out-of-sync repository.

Repository config and hooks

MSP manages the synchronization of the revision information. It does not synchronize the hooks or other repository configuration data, except for the files fsfs.conf and fs-type. The hooks exist depending where you install them, i.e. which version of the hooks, which nodes, which repos. The other configuration files should not be touched. However, you may need to copy them, e.g. conf/svnserve.conf, to all of the repositories at the same time so that the behavior is uniform.

Apache config

Depending on the number and configuration of your replication groups, you may be able to copy them from another node. This is part of your system configuration and should be backed up and restored according to your own IT policy.

A number of critical changes have been made between Apache 2.2 and 2.4. One change that could impact MSP is the consolidation of AcceptMutex, LockFile, RewriteLock, SSLMutex, SSLStaplingMutex, and WatchdogMutexPath directives with a single Mutex directive.

You must ensure that any calls to the AcceptMutex are changed to the Mutex directive.

See the Apache documentation - https://httpd.apache.org/docs/2.4/upgrading.html

SSL config

Add this to your system backup strategy, because you may have custom configuration between nodes.

4.1.13. Manage access to SVN MultiSite Plus

MSP supports three different mechanisms for managing access to its admin UI:

You can set up multiple administrator accounts for accessing the MSP admin console. Accounts can be set up from within the admin UI (via the Security tab). These accounts can then log in to any node’s admin UI by providing their account name and password.

The following sections explain how to set up multiple accounts, set up managing LDAP authorities and export/import the resulting data.

Adding additional users
  1. Log in to the Admin UI using an existing admin account.

    securitytab
    Login
  2. Click Security, then click Add User.

    svnmsp ac3 adduser2
    Add User
  3. Enter details for the new administrator, then click Add User at the end of the entry bar.

    svnmsp ac3 adduser
    Click Add User to save their details
  4. You’ll see a growl message confirming that the user has been added. You’ll see them listed on the Internally Managed Users after clicking the Reload button.

    svnmsp ac3 adduser3
    New user appears
Removing or editing user details

You can modify all account details other than the account name itself. If you need to change the account name then add the new account name and then remove the old one.

  1. Click the corresponding Edit button on the Internally Managed Users table.

    edituser1 1.9
    Remove or Edit users
  2. Update the settings and click Save

    edituser2 1.9
    Edit users
LDAP authorities

MSP supports the use of LDAP authorities for managing admin accounts.

When connecting MSP to available LDAP authorities it is possible to classify the authority as Local, i.e. specific to the node in question or not. If not local, the authority details are replicated to the other nodes within the replication network.

It’s possible to run multiple LDAP authorities that are of mixed type, i.e. using some local authorities along with other authorities that are shared by all nodes. When multiple authorities are used, it’s possible to set what order they are checked for accounts. This enables you to prefer specific account names come from specific LDAP authorities in case your organization has account name collisions between LDAP authorities.

The standard settings are supported for each configured LDAP authority: URL, search base and filter and bind account credentials. Note that the bind account’s password cannot be one-way encrypted using a hash function because it must be sent to the LDAP server in plain text, so for this reason the bind account should be a low privilege user with just enough permissions to search the directory for the account being authenticated. Anonymous binding is permitted for those LDAP servers that support anonymous binding.

Add authority

Use the Add Authority feature to add one or more LDAP authorities, either local to the node or connected via WAN. Locally LDAP services are treated as having precedence. When Internally managed users (accounts) are enabled they are first checked when authenticating users. See Admin Account Precedence.

To add an authority:

  1. Log in to the admin UI, click the Security tab.

  2. Click Add Authority.

    addauth 1.9
    Add Authority

    The Authority entry form appears.

  3. Enter the following details:

    addauth1 1.9
    Add Authority
    URL

    Enter your authorities URL. You need to include the protocol ldap:// or ldaps://
    Example (Active Directory)

    ldap://<server IP>:389
    Bind User DN

    Enter a LDAP admin user account that will be used to query the authority
    Example (Active Directory)

    cn=Administrator,cn=Users,dc=windows,dc=AD
    Search Base

    Enter the Base DN, that is the location of users that you wish to retrieve.
    Example (Active Directory)

    CN=Users,DC=sr,DC=wandisco,DC=com
    Search Filter

    Optionally add a query filter that selects a user based on relevant LDAP attributes. Create a search filter so that it filters LDAP query result into unique result. A workaround might look something like:
    Example: (Active Directory)

    (&(memberOf=[DN of user group])(sAMAccountName={0}))

    This dynamically swaps the {0} for the ID of the active user.
    To authenticate the current user you need to include uid={0} or a similar user search against {0}. For example, if you want to restrict access to MSP to the LDAP user ldapuser1 and you are using openLDAP, then the searchFilter should be:

    (&(uid={0})(uid=ldapuser1))

    For more information about query filter syntax, consult the documentation for your LDAP server.

    Is Local?

    Tick this checkbox if you want the authority to only apply to the current node and not be replicated to other nodes (which is otherwise done by default).

  4. Click Add Authority. This saves the authority settings that you have just entered. You can click Test to verify that the details successfully connect to the authority without adding the authority yet.

  5. When running with multiple authorities, you need to set the order in which MSP polls the authorities. Use the "+-" symbols at the end of each authority entry to push it up (+) or down (-) the list.

    authority order
    Order authorities
Edit authority

Modify an existing authorities settings:

  1. Log in to the admin UI, click the Security tab.

  2. Click the edit link on the line that corresponds with the authority that you wish to edit.

    auth edit01
    Edit authorities link
  3. Update the settings in the popup box, then click Save.

    auth edit02
    Edit authorities box
Kerberos security

This section describes the basic requirements for integrating MSP with your existing Kerberos systems. The procedure requires the following:

Time, ladies and gentlemen, please
Ensure that time synchronization and DNS are functioning correctly on all nodes before configuring Kerberos. A time difference between a client and the master Kerberos server that exceeds the Kerberos setting (5 mins default) will automatically cause authentication failure.
Configuration

This procedure assumes that you have already set up your DNS service and master Key Distribution Center.

  1. On each node, add the service principal:

            # kadmin -p root/admin -q "addprinc -randkey HTTP/node1.example.com"
            # kadmin -p root/admin -q "ktadd -k /opt/krb5.keytab HTTP/node1.example.com"
            # chmod 777 /opt/krb5.keytab
  2. Each node should have installed the add-on JCE Java 7 or Java 8 Unlimited Strength Jurisdiction Policy Files. These can be downloaded from Oracle, subject to your local import rules concerning encryption technology. Once downloaded, extract to the the Java security library, i.e.

    $JAVA_HOME/lib/security/
  3. At this point you can install MSP on each node. If that’s already done, then configure the Kerberos settings under the Security tab.

    kerberos 1.9
    Edit Kerberos box
    Service Principal

    This unique name for an instance of a service, such as HTTP/node1.example.com

    Keytab Location

    This is the location of the keytab, a file containing pairs of Kerberos principals and encrypted keys (often derived from the Kerberos password). It’s used for logging into Kerberos without being prompted for a password.

    Kerberos Config Location

    The krb5.conf file contains Kerberos configuration information, including the locations of KDCs and admin servers for the Kerberos realms of interest, defaults for the current realm and for Kerberos applications, and mappings of hostnames onto Kerberos realms. Normally, you should install your krb5.conf file in the directory /etc. i.e. /etc/krb5.conf

  4. Save the settings.

  5. Log out.

  6. Return to the node in your browser. This time you should log in automatically (in this case as user sally@EXAMPLE.COM).

4.2. Nodes

4.2.1. Add a node

To replicate SVN repository data between nodes, you first tie the nodes together in the form of a replication network, this process starts with the adding (connecting) of nodes in a process we call induction.

Before you try to induct nodes please make sure that there are no firewalls blocking traffic to or from the DConE port (defaults to 6444) or the Content Delivery port (defaults to 4321) from any of the existing nodes in the currently running ecosystem. All MSP nodes must be able to contact all other nodes on both of these ports.

Before starting to add a node see information in the Installation chapter on repeating installation and node induction. Generally you will also want to import the security settings from a current node to ensure settings match - see Match a node’s admin settings for more information.

  1. Log in to the MSP admin console of the new node that you are connecting to your existing servers.

  2. Click the Nodes tab.

  3. Click the Connect to node button.

    connectnode 1.9
    Connect to node
  4. Enter the details of an existing node, from the Settings tab of the existing node. For more information on these details see System Data.

    nodeinduct1 1.9
    Enter the details from an existing, connected node
    Node ID

    This is the Node UUID.

    Node Location ID

    A unique string that a node creates for use as in identifier.

    Node IP Address

    The IP Address of the node’s server.

    Node Port No

    The TCP port that the node uses for DConE, which handles agreement traffic. The default is 6444. See Reserved Ports.

  5. Click Send Connection Request. The new node appears on the active list of Nodes.
    You may find that the new node gets stuck in a pending state. If this happens see If Induction fails, a common cause for this is a firewall blocking communication.

Consistency check revisions
When inducting a new node, make sure it has the same number of consistency check revisions configured as the current nodes or the induction will fail.

4.2.2. Remove a Node

The removal of a node from the MSP replication group is useful if you will no longer be replicating repository data to its location and wish to tidy up your replication group settings.

No ties allowed

You only have the option to remove a node if it is not a member of a replication group. Therefore, you will need to remove a node from all replication groups to make it eligible for removal.

Known issue:
If a node is inducted but not in a replication group then it is possible (from that node) to remove other inducted nodes that are in a replication group. There’s currently an issue in that a node isn’t aware of the membership of replication groups of which it is not itself a member. This means that it is possible to remove a node that is a member of a replication group, if done from another node that doesn’t have knowledge of the replication group.

Until we block this capability you should do a manual check of any nodes that you plan to remove to make absolutely sure that it is not a member of a replication group.

You cannot retrieve a removed node
Take care when removing nodes. To ensure that replication network is kept in sync, removed nodes are barred from being re-inducted. The only way that you can bring back a node is to perform a reinstallation of MSP using a new Node ID.
  1. Log in to the MSP admin console of any connected node.

  2. Click the Nodes tab.

  3. Nodes that are eligible for removal will have the Remove Node option available under the Action column.

  4. Click Remove Node.

    Removing a node is not reversible

    You must be absolutely sure that you want to permanently remove the node.
    Before removing a node from MSP you should shut down the node and remove the MSP installation completely. This could already be the case if the machine on which that node was running was involved in a disaster.

    Never restore a removed node from backup.

    Please contact WANdisco support if you have any questions.

    noderemove1 1.9
    Ready to remove Node
  5. After a refresh of the admin user interface you see that the removed node still displays if you click Display Removed Nodes. Removed nodes have the status Removed in the Connectivity Status column.

    noderemove2 1.9
    Node removed

4.2.3. Stop all nodes

You can stop all synchronization between nodes in the ecosystem with a single button click (if all associated repositories are replicating/writable).
The replicator processes will still be up and the replicators talking to each other but no new agreements will be allowed to occur (for example no repository updates or changes to MSP metadata).

A stop can’t be synchronized if associated repositories are Local Read-only
Before starting a Sync Stop All, make sure that none of your nodes have repositories in a local read-only state. This may mean looking at the UI of multiple nodes if your ecosystem contains replication groups that are not visible from all nodes.
  1. Log in to the admin UI and click the Nodes tab.

  2. Click the Sync Stop All button.

    node syncstopall1 1.9
    Stop all nodes

    You get a growl message confirming the stop has been triggered. You see the results when you refresh your browser session.

    node syncstopall2 1.9
    Stopped
  3. On the Nodes table all nodes are shown as Stopped. In this state you can do maintenance or repairs without risking your replication getting out-of-sync.

    node syncstopall3 1.9
    Node removed
  4. The Sync Stop All button has changed to Sync Start all. You can start selected nodes by logging in to the admin console of each node that you want to start. Use the Start Node link in the Action column of the Nodes table.

We strongly recommend that you watch the log messages and confirm that all nodes report as stopped. If you suspect that one or more nodes are not going to stop you should investigate immediately. The Dashboard messages should report the stop, for example:

Aborted tasksType PREPARE_COORDINATE_STOP_TASK_TYPE
Delete Task
Originating Node: Ld5UYU
tasksPropertyTASK_ABORTING_NODE: Ld5UYU
tasksPropertyTASK_ABORT_REASON: One or more replicas is already stopped.
The replica was: [[[Ld5UYU][bf0c6395-77b6-11e3-9990-0a1eeced110e]]]

Look for the message:

Aborted tasksType PREPARE_COORDINATE_STOP_TASK_TYPE

In the replicator.log file you might see the error type:

"DiscardTaskProposal <task id etc> message: One or more replicas is already stopped."

If you see any of these messages please contact WANdisco support.

4.2.4. Start all nodes

  1. If all nodes have been stopped, click Start All to start them replicating again.

    node syncstopall4 1.9
    Start
  2. After a browser refresh, all nodes will now show as running.
    You should check the MSP UI on all nodes to verify that they are all back in operation.

4.2.5. Disconnected/offline nodes

If a node is disconnected you can see this from the UI:

  • When you click the Replication Groups tab, you can see the groups with nodes that are offline:

    RG show offline
    View Replication Group
  • When you click View to see a replication group, you are warned that functionality is reduced when a node is disconnected, plus you see which nodes are connected/disconnected:

    nodedisconnect 1.9
    Replication Group details
  • In general, you should wait until all nodes that will be members of a new replication group are online/available before trying to create the new replication group. However if you do try to create a new replication group that includes a disconnected node you are warned that the node is unavailable and the group is pending:

    newRGpending 1.9
    New Replication Group
  • You cannot add a new node to an existing replication group while a member node is disconnected. You get this message in the UI log on the dashboard:

    Replication group schedule cannot be updated with a new node whilst a member is disconnected.

4.3. Replication groups

To replicate a SVN repository between a set of nodes, you first need to associate those nodes by adding them to a replication group.

The Replication Groups tab will also show you if there are any disconnected nodes. See Disconnected/offline nodes.

4.3.1. Create a replication group

You need to create a new replication group if you need to replicate 1 or more repositories to a combination of nodes that does not already exist as a replication group. Before you create a new group, review your current replication groups to make sure the desired combination doesn’t already exist. If it does then simply add a new repository to that replication group. If you do need to create a new replication group then follow the steps here.

Single-node Replication Group

It is possible to create single-node replication groups. No replication takes place when you have a single node replication group. In a single node replication group the node must be an Active Voter. All other features work normally.

You may have a single-node configuration in the following situations:

  • If all other nodes in the replication group were removed properly. This could happen naturally during a transition from a 2 node replication group to a different 2 node replication group.

  • If a single node replication group was desired for any reason. For example, if you want a simple place to test your understanding of the interaction between WANdisco’s Access Control Plus product and MSP (e.g. AuthZ file delivery, authorized_keys file delivery, etc.).

  • If the other nodes in a replication group were inappropriately removed: contact WANdisco Support.

Note: When changing a single node replication group into a multi-node replication group the only helper node will be the original single node. During the synchronization period the repositories in the new replication group will be read-only.

A single-node replication group has no repository data redundancy.

To create a single-node replication group follow the same steps as here. You can ignore the warnings, particularly the one that states A replication group should contain at least 2 nodes, which is applicable if you intend to replicate your repository data.

4.3.2. Delete a replication group

You can remove replication groups from MSP, although only if they have been emptied of repositories. The following procedure is an example:

  1. Replication group Group0 needs to be removed from MSP but it has repositories associated with it. Click View to see which ones.

    deleteRG1 1.9
    View
  2. The Replication Group configuration screen shows which repositories are associated with the group. Follow the link to the repositories page to remove the associations and enable group deletion.

    deleteRG2 1.9
    Repositories
  3. On the Repositories screen, click an associated repository, in this example Repo0, then click Edit.

    deleteRG3 1.9
    Select and Edit
  4. In the Edit Repository box, use the Replication Group drop-down to move the repository to a different Replication Group. Then click Save.

    deleteRG4 1.9
    Edit
  5. Repeat this process until there are no more repositories associated with the Replication Group that you wish to delete. Click on View.

    deleteRG5 1.9
    Move it
  6. Now that Replication Group Group0 is effectively empty of replication payload the Delete link is enabled. Click Delete Replication Group (Group0) to remove the replication group. Note that there is no undo.
    However, no data is removed when a replication group is deleted so you can recreate the replication group if you know which nodes were included, what the node types were and whether there was a membership rotation and what it was. Most of these are normally obvious but if a rotation schedule is involved then be careful to record the details before you remove the original replication group as all of that data is lost when the original replication group is deleted.

    deleteRG6 1.9
    Click the Delete link button
  7. A growl will appear confirming that the replication group has been deleted.

4.3.3. Add a node to a replication group

Don’t add a node during a period of high replication load
When adding nodes to a replication group that already contains three or more nodes, ensure that there isn’t currently a large number of commits being replicated.
Adding a node during a period of high traffic (heavy level of commits) going to the repositories may cause the process to take a long time/appear to stall.
Stop scheduled membership rotation before adding or removing any node to/from a replication group. After the operation is complete you should review the membership rotation details before re-enabling the rotation.
Once the "Add Node" process starts

Once the "Add Node" process starts, don’t navigate away from the screen or make any further changes on the Admin UI until the process completes. Moving away or making other changes is likely to disconnect the UI from the state of the operation, so you won’t get a confirmation that it has completed.

What to do if you do get stuck
If for whatever reason the process gets stuck, you can get unstuck by restarting the Admin UI process using the MSP service, E.g.:

[root@redhat6]# service svn-multisite-plus uistop
[root@redhat6]# service svn-multisite-plus uistart

This will not impact the replicator’s operation so end users' access to Subversion is not affected. Once the Admin UI process has restarted the operation to add the node should be confirmed as complete.

You can add additional nodes to an existing replication group, so that there’s minimal disruption to users:

  1. Log in to a node, click the Replication Groups tab.

  2. Click View on the replication group you want to add a new node to.

    addnode1 1.9
    View Replication Group

    The replication group screen appears

  3. Click Add Nodes.

    addnode2 1.9
    Add Node
  4. Select the node to add from the Select a Node dropdown list. Also read the additional on screen instructions.

    addnode3 1.9
    Select Node
    The Add Nodes button is disabled

    The Add Nodes button is grayed out if the current replication group configuration will not support the addition of a new voter node.
    E.g. an even number of Voters is not supported. Changing an existing node to a tie-breaker would allow another node to be added.

    Also, a configuration that is scheduled in the future may block the addition of a new node. Check the schedule if you think that you should be able to add a new node to the replication group.

  5. When you have added the nodes, click Add Nodes.

    addnode4 1.9
    Add Nodes
  6. Select a Helper node from which you will sync repository data. Then click Start Sync.
    Note the warning about not closing the browser or logging out during this process otherwise you’ll need to perform a long repair procedure.

    addnode5 1.9
    Choose Helper Node and Start Sync
  7. A growl message appears to confirm the new node is being added.

    addnode6 1.9
    Start Sync
  8. You now need to manually synchronize the repositories from the helper node (which is temporarily offline for users until this process is finished).

    Do not immediately click on the Complete buttons
    Do not immediately click on the Complete selected or Complete all buttons. You should rsync the repositories in the replication group from the helper to the new node(s) even if you think they are already completely synchronized. Repository meta-data may need to be fixed up even if the repository data itself may not need changing. For example it is likely that fs-type files will need to be changed from fsfs to fsfswd, and other similar changes may be necessary. Forgetting to rsync (or believing that the rsync is not necessary) is one of the largest generators of WANdisco support calls.
    Avoid time consuming repair process!
    In the next steps, when you click Complete all a consistency check is invoked automatically on all repositories in the replication group. Make sure that you copy and sync correctly before clicking Complete all. If the consistency check finds any "inconsistent" repository, it is set into a global read-only state. You will then need to take each repository, one by one, through the repair process. This will be extremely time consuming for multiple repositories.
    If this happens, it is best to drop the node out of the replication group and start over. However, you will then need to write a script to remove the global read-only state for the inconsistent repositories.
    Therefore, we highly recommend that you take care not to make a mistake in copying and rsyncing. Take time to check and double check so that the consistency checks run smoothly.

    The process lets you do a complete sync or select specific repositories that you wish to sync. Assuming that you have synced all repositories you would click Complete All. The helper node is then released from the process, allowing it to catch up with any transactions that were held off while it was taking part in the procedure. The newly added node will, in parallel, catch up with any transactions that were held off while being synchronized from the helper node.

    addnode7 1.9
    Complete adding node
  9. A growl message appears confirming that the new node has been added to the replication group.

    addnode8 1.9
    Growl - Node added
  10. Returning to the Replication Group screen, you can see the new node count.

    addnode9 1.9
    New node appears on Replication Group screen

4.3.4. Remove a node from a replication group

You can remove a node from a replication group. You might need to do this, for example, if the developers at one of your nodes are no longer going to contribute to the repositories handled by a replication group. Removing a node from a replication group stops updates to its repository replicas.

Remove stray repositories
If you remove a node from a replication group, you should delete its copy of the repositories managed by the replication group. Having an out-of-date stray copy can result in confusion or users working from old data. This may occur if for example the Apache/SSHD access mechanisms have not been changed to block further access at that node.

You cannot remove a node that is currently assigned as the Managing Node. To change the managing node, go to the Configure Schedule page and assign a different node as the Managing Node. See Changing the managing node for more information.

To remove a node follow these steps:

  1. Log in to the admin console of one of your nodes. The node needs to be a member of the relevant Replication Group, otherwise it is not listed.

  2. Click Replication Groups then the View button for the Replication Group from which you want to remove a node.

    removenode1 1.9
    View Replication Group
  3. Disable the schedule rotation. Click on the Disable Schedule button to prevent schedule rotation while removing the node. While not strictly necessary this will prevent certain race conditions that could cause confusion. Wait for this operation to complete.

  4. Click the node that you want to remove from the group. If the removal of the node does not invalidate the remaining configuration, then you see a Remove node link. Click the link.

    removenode2 1.9
    Remove node from group
  5. A dialog opens which asks you to confirm the removal of the selected node from the Replication Group. Click Remove.

    removenode3 1.9
    Confirm removal
  6. A growl message confirms that the removal is in progress. You many need to click Reload to ensure that the action has been completed on all nodes.

    removenode4 1.9
    Reload page to see removal

    The node is now removed from the replication group. On the Replication Groups panel you now see that the constituent number of nodes has reduced by one.

    removenode5 1.9
    Removed node seen on Replication Group screen

    The repositories that were on the node that was removed from the replication group are now no longer marked as replicated and should be removed as soon as possible.

4.3.5. Schedule node changes: follow the sun

You can schedule the member nodes of a replication group to change type according to when and where it is most beneficial to have active voters. To understand why you may want to change your nodes read about Node types.

The following steps show how to do this through the UI. Node changes can also be scheduled through the API, see here for information.

  1. Log in to a node, then click the Replication Groups tab. Click the View link for the replication group that you wish to make a schedule.

    schedule01
    Scheduling is done through replication group settings
  2. Click the Configure Schedule button.

    schedule02
    Configure
    Membership views show what is scheduled, not necessarily what is currently active
    The roles and membership displayed is based upon the agreed schedule. It is the setup that should be in place if everything is running smoothly. It may not accurately represent the state of the replication group, due to a delay in processing on a node or if a process has hung. This is not a cause for concern but you must be aware that the displayed membership is an approximation based on the information currently available to the local node.
  3. The replication groups Schedule screen will appear. The main feature of the screen is a table that lists all the nodes in the replication group, set against a generic day (midnight to midnight) that is divided into hourly blocks. Each hourly block is color-coded to indicate the specific node’s type. To change the schedule, click a block.

    schedule03
    Role Schedule
  4. The New Configuration form lets you modify any hours for any available node.

    schedule04
    New Schedule form
    Frequency

    Select from the available frequency patterns: Daily, Weekly, Monday-Friday or Saturday to Sunday.

    From

    The starting hour for the new schedule, e.g. 00 for the start of the day.

    To

    The hour at which the scheduled changes end, e.g. 24 would effectively end the scheduled change at midnight.

  5. Click the node icon to change its type.

    schedule05
    Swapping roles

    When all node changes have been made, click the Save button to continue, or the Cancel button if you change your mind.

  6. The schedule view will now change to show the changes that you make. You must click the Save Schedule button for the changes to be applied.
    With all necessary changes made, you need to review the change to the schedule table and then click Save Schedule button. Any mistakes in node role combinations selected will be detected at this time, and if there are any then the save will fail.

MSP does not provide detailed feedback as to what the mistake is in configuration. If you see the Growl message "Unable to update replication group" then re-check your selected roles to verify that they meet role requirements.

Use the Clear Schedule button to blank out settings that you have changed, returning to the default schedule.

Changing the managing node

There may be circumstances in which you need to change the schedule managing node, for example if the current node will be undergoing maintenance and therefore will be unable to rotate the voting population at the normal time.
To change which node is the managing node (schedule manager), click on the node you want to be the new managing node and select Make Schedule Manager.

schedulemanager
Change Schedule Manager
Changing role of the managing node
You can change the managing node to Active, Active Voter and Active Voter Tiebreaker, though not to any passive roles. If you want to make a managing node into a passive node you must switch the manager to an active node (A, AV, AVT) because the manager needs to be able to propose schedule changes and therefore be active.
Schedule error

The schedule, as it is displayed by the UI, is an ideal representation of where the schedule would place the roles, assuming that the nodes are all properly connected to each other, and that you have not stopped the scheduling. If one or more nodes that are members of the replication group are down or unable to communicate (e.g. a network partition event has occurred), then the schedule could be in the process of moving, or be unable to move, if there are a sufficient number of outages.
Therefore it is very important to verify that all nodes that are members of this replication are properly connected and functioning and that the schedule has not been manually stopped before making operational decisions based on the information displayed on the scheduling page.

We will improve this function in future product versions.

Disable the schedule

If you need to stop any and all scheduled rotations, e.g. in an emergency to prevent losing quorum:

  1. Click the Replication Groups tab, then select your group.

  2. Click Disable Schedule.

    disableschedule 1.9
    Disable schedule

    Note the warning message:

    warning schedule
    Warning

4.4. Repositories

4.4.1. Add a repository

When you have added at least one Replication Group to your ecosystem you may then begin adding repositories.

For detailed instructions see Add Repositories.

The move to SVN 1.9 brings a number of enhancements to Subversion’s file system. This includes changes that allow svnadmin pack to run on repositories that are replicated by MSP, as long as they are format 7 (the native SVN 1.9 repository format). It is possible for packing to run "on-the-fly", without undue impact on repository traffic. See more about svnadmin pack.

4.4.2. Fix pending repository creation

A repository may fail to create because of incorrect permissions on one of the nodes. For example, from Node 1 you might click Create new repository and complete all the details. You see the message about the repository being added and then refresh the page. However, the repository is not listed if, for example, Node 2 has incorrect permissions. In this example:

  • The dashboard of Node 1 shows a pending task and a failed task.

  • The dashboard of Node 2 shows two SEVERE messages:

    [Can't create folder [/opt/deny/test1_TMP]. Abandoning deployment.]
    [Cannot unzip new repo: can't create new repo folder!]

To fix this problem:

  1. Go to the repository folder on Node 2 and make the permissions read/write.

  2. If you now try to create the same repo that you added before, using the same name and filepath, the repository is not added because on the first attempt Node 1 folder permissions were read/write and so the repository directory was created.
    Therefore, to create the same repository you must first remove the repository folders that were created on Node 1 in the first creation attempt and cancel the pending task on Node 1.

4.4.3. Remove a repository

To remove repositories from replication by MSP follow this procedure. These steps do not remove the repository from the disk. It is up to an administrator to login to each node in the replication group of the repository and, after the repository has been successfully removed by the following procedure, remove the repository from the disk (or move it to an archive location).

If this implementation of MSP is integrated with Access Control Plus then start the removal process with the following 2 steps (executed on ACP) and then continue on with the normal process below.

  1. Remove all references to the repository as a resource from ACP

  2. Generate AuthN/AuthZ files from ACP and verify distribution to all MSP nodes

If this implementation of MSP is not integrated with ACP then please start by disabling access by denying Authorization to the repository that is to be removed. Do this by editing the authorization file on the nodes in the replication group and removing references to the repository being removed.

How to remove a repository from replication:

  1. Log in to the admin console of one of your nodes. The node must be a member of a replication group in which the repository is replicated, otherwise it is not listed. Click the Repositories tab to see it.

    repotab 1.9
    Repository tab
  2. On the Repositories tab, click the line of the repository that you want to remove. When the repository is highlighted (in yellow), the Remove button becomes available. Click it.

    removerepo1 1.9
    Select Repository
  3. A dialog box appears. This confirms that removing a repository from a replication group stops any changes that are made to it from being replicated. No repository data is deleted but the repository is internally marked as no longer replicated (changes are made to internal repository files enabling them to be served by a non-replicated Apache/svnserve).

    removerepo2 1.9
    Remove

You can also remove a repository using the button on the Repository Information screen:

removerepo3 1.9
Remove
Important requirement when performing a repository dump / load
If a repository is removed and in some way historically edited (e.g. an SVN dumpfilter to remove content/revisions due to sensitive content), before re-introducing the repository to MSP (all whilst maintaining the same UUID), commits may fail due to the existing fulltext cache references no longer being applicable.
Workaround: Restart Apache and MSP before you re-add/make use of the repository.

4.4.4. Edit a Repository

You can edit a repository’s properties when they are set up in MSP:

  1. Log in to the admin console of one of your nodes. The node needs to be a member of the replication group in which the repository is replicated, otherwise it won’t appear on the tab. Click the Repositories tab.

    repotab 1.9
    Login
  2. Click the line for the repository that you want to edit, then click Edit.

    editrepository 1.9
    Repositories

    The Edit Repository window opens.

    editrepos 1.9
    Edit Repository
    Local Read-Only

    Enable or disable the repository local read-only setting. When enabled, the repository is not writable, either for local users or for the replication system (that would push changes made to the repository on other nodes). However, changes that come from the other nodes are stored to be rolled out as soon as the read-only state is removed.

    Global Read-Only

    Enable or disable the repository global read-only setting. When enabled, the repository is not writable, either locally or globally. This setting locks a repository, stopping any changes.

    Replication Group

    Use the drop-down selector to change the replication group to which the repository belongs. See information on moving repositories.

  3. Make your changes and click Save.
    Depending on your change extra steps may be required. For example if you moved a repository see the next section for more information.

4.4.5. Move a repository to another replication group

You can move a repository to another replication group by using the repository edit box.

Changing the nodes

If you are moving to a replication group with additional nodes then performing the 2 rsync steps below is crucial.

If you are moving to a replication group with fewer nodes then be careful to edit the Authorization file on nodes being removed to eliminate possible access. Make certain to remove the repositories shortly thereafter from the filesystem on the removed nodes.

If you are both adding and removing nodes you will need to do both of these steps.

  1. Use rsync to ensure the repositories are at all new nodes.

  2. Move the repository to another replication group by following the steps in the Edit a repository section.

  3. Now follow the repository repair procedure which includes a final rsync. This allows the repository to be continually available for write operations on one of the other two nodes in the replication group, while the other is used for the source of the rsync. This final rsync is crucial!

4.4.6. Repository synchronized stop

The Repository Synchronized Stop is used to stop replication between repository replicas. You can use it on a per-repository basis or on a replication group basis (where replication is stopped for all associated repositories). If you want to stop replication on all nodes, please see the section on how to use the Sync Stop All command from the Nodes tab.

You should verify that the repository to be sync stop’d does not have a repository replica in the LRO state anywhere by visiting the MSP UI on all nodes in the replication group and looking at the repository information.
Do not use this process if the repository needs to be repaired!

Repository Stops are synchronized between nodes using a 'stop' proposal to which all nodes need to agree. So that while not all repository replicas on all nodes in the replication group will come to a stop at the same time they do all stop at the same point.

  1. Log in to a node’s browser-based UI and click on the Repositories tab. Click on the repository that you wish to stop replicating. Click the Sync Stop button.

    syncstop1 1.9
    Select repository
  2. A growl message will appear to confirm that a synchronized stop has been requested. Note that the process may not be completed immediately, especially if there are large proposals transferring over a WAN link.

    syncstop2 1.9
    Sync Stop repository
  3. On refreshing the screen you will see that a successfully sync stopped repository will have a status of Stopped and will be Local RO (Locally Read-only) at all nodes. For all nodes, check the status in the repository information section on the UI .

4.4.7. Repository synchronized start

Restarting replication after performing a Synchronized Stop requires that the stopped replication be started in a synchronized manner.

  1. Click on a stopped repository and click on the Sync Start button.

    syncstop3 1.9
    Sync Start repository
  2. The repository will stop being Local Read-only on all nodes and will resume replicating.

4.4.8. Stop repositories on a node

In some situations you may need to stop writes to the local repository replica. There are a couple of methods for doing so:

  • Stop the node - this takes the node offline and it will therefore no longer be able to process any changes. It is an Action on the Nodes tab on the UI.

  • Stop Node via API call - use the API to stop all repositories on a node.

5. Troubleshooting Guide

5.1. Logs

MSP logs SVN and replication events in several places:

Admin UI: Growl messages

The growl messages provide immediate feedback in response to a user’s interactions with the Admin UI. Growls are triggered only by local events and only display on the node (and in the individual browser session) where the event was triggered. Growl messages appear in the top right-hand corner of the screen and persist for a brief period (15 seconds in most cases) or until the screen is refreshed or changed.

System Status
sysstatus
Dashboard - Status
Always check the dashboard
If you are troubleshooting a problem we strongly recommend that you check the Dashboard as well as the log files. While the growl messages give you an immediate alert for events as they happen, they are not the main method of tracking failures or important system events.
Dashboard: Replicator Tasks

Events that are more complex and are not bound by user interactions may appear on the Dashboard’s Replicator Tasks.area. This area will only display if there are relevant records. Tasks may consist of a simple statement or, with a click on the Task name, a multi-line report:

dashlog01
Dashboard - Tasks
Application Logs

Read more about Application logs

Replicator Logs

Read more about Replicator logs

5.1.1. Application logs

/opt/wandisco/svn-multisite-plus/

The main logs are produced by various agents and contain messaging that is mostly related to getting MSP started up and running.

-rw-r--r-- 1 wandisco wandisco   88 Jan 15 16:53 multisite.log
-rw-r--r-- 1 wandisco wandisco  220 Jan 15 16:53 replicator.20140115-165324.log
-rw-r--r-- 1 wandisco wandisco 4082 Jan 15 16:53 ui.20140115-164517.log
-rw-r--r-- 1 wandisco wandisco 1902 Jan 15 16:53 watchdog.log
flume.YYYYMMDD-hhmmss.log

A new one is created every time the Flume sender component is started by its watchdog. This file is expected to be small unless the Flume sender component is not properly installed.

ui.YYYYMMDD-hhmmss.log

A new one is created every time the UI component is started by the watchdog. It contains all messages generated by the UI. Normally this log file is small.

multisite.log

This log contains entries based on the execution of the /sbin/init.d/svn-multisite-plus system startup/shutdown script.

    2014-01-15 16:45:17: [3442] Starting ui
    2014-01-15 16:53:24: [3571] Starting replicator
replicator.YYYYMMDD-hhmmss.log

A new one is created every time the replicator is started by the watchdog. It includes events generated by the java Virtual Machine itself and not caught by normal MSP logging mechanisms. For the most part this file should be empty.

watchdog.log

Log entries from all of the 3 possible watchdog processes that keep the UI, replicator and Flume sender running if they should fail unexpectedly.

5.1.2. Replicator logs

The logging for replication activity is stored within the replicator directory in the MSP installation, i.e. /opt/wandisco/svn-multisite-plus/replicator/logs. These logs take the following form:

-rw-r--r-- 1 wandisco wandisco 296785 Jan  6 14:36 fsfswd.log
-rw-r--r-- 1 wandisco wandisco     54 Jan  6 07:34 logrotation.ser
drwxr-xr-x 2 wandisco wandisco   4096 Jan  6 07:30 recovery-details
drwxr-xr-x 2 wandisco wandisco   4096 Jan  6 14:34 thread-dump

The logging system has been implemented using Simple Logging Facade for Java (SLF4J) over the log4J Java-based logging library. This change from java.util.logging has brought some benefits.

This change lets us collate data into specific package-based logs, such has a security log, application log, DConE messages etc.

Logging behavior is mostly set from the log4j properties file: /opt/wandisco/svn-multisite-plus/replicator/properties/log4j.properties

# Direct log messages to a file
log4j.appender.file=com.wandisco.vcs.logging.VCSRollingFileAppender
log4j.appender.file.File=fsfswd.log
log4j.appender.file.MaxFileSize=100MB
log4j.appender.file.MaxBackupIndex=10
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
log4j.appender.file.append=true

# Root logger option
log4j.rootLogger=INFO, file

This configuration controls how log files are created and managed. A change to log4j configuration currently requires a replicator restart to take affect.

  • The log file name is fsfswd.log.

  • The maximum size of a log file is set at 100MB.

  • The maximum number of logs is limited to 10.

  • The VCSRollingFileAppender offers some benefits over Log4j’s default RollingFileAppender. It has a modified rollover behavior so that the log file fsfswd.log is saved out with a permanent file name (rather than being rotated). When fsfswd.log reaches its maximum size it is saved away with the name fsfswd.log.<Date>. The date/time stamp is in ISO-8601 format.

  • When the maximum number of log files is reached, the oldest log file is deleted.

If you enable the debug mode, we recommend that you adjust your log file limits, increasing the maximum file size and possibly the maximum number of files.

If possible, put the log files on a separate file system.

Additional log destinations (appenders)

Apache log4j provides Appenders which are primarily responsible for printing logging messages to different destinations such as consoles, files, sockets, NT event logs, etc.

Appenders always have a name so that they can be referenced from Loggers.

For more information about setting up appenders, see the Apache documentation.

We strongly recommend that you work with WANdisco Support team before making any significant changes to your logging.

Log appenders can have a significantly negative performance implication.

5.1.3. Logging levels

ALL

Provides a level of logging including trace information for troubleshooting hard to identify problems (This can be very noisy).

DEBUG

Provides a standard level of debug information, for identifying problems.

INFO

Interesting runtime events and system interactions. Expect these to be immediately visible on a console.

WARNING

A logging level indicating potential problems.

SEVERE

A logging level indicating serious failures.

5.1.4. Logger settings tool

You can change the logging levels, either temporarily to help in a current investigation, or permanently if you want to change your ongoing logging.

The logging settings tool enables you to change the levels through the UI but it is also possible to modify log settings directly by editing the logger properties file:

/opt/wandisco/svn-multisite-plus/replicator/properties/logger.properties

Once you’ve made a change, you will need to restart the replicator in order for the change to take effect.

Log changes are not replicated between nodes. This allows each node to have its own logging setup but you will have to manually replicate any changes if needed.

The Logging settings tool is on the Settings tab. Loggers are usually attached to packages. Here, the level for each package is specified. The global level is used by default, so levels specified here act as an override that takes effect in memory only, unless saved to the logger properties file.

logging2 1.9
Logging Settings

5.1.5. Edit global logger settings

The global level is the default for all packages.

  1. Log in to the admin console, click the Settings tab.

  2. Scroll down the settings till you reach the Logging Settings block.

    svnmsp logging01
    Configure
  3. Click the Configure button.

  4. The Logging Settings Config page opens. Click the drop-down menu to change the current global logger setting. This change will be applied to all loggers that have not been specified in the edited Logger settings. Changing this value takes effect in memory immediately, to change this value permanently click Save All Settings To File. Loggers that you Add or Edit (specify) will always override this global setting.

    logging1 1.9
    Edit Global Logger Setting
  5. Click Save to apply changes to the logger.properties file, then restart the replicator for changes to take effect.

5.1.6. Add or edit logger settings

  1. Log in to the admin console, click the Settings tab.

  2. Scroll down the settings till you reach the Logging Settings block.

    svnmsp logging01
    Configure
  3. Click the Configure button.

  4. The Logging Settings Config page will open, it has the following sections:

    logging2 1.9
    Configuration page
    Add New Logger Settings

    Enter the name of the logger, assign its level then click the Add button.

    Edit Existing Logger Settings

    Use the corresponding drop-down list to change the level of any of the existing loggers or click the Delete button to remove the logger. The default logging package com.wandisco.fsfs.logger.FSFSFileHandler cannot be deleted.

    All changes thus far are immediate in effect and in-memory only. Changes are not persisted after replicator restart unless you use the save or reload button:

    Reload Logging Settings

    Click Reload All Settings From File button to ditch all changes by reloading the logger settings from the <install-dir>/replicator/properties/logger.properties file.

    Save Logging Settings

    Click Save All Settings To File to apply your changes to the above logger.properties file.

  5. Once you have saved or reloaded the Logger Settings, appropriate growl messages will appear in the top right.

    logging3 1.9
    Growl - Saved

5.2. Consistency check

Consistency check is done on a per repository basis. It enables you to check whether a selected repository remains in the same state across the nodes of the replication group in which the repository resides.

The consistency checker looks at the last N common revision(s), N is specified during the request and can be set by the user.

More revisions costs resources
Don’t set N too large. The more revisions you compare, the greater the cost on the available Java heap space.

In MSP 1.9.0 to 1.9.2 it also compares the fsfswd-txn-sequence file.

fsfswd-txn-sequence

In MSP 1.9.0 to 1.9.2 this file exists in the db folder of the Repo and is only present in repos that have commits (this can also include commits that failed).
Note: In versions after MSP 1.9.2 this file no longer exists.

Each time a commit is made to the repo, regardless of whether or not it fails, the file is updated. The file format is in base36.

If the values in those files do not match the repo will appear as inconsistent and the value will be listed for each node so the user can easily spot where the inconsistency lies.

There will be more information present in the logs as to what the problem is and where it is. For example an inconsistent repo will log the following:

2015-11-09 16:39:54 INFO  [ConsistencyCheckTask:handle] - OutputProposalSequence:[[abb21772-5544-43e2-9cb9-ff1node1][8ccb4272-8496-11e5-ae63-0800278ef1dd]]-0::[fsfs-txn-sequence values did not match for Repo12,8ccb4271-8496-11e5-ae63-0800278ef1dd, between, node1, abb21772-5544-43e2-9cb9-ff1node1, and, node2, abb21772-5544-43e2-9cb9-ff1node2. Values were : k and u respectively.]
Packed repositories / checksums

The consistency checker makes a call to SVN to check both the revprops and the revisions and we wrap these up in a CheckSumWrapper.

In the Checksum we will compare both the Digest and the Kind, the Kind can be either MD5 or SHA1.

  • If the total checksum size is different across the nodes we will mark the repo as inconsistent.

  • If the Checksum, Digest or Kind is not equal across the nodes we will mark the repo as inconsistent.

There will be more information present in the logs as to what the problem is and where it is.

Improved check of lockfiles

From MSP 1.9 the consistency checker no longer tracks all individual lockfiles, given that there could be tens of thousands (or more) locks to colate, resulting in a consistency check consuming multiple Megabytes.

Now a SHA1 checksum is taken of all the lock file names and their sizes. The contents are not checksumed as the contents can be different across nodes based on a checksum created when the lock file is created. The timestamp is a fixed size so the lock file should be the same size across nodes.

The number of lock files is included in the Consistency Check result.

The lock file names are also logged to a file in the log directory of the replicator. The file name is based on the repository Id and also contains the name of the repository, the GSN when the consistency checkpoint was done. Each time the consistency check is done for this repository the file is replaced with a new version. The file should be the same on all nodes, if the lock files are inconsistent between nodes then diff’ing the files should show what lock files are different.

Clean up stale lock files
The consistency checker verifies all lock files, so having a large number of stale or abandoned lock files laying around can impact the consistency checker’s performance. For this reason it makes sense to do a regular cleanup of "stale/abandoned" lock files.
Cleaning up stale lock files
  1. Obtain a list of the locks in a repository. Run:

    svnadmin lslocks {REPOSITORY_PATH}
  2. The output will contain the following information:

    Path: /<pathInVersionTree>
    UUID Token: opaquelocktoken:8e2f486c-632c-4bad-bcb1-cf942ae71893
    Owner: <:accountName>
    Created: 2015-11-24 15:35:10 -0500 (Tue, 24 Nov 2015)
    Expires:
    Comment (1 line):
    some comment

    Followed by at least 1 blank line (of course, true parsing requires reading the number of lines of Comment in the output and skipping those before looking for the blank line and then the next "Path:" record.

  3. You can now determine if a lock file is stale by considering factors such as:

    • File owner no longer works for the company.

    • File created many months or even years ago.

If there’s doubt, contact the file owner and ask about the file’s status. When you have identified a stale lock, it can be removed via the svnadmin rmlocks command. Note: the actions taken by the svnadmin rmlocks command are replicated, so specified locks will be removed from all repository replicas.

Limits of the Consistency Checker

The Consistency Check tells you the last common revision shared between repository replicas. Given the dynamic nature of a replication group, there may be in-flight proposals in the system that have not yet been agreed upon at all nodes. Therefore, a consistency check cannot be completely authoritative as it does not include any changes that occur after the consistency check is scheduled.

Consistency checks should be made on replication groups that contain only Active (including Active Voter) nodes. The presence of passive nodes causes consistency checks to fail.

If you run consistency checks, especially on a schedule, take care when changing node roles that you don’t make an affected node passive. This will result in the consistency check failing, as noted above Active/Active Voter roles are required for consistency checks.

If you use the REST API to run a consistency check on a repository that does not exist then the dashboard will display an error.

Prior to inducting a new node you must ensure that the consistency check settings of the new node match the current ecosystem nodes.
Don’t run with LRO
Don’t run a consistency check if any of your replica repositories are in the Local Read-only state. In this case a consistency check will not complete until the LRO state is cleared.

You will receive a consistency error if you run a consistency check when there is no quorum. Consistency checks cannot verify consistency without a quorum so shouldn’t be run. Consistency checks will not complete until all nodes in a replication group have provided the requested data for the specified repository. Therefore, all nodes in the replication group should be up. If one or more nodes are down then the consistency check will complete when those nodes come back up.

Scheduled Consistency Checks not running?
If scheduled consistency checks are being skipped, possibly due to the previous check having failed, you can get the scheduled checks back into action by cancelling the previous task though the admin UI. Read how to set up Scheduled Consistency Checks.
Running a consistency check

You can trigger a consistency check at any time using the following procedure.

  1. Log in to a node, and click the Repositories tab.

    repotab 1.9
    Go to the repository
  2. Click one of the listed repositories.

    concheck1 1.9
    Click a repository
  3. Click the Consistency check. You see a growl message Invoking consistency check on repository <Repository Name>.

    concheck2 1.9
    Consistency check in action
    Known issue: Don’t run a consistency check if the repository has been removed from one of the nodes.
    There’s currently a problem with running a consistency check on a repository if the replica on one or more or more nodes has been deleted. In this situation a "Highest Common Revision" task appears on the dashboard and remains permanently in a Pending state. Until we resolve this problem you shouldn’t run the consistency checker on a repository if it has been removed from the file system of any of your nodes.
  4. If you click the dropdown, you can choose a different number of revisions to compare between nodes. Be careful, the more revisions selected, the more JAVA RAM will be necessary to complete the consistency check.

    concheck3 1.9
  5. The results of the consistency check will be written to the log. Click on the Reload button to refresh the Repository screen and display the results of the check.

    concheck4 1.9
    Time Consistency Checked Started

    The time that the check was started.

    Number of Revisions Requested

    The check can be set to limit the number of revisions that are checked. This field confirms the limit of the check, in this case it is the default 10 revisions.

    Earliest Revision Checked

    The first revision that is compared between nodes.

    Latest Revision Checked

    The most recent revision that was compared in the check.

    Consistency Check Results

    The summary of the check may look something like this:

    node2 :
          endingRevision: 1
          startingRevision: 0
          format: formatNum: 4
          shardSize: 1000
          notPacked
          RevpropChecksums: 2
          RevisionChecksums: 2
          locks: RepoLocksChecksum{checksum='da39a3ee5e6b4b0d3255bfef95601890afd80709', lockCount=0}
          delegate Port OK: true
          fsType OK: true
          fsfswd-txn-sequence: dw
          Local Node Id: 714565f6-910f-4efc-8d58-493234f690fc
Consistency Check Key
endingRevision

Corresponds with the setting "Latest Revision Checked".

startingRevision

Corresponds with the setting "Earliest Revision Checked".

format

The FSFS format number. E.g.

Format Name Understood by Features

Format 1

SVN 1.1+

inception

Format 2

SVN 1.4+

Introduced support for svn diff version 1

Format 3

SVN 1.5+

Shard layout and storing merge track information

Format 4

SVN 1.6+

representation sharing and repository packing

Format 5

SVN 1.7 development

Removed prior to 1.7.0 release

Format 6

SVN 1.8+

Revision properties packing

Format 7

SVN 1.9+

Performance related changes

shardSize

the maximum number of files store in the db/revs/N shard directories. The default and recommended value is 1000. You can read more about FSFS Sharing in the following blog - Tree-structure FSFS repositories.

notPacked

Flags whether the repository is "packed" or not packed ("notPacked"), a procedure that greatly speed up repository performance by bundling all files in a completed shard together into a single unified revision file. The use of packing saves storage space and gives the operating system opportunities to benefit from caching.

RevpropChecksums

The revisionChecksums() and revpropChecksums() return arrays of org.apache.subversion.javahl.Checksum objects. These objects have member functions to return the kind of checksum and the checksum itself. At present the FSFSWD implementation will always calculate MD5 checksums.

RevisionChecksums

See above.

locks

Check of lockfiles. See Improved Locks

delegate Port OK

"true" or "false"

fsType OK

A check that the file system type is okay. "true" or "false".

fsfswd-txn-sequence

See fsfswd-txt-sequence.

Known issue: You are not notified if a scheduled consistency check fails to run
Check the dashboard for the status of the consistency check.
Scheduled Consistency Checks

You can have consistency checks triggered automatically on a predefined schedule. Run scheduled consistency checks are on a per-node basis as follows (this is not a replicated feature):

  1. Log into the admin console, click Settings.

    settingstab
    Settings
  2. In the System Data section, you’ll see a Scheduled Consistency Check Enabled? checkbox. Tick the box to enable the schedule. The time between checks (if enabled) is set in the Scheduled Consistency Check Frequency (Hours) box.

    svnmsp settings012
    Enable and set the check frequency

    By default, the frequency is set to 24 hours, i.e. repositories are checked for consistency once per day. The entry field permits an integer value from 1 (an hour) to 999 (41 days, 14 hours).
    The starting time for the consistency checks is midnight, so if you set the value to 8, checks will occur at midnight, 8am and 4pm. If you set the value to 96, the check will occur at midnight every 3 days.
    Important - this is midnight in the timezone of the node the consistency check is being triggered on.

  3. Once your settings are in place, click Save.

Known issue
Currently, you need to ensure that the Number of Revisions in Default Consistency Check are the same on all nodes before starting a check. This is important if you have edited the value in production and then add a new node, which will use the default value unless it is edited to match.
Deciding which nodes and when

In an ideal situation you would run a consistency check once a week on a node which has all repositories available. The number of repositories would not be too large so that the consistency check is finished before the start of the working day. Obviously this ideal situation rarely happens.

There are several things to think about when deciding on which nodes and at what frequency to schedule consistency checks:

  • You need to use a combination of nodes which provides completeness i.e. all repositories will be checked once (at least).

  • You need to use a combination of nodes with the least repetition of repositories i.e. you want as many repositories as possible to only be checked once.

  • You want to schedule consistency checks to disrupt repository usage as little as possible - you can use timezone differences between your nodes and your majority of workers to your advantage.

  • By scheduling checks on different days of the week on different nodes you will reduce the impact of the check on repository usage compared to if you did them all at once.

Checking more often than hourly

Scheduled consistency checks are not replicated, there’d be no point as all repository replicas across all nodes are being checked anyway. You can use the fact that they are not replicated to your advantage, for example if you want to perform checks that are more frequent than once per hour. If you have nodes that are in timezones not a full hour apart (for example the UK and India - 4.5hr) you could run an hourly check on each node. In this situation the check would therefore occur every 30 minutes. Such frequent checks wouldn’t be recommended if you have more than a few repositories as consistency checks are very resource intensive.
Do not set the frequency too high. Ideally you want to run a check once a week, and not more than once a day.

Scheduling consistency checks through the API

Instead of using the in built scheduler you can schedule consistency checks with the REST API. Scheduling checks programmatically allows more sophisticated schedules to be used.

For more information contact WANdisco support.

5.2.8. Inconsistency: causes and cures

WANdisco’s replication technology delivers active-active replication that, subject to some external factors, ensures that all replicas are consistent. However, there are some things that can happen that break consistency that would result in a halt to replication.

  • Temporary removal of a repository from a node, then adding it back incorrectly.
    Fix: Ensure that an rsync is performed between your restored repository and the other replicas. Don’t assume that nothing has changed even if the repository has been off-line.

    Known Issue
    Don’t run a consistency check if the repository has been removed from one of the nodes. There’s currently a problem with running a consistency check on a repository if the replica on one or more or more nodes has been deleted. In this situation a "Highest Common Revision" task will appear on the dashboard and will remain permanently in a 'pending' state. Until we resolve this problem you shouldn’t run the consistency checker on a repository if it has been removed from the file system of any of your nodes.
  • The Consistency Check would not be expected to deal with consistency issues that pre-dated the revision at which replication was started.
    Fix: Ensure consistency between replicas before you start replicating a repository.

  • The Consistency Check would not be expected to pick up on inconsistencies that occur very early revisions in a very large repository (revision 12 in a repository with 10,000 revisions, etc.)
    Fix: These sorts of issues should be managed through SVN admin best practices such as through regular, incremental backup of repositories and verifications using svnadmin.

  • Restoring a backup of a repository from a VM snapshot can introduce differences.
    Fix: Repeat the repository restoration, account for factor’s such as the use of Change Block Tracking (CBT). Make sure to rsync the restored repository from a single recovered copy.

    While you can restore a repository from a VM snapshot, never restore a replicated node from a VM snapshot. If restoring a repository from a VM snapshot take caution to boot the VM into single use and prevent the MSP application from starting (e.g. mv /opt/wandisco/multisite-plus /opt/wandisco/multisite-plus.DONOTSTART).
  • Possible SVN/VCS bugs that leads to non-deterministic behavior, leading to a loss of sync.
    Fix: Need to be handled on a case by case basis, subject to the nature of the problem. Please contact WANdisco support.

  • Manipulation of file/folder permissions outside of SVN’s control will lead to divergence that will force the affected replica to become read-only.
    Fix: The easiest to fix as it is correcting the file/ownership errors. This will generally result the replicas re-syncing and automatically coming out of Read-only mode.

5.2.9. Log results

It’s also possible to check the results of a consistency check by viewing the replicator’s log file (e.g. fsfswd.log). See Logs

5.2.10. A note about replica size and consistency

It is possible that repository replicas that are consistent between nodes have different reported on-disk size footprint. This difference should not be a cause for concern and can be explained by a number of factors that mostly relate to house keeping and actions that don’t need to be synchronized. These can include:

  • Aborted transactions, still waiting to be cleaned up.

  • The local use of various repository admin tools that create or change repository files.

  • Collection timing skew; different revision numbers.

  • From MSP 1.9 onwards, the effect of packed vs unpacked repositories.

5.3. Copying repositories

This section describes how to get your repository data distributed before replication.

Repositories should start out as identical at all sites. A tool such as rsync can be used to guarantee this requirement. The exception is the hooks directory which can differ as variances in site policy may require different hooks. For more information see hooks.

5.3.1. Copying existing repositories

It’s simple enough to make a copy of a small repository and transfer it to each of your nodes. However, remember that any changes made to the original repository will invalidate your copies unless you perform a synchronization prior to starting replication.

If a repository needs to remain available to users during the process, you should briefly halt access, in order to make a copy. The copy can then be transferred to each node. Then, when you are ready to begin replication, you need use rsync to update each of your replicas.

5.3.2. New repositories

There are 2 ways to create a new repository:

  1. Through the UI - see Add repositories

  2. Locally via svnadmin create

Don’t create a new repository at each node, instead create the repository once and then copy it to all nodes in the replication group to which it will be added. Copying can be done via rsync, scp or any other mechanism that guarantees a 100% identical repository copy. The hooks directory contents can be customized after the copy is made if required (see hooks).

5.4. Repair an out-of-sync repository

There are several situations where a repository may be corrupted or lose sync with its other copies. For example, it could be the result of a temporary file system full state or a file system corruption causing lost data. If this happens, the node with this copy stops replicating data for that repository. Other repositories are unaffected and continue to replicate. You can use MSP’s repair process to quickly repair the repository and continue replicating. Make certain to fix the underlying file system problem before continuing or the corruption will re-occur.

No option to repair?

If an existing repository is added to a Replication Group that contains Passive nodes or a repository on a Passive node enters an Local Read-only state, the UI does not offer a repair option because it cannot coordinate with the repository copy on the Passive node. You must temporarily change the passive node into an active node:

  1. Log in to the Passive node, then click the Replication Group tab.

  2. Click the Configure button, then change the role of the passive node so that it becomes active.

  3. When the repair is completed successfully, reverse this change to return to your established replication model.

  1. Log in to the admin UI on all nodes and click the Repositories tab. Any repository that is out of sync is flagged as Local RO and Stopped. Other replica may continue to update.
    Click the Repair button.

    emr1 1.9
    Out of sync
  2. The Repair Repository window opens. Select a helper from the nodes still in replication. Make sure that the helper’s copy of the repository is the latest version.
    Click the Start Repair Process button. This briefly takes the selected node offline, to ensure that changes don’t occur to the repository while you conduct the repair. Log in to handle the repair manually.

    emr2 1.9
    Start the repair
  3. Use the good copy of the repository on the helper node, overwriting the broken copy. We recommend that you use rsync using the following command:
    rsync -rvlHtogpc --delete --exclude /hooks /path/to/myRepo/ account@remoteHost.example.com:/path/to/myRepo/
    Note the trailing slashes before and after arguments.

    Hooks will be overwritten
    When restoring a repository using rsync, you will also copy across the helper repository’s hooks, overwriting those on the destination node.
    Need to maintain existing hooks?
    Before doing the rsync, copy the hooks folder to somewhere safe. Then when you’ve completed the rsync, restore the backed-up hooks.
    Locks
    The locks directory-tree must be exactly the same on all replicas to avoid replica divergence. For this reason the "--delete" option is required.
    [root@localhost repos]#  rsync -rvlHtogpc --delete --exclude /hooks /opt/repos/repo2/ root@172.16.2.41:/opt/repos/repo2/
    
    The authenticity of host '172.16.2.41 (172.16.2.41)' can't be established.
    RSA key fingerprint is 9a:07:b2:bb:b6:85:fa:93:41:f0:01:d0:de:8f:e1:5d.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '172.16.2.41' (RSA) to the list of known hosts.
    root@172.16.2.41's password:
    sending incremental file list
    ./
    README.txt
    format
    conf/
    conf/authz
    conf/passwd
    conf/svnserve.conf
    db/
    db/current
    db/format
    db/fs-type
    db/fsfs.conf
    db/min-unpacked-rev
    db/rep-cache.db
    db/txn-current
    db/txn-current-lock
    db/uuid
    db/write-lock
    db/revprops/
    db/revprops/0/
    db/revprops/0/0
    db/revprops/0/1
    db/revprops/0/2
    db/revprops/0/3
    db/revs/
    db/revs/0/
    db/revs/0/0
    db/revs/0/1
    db/revs/0/2
    db/revs/0/3
    db/transactions/
    db/txn-protorevs/
    locks/
    locks/db-logs.lock
    locks/db.lock
    
    sent 1589074 bytes  received 701 bytes  167344.74 bytes/sec
    total size is 1585973  speedup is 1.00
    [root@localhost repos]#
  4. When the repository is updated, check that the fixed repository now matches the version on your helper node.

  5. Restart Apache. This frees up file handlers that are holding the rep-cache.db file open as well as clearing any in-memory cache data that could point to references that don’t exist in the repaired repository.

  6. Complete the repair process. Click the Complete Repair Process button.

    emr3 1.9
    Complete
  7. Now, restart the replicator. You can use the Restart Replicator button on the Admin UI’s Settings Tab or the svn-multisite-plus script described in the Admin section Starting up.

  8. A Growl message will confirm the repair is complete. Make sure that the re-synced repositories are Replicating again.

    emr4 1.9
    Back in sync
  9. If a consistency check was in progress when you executed the repository repair process then the consistency check tasks may never complete. You should go to all of the nodes other than the repaired node and cancel any task associated with a consistency check on the repaired repository.

5.4.1. Recovering Sidelined Repositories

The sidelining feature is used for putting a repository into the offline mode. This tells the other nodes to press on, and not queue up subsequent proposals. When a repository has been taken offline, it can never catch up and will require a Repository Repair.

Why sideline?
Without the sidelining feature, any replica that remained offline could cause the remaining nodes to exhaust their storage. This is because they would attempted to cache all the continuing repository changes, so that they could automatically "heal" the offline repository, should it come back online.

Use the following procedure to free a repository from a sidelined state:

  1. View, then click on the sidelined repository.

    sidelined1 1.9
    Select Repository
  2. Click on Repair opens the repair dialog with sidelining-related options. Start by clicking Prepare to Unsideline button.

    sidelined2 1.9
    Prepare to Unsideline
  3. Choose the Helper Node from the Choose Helper Node dropdown and click to Start Repair Process.

    sidelined3 1.9
    Choose Helper Node
  4. Now use rsync to copy the repository from the helper to the broken node.

  5. When the rsync has completed, click on the Unsideline Repository button.

    sidelined4 1.9
    Choose Helper Node
  6. Click on the Complete Repair Process button and trigger the Consistency Check.

    sidelined5 1.9
    Choose Helper Node
  7. A Growl message will appear saying that the helper process has completed and a consistency check will be carried out, if this check fails the repo will go Global Read-only. You can check this by refreshing the page. The repository will show up as replicating again.

    sidelined6 1.9
    Replicating again

5.5. Recover from node disconnection

MSP can recover from a brief disconnection of a member node. It should be able to automatically synchronize when the node is reconnected. The crucial requirement for MSP’s continued operation is that agreement over transaction ordering must be able to continue. Votes must be cast and those votes must always result in an agreement.

If, after a node disconnection, a replication group can no longer form agreements then replication is stopped. If the disconnected node was a voter and there aren’t enough remaining voters to form an agreement then either the disconnected node must be repaired and reconnected, or the replication group must undergo emergency reconfiguration (EMR).

5.5.1. EMR

EMR is only necessary if there is a lack of quorum in one or more replication groups after a node has been disconnected/lost and, in the case of simple disconnection, the node is not expected to resume operation for an unacceptable amount of time. If you use EMR then the disconnected/lost node will be permanently removed from your ecosystem - including all replication groups where it is a member - and it must never resume operation again. You must contact WANdisco’s support team for assistance before using EMR as the operation poses several risks to overall operation. We therefore recommend that you do not attempt the procedure without assistance from WANdisco support.

EMR is a final option for recovery

The EMR process cannot be undone, and it involves major changes to your replication system. Only consider an EMR if the disconnected node cannot be repaired or reconnected in an acceptable amount of time.

The EMR procedure needs to be co-ordinated between sites/nodes. You must not start an EMR if an EMR procedure has already started from another node. Running multiple EMR procedures at the same time can lead to unpredictable results or cause the processes to get permanently stuck requiring a complete re-deployment of your system. Again, please contact WANdisco support to prevent unnecessary damage to your replication ecosystem.

Any replication group which has its membership reduced to one node will continue to exist after the emergency reconfiguration as a non-replicating group. When you have set up a replacement node you should be able to add it back to the group to restart replication.

Note: If EMR is used to remove a node you may be left with a pending task of type tasksTypeREMOVE_STATE_MACHINE_TASK. If this is the case then:

  1. Cancel the active/pending task (type is tasksTypeREMOVE_STATE_MACHINE_TASK)

  2. Restart the node where the pending task existed.

5.6. Run Talkback

Talkback is a bash script that is provided in your MSP installation for use in the event that you need help from the WANdisco support team.

Manually run talkback using the following procedure. You can run talkback without the need for user interaction if you set up the variables noted in step 3, below:

  1. Login to the server using either the root account or the account that MSP runs as. Using the root account will enable additional data to be included in the talkback that may be required in the analysis of your issue and finding the appropriate root cause. Navigate to the MSP’s binary directory:

    cd /opt/wandisco/svn-multisite-plus/bin/
  2. Run talkback.

    [root@localhost bin]# ./talkback
  3. You’ll need to provide some information during the run - also note the environmental variables noted below which can be used to further modify how the talkback script runs:

    #
    # WANdisco talkback - Script for picking up system & replicator       #
    # information for support                                             #
    #
    
        To run this script non-interactively please set following environment vars:
    
        ENV-VAR:
        MSP_REP_UN                  Set username to login to MSP
        MSP_REP_PS                  Set password to login to MSP
        MSP_SUPPORT_TICKET          Set ticket number to give to WANdisco support team
        MSP_RUN_SVNADMIN            Run svnadmin verify, lstxns and lslocks commands - turned off by default (if enabled this can take a very long time depending on the size and quantity of your repositories - please contact support)
    
        By default, your talkback is not uploaded. If you wish to upload it, you may also specify
        the following variables:
    
        MSP_FTP_UN                  Set ftp username to upload to WANdisco support FTP server. Note that
                                    specifying this may cause SSH to prompt for a password, so don't set
                                    this variable if you wish to run this script non-interactively.
    
    
          ===================== INFO ========================
          The talkback agent will capture relevant configuration
          and log files to help WANdisco diagnose the problem
          you may be encountering.
    
    Please enter replicator admin username: adminUIusername
    Please enter replicator admin password: thepasswordhere
    
    retrieving details for repository "Repo1"
    retrieving details for repository "Repo3"
    retrieving details for repository "Repo4"
    retrieving details for repository "repo2"
    retrieving details for node "NodeSanFransisco"
    retrieving details for node "NodeAuckland"
    retrieving details for node "NodeParis"
    
    Please enter your WANdisco support FTP username (leave empty to skip auto-upload process):
    Skipping auto-FTP upload
    
    TALKBACK COMPLETE
    
    ---------------------------------------------------------------
     Please upload the file:
    
         /opt/wandisco/svn-multisite-plus/talkback-201312191119-redhat6.3-64bit.tar.gz
    
     to WANdisco support with a description of the issue.
    
     Note: do not email the talkback files, only upload them
     via ftp or attach them via the web ticket user interface.
    --------------------------------------------------------------

Note that we have disabled the svnadmin check as in some situations it can impede the rapid collection of system data. If you want to turn it back on set the corresponding env variable as follows.

Enter the following string to switch the SVNAdmin checks back on:

export MSP_RUN_SVNADMIN=true

and then run the talkback. You can check the status of the variable by entering:

echo '$MSP_RUN_SVNADMIN'

5.6.1. Uploading talkback files

If you need help from WANdisco support you may need to send them your talkback output files.
DO NOT send these files by email. The best way to share your talkback files is via SFTP, but small files (<50MB) can also be uploaded directly at customer.wandisco.com.

For information on how to upload talkback files, see the Knowledgebase article How to upload talkback files for support.

Information can also be found at customer.wandisco.com but you will need a valid WANdisco License Key to access this information.

5.6.2. Talkback output example

talkback-201707040001-svnmsp.example.com/
├── flumeData
│   ├── receiver
│   │   └── conf
│   └── sender
│       └── conf
│           ├── acp_sender.conf
│           ├── flume-conf.properties.template
│           ├── flume-env.ps1.template
│           ├── flume-env.sh.template
│           ├── flumeFilterScript
│           └── log4j.properties
├── replicator
│   ├── agreementStatistics.xml
│   ├── application.xml
│   ├── config
│   │   ├── application.properties
│   │   ├── license.key
│   │   ├── log4j.properties
│   │   ├── log4j.properties.md5
│   │   ├── logger.properties
│   │   ├── main.conf
│   │   ├── svnok.catalog
│   │   ├── ui-logs
│   │   │   └── local-ui.log
│   │   ├── ui.properties
│   │   └── users.properties
│   ├── configuration.xml
│   ├── handlers.xml
│   ├── license.xml
│   ├── localNode.xml
│   ├── locations.xml
│   ├── md5s
│   ├── memberships.xml
│   ├── nodes
│   │   └── 93aa61d0-88e7-46fd-b23a-d2599fd13762
│   │       ├── connection-test
│   │       ├── location.xml
│   │       └── node.xml
│   ├── nodesInformation
│   ├── nodes.xml
│   ├── recent-logs
│   │   ├── fsfswd.log
│   │   ├── multisite.log
│   │   ├── node-details
│   │   ├── replicator.20170621-175447.log
│   │   ├── tasks-gc-2017-06-30T18:43:36+0200.xml
│   │   ├── ...
│   │   ├── thread-dump-2017-06-15T22-34-39+0200
│   │   ├── ...
│   │   ├── ui.20170621-175447.log
│   │   └── watchdog.log
│   ├── replicatedConfiguration.xml
│   ├── replicationGroups.xml
│   ├── replicator-file-list
│   ├── repositories
│   │   ├── repo1
│   │   │   ├── info
│   │   │   ├── membership.xml
│   │   │   ├── replicationGroup.xml
│   │   │   ├── repository.xml
│   │   │   ├── statemachine.xml
│   │   │   └── stats.xml
│   │   ├── ...
│   │   └── repoLast
│   │       ├── info
│   │       ├── membership.xml
│   │       ├── replicationGroup.xml
│   │       ├── repository.xml
│   │       ├── statemachine.xml
│   │       └── stats.xml
│   ├── repositories.xml
│   ├── statemachines.xml
│   ├── tasks.xml
│   └── VERSION
├── sysInfo
│   └── sysInfo_192.168.56.205-20170630-223153.tar.gz
└── system
    ├── file-max
    ├── file-nr
    ├── limits.conf
    ├── logs
    │   ├── error_log
    │   └── messages
    ├── netstat
    ├── processes
    ├── services
    ├── sysctl.conf
    ├── sys-status
    └── top

Note: Nodes that fall behind will eventually recover.

MSP runs with a smart commit strategy and ignores all read operations so activities such as checkouts never impact upon WAN traffic. This, along with network optimization can allow deployments to provide developers with LAN-speed-like performance over a WAN for write operations at every location, while keeping all of the repositories in sync. If a node is temporarily disconnected, or experiences extreme latency, low speeds or high lost packet rates, a node may become temporarily out of sync while transactions are queued up.

In this situation the node should eventually catch up in a self-healing manner without administrator intervention. It is worth monitoring the state of your WAN connectivity to help gain assurance that replication is going to be able to catch up. Clearly, if connectivity drops to almost zero for a prolonged period then this will inevitably result in the node becoming isolated and increasingly out-of-sync. If this happens after you have monitored the network traffic for a period of time, contact WANdisco’s support team and start considering contingencies such as making network changes or removing the isolated node from replication. If necessary due to quorum considerations, Emergency Reconfiguration may be necessary.

5.8. Disable external authentication

In the event that you need to disable LDAP or Kerberos authentication and return your deployment to the default internally managed users, use the following procedure.

  1. Open a terminal on your node. Navigate to the replicator directory:

    $  cd /opt/wandisco/svn-multisite-plus/replicator/
  2. Run the following command-line utility:

    $  java -jar resetSecurity.jar
    • Use resetSecurity.jar to reset an existing admin user’s password, or to create a new admin user.

  3. You’ll be asked for new administrator credentials then prompted to restart the replicator in order for the change to be applied. Make sure to provide an account name that has not already been used (along with its password).

  4. Now login using the normal authentication form and the new administrative account name:

loginpassword 1.9
Log in

5.9. Create a new users.properties file

If you need to create a fresh users.properties file for your deployment:

  1. Shut down all nodes with the command: service svn-multisite-plus stop.
    If using SLES 12/systemd: systemctl stop wdmsp.target

    Platform dependent commands
    See here for more information on platform specific start and stop commands.
  2. Create an empty /opt/wandisco/svn-multisite-plus/replicator/properties/users.properties which is owned by the account that MSP is running as.

  3. Start the MSP service on that node.

  4. Use the resetSecurity.jar utility

  5. Restart svn-multisite-plus on the node. This adds the user to the users.properties file.

  6. Copy the newly created /opt/wandisco/svn-multisite/replicator/properties/users.properties file to all other nodes.

  7. Restart the MSP services on all nodes.
    On most platforms use service svn-multisite-plus restart.
    If using SLES 12/systemd use systemctl restart wdmsp.target.

    Platform dependent commands
    See here for more information on platform specific start and stop commands.

6. Reference Guide

This chapter describes in more depth MSP features and concepts that you may want to understand in more detail.

6.1. UI tabs

6.1.1. Dashboard

The dashboard provides administrators with a service status for MSP and displays any urgent issues. Past issues will stay on the Dashboard for a maximum of 96 hours (or as defined by the Dashboard Item Age Threshold).

dashboard 1.9
System status and log messages
System Status

A single line status message that indicates whether replication is running successfully or not.

In addition to System Status, one or more of the following sections may appear depending on the status of your MSP eco-system.

Log Messages
Replication Groups

The status of each running replication group is listed. Click on the dropdown button to indicate which nodes are at fault.

Pending Tasks

List all tasks that are currently pending. It is possible to cancel tasks by clicking on the corresponding button.

Failed Tasks

Lists the replicator tasks that have failed, along with the task’s unique Id, which can be used to search the logs for more details.

Disconnected Nodes

Logs all nodes that have been disconnected, when they were disconnected and for how long. If the duration field is empty then the outage still exists. Any outages that occurred to the node from which the UI is logged into will not be listed.

6.1.2. Repositories

Click this tab to manage your replicated SVN Repositories: Add, Edit, Consistency Check, Repair, Sync Stop, Reset all Stats, Remove

repotable
Repositories tab
Repository table

The repository table lists all the repositories that you add to MSP.

Name

The Repository Name

Known issue: duplicate repository names allowed
It’s currently possible to add multiple repositories with the same name, although they need different paths. Ensure that you do not use the same name for multiple repositories. Doing so can easily cause confusion and will be prevented in future releases.
Path

The local path to the repository. This needs to be the same across all nodes

Replication Group

The replication group in which the repository is replicated

Youngest Rev

This is the latest revision number for the repository

Transactions

Lists any pending transactions. The transactions link to the last transactions played out for the repository:

transactions0
Click on a transaction box
transactions2
Transactions list revealed!
"-1 Pending" Transactions

The Transactions field may show the value "-1 pending", which represents "incomplete data" which may appear when the system is unable to confirm how many transactions might be pending. If you see "-1 Pending" transactions anywhere, you should monitor to ensure that it clears. If it persists then you should contact WANdisco’s support team for assistance.

Last Modified

The date and time of the last modification to the repository

Global RO

Indicates if the repository is globally read-only

Stops any further commits from SVN users
The term Global Read-only doesn’t accurately reflect what happens at the repository-level. When a repository enters a Global Read-only state it will no longer accept any commits from SVN clients. However, proposals that are flying around within the state machine can still be written. It is this state that allows nodes to reach a synchronized stop.
Local RO

Indicates if the repository replica is in a local read-only state (local to this node). When a repository replica is in the "Local RO" state then clients connecting to the repository replica on this node will be prevented from making any changes but changes can continue to be made to repository replicas at other nodes (assuming sufficient redundancy enables continuity of quorum).

Status

Indicates whether the repository is replicating or has stopped. A stopped repository will be in a read-only state, either globally or locally

Under control
Remember that this table doesn’t automatically show all the repositories on the server, only those repositories that have been added to MSP for replication (non-replicated repositories will not show up). See Add Repository.
Filter Repositories

You can use this search box to filter the list of available repositories, useful if you’re running with a large number of repositories.

Repository Information

Click on the Repositories tab and then click on the repository name. The repository’s information screen will open.

concheck2 1.9
Repository Information
Consistency Check

The Consistency Check tool provides a quick method for confirming that the distributed copies (replicas) of a repository are all in an identical state - a requirement for replication. Clicking the button will trigger a series of checks, the results of which appear on this same page. One or more reload operations may be required, depending on the number of revisions being checked, the number of replicas and the latency between all of the nodes managing those replicas. For more information on its use, see Consistency check.

Sync Stop

Bring replication of the repository to a stop across all nodes.

Repair

Use the repair button to initiate a repository repair procedure.

Reload

Refresh the screen to pick up any changes.

Repository Information

Available information about the repository.

Size

The size of a repository is not cached, you need to trigger a calculation. The value returned is "the sum of the bytes of all of the files".
Size is not a complete measure of disk space. A repository size, in terms of it’s payload data is not the same as the amount of disk space necessary to host the repository. The actual on-disk storage required will be larger than the Size displayed here.

Remove Repository

Use this tool to remove a selected repository from MSP’s control. The repository data will not be deleted but once removed from MSP, changes made to the repository locally will no longer be replicated to other nodes. See Removing Repositories.

Edit Repositories

It’s possible to make limited changes to a repositories settings after it has been added.

Click on the Repositories tab and then click anywhere on the Repository’s bar, causing it to highlight in yellow. Click the Edit button on the repositories menu bar, which has now turned blue. The edit box will open.

editrepos 1.9
Click Edit
Add Repository

Click Add in order to add or create a new repository to MSP. If the repository already exists it’s integrity must be verified before you place it under the control of MSP. Each node in the Replication Group you add the new repository to should have an identical copy in exactly the same directory path. If it doesn’t already exist then tick the Create New Repository box to create it at the same time as adding. For more detail see Add Repositories.

addrepo2 1.9
Add Repository
Repo name

Choose a descriptive name, this doesn’t need to be the folder name. Whilst it can be anything you like, a good naming convention is advised to prevent later confusion.

FS Path

The local file system path to the repository. This needs to be the same across all nodes.

Replication Group

The replication group in which the repository is replicated. It is the replication group that determines which nodes hold repository replica, and what role each replica plays.

Global Read-only

Check box that lets you add a repository that will be globally read-only. In this state MSP continues to communicate system changes, such as repository roles and scheduling, however, no new repository changes will be accepted, either locally or through proposals that might come in from other nodes - which in most cases shouldn’t happen as by definition the repository should also be read-only at all other nodes.
Note: prior, "in flight" agreements will continue to play out until they are all accounted for.

  • You can think of the Global Read-only flag as quick means of locking down a repository, so that no new commits will be accepted at any node.

Create New Repository

Tick this checkbox to tell MSP to create a brand new repository, which will be replicated to all nodes in the replication group that you place it in.

Add Repo

Click the Add Repo button when you have entered all the required fields for the repository that you are adding. You can cancel the addition of the repository by clicking on the circular cross icon that appears on the left-hand side of the entry fields.

Reset all stats
resetall 1.9
Reset all stats
Reset All Stats (on this node)

MSP captures basic repository stats for all the repositories placed under it’s control. The stats for selected repositories are displayed on the dashboard.

Click the Reset All Stats button to blank all the repository statistics on a node. The action is not replicated and the stats that are stored on the other nodes will not be affected.

Repair
emr1 1.9
Repair

The Repository Repair tool is used when a repository on one of your nodes has been corrupted or similarly requires repair or replacement. Selecting a repository to repair, the tool asks to you select a 'helper' node. This node briefly stops replicating because the helper node will be used to copy or rsync an up-to-date replica of the broken repository onto the current node.

Sync Stop

The Sync Stop tool lets you bring replication to a stop for a selected repository. The tool is required to ensure that when replication has stopped all repository replica remain in exactly the same state. This requirement is complicated within distributed systems where proposals may be accepted on some nodes while still in-flight to other nodes. See Performing a synchronized stop

6.1.3. Replication Groups

Replication Groups are units of organization that we use to manage replication of specific repositories between a selected set of nodes. In order to replicate a SVN repository between a specific set of nodes you would need to combine that set of nodes into a Replication Group.

Example replication groups
Example 1

An organization has developers working in Chengdu and San Fransisco who need to collaborate on projects stored in three SVN repositories, Repo0, Repo2 and Repo4. An administrator in the Chengdu office creates a replication group called ImportantGroup. The MSP nodes corresponding with each of the two sites are added to the group.

exampleRG1
Replication group example - 2 Nodes

The Chengdu office is the location of largest development team, where most repository changes occur. For this reason the node is assigned the role of Tie-breaker. If there is disagreement between the nodes in the group over transaction ordering, NodeChengdu will carry the deciding vote.

The node in San Fransisco hosts a standard active-voter node. Changes to the local repository are replicated to NodeChengdu, changes made on the Chengdu node are replicated back to San Fransisco.

Example 2

The organization also has developers in Sheffield and Belfast who collaborate on projects stored in two SVN repositories, Repo1 and Repo3. An administrator in Sheffield creates a replication group called NewGroup. The MSP nodes corresponding to the Sheffield and Belfast offices are added to the group along with NodeChengdu.
As there are an odd number of voter nodes in this Replication Group, a tiebreaker node is not needed.

exampleRG2
Replication group example - 3 Nodes

The Chengdu node is added to the group as a Voter (only) as it is a management node that plays no active part in development. This means that NodeChengdu takes part in the vote for transaction ordering, even though the payload of those transactions are not written to repository replicas stored at the Chengdu office. The purpose of NodeChengdu is simply to add resilience to the replication system, in the event of a short-term disruption to traffic from one of the other two nodes, agreement can still be reached and replication could continue.

Cannot change voter nodes
Voter nodes must be added to a Replication Group during the creation of the group. They cannot be added later and the role of a node be changed to or from Voter (only).
In order to change a node role to or from Voter (only) you need to create a new replication group with the new roles for the nodes and the move the repositories from the old to the new replication group. This will require a repository repair process be used if the new replication group has an Active role where it had been Voter (only).

The organization might choose to make the Chengdu node Passive instead. With NodeChengdu running a passive node, replicas of Repo1 and Repo3 would also be stored in Chengdu (Voter only nodes have no local repository). While Passive nodes cannot modify the repository, they can provide read-only access to their repositories to SVN clients.
Having Passive nodes in your system will effect, for example Consistency Checking, therefore the use of Passive nodes should be verified with WANdisco support first.

Types of node

Another element controlled by replication groups is the role that each repository replica plays in the replication system. See more about Types of nodes.

Create a replication group

You can create a replication group providing that you have at least one node connected.

ui repgroup01
Create replication group

For more details, see How to create a replication group.

View

You can view and partly edit a replication group by clicking on the view button.

  1. Click the View button.

    msp view1 1.9
    View Replication Group
  2. The replication group’s screen will open showing the member nodes of the group.

    msp view2 1.9
    Replication Group details

    Each node is displayed as a color-coded circle. Click on the circle to see what other node types are available. Read more about node types.

    msp view3 1.9
    Change node type
  3. The Configuration screen provides access to the each node’s type, along with a list of repositories and a link to the Configure Schedule screen.

    Add Nodes

    You can add additional nodes to a replication group. Click on the Add Nodes button to start the procedure, you can read about here - Adding a node to a replication group.

    Save Node Roles

    Use this button to save any changes that you make to the member nodes.

    Configure Schedule

    The Schedule screen lets you set the roles of nodes to change over time, specifically changing according to a schedule.

    Disable Schedule

    Stop any scheduled role changes that are not already in progress. This is normally done to prevent moving roles to nodes that are known to be down (e.g. during system maintenance).

    Reload

    Refresh the data on the page.

Why change a node’s role?

At the heart of WANdisco’s DConE2 replication technology is an agreement engine that ensures that SVN operations are performed in exactly the same order on each replica, on each node. Any node that has the role of voter becomes part of the agreement engine and together with other voters determine the correct ordering. If there’s high latency between any voters this may adversely affect replication performance. Fortunately it isn’t a requirement that every node takes part in forming agreements. An Active only node can still create proposals (i.e. initiates repository changes) but the agreement engine doesn’t need to wait for its vote. Read more about how replication works in the Replication Strategy Section.

Follow the Sun

To optimize replication performance it’s common for administrators to remove voter status from a node after their staff leave for the day - a practice commonly known as "Follow The Sun" where far-flung organizations transfer roles and privileges between locations so that those privileges are always held by nodes at actively staffed sites.

schedule03
Schedule
Role Schedule

The Role Schedule window shows all the nodes in the replication group, along with each node’s current roll (denoted by colored circular buttons).

If you change any nodes you need to click the Save Schedule button. Any mistakes in node role combinations selected will be detected at this time, and if there are any then the save will fail.

Use the Clear Schedule button to blank out settings that you have changed, returning to the default schedule.

For how to setup a schedule, see How to configure a schedule.

6.1.4. Nodes

The Nodes tab is where information on the functions that manage repository data replication can be found.

nodeinduct1 1.9
Nodes Tab
Connect to Node

The following information is needed to connect to a Node and can be found on the Settings tab:

Node ID

The UUID of the inductor node.

Node Location ID

The reference code that defines the inductor node’s location.

Node IP Address

The IP address of the inductor node server.

Node Port No

The DConE Port number, 6444 by default.

Sync Stop All

Brings the agreement engine on all nodes to a stop. This operation requires that all repositories are replicating/writable.

Sync Start All

Re-starts the agreement engine on all nodes. This button is only available if the agreement engines on all nodes are stopped.

Reload

For a refresh of the information to pick up any changes that may have occurred since loading the screen.

Name

Name assigned to the node.

Connectivity Status

Displays the node’s status for example connected, local and stopped.

Last Connectivity Change

Date and time of the most recent change.

Transactions

A clickable button showing any pending transactions.

Action

Displays actions available, for example start node or stop node.
Clicking Stop Node here takes the node offline and it will therefore no longer be able to process any changes. Once the node is stopped, Start Node will appear in the Action column.

6.1.5. Settings

The server’s internal settings are reported on the Settings tab, along with a number of important editable settings.

svnmsp settings 001
Settings tab
Administrator Settings
User Interface HTTP Ports

Change the ports that you want to use to access the User Interface. Enter valid port numbers and click Save.
Note - valid ports are ports that are not already in use by another application.
This is not a replicated operation and will therefore only change the UI ports on this node. For operational ease of use consider making certain that all nodes use the same ports.

Shutdown Replicator / Restart Replicator

If you want to shut down the replicator, click Shutdown Replicator. Note the message you receive:

This action will shutdown the Replicator process.
The Replicator can only be restarted from the command line and cannot be restarted through the User Interface.

If you shut down from this button, you cannot restart using the UI as the restart command only works if the replicator is currently running. However, you can stop and restart the replicator process by clicking Restart Replicator. You receive a message saying this takes a minimum of about 15 seconds (the larger the installation, the longer this time will be). Click Restart to confirm.

Restarting when already shutdown
If the replicator is not running then you can’t use the Restart Replicator button on the Admin UI, you will need to run a restart using the command line "Start" command. See Starting up.
Monitoring Data
svnmsp resourcemon01
Monitoring Data

The Resource Monitoring Data settings provide a basic tool for monitoring available disk storage for MSP’s resources.

Monitor Interval (mins)

If the disk space available to a monitored resource is less than the value you have configured, and the event is associated with a Severe severity, then the replicator will log the event and immediately exit. The polling period between sampling the available disk space defaults to 10 minutes and can be configured via a setting in the application.properties file:

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties

monitor.period.min=10L

Value is in minutes, and only run through the UI, it is not handled directly by the replicator.

Add New Resource Monitor

Enter the path to a resource that you wish to monitor, then click Add.

Resource Monitors

This section lists all resources currently being monitored. Click on Configure to change monitor settings, Delete to remove a monitor. The default monitor protects the replicator itself against running out of space and cannot be edited or deleted.
If you want to increase the minimum required disk space before the replicator shuts down this change is made in the application.properties file. See here for more information.

For more information about setting up monitors, read Setting up resource monitoring.

Notifications
notifications01
Notifications

The notifications system provides SVN administrators with the ability to create event-driven alert emails. Set up one more more gateway (mail servers), add destination emails to specify recipients, create email templates for the content of the alert emails, then set the rules for which event should trigger a specific email.

Gateways
svnmsp notifications01
Gateways

The Gateways section stores the details of those email relay servers that your organization uses for internal mail delivery. You can add any number of gateways, MSP will attempt to delivery notification emails using each gateway in their order on the list, #0, #1, #2 etc.

MSP will attempt delivery via the next gateway server when it has attempted delivery a number of times equal to the Tries number. It reattempts delivery after waiting a number of seconds equal to the Interval setting.

How MSP gives up on delivering to a gateway
Example: Gateway #0 is offline. With Tries set to 5 and Interval set to 600, MSP attempts delivery using the next gateway (#1) after 600s x 5 = 50 minutes. If you have more than one Gateway you would want to use a smaller interval. The last Gateway should be configured to try harder by using a larger number of Tries and/or a larger Interval.
IP/Hostname of SMTP Server

your email server’s address.

SMTP Server Port

The port assigned for SMTP traffic (Port 25 etc).

Encryption Type

Indicate your server’s encryption type - None, SSL (Secure Socket Layer) or TLS (Transport Layer Security). SSL is a commonly used. For tips on setting up suitable keystore and truststore files see Setting up SSL Key pair.

Authentication Required

Indicate whether you need a username and password to connect to the server - requires either true or false.

User Name

If authentication is required, enter the authentication username here.

Password

If authentication is required, enter the authentication password here.

Sender Address

Provide an email address that your notifications will appear to come from. If you want to be able to receive replies from notifications you’ll need to make sure this is a valid and monitored address.

Number of Tries Before Failing

Set the number of attempts MSP makes in order to send out notifications for this Gateway..

Interval Between Tries (Seconds)

Set the time (in seconds) between MSP’s attempts to send notifications to this Gateway..

Destinations
svnmsp notifications02
Destinations

In the Destinations panel is used to store email address for notification recipients. Add, Edit or remove email addresses.

Templates
svnmsp notifications03
Templates

The templates panel is used to store email content. You create messaging to match those events for which you want to send user notifications.

Template Subject

Use this entry field to set the subject of the notification email. You’ll want this subject to be descriptive of the event for which the email will be triggered.

Body Text

Enter the actual message that you want to send for a particular situation/event. The body text can include keywords that will be expanded when the notification is sent. Which keywords are allowed depend on the type of notification event, see events and variables.

Rules
svnmsp notifications04
Rules

Use the Rules panel to actually setup up your notification emails. Here you’ll associate email templates and destination emails with a particular system event. For example, you may create an email message to send to a particular group mailing list in the event that a repository goes into Read-only mode. Selecting descriptive subjects for your templates will help you to select the right templates here.

Event

Choose from the available list of trigger events.

Template

Choose from the available list of templates you have created.

Destination

Choose from the available list of email addresses.

Logging Setting

The Logging Setting lets you quickly add or modify Java loggers via the admin console, rather than making manual edits to the logger file:

<install-dir>/replicator/properties/logger.properties.
svnmsp logging01
Configure Logging Settings

Loggers are usually attached to packages, here, the level for each package is available to modify or delete. The global level is used by default, so changes made here are used to override the default values. Changes are applied instantly but in-memory only and are forgotten after a restart of the replicator (unless they are saved). For information about adding or changing loggers, see Logger Setting Tool.

logging2 1.9
Logging Settings Tool
System Data
systemdata 1.9
System Data

The System Data table provides a list of the editable and read-only settings.

Changing the editable settings causes a replicator restart:

Node Name

This is the human-readable form of the node’s ID. You can change the Node Name and reuse it after it has been removed from the replication network. You cannot have two nodes with the same name, but you can reuse a previously removed node name.

Location Latitude

The Node’s geographical location is no longer recorded during installation. Instead you enter the details here.

Location Longitude

Along with Longitude, this value places the node on the internal map and helps the application determined the local time for the node based on the timezone in which it falls.

Host Name/IP Address

The Hostname / IP address of the server hosting the node.

If SSL is configured then after an IP address change all nodes must be manually restarted.
Only change one node
The UI can only be used to change the IP address of a single node at once. If you need to change the address of multiple nodes please contact WANdisco support for assistance.
DConE port

DConE port (actually it’s DConE 2) handles agreement traffic between nodes. The Default is 6444.

Dashboard Polling Interval (Minutes)

Sets how often the dashboard messaging is updated. The messaging is populated by Warnings and Errors that appear in the replicator logs file. The default frequency is every 10 minutes.

Dashboard Item Age Threshold (Hours)

The number of hours a logged event is displayed on the Dashboard for. We recommend not setting this value lower than 96 hours so you don’t miss an important issue over a 3-day weekend.

Number of Revisions in Default Consistency Check

The number of revisions to compare between nodes.
If inducting a new node then the new node must have the same number of revisions configured or the induction will fail.

Scheduled Consistency Check Enabled?

Tick this box to schedule a consistency check for all repositories known to this node. The frequency is specified below.

Scheduled Consistency Check Frequency (Hours)

The default is 24 hours so repositories are checked for consistency once a day. The allowed value however is 1-999. For more information see Scheduled Consistency Checks.

The read-only settings were either provided during setup or have since been applied:

Node ID

A unique string that is used to identify the node during an induction.

Location ID

A unique string that is used to identify the server during an induction.

Database Location

The full path to MSP’s database. By default this will be <install-dir>/replicator/database.

Delegate Port

The delegate port is used by SVN to delegate write operations to the WANdisco Replicator (via the contact.server.port (described below).

Jetty HTTP Port

This is the REST API port if SSL is not configured. For the REST API port if SSL is configured see the application.properties file.

Content Server Port

The port that will be used to transfer replicated content (repository changes). This is different from the port used by WANdisco’s DConE agreement engine.

Content Location

The directory in which replication data is stored (prior to it having been applied tothe repository).

License information

Details concerning MSP product license - such as the date of expiry.

View REST API Documentation
This link takes you to your node’s local copy of the API documentation. This documentation is generated automatically and ties directly into your server’s local resources.
There is also a copy of the latest API documentation available in this user guide. Note that this has been lifted from an installation and will link to resources that will not be available on the website (resulting in dead links).

The module versions provides a list of the component parts of the MSP application. This is useful if you need to verify what version of a component you are using - such as if you need to contact WANdisco for support.

6.1.6. Security

The security tab is used to manage admin accounts, either entered manually into MSP or managed through an LDAP authority, or managed via a Kerberos Authority. On the tab is an entry form for adding administrative accounts, along with LDAP Settings for binding MSP to one or more LDAP services.

svnmsp ac2
Security
Add User

Enter the details of an additional administrator who will be able to login to the MSP Admin UI. See Adding additional users for more information.

Add Authority

Enter the details of one or more LDAP authorities for managing administrator access. See Adding LDAP authorities for more information.

Disable Managed Users

This feature lets you block access to the MSP Admin UI by non-LDAP users. This button does not become visible until you add an LDAP authority. See Disabling (Internally) Managed Users below.

Enable SSO

This button will only be available to click if you have entered valid Kerberos settings. When enabled it places MSP’s admin console into Single Sign-on mode. When enabled accessing the admin UI will use Kerberos instead of the username and password login form. In the enabled state the button will change to say Disable SSO.

Export Security Settings

The data entered into the Securities tab can be backed up for later re-importing by clicking the Export Security Settings button. The data is stored in /opt/wandisco/svn-multisite-plus/replicator/export/security-export.xml which should be included in any backup procedures you are running. You will need access to the file from your desktop during a re-import.

Import Security Settings

Click the Import Security Settings button if you need to restore your Security settings, such as after a re-installation of MSP. The import will proceed providing that you can enter a file path to the security-export.xml file.

You will need to import the exported security settings to any newly installed node before attempting induction.
Reload

Click on the reload button to refresh the Admin UI screen, you will need to do this in order to view any changes that you make.

6.1.7. Admin Account Precedence

MSP uses the following order of precedence when checking for authentication of users:

  • First: Internally managed users (if they are enabled - see Disable Managed Users)

  • Second: Local LDAP authorities by order

  • Third: Global LDAP authorities by order

Explanation

This provider implementation tries to authenticate user credentials against either the list of internally managed users, or against any number of LDAP authorities, or both — depending on how the administrator has configured the application.

When authenticating against LDAP authorities, each one is tried in sequence until one either grants access or they all deny access. In the event that they all deny access, only the error from the last authority tried will be returned.

  • Admin account changes are replicated to all nodes.

  • Changes to admin accounts are handled as proposals that require agreement from a majority of every node in the replication network.

  • Admin account changes are reported into the replicator log.

Disable (Internally) Managed Users

Click the Disabled Managed Users button if you want to control access to MSP exclusively through LDAP. Once clicked, any Internally managed users will no longer be able to log into the Admin UI after they next log out. From that point only LDAP managed users will have access to the MSP Admin UI.

Re-enable Internally Managed users

If, after disabling Internally Managed Users you need to enable them again — should there be a problem with your LDAP authorities - then it is possible to enable access again by logging into the node via a terminal window (with suitable permissions), navigate to the following directory:

/opt/wandisco/svn-multisite-plus/replicator

and run the reset script:

java -jar wd_resetsecurity.jar

Any internally managed users who remain in MSP’s database will have their access restored.

Internally Managed Users
svnmsp ac3 adduser3
Internally Managed Users

This table lists those admin users who have been entered through the Admin UI or imported using the Import Security Settings, along with the first admin account.

Admin Account #1

Note that the first admin account is the one set up during the installation of your first node. The credentials specified during this installation are stored to the users.properties file which is then used during the installation of all subsequent nodes.

Admin Account Mismatch
The users.properties file is used to ensure that exactly the same username/password is used on all nodes during installation. In the event that there’s a mismatch then you wouldn’t be able to connect the nodes together (through the Induction process). Rather than clean-up and reinstall you can fix this by manually syncing the password files.

Admin Account #1 can be removed but the last admin account remaining on the system will not be deletable to ensure that it isn’t possible for an administrator to be completely locked out of the admin UI.

Kerberos

Support for the Kerberos protocol is included. When enabled, Kerberos handles authentication for access to the admin UI, where the administrator is automatically logged in if their browser can retrieve a valid Kerberos ticket from the operating system.

You can’t mix and match log-in type When Kerberos SSO is enabled only users who are set up for Kerberos will be able to access the admin UI. The username and password login form will be disabled. If you ever need to disable Kerberos authentication this can be done using the authentication reset script (wd_resetsecurity.jar) which will return your deployment to the default login type.
kerberos 1.9
Kerberos settings entry form
Service Principal

A service principal name (SPN) is the name that a client uses to identify a specific instance of a service.
Example:

HTTP/host.example.com
Keytab File

The keytab is the encrypted file on disk where pairs of Kerberos principals and their keys are stored.
Example:

/tmp/krb5.keytab
Kerberos 5 Realm Configuration File

The krb5 configuration file location of the replicator host’s Kerberos 5 realm configuration.
Example:

/etc/krb5/krb5.conf
Never replicated, always configured 'per-node'

Kerberos configuration is not replicated around the replication network because each node in the network needs its own host-specific configuration. This configuration is node-local only. The configuration needed is the host-specific service principal name, noted in the settings above. e.g. On most systems the location of the host’s encrypted key table file will be something something like:
/etc/krb5.keytab

The location of the host’s Kerberos 5 realm configuration may be something like:

/etc/krb5.conf
or
/etc/krb5/krb5.conf

LDAP Authorities
ldap stuff
LDAP Authority entry forms
Node-Local LDAP Authorities

If chosen, then only the local node will use the LDAP authority for authentication.

Replicated LDAP Authorities

If replicated is chosen, all nodes in the replication network can use the LDAP authority for authentication.

Mixing Local and replicated authorities

Both kinds of authority are supported simultaneously, with the node-specific LDAP authorities taking precedence over replicated authorities in order to support the use-case where, for example, a particular node may prefer to use a geographically closer LDAP directory. Replicated LDAP authorities are replicated to other nodes and therefore are expected to be usable at all MSP nodes. Also, if multiple LDAP authorities of either type are configured then the order in which they are consulted is also configurable, using the +/- buttons at the end of each entry.

Order

LDAP authorities are listed in the order of execution that you set when defining each authority’s properties.

Url

The URL of the authority. The protocol "ldap://" or "ldaps://" are required.

Bind User DN

Identify the LDAP admin user account that MSP will use to query the authority.

Search Base

This is the Base DN, that is the location of users that you wish to retrieve.

Search Filter

A query filter that will select users based on relevant LDAP attributes. For more information about query filter syntax, consult the documentation for your LDAP server.

Remove

Click to remove the authority from MSP.

Edit

Click to make changes to the authority’s settings.

The usual configuration options are supported for each configured LDAP authority: URL, search base and filter and bind user credentials.

Just enough permissions
The bind user’s password cannot be one-way encrypted using a hash function because it must be sent to the LDAP server in plain text. For this reason the bind user should be only have enough privileges to search the directory for the user being authenticated. Anonymous binding is permitted for those LDAP servers that support anonymous binding.
LDAP Home or away

When adding an LDAP authority, the configuration can be selected to be either replicated or node-specific.

Replicated LDAP Authorities

If node-specific is chosen, then only the local node will use the LDAP authority for authentication. Both kinds of authority are supported simultaneously, with the node-specific LDAP authorities taking precedence over WAN-based authorities in order to support the use-case where a particular node may prefer to use a geographically closer LDAP directory, for example. Also, if multiple LDAP authorities of either type are configured then the order in which they are consulted is also configurable.

The usual configuration options are supported for each configured LDAP authority: URL, search base and filter and bind user credentials.

6.2. Architecture overview

The diagram below outlines the MSP architecture, in term of how the application is split up and how those component parts communicate with each other and the outside world.

architecture 1.9
Product Architecture
Key points
  • Admin UI and Replicator are run in separate Java processes.

  • The Admin UI interacts with the application thought the same API layer that is available for external interactions. This layer enforces separation of concerns and handles authentication and authorization of all user interactions.

  • SVN server runs with a WANdisco version of the FSFS server libraries. Whilst Read operations are passed to SVN’s regular FSFS, writes are delegated to the MSP replicator.

  • The DConE 2 Coordination protocol handles the agreement of transaction ordering between nodes via port 6444. The delivery of the actual replicated content (SVN commits etc) is handled by the Content Distribution layer on port 4321.

6.3. Install directory structure

MSP is installed to the following path by default:

/opt/wandisco/svn-multisite-plus/

It’s possible to install the files somewhere else on your server, although this guide will assume the above location when discussing the installation.

Inside the installation directory you’ll find the following files and directories:

[root     root    ]  ├── bin
[wandisco wandisco]  ├── config
[root     root    ]  ├── flume
[root     root    ]  ├── lib
[wandisco wandisco]  ├── local-ui
[wandisco wandisco]  │   └── ui-logs
[wandisco wandisco]  ├── logs
[root     root    ]  ├── replicator
[wandisco wandisco]  │   ├── content
[wandisco wandisco]  │   ├── content_delivery
[wandisco wandisco]  │   ├── database
[wandisco wandisco]  │   │   ├── application
[wandisco wandisco]  │   │   │   └── resettable.db
[wandisco wandisco]  │   │   ├── backup
[wandisco wandisco]  │   │   ├── DConE.application.db
[wandisco wandisco]  │   │   └── recovery
[wandisco wandisco]  │   │       ├── application.integration.db
[wandisco wandisco]  │   │       ├── DConE.system.db
[wandisco wandisco]  │   │       └── DConE.topology.db
[root     root    ]  │   ├── docs
[wandisco wandisco]  │   ├── export
[root     root    ]  │   ├── gfr
[root     root    ]  │   │   ├── bin
[root     root    ]  │   │   │   └── acp
[root     root    ]  │   │   ├── etc
[wandisco wandisco]  │   │   ├── lib
[wandisco wandisco]  │   │   ├── log
[wandisco wandisco]  │   │   ├── tmp
[wandisco wandisco]  │   │   └── var
[wandisco wandisco]  │   ├── hooks
[root     root    ]  │   ├── lib
[wandisco wandisco]  │   ├── logs
[wandisco wandisco]  │   │   ├── recovery-details
[wandisco wandisco]  │   │   ├── stats
[wandisco wandisco]  │   │   ├── tasks
[wandisco wandisco]  │   │   └── thread-dump
[wandisco wandisco]  │   ├── properties
[root     root    ]  │   └── properties.dist
[root     root    ]  ├── resources
[root     root    ]  │   └── svn
[wandisco wandisco]  ├── tmp
[root     root    ]  ├── ui
[wandisco wandisco]  └── var
[wandisco wandisco]      ├── backups
[wandisco wandisco]      └── watchdog

6.4. Properties files

The following files store application settings and constants that may need to be referenced during troubleshooting. However, you shouldn’t make any changes to these files without consulting WANdisco’s support team.

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties

This file contains settings for the replicator and affects how MSP performs. View sample.

Temporary requirement
If you (probably under instruction from WANdisco’s support team) manually add either connectivity.check.interval or sideline.wait to the applications property file then you must add an "L" (Long value) to the end of their values so they are converted correctly. View our sample application.properties file to view all the properties that are suffixed as "Long".
/opt/wandisco/svn-multisite-plus/replicator/properties/samples/logger.properties

This file handles properties that apply to how logging is handled. View sample.

/opt/wandisco/svn-multisite-plus/replicator/properties/samples/user.properties

This file contains the admin account details which will be required when installing second and subsequent nodes. View sample.

/opt/wandisco/svn-multisite-plus/local-ui/samples/ui.properties

This file contains settings concerning the graphical user interface such as widget settings and timeout values. Stored in this file is the UI Port number and is considered the defacto recording of this value, superseding the version stored in the main config file /opt/wandisco/svn-multisite-plus/config/main.conf. You can View a sample.

6.5. Replication strategy

MSP provides a toolset for replicating SVN repository data in a manner that can maximize performance and efficiency whilst minimizing network and hardware resources requirements. The following examples provide you with a starting point for deciding on the best means to enable replication across your development nodes.

6.5.1. Replication Model

In contrast with earlier replication products, MSP is no longer based upon a network proxy that handles file replication between replica. Now, replication is handled at the filesystem level, via FSFS.

model old
MSP differs from earlier WANdisco replication products on a number of levels
Limitations of the old model
  • The replicator sits in front of Apache

  • All reads have to pass through the replicator

  • Requires some functionality to be added so that certain operations can be performed prior to scheduling a transaction for replication

  • Authentication against external data stores

  • Hooks firing

model new
MSP differs from earlier versions - it replicates at the file system level
Per-Repository Replication

MSP replicates data on a per-repository basis. Each node can host a different set of replicated repositories, and these repositories are connected across nodes in replication groups. Each repository can only be a member of one replication group at any one time.

per repository
Per-Repository Replication

In this example Replication Group 1 contains two repositories which have replicas on all four nodes. In contrast, Replication Group 2 only contains one respository which has replicas on node 1 & node 2.

Dynamic membership evolution
dynamic evolution
MSP allows replication groups to change their membership

A repository can only replicate to the member nodes of a single replication group at any one time, although it is possible to move a repository between replication groups as required - this is done on-the-fly, nodes can be added or deleted without the need to pause all replication (with a synchronized stop).

MSP offers a great deal of flexibility in how repository data is replicated. Before you get started it’s a good idea to map out which repositories are needed at which locations.

WANdisco replication and compression

There are a number of WAN network management tools that offer performance benefits by using data compression. The following guide explains how data compression is already incorporated into WANdisco’s replication system, and what effect this built-in compression may have on various forms of secondary compression.

Network management tools may offer performance benefits by on-the-fly compression of network traffic, however it’s worth noting that WANdisco’s DConE replication protocol is already using compression for replicated data. Currently Zip compression is used before content is distributed using the Content Distribution component of DConE.

Traffic Management systems that provide WAN optimization or "WAN Acceleration" may not provide expected benefits as a result of WANdisco’s compression. The following list highlights where duplication or redundancy occurs.

Compression

Encoding data using more efficient storage techniques so that a given amount of data can be stored in a smaller file size.
WANdisco effect: As replicated data is already compressed, having a WAN accelerator appliance compress the data again is a waste of time - however, as long as it can "fill the pipe", i.e. keep the throughput of traffic faster rate than the network can consume it then its not going to negatively impact data transfer.

Deduplication

Eliminating the transfer of redundant data by sending references instead of the actual data. By working at the byte level, benefits are achieved across IP applications. Data-deduplication offers the most benefit when there’s a lot of repetition in the data traffic.
WANdisco effect: Because the data is already compressed data-deduplication will not be effective. When using most compression algorithms there will be near zero duplicated blocks in the unit since duplicated blocks can be compressed. This means that WAN acceleration based on deduplication is going to find no duplicate blocks and therefore it will fail to accelerate the data transfer as well.

Latency optimization

Various refinements to the TCP implementation (such as window-size scaling, selective Acknowledgement etc).
WANdisco effect: DConE does not use TCP / network layer techniques. This form of optimization won’t have any impact on WANdisco Replication.

6.5.2. Creating resilient replication groups

Become familiar with the node roles
Make sure that you understand the WANdisco node types. See Guide to node types.

MSP can maintain SVN repository replication (and availability) even after the loss of nodes from a replication group. However, note these configuration rules:

Rule 1: Understand Paxos Node roles

The unique Active-Active replication technology used by MSP is an evolution of the Paxos algorithm. Different node roles are set by particular mechanics at play in the replication system:

  • Learners: Learners are the nodes that are involved in the actual replication of SVN repository data. When changes are requested to be made on a repository replica, that change is ordered by the WANdisco Paxos implementation and delivered to each learner in the agreed sequence. Learner Nodes are required for the actual storage and replication of repository data. You need a learner node at any location where SVN users are working or where you wish to store hot-backups of repositories

Types of Nodes that are Learners: Active

  • Acceptors: All changes being made on each repository in exactly the same order is a crucial requirement for maintaining synchronization. Acceptors are nodes that take part in the vote for the order in which proposals are played out. Acceptor Nodes are required for keeping replication going. You need enough Acceptors to ensure that agreement over proposal ordering can always be met, even accounting for possible node loss. For configurations where there are a an even number of Acceptors it is possible that voting could become tied. For this reason it is required to make a voter node into a tiebreaker which has slightly more voting power so that it can outvote another single voter node.

Types of nodes that are Acceptors: Voter Only
Nodes that are both an Acceptor and Learner: Active Voter

  • Proposers: Proposers are nodes that can propose changes. They are also involved in conflict resolution as part of the voting process. Resolution comes from interactions between proposers and acceptors and the losers are expect to re-propose.

Types of nodes that are Learners and Proposers: Active
Types of nodes that are Acceptors and Proposers: Active Voter

Rule 2: Replication groups should have a minimum membership of three Active/Passive nodes

Two-node replication groups are not fault tolerant, you should strive to replicate according to the following guideline:

  • The total number of nodes required in order to survive the failure of N nodes is 2N+1.

    So in order to survive the loss of a single node you need to have a minimum of 2x1+1= 3 nodes
    In order to keep on replicating after losing a second node you need 5 nodes.

Rule 3: Learner Population - resilience vs rightness
  • During the installation of each of your nodes you are asked to provide a Content Node Count number, this is the number of other active/passive nodes in the replication group that need to receive the content for a proposal before the proposal can be submitted for agreement.

    Setting this number to 1 ensures that at least one other node has the content before the change proposal is voted upon. This prevents a freeze in replication if the originating node goes down after the vote has been taken but before the data has been completely delivered to another node. Setting this number to more than 1 simply increases the resiliency of the system. The higher the number the slower the system will respond to requested changes, as the vote will not be taken until that number of learner nodes have the data for the proposal. For more details on this see the Content distribution policy section below.

Rule 4: 2 nodes per site provides resilience and performance benefits

Running with two nodes per site provides two important advantages.

  • Firstly it provides every site with a local hot-backup of the repository data.

  • Enables a site to load-balance repository access between the nodes which can improve performance during times of heavy usage.

  • Providing the nodes are Voters, it increases the voter population and improves resilience for replication.

6.5.3. Content distribution policy

WANdisco’s replication protocol separates replication traffic into two streams, the coordination stream which handles agreement between voter nodes, and the content distribution stream through which SVN repository changes are passed to all other active/passive nodes that store repository replicas.

msp contentdis 1.9
Content distribution

Content in this setting is the the data required for an agreement to be delivered (i.e. the repository change to be made). The content distribution policy determines how many other nodes must have this content before the voting process is initiated. Without this, if an agreement is scheduled and the node(s) that have the content are lost before the other nodes can obtain that content then a disaster recovery will be required as the content is no longer available.

3 Paxos roles are used in content distribution:

  • Acceptor - votes on proposals

  • Proposer - creates proposals, resolves proposal conflicts (e.g. propose to make a change to a repository)

  • Learner - delivers/executes proposals (e.g. updates a repository) - node with a repository replica

Proposals turn into agreements when they have sufficient votes from acceptors. The agreements are in a fixed order relative to each other and are delivered at each node in that fixed order.

MSP lets you apply different policies to content distribution on a per-node basis.

Contact WANdisco support if you have any questions about the Content Distribution policy.

Changing content distribution policy

In MSP there are 2 settings that govern the behavior of the Content Distribution Policy. Their names and defaults are:

content.min.learners.required=true
content.learners.count=1

These settings are modified on a per-node basis via the application.properties file.

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties

A restart of the application is required after any change is made to the application.properties file.

Reliable Policy

The "Reliable Policy" is the default setting.

content.push.policy=reliable

The content.learner.count represents the number of learner nodes excluding the originating node that must have the data before any repository change will be put to the vote.

  • If content.learner.count is larger than the number of non-originating replicas then it will automatically be reduced to the number of non-originating replicas.

If content.min.learners.required is true and there are an insufficient number of available replicas (based on the content.learner.count value) then the repository modification will fail without being put to a vote.

If content.min.learners.required is false then the value of content.learner.count will be adjusted to the number of non-originating available replicas. However, if the number of non-originating available replicas is zero and content.learner.count is non-zero then the repository modification will fail.

If content.learner.count=0 there is no requirement to deliver the content to any other node and disaster recovery could be needed as described above. We strongly suggest that you do not set this value to less than 1.

The number of simultaneous failures that can occur without requiring disaster recovery is strictly governed by the content.learner.count value. The default number is 1 so either the originating node OR the non-originating node that had the content delivered could be lost, but not both, before disaster recovery would be necessary. If both nodes in this case were down for maintenance then the other nodes would be stalled until one of the 2 nodes that have the content are once again available - at least for that repository family.

Examples:

content.learner.count=5
content.min.learners.required=true

  • During an outage there are only 4 learner nodes available in the replication group - requests to modify the repository will fail because there aren’t enough available learner nodes to validate a content distribution policy check.

content.learner.count=5
content.min.learners.required=false

  • During an outage there are now only 4 learner nodes in the replication group - requests to modify the repository will be successful because MSP will automatically drop the required learner count to ensure that the required learner count doesn’t exceed the total number of learner nodes in the group.

Steps for a policy change

Use this procedure to change between the above Content Distribution policies.

  1. Make a back up and then edit the /opt/wandisco/svn-multisite-plus/replicator/properties/application.properties file (Read more about the properties files).

  2. Change the value of content.min.learners.required, make it "true" for reliability, "false" for speed (default is true).

  3. Save the file and perform a restart of the node.

content.thread.count

Content Distribution will attempt parallel file transfer if there are enough threads available. The number of threads is controlled by a configuration property content.thread.count which is written to the application.properties file.

content.thread.count=10

The default value is 10. This provides plenty of scope for parallel file transfer. However, as each thread consumes system overhead in the form of a file descriptor and some memory space, servers that are under regular heavy load should lower the count to something smaller. If your nodes are big enough, you have allocated sufficient Java heap and your networking links have sufficient bandwidth then you could increase the value to something larger. Please change this in small increments and then test for improvement (or opposite).

Change the content maximum idle time
content.max.idle.time=2147483647

Set this in milliseconds. If content connection, either push or pull, is idle for this time, it is considered unreliable and closed. A new connection is then opened when needed. TCP/IP itself does not time-out the connections, however many network components (routers and firewalls) do. This timed-out connection then can behave as black hole, which blocks writes for tens of seconds timeouts. This can lead to spikes in transmission (push or pull) times after a period of inactivity (or even during activity if the number of connections is large and under-utilized).

Set the content.max.idle.time to, for example, 10 minutes, or whatever expiration the network infrastructure uses. We recommend that you set this to whatever your routers/firewalls are set to. This can avoid delays. If you set the value too low, connections may be closed unnecessarily and cause delays on new connection creation (roughly 1 RTT, but more for ssh connections).

You should set, or lower, this value if you get a large number of "Failed to send" info logs from PrioritizingSender, occurring especially after some time of commit inactivity.

Set the memory chunk size
content.in.memory.chunk.size=16K

For file transfer, compression or zip, a chunk is read from the physical disk. This property defines the chunk size in bytes. The default is 16K. Note: If you make the size bigger, then you need a bigger heap space.

6.5.4. Replication lag

There are some time-sensitive activities where you need to work around replication lag. For example:

  1. You have a 2-node replication group, NodeA and NodeB, and Repository, Repo01, is replicated between them.

  2. A commit to NodeA puts Repo01 at Revision N. The proposal for this commit is agreed but NodeB is still waiting for the changes to arrive so lags slightly behind NodeA at revision N-1.

  3. A user on NodeB does an svn cp http://nodeB/repo01/trunk http://nodeB/repo01/tags/TAG_X.

  4. This tag does not include changes that occurred in the latest revision. WANdisco’s replication technology ensures that all nodes are in the same state in the short to medium term. However, at any moment changes may be in transit. A larger volume of traffic and less available network capacity increases this still in transit state.

This lag is unavoidable in a real-world application and all replicas should soon be back in sync. It is important to understand the effects of this lag and work-around it whenever possible. For instance, in the example above, you need to validate that a specific revision is sufficient for the purposes of that copy. Then when using the svn cp command specify that revision in the from side, e.g. svn cp http://nodeB/repo01/trunk@112358 http://nodeB/repo01/tags/TAG_X.

6.6. Guide to node types

Each replication group consists of a number of nodes and a selection of repositories that will be replicated.

The different node types are:

Active
node active

An Active node has users who are actively committing to SVN repositories, which results in the generation of proposals that are replicated to the other nodes. However, it plays no part in getting agreement on the ordering of transactions.
Active nodes support the use of the Consistency Checker tool.

Active Voter
node activevoter

An Active Voter is an Active node that also votes on the order in which transactions are played out. In a replication group with a single Active Voter, it alone decides on ordering. If there’s an even number of Active Voters, a Tiebreaker will have to be specified.
Active nodes support the use of the Consistency Checker tool.

Passive
node passive

A node on which repositories receive updates from other nodes, but doesn’t permit any changes to its replicas from SVN clients - effectively making its repositories read-only. Passive nodes are ideal for use in providing hot-backup.
Passive nodes do not support use of the Consistency Checker tool.

Passive Voter
node passivevoter

A passive node that also takes part in the vote for transaction ordering agreement.

Use for:

  • Dedicated servers for Continuous Integration servers that do not update repositories

  • Sharing code with partners or sites that won’t be allowed to commit changes back

  • In addition, these nodes could help with HA as they add another voter to a site.

  • Passive nodes do not support use of the Consistency Checker tool.

Voter (only)
node voter

A Voter-only node doesn’t store any repository data, it’s only purpose is to accept transactions and cast a vote on transaction ordering. Voter-only nodes add resilience to a replication group as they increase the likelihood that enough nodes are available to make agreement on ordering.

Voter-only nodes can only be added during Replication Group creation. Nodes within an existing Replication Group cannot be changed to a Voter-only node, nor can nodes be added as Voter-only.

The Voter-only node’s lack of replication payload means that it can be disabled from a replication group, without being removed.

disablenode 1.9

A disabled node can be re-enabled without the need to interrupt the replication group.

Tiebreaker
node tb

If there are an even number of voters in the Replication Group the Tiebreaker gets the casting vote. The Tiebreaker can be applied any type of voter: Active Voter, Passive Voter or Voter. The Tiebreaker is only available for a replication group that has an even number of voter nodes. Also, if a replication group that is equipped with a tiebreaker node subsequently changes so that it has an odd number of voter nodes, either by gaining or losing a node, then its tiebreaker node automatically loses the tiebreaker designation and gets the same voting power as any other voter node.

Helper
node helper

When adding a new node to an existing replication group you will select an existing node from which you will manually copy or rsync the applicable repository data. This existing node enters the 'helper' mode in which the same relevant repositories will be read-only until they have been synced with the new node. By relevant we mean that they are replicated in the replication group in which the new node is being added.

New
node new

When a node is added to an existing replication group it enters an 'on-hold' state until repository data has been copied across from a designated helper node. Until the process of adding the repository data is complete, New nodes will be read-only. Should you leave the Add a Node process before it has completed you will need to manually remove the read-only state from the repository.

7. Appendix

7.1. Setting up SSL Key pair

MSP supports the use of Secure Socket Layer encryption (SSL) for securing network traffic. Currently you need to run through the setup during the initial installation.
If you plan to use SSL you need to run through the following steps before starting the MSP installation.

Using stronger and faster encryption

Java’s default SSL implementation is intentionally weak to avoid various import/export regulations associated with stronger forms of encryption. However, stronger algorithms are available to install, placing the legal responsibility for compliance with local regulation on the user. See Oracle’s information on the Import limits of Cryptographic Algorithms for JDK7 and JDK8.

Use self signed certificates in test environments
In production environments, certificates purchased from commercial Certificate Authorities are normally required, however in testing environments you can use self signed certificates. For more information see the Knowledgebase article How to create self signed certificates and use them in test environments.

If you need stronger algorithms, e.g. AES which supports 256-bit keys, then you can download Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files that can be installed with your JDK/JRE. These are available for download from the Oracle website.

  1. Create a new directory in which to store your key files. This directory can be anywhere, although in this example we store them in the svn-multisite-plus/replicator file structure: open a terminal and navigate to <INSTALL_DIR>/svn-multisite-plus/replicator/config.

  2. From within the /config folder make a new directory called ssl:

     -rw-rw-r-- 1 wandisco wandisco 5 Dec  5 13:53 setup.pid
    
     [User@Fed11-2 config]$ mkdir ssl
  3. Go into the new directory:

    cd ssl
  4. Copy your private key into the directory. If you don’t have keys set up, you can use JAVA’s keygen utility, using the command:

    keytool -genkey -keyalg RSA -keystore wandisco.ks -alias server -validity 3650 -storepass  <YOUR PASSWORD>
    Knowledgebase
    Read more about the Java keystore generation tool in the KB article Using Java Keytool to manage keystores.
    -genkey

    Switch for generating a key pair (a public key and associated private key). Wraps the public key into an X.509 v1 self-signed certificate, which is stored as a single-element certificate chain. This certificate chain and the private key are stored in a new keystore entry identified by alias.

    -keyalg RSA

    The key algorithm, in this case RSA is specified.

    keystore.jks

    This is the file name for your private key file that will be stored in the current directory. You can chose any name but use it consistently.

    -alias server

    Assigns an alias "server" to the key pair. Aliases are case-insensitive.

    -validity 3650

    Validates the keypair for 3650 days (10 years). The default would be 3 months

    -storepass <YOUR PASSWORD>

    This provides the keystore with a password.

    Note: If no password is specified on the command, you are prompted for it. Your entry is not masked so you, and anyone else looking at your screen, can see what you type.

    Most commands that interrogate or change the keystore will need to use the store password. Some commands may need to use the private key password. Passwords can be specified on the command line (using the -storepass and -keypass options).
    However, do not specify a password on a command line or in a script unless it is for testing purposes, or you are on a secure system.

    The utility prompts you for the following information:

     What is your first and last name?  [Unknown]:
     What is the name of your organizational unit?  [Unknown]:
     What is the name of your organization?  [Unknown]:
     What is the name of your City or Locality?  [Unknown]:
     What is the name of your State or Province?  [Unknown]:
     What is the two-letter country code for this unit?  [Unknown]:
     Is CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown correct?  [no]:  yes
    
     Enter key password for <mykey>
     (RETURN if same as keystore password):
  5. With the keystore now in place, the setup picks the file up if you provide the relevant details during the installation process:

    sslsettings01
    SSL Set up

    Changes in these values require a restart. Any invalid value restarts the replicator and no DConE traffic flows.

7.1.2. Setting the server key

In the keystore, the server certificate is associated with a key. By default, we look for a key named server to validate the certificate. If you use a key for the server with a different name, enter this in the SSL settings.

7.1.3. Enabling SSL post-installation

Once the keystore is in place, SSL can be enabled post-installation.

To do this you need to edit the application.properties file:

/opt/wandisco/svn-multisite-plus/replicator/properties/application.properties
ssl.debug=true
ssl.enabled=true
ssl.keystore=/opt/wandisco/svn-multisite-plus/ssl/keystore.jks
ssl.keystore.password=
ssl.key.alias=
ssl.key.password=
ssl.truststore=/opt/wandisco/svn-multisite-plus/ssl/cacerts.jks
ssl.truststore.password=wandisco
ssl.enabled

Switch for enabled SSL. Value: true

ssl.keystore

The absolute path to the keystore.

ssl.keystore.password

The password for the keystore - this password must be encrypted. See Encrypting passwords.

The ssl.keystore.password and the ssl.key.password must be identical. This is a java requirement.
ssl.truststore

The absolute path to the truststore. This may be the same as the keystore.

ssl.truststore.password

The password for the truststore - this password must be encrypted (see Encrypting passwords). If the same file is being used for the keystore and the truststore then the password must be the same for both.

Repeat for Flume files
You also need to repeat this process to update passwords in the Flume files. For more details on this see the KB article on How to upgrade the ACP sender delivered with ACP1.9.0 and how to set up SSL.
Encrypting passwords

When updating passwords in the application.properties file or the acp_sender.conf file, the passwords need to be an encrypted version, not clear text.

We’ve provided a tool to handle password encryption:

wd_cryptPassword.jar

Use the tool as follows:

cd <product-installation-directory>
java -jar wd_cryptPassword.jar <password-to-encrypt>

A restart is needed once these changes have been made.

7.1.4. SSL troubleshooting

A complete debug of the SSL logging is required to diagnose the problems.

To do this use the Logger Setting Tool on the Settings tab. There are 2 options:

  • Set the Global Logger Setting to Debug - this will be reset to Info following a replicator restart unless you clink Save All Settings to File.

  • Add a new logger setting called javax.net and set to Debug.

7.2. Hook Scripts

Hooks are scripts that are triggered by specific repository events, such as the receipt of an update, or an update having been accepted into the repository. As such they’re very useful for SVN administrators who want to have more control over their repository environment.

For example, with the use of a post-commit hook it is possible to send an email to announce that a new revision has been created.

Tip
Generally we advise that hooks should be set up the same on all nodes, although this is not a requirement for replication and there are some situations where you may wish to be selective about where hooks fire.
Hooks need to be executable by the appropriate system account

Any hook that you intend to fire on a particular node will need to be suitably permissioned to execute. e.g.

chmod+x post-commit
Types of hooks
Hook How to Integrate with WANdisco

start-commit

Standard Subversion implementation.

pre-commit

Standard Subversion implementation.

post-commit

Standard Subversion implementation. See the following section on Replicated Post-Hooks, for when you need post-commit hooks to fire on multiple nodes, rather than just on the initiating node.

pre-revprop-change

Standard Subversion implementation.

post-revprop-change

Standard Subversion implementation.

post-lock

Standard Subversion implementation.

Pre-unlock

Standard Subversion implementation.

Post-unlock

Standard Subversion implementation.

Running Hooks with SVN MultiSite Plus

Deploying MSP should have minimal impact on how hook scripts run on a deployment.

It is not a requirement to have absolutely uniformity between nodes however it is best practice to not have situations where results are nondeterministic. The hooks directory contents should therefore be evaluated carefully, taking into account the intended policies.

Pre-commit hooks

WANdisco’s modified version of the FSFS libraries intercepts commits after any pre-commit hooks have run. This means that the pre-commit hooks run on the initiating node (on the server, Apache, SVNserve, etc.) rather than in the replicator. If a pre-commit hook fails, then the server returns an error to the client before the FSFSWD intercept call. As a result, the replicator is never involved with failed pre-commit hooks (with the possible exception of protorev/abort notifications). So, if a commit (on the originating node) is delegated for replication, any related pre-commit hook will already have succeeded.

Post-commit Hooks

The replicator completes the commit on the originating node by invoking a Java Native Interface (JNI) function, a low-level function that doesn’t run any hooks. When the replicator returns the commit status to the originating repository FSFSWD a successful commit causes the post-commit to run on the server.
The net effect is that pre- and post- hooks run in the server on the originating repository and they do not run at all for the replicated repositories. Although the replicator can invoke the hooks for the replicated repositories if required.

Replicated Post-commit Hooks

Hooks that start with the prefix repl- are recognized and picked up by WANdisco’s replicator.

There are lots of scenarios where it is essential that a post-commit hook runs on other nodes, not just the node on which it is initially triggered, for example - running continuous build servers that are triggered by post-commit hooks.

When a change is replicated from a remote node these hooks are triggered on every node in which the script exists, except the originating node. You can exclude nodes by not including a repl- version of the hook.

Listed below is a list of the replicated hook names that are currently supported:

  • repl-post-commit

  • repl-post-revprop-change

  • repl-post-lock

  • repl-post-unlock

Hook scripts that are replicated are run in the following temp directory which will be created on each applicable node:

/opt/wandisco/svn-multisite-plus/replicator/hooks/tmp

The usual requirements for running hook scripts still apply, the hook must be executable for the system account that MSP runs as.

Limitations
Replication of post-commit hooks is straight forward, however other post-hooks, such as post-revprop-change may carry arguments, such as username to which replicated hook scripts won’t have access (the replicator is working below the authn layer). In situations where User is needed, we implant the value Unknown in order to ensure that the hook doesn’t error.

7.3. Guide to Kerberos

7.3.1. Introduction

Kerberos is a network authentication system defined by RFC 4120. Further developments added negotiation capabilities (RFC 4537 and RFC 5021) and a new interface method, GSSAPI (or General Security System Application Program Interface) - which allows applications that are suitably configured to make calls to the Kerberos service.

Number 5
The Kerberos supported by MSP is Kerberos (krb5). Earlier versions, up to Kerberos 4 are significantly different from version 5 and are no longer under development. Krb5 is the leading implementation of Kerberos and is used as part of MIT Kerberos (Linux) and MS Active Directory (Windows).

Kerberos is now widely used throughout the world of enterprise-level LAN and WAN networking and, since Windows 2000 it has been the core technology with Microsoft’s own Single Sign-on authentication technology (don’t let that put you off).

Definitions
Authentication

The process used to verify that data or information asserting that it originates from a source can only have come from that source.
This implementation of Kerberos covers the authentication requirement.

Authorization

When an account has been authenticated it may or may not be authorized to access a system/network resources such as files, applications, the ability to send email, etc. The authentication process typically provides access to a set of records in a security database that will contain specific access information and/or additional access information based on the accounts membership of one or more groups.
This implementation of Kerberos SSO is intended only as a replacement for password checks. In a future release Kerberos will be married to MSP’s internal or LDAP-driven admin user list.

Credentials

Any kind of password/key or security token. Your credentials are the objects that are accepted as proof of identity. Since you should be the only one who knows or has access to your credentials. When you present them to a system or network and they match the credentials that were securely recorded on an earlier date then it proves that you are who you say you are. As noted above, once authenticated your account may still need to be authorized to be able to actually access specific resources.

What Kerberos brings to the table
  • Kerberos is generally distrustful of any underlying network security, although it does need to trust its own network elements - chiefly the parts of the Kerberos Key Distribution Centre (we’ll refer to it as the KDC from this point).

  • As a result of its distrust, Kerberos never sends credentials across the network. It assumes that someone is packet-sniffing with the aim of stealing credentials. It therefore ensures that credentials are stored only in a single secure location (the Kerberos Key Distribution Center). So credentials are never stored on the user’s host. Once the initial authentication exchange takes place the password should be destroyed by that host.

  • Application hosts/servers must be able to prove their identity to anyone requesting such proof.

  • All communication between authenticated accounts (clients) and application services must be capable of being encrypted. Various bulk cipher algorithms (all-symmetric) are supported and may be negotiated.

Important Kerberos Terms
Principal

This is the string that fully identifies an account of the Kerberos service. The Principal can be the name of a service which runs on a host called a Service-Principal or user, sometimes called a User-Principal, and forms an index to the information stored about the entity in the Key Distribution Center (KDC). The format of the Principal differs for users and services.
Form: HTTP/node1.example.com

Realm

Those users, services and application servers that are covered by a particular Key Distribution Center (KDC). For a user to login to a realm the realm’s authentication server must have knowledge of the user’s credentials (and other information) which is maintained in some form of secure database. In Microsoft’s implementation this would be called a "Domain". Realms may trust other realms (in this case the peer realms will have cross-authenticated).
Form: <name>@REALM (case sensitive) e.g. BECKY@REALM (by convention it’s recommended that these be stated in upper case)

Ticket

This is a data structure with content that is known only to its issuer and any party or parties to which the ticket applies. Intermediate hosts, (clients, etc) treat the tickets as generic lumps of data and simply pass them on to their destination. There are two types of tickets used by Kerberos; Ticket Granting Tickets (TGT) proving a successful authentication or Service Tickets (ST) - are issued by a Ticket Granting Service (TGS), enabling the user to access a desired Application Service.

Configure browsers for Kerberos authentication

Use the following procedures to ensure that your browser will support Kerberos authentication:

Chrome

Start Chrome with the following switch:

google-chrome --auth-server-whitelist="*host.com"
Firefox
  • Start Firefox. In the Address line, enter "about:config"

  • Navigate to the property network.negotiate-auth.delegation-uris, double click it and enter in the Kerberos domain.

  • network.negotiate-auth.trusted-uris is updated in the same way.

network.negotiate-auth.trusted-uris - Sites that are permitted to engage in SPNEGO authentication with the browser.
network.negotiate-auth.delegation-uris - Sites for which the browser may delegate user authorization to the server.

Fall back

In the event that the Kerberos system fails, MSP will fall back to basic authentication using manual entered username and password.

Known issue after a fall back

After falling back to basic authentication the system will automatically attempt to log in using Kerberos, generating an error message:

Error logging in: Unable to validate SSO/Kerberos ticket. Try logging in.

Ignore this message, you can now login with a suitable admin credentials.

Keber-what?
Kerberos is a riff on Cerberus the 3-headed dog from ancient Greek mythology who guarded the entrance to the underworld. Rest assured that unlike its semi-namesake, Kerberos does not contain any exploits that allow entry through the use of drugged honeycakes.
Kerberos API Resources

The API now includes a number of Kerberos related end-points. You can review these on your node’s local copy of the API documentation. For convenience a copy of this document is available here: KerberosConfigResources.

Enable/Disable SSO via API

While there’s a UI toggle for enabling or disabling Kerberos SSO, you can also managed this via API calls:

Enable SSO
curl -X POST -H "Content-Type:application/x-www-form-urlencoded" -d "enable=true" 'http://192.168.56.190:8082/api/security/kerberos/enableSso' -u admin:pass

or

curl -X POST -H "Content-Type:application/x-www-form-urlencoded" 'http://192.168.56.190:8082/api/security/kerberos/enableSso?enable=true' -u admin:pass
Log Report

The following message will appear in the log:

[Single-sign-on with Kerberos was enabled by admin]
Disable SSO
curl -X POST -H "Content-Type:application/x-www-form-urlencoded" -d "enable=false" 'http://192.168.56.190:8082/api/security/kerberos/enableSso' -u admin:pass

or

curl -X POST -H "Content-Type:application/x-www-form-urlencoded" 'http://192.168.56.190:8082/api/security/kerberos/enableSso?enable=false -u admin:pass
Log Report

The following message will appear in the log:

[Single-sign-on with Kerberos was disabled by admin]
Warning message

The following error message will appear in the log file as a result of enabling SSO via the Admin UI. You can ignore it.

'WARN [WebComponent:filterFormParameters] - A servlet request, to the URI http://node1.vagrant.wan:8082/api/security/kerberos/enableSso, contains form parameters in the request body but the request body has been consumed by the servlet or a servlet filter accessing the request parameters. Only resource methods using @FormParam will work as expected. Resource methods consuming the request body by other means will not work as expected.'

This message is meant to warn developers about the fact that the request entity body has been consumed, thus any other attempts to read the message body will fail.

7.4. Beginner’s Guide to SVN

7.4.1. So, what’s SVN?

SVN is a version control system, a software toolset that helps people to manage changes that are made to collections of shared files. Even when we work alone, most of us will make use of some form of version control, although often crude and inconsistent - such as when we use an application’s Save As and cook up a new file name to distinguish the new version from the old. Without version control systems, collaboration (especially in software development) quickly devolves into a horrible mess as different contributors make change to the same files, overwriting or just mangling each others work.

VCS not SCM
There’s a specialist form of version control system (called Software Configuration Management) designed specifically for handling software development. Although SVN is most often used for software development it remains a mainstream version control system that is ready to handle files and documents of pretty much any type, and is occasionally put to novel use, such as managing backups, shared to-do lists and even in the writing of collaborative fiction.

Version Control Gives You:
Backup and restore

All changes are recorded and available for retrieval. If you make a mess when changing a file you can get back to the unspoilt version of the file with minimal time and effort.

Synchronization

When you work from a SVN repository, you can chose when to synchronize your working copy with the main repository. The architecture allows you to chose when you have periods of isolation and synchronization, although you must synchronize before a check in.

Logging changes

Changes made through SVN can be documented with messaging (check-in comments) that help someone to understand why changes were made.

Playing in the sandbox

SVN’s ability to guard against damaging changes lets you try big risky changes, safe in the knowledge that should you cause a mess, it’s easy to go back to the good old stable trunk. This is referred to as isolation.

Branching and merging

Development need not take a linear path. With SVN it’s possible to manage multiple versions of files at the same time.
Merging is one form of synchronization.

SVN Gives You:

SVN was originally written by a group of CVS (Concurrent Versioning System) users who were frustrated by CVS’s drawbacks. They designed SVN to build on CVS’s strengths, while avoiding its limitations. So when people talk about SVN’s key features, they are usually talking about the things it does that CVS can’t do.

Changes to repositories are true atomic operations

Don’t worry, there’s no radioactivity, that’s atomic in the original Greek meaning, as in 'indivisible'. When a change is committed to a SVN repository it is either fully completed or not done at all. This transactional approach to changes is important for maintaining consistency and protecting against corruption.

Files can’t hide from SVN

You can rename them, move them or even remove them and SVN will still track their history.

It’s the size of the change in the file, not the size of the file in the change

It’s a key property of the SVN repository model that the cost of an operation is proportional to the amount of change, not the size of the file that is changing.

Branching and tagging are cheap operations

Branches and tags are both handled using a lightweight copy operation. A copy takes up a small, fixed amount of space. Any copy is effectively a pointer to the original revision. Because of this both tags and branches are initially identical in nature, whether something is a tag or branch depends on where in the repository the pointer is created.
There are a small number of common conventions for this but any organization can design their own. One of the most popular has a branches directory and a tags directory in the root of the repository. In this case if the pointer (copy) is created in the /tags then it is a tag and the authorization should prevent further modification below that point. If created it in /branches then it is a branch and the authorization should allow further modification below that point (to appropriate accounts). We discuss this more below. There are also alternative approaches to branching and tagging, Google may help with finding more details on these.

Efficient handling of binary files

SVN handles binary using a different algorithm. So just like with text files, it can store successive revisions of a binary file without having to store a full copy of the file for each revision.

SVN supports and versions symbolic links

A versioned symbolic link appears as a true symbolic link. On platforms that that don’t support them (pre-Vista Windows) they behave like normal files, but SVN treats the link target as editable.

7.4.2. How SVN works

SVN uses an approach to versioning called Copy-Modify-Merge which has some big advantages over earlier systems that usually locked files when they were edited to ensure that two people couldn’t change a file at the same time. With Copy-Modify-Merge any number of people can make a change to a file at the same time without problem. Each person takes a copy of the file from the repository, this is called a Working Copy and is a snapshot of the file from the latest revision. Changes are always made to this working copy, and when the person modifying the file is ready to share their changes, the file is committed back to the repository where it is given a new revision number.

svnexplained01
A file is added to the repository and undergoes a series of changes

Each time the file is changed and committed to the repository it generates a new snapshot of the file. However, this snapshot is not a full copy of the file, instead it is a diff, which only contains a description of what has changed in the file. The above illustration shows how the changing state of snack.txt is recorded as series of additions and subtractions. No matter what changes are made, or when they are made, it will be possible to recreate any revision by applying the appropriate diffs.

Revision numbers are global, not file specific

The above illustration may give the impression that the revision number is specific to the file, as in snack.txt. In SVN this is not the case as the revision number reflects any changes that are made within the entire file system. So it’s not really revision 5 of snack.txt, more precisely it is the version of snack.txt that appears in revision 5 of the repository, even if it is the only change that was made in revision 5.

Conflicting changes

So, what does happen when two people make a change to the same file? How does SVN handle conflicting changes? We’ll run through an example situation, illustrated below.

svn explained02
Using SVN means your food fights leave an audit trail…​
  1. We checkout a file, snack.txt from the repository, the latest version is Revision 4. The file contains a list of sandwich ingredients. We edit the working copy of the file with our own sandwich preferences.

  2. A colleague is already working on snack.txt and commits changes that turn the sandwich into a toastie in revision 5.

  3. Having completed our edit of the file, we try to commit our changes, but the commit is rejected because SVN identified that our revision is out of date, and if our changes are committed, the changes made in revision 5 will be overwritten.

    svn explained02error
    What an out of date error looks like on the Tortoise SVN client (on Windows)
  4. We attempt an update, this downloads revision 5 of snack.txt and attempts to merge it with our working copy. If the changes between the two versions are in different places within the file there’s a good chance that the update will successfully merge the version of the file with revision 5. Unfortunately, in this case the changes can’t be merged because the changes happen in the same place. Fear not! We’re using TortoiseSVN, the Windows SVN client which provides some useful tools for dealing with the conflict.

    Tip
    When SVN detects a conflict it creates 3 temporary files:
             file.mine (your current working copy)
             file.rOldRev (the file at the revision before your changes were made)
             file.rNewRev (the file as it is in the latest revision in the repository)

    SVN also annotates the original file to show the conflicts within the file (illustrated in the image below).

    svn explained02conflict
    How a conflicted file is tagged to aid editing
  5. The conflicted file can now be edited so that both sets of changes are included, or whatever solution is best. Either way, SVN helped stop the loss of work, breaking of files and potential fisticuffs at dawn. In this case, snack.txt is kept as a toastie but is given a mutually agreeable filling.

    Tip
    After a conflicted file has been fixed, you tell SVN that the conflict has been resolved. SVN will then delete the three temp files and allow the file to be updated or committed.
    Conflicts rarely occur if you remember to do an update of your local copy before making any changes.
    svn explained02log
    The log view of snack.txt showing the changes over

7.4.3. Directory structure

SVN doesn’t force you to organize your files in any particular way, although there is a best practice for how to keep SVN repository files. This isn’t essential, but as the term 'best practice' suggests, everyone agrees this is a good way to work - especially those who started out by ignoring it and ended up in a mess.

svndirectories
A repository created with the recommended directory structure
.svn

Prior to SVN 1.7, every directory in a working copy contains administrative directory called .svn. The files in each administrative directory help SVN recognize which files contain unpublished changes, and which files are out of date with respect to others' work. There’s never any good reason for entering the directory and making any manual changes - just leave it alone.

SVN 1.7 contains a rewritten Working Copy Library (called WC:NG). This does away with separate .svn directories, using instead a single .svn directory located in the working copy’s main directory.

Don’t delete or change anything in the .svn directory!
SVN depends on it to manage your working copy. If you accidentally remove the .svn subdirectory, the easiest way to fix the problem is to remove the entire directory (a normal system deletion, not svn delete), then do an svn update from a parent directory. The SVN client will pull in a fresh copy of the directory you’ve deleted, along with a new .svn folder.
Trunk

The trunk directory is for current development code. The name is a reference to a growing trunk of a tree, not a place to store your spare tyres. This is where your current release code should be stored. It’s best not to muddy the Trunk directory with revisions or release names.

http://10.2.5.2:9880/encom/trunk
Branches

Branches are created by copying specific revisions of the trunk into the branches directory by a specific name (the branch-name). You can think of branches as "offshoots" of the trunk, like a tree (although there are some discrepancies with the analogy!) The idea is to use branches to work on significant changes, variations of code, without causing disruption to the current release code.

Bug fixing on a branch.

A major bug might be fixed on a branch created for this purpose. This allows for bug fixing changes to be worked on without the potential for disrupting other work going on in the trunk/development branches.

"Toe in the water" branches

It’s common to use a branch as a code "sandbox" when you want to try a new technology out. If everything gets broken, you can walk away, with no risk to the working code, but if the experiment works out, it can be easily merged back into the trunk.

http://10.2.5.2:9880/encom/branches/R1.02
http://10.2.5.2:9880/encom/branches/soapflax
Tags

Finally there are tags. Tags work like branches, but are not meant to be developed. Instead, they are code milestones, giving you a snapshot of the code at specific points in its history.

http://10.2.5.2:9880/encom/tags/version1.03
Tagging Bugfix / development branches

When you create a code or bug fix branch it’s useful to create a tag of the code before the changes are made (called the PRE tag) and a tag after the bugfix or code change has been made (called the POST tag).

http://10.2.5.2:9880/encom/tags/PRE_authchange_bug9343
http://10.2.5.2:9880/encom/tags/POST_authchange_bug9343
A Typical SVN project
repostructure
An illustration of how a SVN Repository evolves using branching, tagging and a code trunk
SVN itself makes no distinction between tags and branches. It won’t stop you from committing changes to tags or fixing major bugs on the trunk, it’s important that you are aware of this so you can guard against mistakes. A benefit of using a SVN client such as TortoiseSVN is that they add a lot of useful functionality that helps you guard against errors.

8. API

8.1. RESTful API

Access Control offers increased control and flexibility through the support of a RESTful (Representational State Transfer) API for accessing a set of resources through a fixed set of operations.

Prerequisites

Provided examples use Curl on the command line, but any other mechanism that properly communicates with a RESTful API will work too.

Authentication

Provided examples show the use of admin:password credentials for clarity. Clearly you shouldn’t use this approach for production. It can be beneficial to create a suitably permissioned account exclusively for API duties.

  • All calls use the base URI:

    http(s)://<server-host>:8082/api/<resource>

    You will need to replace 8082 with your chosen REST API port.

  • The Internet media type of the data supported by the web service is application/xml.

  • The API is hypertext driven, using the following HTTP methods:

Type Action

POST

Create a resource on the server

GET

Retrieve a resource from the server

PUT

Modify the state of a resource

DELETE

Remove a resource

Access point

protocol://node:port/api/replication-groups

  • protocol is either http (without SSL) or https (with SSL)

  • node is the IP address or hostname of the MSP node that will serve the request. If you have implemented SSL then you must use a hostname (or Fully Qualified Domain Name, e.g. "myhost.wandisco.com").

  • port is the port used for the REST API. MSP normally listens for API calls on port 8082.

Online documentation

You can review a copy of the bundled API documentation. This documentation is taken straight from a live installation. Given that the documentation is automatically generated, it may link to local files and resources that will not be available here. If you have difficulties here you should therefore try using the documentation linked to by your MSP UI.

8.2. Replication Groups

This page details how to manage your replication groups using the available REST API. Read more about the replication-groups end point.

8.2.1. GET

Retrieves a list of replication groups.

Parameters
  • withPendingTransactions

    • Default value: false

    • Set to true to include pending transactions in the response (slower).

    • Set to false to omit pending transactions in the response (faster).

Output

The HTML response code is included in the section headings describing the various types of output.

200 - Success

Success returns an XML-formatted document that describes each replication group. The XML document is a tree rooted at replicationgrouplist. Each replicationgroup has the following elements:

  • dsmId - The replication group’s state machine ID

  • groupStarted - A boolean indicating whether the replication group’s state machine is active

  • replicationGroupIdentity - The replication group’s unique ID

  • pendingTransactions - If the withPendingTransactions parameter is true, this element will be included and will give a count of pending transactions for the replication group.

  • replicationGroupName - The replication group’s common (display) name, as seen in the user interface

  • rotationSuspended - A boolean indicating whether schedule rotation is suspended in the replication group

  • scheduleManagingNodeId - The unique ID of the node that manages the replication group’s schedule

Each replication group has one or more repositoryIds elements, the body of which lists the unique ID of a repository in the replication group.

Each replication group has a list of replicationGroupNodes, each of which has the following elements.

  • managingNode - A boolean that indicates if this node is the replication group’s managing node

  • role - The node’s role in the replication group.

Under the node tree, each node also has the following information.

  • nodeIdentity - The node’s unique ID

  • locationIdentity - The node’s unique location ID

  • isLocal - A boolean indicating whether the node is local to the node that is serving the request

  • isUp - A boolean indicating whether the replicator on the node is running

  • isStopped - A boolean indicating whether the replicator on the node is not handling requests

  • lastStatusChange - A UNIX epoch timestamp indicating when the node’s status last changed

Each node also has a list of attributes. Each attribute has a key and a value. Common and useful attributes include:

  • eco.system.membership - The unique ID of the node’s ecosystem membership

  • eco.system.dsm.identity - The unique ID of the node’s ecosystem state machine

  • node.name - The common (display) name for the node, as seen in the user interface

The replication group element also includes a schedule block that describes the replication group’s schedule. For each unique phase of the schedule, this block will contain:

  • dayOfWeek The day that this schedule phase becomes active.

  • hourOfDay The hour of the day (0-23) that this schedule phase becomes active.

  • membershipId The unique ID of the scheduled phase’s membership

  • quorum A boolean indicating whether the schedule will adjust quorum * scheduled A boolean indicating whether the phase is actively scheduled

  • A list of scheduledNodes, each of which contains:

    • managingNode A boolean indicating whether the node is the managing node during this phase of the schedule

    • A node element, containing information similar to that returned in nodes

400 - Processing error

The request was invalid or the node could not process it successfully. The output will contain an error message with more details.

401 - Invalid authentication

Invalid authentication returns a brief XML document that embeds an HTML-formatted error message.

Usage examples

In the following examples we use:

  • admin as an administrator account name

  • pass as the credential for the admin account

  • http://192.168.56.190 as the IP address of the MSP node

  • 8082 as the REST port

Default

curl -u admin:pass http://192.168.56.190:8082/api/replication-groups

Returns a list of replication groups.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<replicationgrouplist>
  <replicationgroup>
    <dsmId>3f049150-0658-11e4-9d99-080027b651cd</dsmId>
    <groupStarted>true</groupStarted>
    <nodes>
      <replicationGroupNodes>
        <managingNode>false</managingNode>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
          <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
          <isLocal>false</isLocal>
          <isUp>true</isUp>
          <isStopped>false</isStopped>
          <lastStatusChange>1411397169893</lastStatusChange>
          <attributes>
            <attribute>
              <key>eco.system.membership</key>
              <value>ECO-MEMBERSHIP-190823c6-0658-11e4-a747-0800279336f8</value>
            </attribute>
            <attribute>
              <key>eco.system.dsm.identity</key>
              <value>ECO-DSM-d33204f1-0648-11e4-aaa1-080027b651cd</value>
            </attribute>
            <attribute>
              <key>node.name</key>
              <value>node56191</value>
            </attribute>
          </attributes>
        </node>
        <role>AV</role>
      </replicationGroupNodes>
      <replicationGroupNodes>
        ...
      </replicationGroupNodes>
      <replicationGroupNodes>
        ...
      </replicationGroupNodes>
    </nodes>
    <replicationGroupIdentity>3f04df72-0658-11e4-9d99-080027b651cd</replicationGroupIdentity>
    <replicationGroupName>3 Nodes Group</replicationGroupName>
    <repositoryIds>5dc5a543-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5d75b075-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5d0c674f-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5ddcd6cb-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5e006466-0658-11e4-9d99-080027b651cd</repositoryIds>
    <rotationSuspended>false</rotationSuspended>
    <schedule>
      <scheduledNodeLists>
        <dayOfWeek>7</dayOfWeek>
        <hourOfDay>0</hourOfDay>
        <membershipId>3f04b861-0658-11e4-9d99-080027b651cd</membershipId>
        <quorum>true</quorum>
        <scheduled>true</scheduled>
        <scheduledNodes>
          <managingNode>false</managingNode>
          <node>
            <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
            <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
            <isLocal>false</isLocal>
            <isUp>true</isUp>
            <isStopped>false</isStopped>
            <lastStatusChange>1411397169893</lastStatusChange>
            <attributes>
              <attribute>
                <key>eco.system.membership</key>
                <value>ECO-MEMBERSHIP-190823c6-0658-11e4-a747-0800279336f8</value>
              </attribute>
              <attribute>
                <key>eco.system.dsm.identity</key>
                <value>ECO-DSM-d33204f1-0648-11e4-aaa1-080027b651cd</value>
              </attribute>
              <attribute>
                <key>node.name</key>
                <value>node56191</value>
              </attribute>
            </attributes>
          </node>
          <role>AV</role>
        </scheduledNodes>
        <scheduledNodes>
          ...
        </scheduledNodes>
        <scheduledNodes>
          ...
        </scheduledNodes>
      </scheduledNodeLists>
    </schedule>
    <scheduleManagingNodeId>abb21772-5544-43e2-9cb9-ff1node56190</scheduleManagingNodeId>
  </replicationgroup>
</replicationgrouplist>
Include pending transactions in output

curl -u admin:pass http://192.168.56.190:8082/api/replication-groups?withPendingTransactions=true

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<replicationgrouplist>
  <replicationgroup>
    <dsmId>3f049150-0658-11e4-9d99-080027b651cd</dsmId>
    <groupStarted>true</groupStarted>
    <nodes>
      <replicationGroupNodes>
        <managingNode>false</managingNode>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
          <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
          <isLocal>false</isLocal>
          <isUp>true</isUp>
          <isStopped>false</isStopped>
          <lastStatusChange>1411397169893</lastStatusChange>
          <attributes>
            <attribute>
              <key>eco.system.membership</key>
              <value>ECO-MEMBERSHIP-190823c6-0658-11e4-a747-0800279336f8</value>
            </attribute>
            <attribute>
              <key>eco.system.dsm.identity</key>
              <value>ECO-DSM-d33204f1-0648-11e4-aaa1-080027b651cd</value>
            </attribute>
            <attribute>
              <key>node.name</key>
              <value>node56191</value>
            </attribute>
          </attributes>
        </node>
        <role>AV</role>
      </replicationGroupNodes>
      <replicationGroupNodes>
        ...
      </replicationGroupNodes>
      <replicationGroupNodes>
        ...
      </replicationGroupNodes>
    </nodes>
    <pendingTransactions>0</pendingTransactions>
    <replicationGroupIdentity>3f04df72-0658-11e4-9d99-080027b651cd</replicationGroupIdentity>
    <replicationGroupName>3 Nodes Group</replicationGroupName>
    <repositoryIds>5dc5a543-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5d75b075-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5d0c674f-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5ddcd6cb-0658-11e4-9d99-080027b651cd</repositoryIds>
    <repositoryIds>5e006466-0658-11e4-9d99-080027b651cd</repositoryIds>
    <rotationSuspended>false</rotationSuspended>
    <schedule>
      <scheduledNodeLists>
        <dayOfWeek>7</dayOfWeek>
        <hourOfDay>0</hourOfDay>
        <membershipId>3f04b861-0658-11e4-9d99-080027b651cd</membershipId>
        <quorum>true</quorum>
        <scheduled>true</scheduled>
        <scheduledNodes>
          <managingNode>false</managingNode>
          <node>
            <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
            <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
            <isLocal>false</isLocal>
            <isUp>true</isUp>
            <isStopped>false</isStopped>
            <lastStatusChange>1411397169893</lastStatusChange>
            <attributes>
              <attribute>
                <key>eco.system.membership</key>
                <value>ECO-MEMBERSHIP-190823c6-0658-11e4-a747-0800279336f8</value>
              </attribute>
              <attribute>
                <key>eco.system.dsm.identity</key>
                <value>ECO-DSM-d33204f1-0648-11e4-aaa1-080027b651cd</value>
              </attribute>
              <attribute>
                <key>node.name</key>
                <value>node56191</value>
              </attribute>
            </attributes>
          </node>
          <role>AV</role>
        </scheduledNodes>
        <scheduledNodes>
          ...
        </scheduledNodes>
        <scheduledNodes>
          ...
        </scheduledNodes>
      </scheduledNodeLists>
    </schedule>
    <scheduleManagingNodeId>abb21772-5544-43e2-9cb9-ff1node56190</scheduleManagingNodeId>
  </replicationgroup>
</replicationgrouplist>
Invalid authentication

curl -u admin:wrongpass http://192.168.56.190:8082/api/replication-groups

<?xml version="1.0"?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 401 No client with requested id: admin</title>
</head>
<body>
<h2>HTTP ERROR: 401</h2>
<p>Problem accessing /api/replication-groups. Reason:
<pre>    No client with requested id: admin</pre></p>
<hr /><i><small>Powered by Jetty://</small></i>
</body>
</html>

8.2.2. POST

Create a new replication group.

Request body

An XML document that describes the new replication group. The root element is replicationGroup and it contains the following elements.

  • replicationGroupName The common (display) name for the new replication group

  • replicationGroupIdentity An empty element

  • schedule A block listing the desired schedule, including the nodes that will constitute the replication group. See the GET description for how to construct this block. Note:

    • The new replication group must satisfy the normal rules of quorum during all phases of the schedule.

    • One node must function as the managing node.

Output

The HTML response code is included in the section headings describing the various types of output.

202 - Success

Success returns the 202 response code with no further information.

400 - Processing error

The request was invalid or the node could not process it successfully. The output will contain an error message with more details.

401 - Invalid authentication

Invalid authentication returns a brief XML document that embeds an HTML-formatted error message.

Usage examples

In the following examples we use:

  • admin as an administrator account name

  • pass as the credential for the admin account

  • http://192.168.56.190 as the IP address of the MSP node

  • 8082 as the REST port

  • A set of example node identities and location identities. These can be gleaned from the nodes end point.

Creating a new group

Using this data as the response body:

<?xml version="1.0"?>
<replicationGroup>
  <replicationGroupName>All Nodes Group</replicationGroupName>
  <schedule>
    <scheduledNodeLists>
      <dayOfWeek>7</dayOfWeek>
      <scheduledNodes>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
          <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
        </node>
        <role>AVT</role>
        <managingNode>true</managingNode>
      </scheduledNodes>
      <scheduledNodes>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56192</nodeIdentity>
          <locationIdentity>9ae09faf-0650-11e4-a747-0800279336f8</locationIdentity>
        </node>
        <role>AV</role>
        <managingNode>false</managingNode>
      </scheduledNodes>
      <scheduledNodes>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56190</nodeIdentity>
          <locationIdentity>d29772a0-0648-11e4-aaa1-080027b651cd</locationIdentity>
        </node>
        <role>AV</role>
        <managingNode>false</managingNode>
      </scheduledNodes>
    </scheduledNodeLists>
  </schedule>
  <replicationGroupIdentity/>
</replicationGroup>

curl -u admin:pass -X POST -d "<replicationGroup><replicationGroupName>All Nodes Group</replicationGroupName><schedule><scheduledNodeLists><dayOfWeek>7</dayOfWeek><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity><locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity></node><role>AVT</role><managingNode>true</managingNode></scheduledNodes><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56192</nodeIdentity><locationIdentity>9ae09faf-0650-11e4-a747-0800279336f8</locationIdentity></node><role>AV</role><managingNode>false</managingNode></scheduledNodes><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56190</nodeIdentity><locationIdentity>d29772a0-0648-11e4-aaa1-080027b651cd</locationIdentity></node><role>AV</role><managingNode>false</managingNode></scheduledNodes></scheduledNodeLists></schedule><replicationGroupIdentity></replicationGroupIdentity></replicationGroup>" --header 'Content-Type: application/xml' http://192.168.56.191:8082/api/replication-groups

Aside from the 202 response code there is no further output.

Invalid request

Using this response body:

<?xml version="1.0"?>
<replicationGroup>
  <replicationGroupName>All Nodes Group</replicationGroupName>
  <schedule>
    <scheduledNodeLists>
      <dayOfWeek>7</dayOfWeek>
      <scheduledNodes>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56192</nodeIdentity>
          <locationIdentity>9ae09faf-0650-11e4-a747-0800279336f8</locationIdentity>
        </node>
        <role>AV</role>
        <managingNode>false</managingNode>
      </scheduledNodes>
      <scheduledNodes>
        <node>
          <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56190</nodeIdentity>
          <locationIdentity>d29772a0-0648-11e4-aaa1-080027b651cd</locationIdentity>
        </node>
        <role>AV</role>
        <managingNode>false</managingNode>
      </scheduledNodes>
    </scheduledNodeLists>
  </schedule>
  <replicationGroupIdentity/>
</replicationGroup>

curl -u admin:pass -X POST -d "<replicationGroup><replicationGroupName>All Nodes Group</replicationGroupName><schedule><scheduledNodeLists><dayOfWeek>7</dayOfWeek><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56192</nodeIdentity><locationIdentity>9ae09faf-0650-11e4-a747-0800279336f8</locationIdentity></node><role>AV</role><managingNode>false</managingNode></scheduledNodes><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56190</nodeIdentity><locationIdentity>d29772a0-0648-11e4-aaa1-080027b651cd</locationIdentity></node><role>AV</role><managingNode>false</managingNode></scheduledNodes></scheduledNodeLists></schedule><replicationGroupIdentity></replicationGroupIdentity></replicationGroup>" --header 'Content-Type: application/xml' http://192.168.56.191:8082/api/replication-groups

This request is invalid as it includes two nodes with no tie-breaker, violating quorum requirements. The output is:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<exception>
  <class>com.wandisco.nodes.groups.exceptions.UnsupportedMembershipException</class>
  <message>Memberships with an even number of acceptors should have a DN defined</message>
  <stack-trace>com.wandisco.nodes.groups.exceptions.UnsupportedMembershipException: Memberships with an even number of acceptors should have a DN defined
        at com.wandisco.application.tasks.membership.MembershipUtils.checkRoles(MembershipUtils.java:386)
        at com.wandisco.application.tasks.membership.MembershipUtils.checkRoles(MembershipUtils.java:343)
        at com.wandisco.application.dao.ReplicationGroupDAO.checkMemberships(ReplicationGroupDAO.java:426)
        ... very long stack trace ...
  </stack-trace>
</exception>
Invalid authentication
`curl -u admin:wrongpass -X POST -d "<replicationGroup><replicationGroupName>All Nodes Group</replicationGroupName><schedule><scheduledNodeLists><dayOfWeek>7</dayOfWeek><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56192</nodeIdentity><locationIdentity>9ae09faf-0650-11e4-a747-0800279336f8</locationIdentity></node><role>AV</role><managingNode>false</managingNode></scheduledNodes><scheduledNodes><node><nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56190</nodeIdentity><locationIdentity>d29772a0-0648-11e4-aaa1-080027b651cd</locationIdentity></node><role>AV</role><managingNode>false</managingNode></scheduledNodes></scheduledNodeLists></schedule><replicationGroupIdentity></replicationGroupIdentity></replicationGroup>" --header 'Content-Type: application/xml' http://192.168.56.191:8082/api/replication-groups`

<?xml version="1.0"?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
<title>Error 401 No client with requested id: admin</title>
</head>
<body>
<h2>HTTP ERROR: 401</h2>
<p>Problem accessing /api/replication-groups. Reason:
<pre>    No client with requested id: admin</pre></p>
<hr/><i><small>Powered by Jetty://</small></i>
</body>
</html>
Schedule node type changes via the public API

Instead of manually setting up schedules through a node’s UI you can do it programmatically through calls to the public API.

Use the following API call:

http://<ip>:8082/public-api/replicationgroup/{repgroupID}/schedule

e.g.

http://10.0.100.135:8082/public-api/replicationgroup/97913c04-bbad-11e2-877a-028e03094f8d/schedule

PUT with ReplicationGroupAPIDTO XML as body:

To make Node N3 a tie-breaker 'T' FROM 10:00 - 16:00 (GMT) every day of the week with Node N1 as tie-breaker 'T' afterwards:

Times are always in UTC (GMT)
When viewed on a node, times are shifted to the local timezone although internally they are always recorded in UTC.
Example curl command

Make a text file containing ReplicationgroupAPIDTO XML (as above) called schedule.xml:

curl -u username:password -X PUT -d @schedule.xml http://[IP]:[PORT]/public-api/replicationgroup/97913c04-bbad-11e2-877a-028e03094f8d/schedule
Sample 'schedule.xml' file
<ReplicationGroupAPIDTO>
       <replicationGroupName>global</replicationGroupName>
     <replicationGroupIdentity>97913c04-bbad-11e2-877a-028e03094f8d</replicationGroupIdentity>
       <scheduledNodes>
           <dayOfWeek>1</dayOfWeek>
           <hourOfDay>14</hourOfDay>
           <schedulednode>
               <nodeIdentity>N1</nodeIdentity>
               <locationIdentity>c0e486a0-bbab-11e2-863b-028e03094f8e</locationIdentity>
               <isLocal>true</isLocal>
               <isUp>true</isUp>
               <lastStatusChange>0</lastStatusChange>
               <role>AV</role>
           </schedulednode>
           <schedulednode>
               ...
           </schedulednode>
           <schedulednode>
               ...
           </schedulednode>
       </scheduledNodes>
</ReplicationGroupAPIDTO>

Download the full sample schedule.xml file.

8.3. Nodes Endpoint

This page details how to manage your Nodes using the available REST API. Read more about the Nodes end point.

8.3.1. GET

Parameters
  • withRemoved

    • Default value: false

    • Set to true to include removed nodes in the response

    • Set to false to omit removed nodes in the response

Output

The HTML response code is included in the section headings describing the various types of output.

200 - Success

Success returns an XML-formatted document that describes each node. The XML document is a tree of nodes. Each node has the following elements:

  • nodeIdentity - The node’s unique ID

  • locationIdentity - The node’s unique location ID

  • isLocal - A boolean indicating whether the node is local to the node that is serving the request

  • isUp - A boolean indicating whether the replicator on the node is running

  • isStopped - A boolean indicating whether the replicator on the node is not handling requests

  • lastStatusChange - A UNIX epoch timestamp indicating when the node’s status last changed

Each node also has a list of attributes. Each attribute has a key and a value. Common and useful attributes include:

  • eco.system.membership - The unique ID of the node’s ecosystem membership

  • eco.system.dsm.identity - The unique ID of the node’s ecosystem state machine

  • node.name - The common (display) name for the node, as seen in the user interface

401 - Invalid authentication

Invalid authentication returns a brief XML document that embeds an HTML-formatted error message.

Usage examples

In the following examples we use: admin as an administrator account name pass as the credential for the admin account http://192.168.56.190 as the IP address of the MSP node 8082 as the REST port

Default

curl -u admin:pass http://192.168.56.190:8082/api/nodes

Returns a list of non-removed nodes.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<nodes>
  <node>
    <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
    <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
    <isLocal>false</isLocal>
    <isUp>true</isUp>
    <isStopped>false</isStopped>
    <lastStatusChange>1411397169893</lastStatusChange>
    <attributes>
      <attribute>
        <key>eco.system.membership</key>
        <value>ECO-MEMBERSHIP-190823c6-0658-11e4-a747-0800279336f8</value>
      </attribute>
      <attribute>
        <key>eco.system.dsm.identity</key>
        <value>ECO-DSM-d33204f1-0648-11e4-aaa1-080027b651cd</value>
      </attribute>
      <attribute>
        <key>node.name</key>
        <value>node56191</value>
      </attribute>
    </attributes>
  </node>
  <node>
    ...
  </node>
  <node>
    ...
  </node>
</nodes>
Include removed nodes in output

curl -u admin:pass http://192.168.56.190:8082/api/nodes?withRemoved=true

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<nodes>
  <node>
    <nodeIdentity>abb21772-5544-43e2-9cb9-ff1node56191</nodeIdentity>
    <locationIdentity>3027fc8f-064e-11e4-b8d2-080027ec317a</locationIdentity>
    <isLocal>false</isLocal>
    <isUp>false</isUp>
    <isStopped>true</isStopped>
    <lastStatusChange>1411397169893</lastStatusChange>
    <attributes>
      <attribute>
        <key>eco.system.membership</key>
        <value>ECO-MEMBERSHIP-190823c6-0658-11e4-a747-0800279336f8</value>
      </attribute>
      <attribute>
        <key>eco.system.dsm.identity</key>
        <value>ECO-DSM-d33204f1-0648-11e4-aaa1-080027b651cd</value>
      </attribute>
      <attribute>
        <key>node.name</key>
        <value>node56191</value>
      </attribute>
    </attributes>
  </node>
  <node>
    ...
  </node>
  <node>
    ...
  </node>
</nodes>
Invalid authentication

curl -u admin:wrongpass http://192.168.56.190:8082/api/nodes

<?xml version="1.0"?>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
    <title>Error 401 No client with requested id: admin</title>
  </head>
  <body>
    <h2>HTTP ERROR: 401</h2>
    <p>Problem accessing /api/nodes. Reason:
<pre>    No client with requested id: admin</pre></p>
    <hr/>
    <i>
      <small>Powered by Jetty://</small>
    </i>
  </body>
</html>

8.4. Node Induction

8.4.1. Perform a node induction

To induct two or more nodes via API the following information need to be included in the XML payload that will be passed to the cURL command:

INDUCTION_XML
<inductionTicket>
  <inductorLocationId>${LOCATION_ID}</inductorLocationId>
  <inductorNodeId>${NODE_ID}</inductorNodeId>
  <inductorHostName>${INDUCTOR_HOST}</inductorHostName>
  <inductorPort>${INDUCTOR_DCONE}</inductorPort>
</inductionTicket>
${LOCATION_ID}

The Location ID of the first node from which we are inducting. You can capture the Location ID from the REST API’s Nodes page. e.g.

http://10.0.2.0:8082/api/nodes

This returns the following information:

<nodes>
    <node>
        <nodeIdentity>156d2cd8-0929-4333-a84e-350e6be44e4b</nodeIdentity>
        <locationIdentity>7b212e2f-486b-11e4-90b8-22564bb81bc7</locationIdentity>
        <isLocal>true</isLocal>
        <isUp>true</isUp>
        <isStopped>false</isStopped>
        <lastStatusChange>1412058720321</lastStatusChange>
        <attributes>
            <attribute>
                <key>eco.system.membership</key>
                <value>
                    ECO-MEMBERSHIP-df34fdfc-486b-11e4-adb1-aa7004f22f33
                </value>
            </attribute>
            <attribute>
                <key>node.name</key>
                <value>node1</value>
            </attribute>
            <attribute>
                <key>eco.system.dsm.identity</key>
                <value>ECO-DSM-7b86a6c0-486b-11e4-90b8-22564bb81bc7</value>
            </attribute>
        </attributes>
    </node>
    ...
</nodes>

You can also find the LocationID on the Settings screen of the admin UI.

${NODE_ID}

The Node ID of the first node from we are inducting. This can also be found on the API’s /api/nodes screen (see above). It’s also available on the Settings screen of the admin UI.

${INDUCTOR_HOST}

The IP/hostname of the first node from which we are inducting.

${INDUCTOR_DCONE}

The DConE port: this is chosen during installation and needs to be the same across all nodes. Default value is 6444.

Each node (apart from the first node from which we are inducting) will need the above XML.

Induction

The cURL command will look as follow (change to https for SSL API):

curl -u <username>:<password> -X PUT -d "${ABOVE_XML}" --header 'Content-Type: application/xml' http://<nodeIP>:<apiPort>/api/node/${NODE_ID_TO_BE_INDUCTED}
${NODE_ID_TO_BE_INDUCTED}

The Node ID of the node to which we are inducting.

An example of inducting Node2 (999aacc5-af77-43e7-a8de-9a921aimz3k4) from Node1 (999aacc5-af77-43e7-a8de-9a921apg0sby) would be:

 curl -u admin:pass -X PUT -d '<inductionTicket><inductorLocationId>57b331ba-38f2-11e4-8958-3a2a7398d235</inductorLocationId><inductorNodeId>999aacc5-af77-43e7-a8de-9a921apg0sby</inductorNodeId><inductorHostName>172.16.2.50</inductorHostName><inductorPort>6444</inductorPort></inductionTicket>' --header 'Content-Type: application/xml' http://10.0.2.0:8082/api/node/999aacc5-af77-43e7-a8de-9a921aimz3k4

Note that the NODE_ID in the XML (the first Node we’re inducting from) is different from the NODE_ID of the URL (the Node we’re inducting to).

This cURL command is repeated with the same XML and different NODE_ID in the URL for each node that needs to be inducted. Please note to leave enough time to complete an induction before attempting another induction, otherwise the second induction will be aborted.

Timing
Leave enough time to complete induction before attempting another induction. This approach is vulnerable to any possible delays that may occur during an induction. You can take a cautious approach leaving minutes between inductions just to make sure there is no issue.

8.5. Stop and start nodes

Stop output from nodes

Using this method proposals will still be delivered to the node and the node can still participate in voting, but the proposals will not be executed until the output is restarted.

This is supported in the API with the RepositoryResource methods:

PUT <server>:<port>/api/repository/{repositoryId}/stopoutput

This command takes one argument: NodeListDTO nodeListDTO which is the list of nodes on which the repo output will be started. In this case that list will only include NodeX.

PUT: <server>:<port>/api/replicator/stopall

This command takes one argument: NodeListDTO nodeListDTO which is the list of nodes on which the repo output will be stopped. In this case that list will only include NodeX.

Note that whilst the output is stopped it will show in the UI as LocalReadOnly.

PUT <server>:<port>/api/repository/{repositoryId}/startoutput

8.5.2. Stop nodes

If you don’t need the stop to be coordinated the the following method is simpler and immediately stops the output of proposals on all repositories on the node it is executed against:

PUT: <server>:<port>/api/replicator/stopall

This will stop all repositories on the node on which it is invoked (with no coordination).

8.5.3. Start nodes

To start them all again call:

PUT: <server>:<port>/api/replicator/startall

8.6. Remove a node from a replication group

This page describes how to remove a node from your MSP replication group. Remove a node gives more details.

Use the following XML for the cURL command:

<nodes>
  <node>
    <nodeIdentity>${NODE_ID}</nodeIdentity>
    <locationIdentity>${LOCATION_ID}</locationIdentity>
  </node>
</nodes>
${NODE_ID}

The node ID of the first node that you want to remove from the ecosystem. This is on the Settings screen and looks like 999aacc5-af77-43e7-a8de-9a921a45thuc. It is also in the /api/nodes page.

${LOCATION_ID}

The location ID of the first node that you want to remove from the ecosystem. This is on the Settings screen and looks like 0488a9be-38ec-11e4-aa49-3a2a7398d235. It is also in the /api/nodes page.

The cURL command looks like this:

curl -u  <username>: <password> -X PUT -d "${ABOVE_XML}" --header 'Content-Type: application/xml' http://<nodeIP>:<apiPort>/api/node/${LOCAL_NODE_ID}/removenodes

For SSL API change http to https.

{LOCAL_NODE_ID}

The node ID of the node that will remain part of the ecosystem. This is on the Settings screen. It is also in the /api/nodes page.

A working example would be:

curl -u admin:pass -X PUT -d "<nodes><node><nodeIdentity>999aacc5-af77-43e7-a8de-9a921aimz3k4</nodeIdentity><locationIdentity>483fcf8d-38f2-11e4-be09-4a9206cdc4f9</locationIdentity></node></nodes>" --header 'Content-Type: application/xml' http://172.16.2.50:8082/api/node/999aacc5-af77-43e7-a8de-9a921apg0sby/removenodes

If the call is successful the removed node is displayed in the Nodes page as Removed when you click Display Removed Nodes.

8.7. Add an existing repository

Here is an example procedure for adding a repository that is already present on your node under the control of MSP:

curl -u admin:pass -X POST -d "<svn-repository><name>$REPO_NAME</name><fileSystemPath>$REPO_LOCATION</fileSystemPath><globalReadOnly>false</globalReadOnly><localReadOnly>false</localReadOnly></svn-repository>" -H "Content-Type:application/xml" http://nodeIP:8082/api/repository?replicationGroupId=<rgId>

In the XML part obviously replace $REPO_NAME with the repository’s name and $REPO_LOCATION with the repository’s location (i.e. Repo1, /opt/Subversion/Repo1). Ensure that you use the appropriate credentials and replication group ID.

XML passed by -d can always be written into a file such as repoXML.xml and then the call becomes:

curl -u admin:pass -X POST -d "@repoXML.xml" -H "Content-Type:application/xml" http://nodeIP:8082/api/repository?replicationGroupId=<rgId>

as preferred.

8.8. Invoke a consistency check

Example of a rest call to trigger a consistency check.

To perform a consistency check on a repository the repository ID need to be used in a cURL or equivalent command. The repository ID can be found by getting a list of the present repositories known to the target node /api/repositories page and will look something like a6e0c5a3-47a2-11e4-8fe4-22564bb81bc.

root@redhat6 svn-multisite-plus]# curl -u admin:pass http://172.16.0.254:8082/api/repositories

8.8.1. Output

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
  <svn-repositories>
  <repository>
    <dsmId>a6e643e4-47a2-11e4-8fe4-22564bb81bc7</dsmId>
    <fileSystemPath>/opt/Subversion/Repo0
    </fileSystemPath>
    <globalReadOnly>false
    </globalReadOnly>
    <latestRevision>
        <revisionNum>0
        </revisionNum>
        <size>26318
        </size>
        <timestamp>1375431011000
        </timestamp>
    </latestRevision>
    <localReadOnly>false
    </localReadOnly><name>Repo0</name>
    <readOnlyReason></readOnlyReason>
    <replicationGroupId>9a663117-47a2-11e4-8fe4-22564bb81bc7</replicationGroupId>
    <repositoryIdentity>a6e0c5a3-47a2-11e4-8fe4-22564bb81bc7</repositoryIdentity>
    <state xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="deployedStateDTO"/>
  </repository>
  <repository>
    ...

The cURL command will look like (change to https for SSL API):

curl -u <username>:<password> -X POST http://<nodeIP>:<apiPort>/api/repository/${REPOSITORY_ID}/consistencyCheck

It is possible to specify the number of revisions to be checked. Please specify the smallest necessary number as a very large number could run your system out of JAVA HEAP. This would cause the replicator to crash and be unable to restart until memory is added to the system and configured for use. We suggest you use at most 100 for normal checking, and recommend no more than 1000 revisions be checked.
This parameter will need to be appended at the end of the URL as follow:

curl -u <username>:<password> -X POST http://<nodeIP>:<apiPort>/api/repository/${REPOSITORY_ID}/consistencyCheck?numberOfRevisions=10

A working example would be:

curl -u admin:pass -X POST http://10.0.0.50:8082/api/repository/a6e0c5a3-47a2-11e4-8fe4-22564bb81bc7/consistencyCheck?numberOfRevisions=3

8.9. License endpoint

8.9.1. Perform a license status check

To get the license status via API the following URL command needs to be used:

curl -u <username>:<password> http://<nodeIP>:<apiPort>/api/license

8.9.2. Example

  curl -u api:password http://172.16.2.24:8082/api/license

8.9.3. Output

An example of the XML returned by the curl:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<license>
  <allowedIps>192.168.56.209</allowedIps>
  <companyName>Your Company</companyName>
  <rawproperties>
    <licenseproperty>
      <key>nusers</key>
      <value>500</value>
    </licenseproperty>
    <licenseproperty>
      <key>product</key>
      <value>svnplus</value>
    </licenseproperty>
    <licenseproperty>
      <key>lm-version</key>
      <value>bc15-direct</value>
    </licenseproperty>
    <licenseproperty>
      <key>nsites</key>
      <value>1</value>
    </licenseproperty>
    <licenseproperty>
      <key>ip</key>
      <value>192.168.56.209</value>
    </licenseproperty>
    <licenseproperty>
      <key>eval-license</key>
      <value>false</value>
    </licenseproperty>
    <licenseproperty>
      <key>no_expiry</key>
      <value>false</value>
    </licenseproperty>
    <licenseproperty>
      <key>company</key>
      <value>YourCompany</value>
    </licenseproperty>
    <licenseproperty>
      <key>expiration</key>
      <value>12/31/2018</value>
    </licenseproperty>
    <licenseproperty>
      <key>scm</key>
      <value>svn</value>
    </licenseproperty>
    <licenseproperty>
      <key>maint_start</key>
      <value>1501792681</value>
    </licenseproperty>
    <licenseproperty>
      <key>fd-licver</key>
      <value>3614</value>
    </licenseproperty>
    <licenseproperty>
      <key>maint_end</key>
      <value>1546214400</value>
    </licenseproperty>
  </rawproperties>
  <currentUsers>0</currentUsers>
  <expiry>1546232400000</expiry>
  <licenseType>Production License</licenseType>
  <maxUsers>99999</maxUsers>
  <numberOfNodes>1</numberOfNodes>
</license>

Read more about REST API calls for LicenseResources

8.10. Pending transactions

8.10.1. List all pending transactions

To get pending transactions at a node via API the Node ID is required for the cURL command. The node ID can be found on the Settings screen and will look something like 999aacc5-af77-43e7-a8de-9a921a45thuc. Can also be found in the /api/nodes page.

The cURL command will look like (change to https for SSL API):

curl -u <username>:<password> http://<nodeIP>:<apiPort>/api/node/${NODE_ID}/pendingTransactions

This will return the number of pending transactions for the specified node.

A working example would be:

curl -u admin:pass http://10.0.2.50:8082/api/node/999aacc5-af77-43e7-a8de-9a921afg6lei/pendingTransactions

Which gives the output:

 3[root@redhat6 svn-multisite-plus]#

The number of pending transactions are returned, in this case 3.