WANdisco
 Navigation:  v | Release Notes | Install | Upgrade | Administration | Reference | Gerrit | API | Glossary | Archive

Installation Guide

This guide describes how to install Git MultiSite (GitMS):

1. Before you install

Before installing GitMS, make sure that you have sufficient hardware and that all required software is configured appropriately.

1.1 Skills requirements

This section describes the knowledge and technical requirements for deployment and operation of the WANdisco software. Make sure that you can meet the requirements before you begin the deployment.

Technical skill requirements

System administration
  • Unix operating system installation
  • Disk management
  • Memory monitoring and management
  • Command line administration and manually editing configuration files

Apache administration, if applicable
  • Familiarity with Apache web server architecture
  • Management of httpd.conf / Apache2 configuration file management settings
  • Start/stop/restart administration
  • User authentication options
  • Log setup and viewing

Networking
  • IP address assignation
  • TCP/IP ports and firewall setup

Git
  • Familiarity with Git administration in order to manage Git repositories via the command line
  • Repository creation and/or file system copying and synchronization
  • Familiarity with WANdisco's replication architecture
  • Understanding of the installation procedure relevant to your OS
  • Concept of Node types and Replication groups

Gerrit, if applicable

If you would like help assessing your requirements, request a supported installation from WANdisco.

One administrator can manage all the systems running MultiSite. However, we recommend that you someone at each site who is familiar with MultiSite Basics.

1.2 Deployment overview

We recommend that you follow a well-defined plan for your WANdisco GitMS deployment. This helps you keep control, understand the product, and find and fix any issues before production. We recommend that you include the following steps:

1. Pre-deployment planning: Identify the requirements, people, and skills needed for deployment and operation. Agree a schedule and milestones. Highlight any assumptions, constraints, dependencies, and risks to a successful deployment.

2. Deployment preparation: Prepare and identify server specifications, locations, node configuration, repository set-up, replication architecture, and the server and software configurations.

3. Testing phase: Actions for an initial installation and testing in a non-production environment, executing test cases, and verifying deployment readiness.

4. Production deployment: Actions to install, configure, test, and deploy the production environment.

5. Post-deployment operations and maintenance: Actions including environment monitoring, system maintenance, training, and in-life technical support.

1.3 System requirements

This section describes how to prepare your Git servers for replication. You need to ensure that you've got a suitable platform, with sufficient hardware and compatible versions of the required software that is configured appropriately. Use this information as a guide, not as a fixed set of requirements.

You can run GitMS nodes as virtual machines on the same physical hardware. Note that this will impact the ability to provide uptime if there is a hardware outage.

If you do want to use virtual machines, make sure your setup is configured to allow uninterrupted running if there is a hardware outage.

1.3.1 Hardware recommendations

Hardware sizing guidelines
Size #Users Repository size (GB) CPU speed (GHz) #CPU #Cores RAM (GB) HDD (GB)
Small 100 25 2 1 2-4 8-16 * 100
Medium 500 100 2 2 4 16-32 250
Large 1000 500 2.66 4 4 32-64 750
Very Large 5000 1000 2.66 4 4-6 128 1500

* For small deployments it should be possible to run with 8GB of system memory. However, if you are going to run additional services such as Gerrit on the same server then 16GB is the minimum system memory requirement. For GitMS/Gerrit deployments with large numbers (i.e. more than hundreds) of users, you should increase the minimum memory requirement to 32GB or 48GB.

1.3.2 Storage tips

1.3.3 Processor tips

1.4 Setup requirements

This is a summary of requirements. You must also check the more detailed Installation checklist.

1.4.1 MultiSite servers

This section summarises requirements:

1.4.2 Git installations

Git installation requires:

Tips for installation:

1.4.3 Repository consistency

Repositories should start out as identical at all sites. A tool such as rsync can be used to guarantee this requirement. The exception is the hooks directory which can differ as variances in site policy may require different hooks.

Note: If using GerritMS then do not install Git hooks since Gerrit has its own hook mechanism.

In addition to the normal Git hooks, GitMS supports replicated hooks. See hooks for more information.

2. Installation checklist

Though you may have referred to the checklist before evaluating GitMS, we strongly recommend that you reread it before deployment to confirm that your system still meets all requirements.

System setup
Operating systems We've tested the following operating systems:
  • Red Hat Linux Enterprise Server (64-bit): 6.6
    Important note:
    Red Hat 6 requires the RHEL Server Optional repository to be enabled in Red Hat Network.
  • CentOS: 6.6

    See Red Hat note above.

  • SUSE: 11

    Contact WANdisco Support for more information about running on this platform.

Go 64-bit
We don't support GitMS on 32-bit architecture because this would impose serious limits on scalability. You must deploy on a 64-bit OS.
During install you are asked which user and group you want to run MultiSite as. On Ubuntu this change does not apply system-wide, so some files have the default group set. This is not a problem, but something to consider when deciding on your OS.
Git server Required version
GitMS needs to use WANdisco's own Git distribution, version 2.7.0 or 2.7.4, which includes modifications necessary to deploy Git with replicated repositories.
Note that the GitMS installer does not update the Git version. You must do this before running the GitMS installer.

Write access for system user
The replicator user must have write permission for all repositories, because the replicator writes directly to the Git repository.

Manage repository file ownership if using Git+SSH:// or file://
Accessing Git repositories via Apache is simplified because all user access is handled via the same daemon user. There are potential permission problems with Git+SSH or file:// when multiple users access the same repository.

Additional Git technologies required
JGit, the Java library from Eclipse, and C-Git, the git implementation written in C, are both required by GitMS.
The necessary version of JGit is included in the GitMS install but C-Git binaries need to be installed additionally.
For GitMS 1.7.2 the JGit version included is version 4.1.1. The C-Git version required is version 2.7.0, see Git binaries for how to install.

Tips:
  • Simplify user management by putting SSH users into a single group. You can then ensure that the group has read/write permissions for the repositories.
  • Make repositories wholly owned by the group.
  • Ensure that the prevailing umask is set to provide suitable permissions (002 instead of the default 022).
    Information about setting umask and Gitolite integration
    Setting the umask options for the replicator, the umask 027 gives 750 permissions on the created repositories. This means that only the system account that runs GitMS can write to them and, subsequently, all pushes to repositories need to come through this account, such as with suexec when using Apache.

    Accounts in the same primary groups can read from the repository, although pushes are rejected.

    For the GitMS user, the repository umask 027 works if Gitolite is controlled by the same system user, and Apache works if using susexec to run backend as GitMS. However, without group write access it cannot work on the repository as other members of GitMS group.

    We recommend using permission 007 to give group ability to write to repository.

    The GitMS account in the same group as the repository owner: this is workable but breaks the ability to do garbage collection and causes issues later. 027 is not workable for group access but does not appear to cause issues if using the GitMS user. We recommend that the Git MulitSite system user owns both processes. You can use members of GitMS's group to push and use a repository with 007 as long as GitMS owns the repositories.
  • Use wrapper scripts for certain commands.
Git binaries
These are available from WANdisco. They provide the builds, including modifications required for GitMS. Make sure you use the correct binaries for your version of GitMS.
Same location
All replicated repositories must be in the same location, i.e. the same absolute path, and in exactly the same state before replication can start.
Git client Any Git client compatible with a Git 1.8 remote repository. This minimum requirement is for Git 1.7.
Hooks Normally hook scripts are replicated on all repositories but in some circumstances this is not possible. See hooks for more information.
System memory Minimum recommended: 8GB RAM, 16GB swapping container
Disk space Git: Match to projects and repositories.
MultiSite Transaction Journal: Equivalent of seven days of changes.

To estimate your disk requirements, you need to quantify some elements of your deployment:
  • Overall size of all of your Git repositories
  • Frequency of commits in your environment
  • Types of files being modified: text, binaries (Git clients only send deltas for text)
  • Number and size of files being changed
  • Rate that new files are being added to the repository
File descriptor/User process limits Ensure hard and soft limits are set to 64000 or higher. Check with the ulimit or limit command.
Running lots of repositories
When the replicator is not run as a root user, the max user processes needs to be set to a high value otherwise your system cannot create the threads required to deploy all your repositories.
User process limits:

Maximum processes and open files are low by default on some systems. We recommend that process numbers, file sizes, and number of open files are set to unlimited.

Temporary changes for current shell:

ulimit -u unlimited && ulimit -f unlimited && ulimit -n 64000

-f The maximum size of files created by the shell, default option
-u The maximum number of processes available to a single user
-n The maximum number of open files for a single user

Permanent changes:

RHEL6 and later:

Make the changes in both /etc/security/limits.conf and /etc/security/limits.d/90-nproc.conf:

# Default limit for number of user's processes to prevent
      # accidental fork bombs.
      # See rhbz #432903 for reasoning.
      * soft nproc 1024   <- Increase this limit or ulimit -u will be reset to 1024
      # The asterisk changes values for all users. If you want to change for a specific user, replace it with the username:
      gitms soft nproc 65000
      gitms hard nproc 65000
      gitms soft nofile 65000
      gitms hard nofile 65000
      gitms soft 

Ubuntu

Make the changes in /etc/security/limits.conf:


      gitms           soft    nofile  65000
      gitms           hard    nofile  65000
      gitms           soft    nproc   65000
      gitms           hard    nproc   65000

If you do not see these increased limits, you may need to edit more files.

If you are logging in as the MultiSite user, add the following to /etc/pam.d/login:

session  required  pam_limits.so

If you su to the MultiSite user, add the following to /etc/pam.d/su:

session  required  pam_limits.so

If you run commands through sudo you need to make the same edit to /etc/pam.d/sudo.

File systems Supported file systems include:
  • ext4
  • VXFS from Veritas
  • XFS on RHEL/CentOS 7
    • XFS version 2.8.10 (or newer) combined with Kernel version 2.6.33 (or newer) - this requirement is met by RHEL7.2 and above.

Write barriers should always be enabled.

Journaling file system Replicator logs should be on a journaling file system, for example, ext3, or VXFS from Veritas.
Avoid data loss
See our Knowledgebase article, Data Loss and Linux, that looks at several implementation strategies that militate against potential data loss as a result of power outages.
Java Install JDK 1.7.
Use Oracle Java
Our development and testing is done using Oracle JDK 1.7. While you may be able to use other Java packages, we cannot support you unless you run with Oracle's JDK 7 or later versions.
  1. Install JDK/JRE 1.7 (from Oracle) and define the JAVA_HOME environment variable to point to the directory where the JDK/JRE is installed.
  2. Add $JAVA_HOME/bin to the path and ensure that no other java (JDK or JRE) is on the path:
      $ which java
      /usr/bin/java
      $export JAVA_HOME="/usr"
      
  3. You can run with the JRE package instead of the full JDK. Check this by running java -server -version. If it generates a not found error, repeat Steps 1 and 2.
    If you have package management problems or conflicts with the JDK version you are downloading (for example, rpm download for Linux), you may want to use the self-extracting download file instead of the rpm (on Linux) package. The self-extracting download easily installs in any directory without any dependency checks.
Python Install version 2.3 or later.
Browser compatibility Setup and configuration requires access through a browser. These browsers are known to work:
  • Internet Explorer 8 & 9 or later
  • Firefox 4.0 or later
  • Google Chrome 10.0 or later
  • Safari 5.0.4 or later
  • Opera 10.60 or later
GitMS is not compatible with Internet Explorer 6 or 7
We know that some users are still tied to earlier versions of Internet Explorer. However, we cannot provide backward compatibility for all time.
Network settings
Reserved ports During installation a block of ports is reserved for use by MultiSite. You cannot edit these ports after installation. Make sure you get them right at the start.

Required ports

dcone.port= An integer between 1 - 65535, default=6444
DConE port handles agreement traffic between sites

content.server.port= An integer between 1 - 65535, default=4321
The content server port is used for the replicator's payload data: repository changes etc.

gitms.local.jetty.port= An integer between 1 - 65535, default=9999
The jetty port is used for the MultiSite management interface.

jetty.http.port= An integer between 1 - 65535, default=8082
The jetty port is used for the MultiSite management interface.

jetty.https.port An integer between 1 - 65535, default: 8445
The jetty port is used for the MultiSite management interface when SSL encryption is enabled.
Make each port different
In contrast with earlier versions of MulitSite which used the same port for both the UI and replication traffic, Git MuliSite doesn't multiplex different traffic on a single port. You will need to assign a different port to each type of traffic.
Firewall or AV software If your network has a firewall, ensure that traffic is not blocked on the reserved ports noted above. Configure any AV software on your system so that it doesn't block or filter traffic on the reserved ports.
Full connectivity GitMS requires full network connectivity between all nodes. Ensure that each node's server can communicate with all other servers that will host nodes in your installation.
VPN
Set up IPsec tunnel, and ensure WAN connectivity.
VPN persistent connections Ensure that your VPN doesn't reset persistent connections for GitMS.
Bandwidth Put your WAN through realistic load testing before going into production. You can then identify and fix potential problems before they impact productivity.
DNS setup Use IP addresses instead of DNS hostnames, this ensures that DNS problems won't hinder performance. If you are required to use hostnames, test your DNS servers performance and availability prior to going into production.
NTP You should deploy a robust implementation of NTP, including monitoring as NTP will not auto-correct if the time is too far off-set from the current time. This is an important requirement because without nodes being in sync there are a number of problems that can occur. E.g. REST API created artifacts such as when deploying with Gerrit MultiSite will be improperly created, resulting in potential time reporting errors.
MultiSite setup
System User Account Take careful note of this requirement as many installation problems are caused by running applications with unsuitable or incompatible system accounts.
In most cases you can install Git MultiSite with any system user with suitable permissions, e.g. "wandgit", however, you must ensure that user belongs to the group "apache".

Read a detailed explanation of why this is required: System accounts for running MultiSite.
Replication configuration Read our Replication Setup Guide for information on how to optimise your replication.
Voters follow the sun it users get the best performance if GitMS gets agreement from the local node. For this reason you should schedule for the voter node to correspond with the location in which developers are active (i.e. in office hours).

 Disk space for recovery journal

Provision large amounts of disk space for multisite-plus/replicator/database, enough space to cover at least the number of commits within a two to four hours during your times of peak Git usage.
License Model

GitMS is supplied through a licensing model based on the numbers of both nodes and Git repository end-users. WANdisco generates a license.key file will be matched to your agreed usage requirements.

Evaluation license
To simplify the process of pre-deployment testing GitMS is supplied with an evaluation license. This type of license imposes no restrictions on use but it time-limited to an agreed period.

Production license
Customers entering production need production license file for each node, these license files are tied to the node server's IP address so care needs to be taken during deployment. If that a node needs to be moved to a new server with a different IP customers should contract WANdisco's support team and request that a new license be generated ideally before you transfer the node. Production license can be set to expire or they can be perpetual.

Special node types
Git MuliSite offers additional node types that provide limited functionality for special cases where a node only needs to perform in a limit role:

Passive Nodes (Learner only): A passive node operates like a slave in a master-slave model of distribution. Change to its repository replicas only occur through inbound proposals, it never generates any proposals itself.

Voter-only nodes (Acceptor only): A voter-only node does not need to know the content of proposals. It votes based only on the basis of replication history: "Have I already voted yes to a Global Order Number equal to or larger than this one?".

These limited-function nodes are licensed differently from active nodes. Get more details from WANdisco's sales team. Briefly, the IP addresses are a fixed list. However, the node count and special node count may move between sets of nodes, as long as the number of each type of node is within the limit specified in the license.key.

Removing Git MultiSite In the event that you need to remove Git MultiSite, your replicated repositories can continue to be used in a normal, non-replicated setting. Furthermore, the repositories will not contain any WANdisco proprietary artifacts or formats. See Removing Git MultiSite.
Gerrit setup: applicable if you are planning to integrate GitMS with Gerrit code review.
Gerrit version Version 2.9.1 or later: GitMS for Gerrit requires that you are running version 2.9.1 or later of Gerrit.
You will need to upgrade to this or a later version before completing the installation of GitMS.
Database Percona XtraDB We have developed and tested GitMS for Gerrit using Percona XtraDB 5.6.22-25.8.
Configuration change: During installation of Gerrit's MultiSite components, you need to modify Gerrit's database settings to increase its maximum number of database connections.
Replication requirements You need to be aware of the following limits that apply to this version of Git MultiSite for Gerrit:
  • Gerrit currently integrates with a single Replication Group:
    Using multiple replication groups with Gerrit is an advanced operation. Before proceeding, Contact WANdisco Support.
  • All nodes in your Gerrit replication group must be Active or Active-Voters:
    Any Gerrit node could also be a Tiebreaker. Passive and Voter-only nodes are not supported.
Authentication

OpenID not compatible: It's not possible to use Google's OpenID authentication. If you are planning to use HTTP then you need to ensure that you have an Apache web server running in front of Gerrit.
Caching

Gerrit caches are now being replicated between the nodes. That means that when some cache becomes outdated on one node, it will get outdated also on the other nodes, so that it is possible to use the advantage of the caches without the problems that those caches could bring when something happens on a remote node. To make sure that the caches are enabled in Gerrit for Git MultiSite you need to add these properties to the application.properties file. See Enable Gerrit Caching.
System resources Protect the server against resource exhaustion:
The integration of Gerrit into a GitMS deployment will increase the demands on server resources. Take note of GitMS's requirements for setting high File descriptor / User process limits. While these requirements are not changed by the addition of Gerrit, it does make resource management even more important.

Gerrit garbage collection:
The system administrator should configure Gerrit to run a scheduled garbage collection. This can help ensure that that the server doesn't experience errors or performance downgrade as as result of system resources running out.

Gerrit garbage collection
For tips see Running Garbage Collection in the Admin Guide.

Plugins Gerrit supports a number of plugins for integrating additional applications. Currently we have successfully tested the plugins for Jenkins and JIRA. General plugin information
  • Plugins need to be installed in exactly the same way on every node to ensure deterministic behavior or nodes can lose their sync.
  • Plugins that use global configuration of key-value pairs, stored in the gerrit.config will replicate without problem providing they are configured the same on all nodes.
  • Plugins with Project-level configuration (stored in project.config within refs/meta/config) should replicate without problem.
  • We're still investigating whether plugins that request data directories for storage can be supported with replication. See the next section.
Integration with third-party applications Many Gerrit deployments are integrated with one or more third-party applications. While there are no hard and fast rules for how these will be affected by moving to a replicated environment, the following information may be useful:
Git hooks

GitMS offers both standard and replicated hooks. The administrator must understand how these differ and which should be used for a given task.

Gerrit event stream

At the moment the event stream only publishes events that occur directly on a node. Integrations that rely on the event stream (like the Jenkins plugin for Gerrit) must connect to every Gerrit node in order to function normally.

3. Installation

This Installation Guide describes setting up GitMS for the first time. If you are upgrading from an earlier version of GitMS you should also follow this procedure. GitMS is a completely new class of product so it's not possible to follow a shortcut upgrade procedure.

If you need to upgrade Git:
  1. Stop GitMS on all nodes.
  2. Upgrade Git on all nodes.
  3. Start GitMS on all nodes.
  4. Follow the current GitMS upgrade instructions.

3.1 Installation overview

This is an overview of the process:

  1. Double-check the Installation checklist. Take time to make sure that you have everything set up and ready. This avoids problems during installation. In particular, check:
    • Git authentication: Git is installed, and using authentication.
    • JDK: You need to run an Oracle JDK. We recommend JDK 7, but 6 works, with warnings.
    • Java memory settings: The Java process on which GitMS runs is assigned a minimum and maximum amount of system memory. By default it gets 128MB at startup and 4GB maximum.
    • System resources: Ensure that your system is going to operate with a comfortable margin.
  2. Ensure that your repositories are copied into place on all nodes.
  3. Download and copy the MultiSite files into place.
  4. Run the setup, then complete the installation from a web browser.

3.2 Before you start

  1. Read through the Installation checklist thoroughly.

    If running with Access Control/Flume
    If you are following on from a previous installation/upgrade that was done using root, all subsequent upgrades also need to be run using root.

  2. Ensure that you have WANdisco's Git binaries pre-installed. GitMS edition requires changes that are built into WANdisco's version of Git.

    Git binary versions
    It is crucial that the Git binaries are the correct ones for your version of GitMS. For more information see the release notes for your version of GitMS. If you are adding a new node to an existing ecosystem, make sure that you install the same version of Git as is on the existing nodes.

  3. Ensure that the system user used for installing GitMS has access to Java, otherwise the installation fails.
  4. At the end of an installation (when including auditing functionality), you'll currently see an error message:
    
    Do you want to continue with the installation? (Y/n) y
    Installing Apache Flume to /opt/wandisco/flume-git-multisite
    WARNING: Cannot read /opt/wandisco/git-multisite/replicator/logs/gitms.log
    Stopping flume: [OK]
    Starting flume: [OK]
    Starting ui: [OK]
    Checking if the GUI is listening on port 8080: ............Done
    
    Please visit http://<thisHost>:8080/ to finish the installation
    
    Installation Complete
    You can ignore this error, the missing log file is not created by the system at this stage. It is created once the Git MultiSite replicator is first started, after the installation is complete.

Set the LOG_FILE environmental variable

If you need to capture a complete record of installer messages, warnings, and errors, then you need to set the LOG_FILE environment variable before running the installer. Run:
 export LOG_FILE="opt/wandiscoscp/log/file.file"
This file's permissions must allow being appended to by the installer. Ideally, the file should not already exist, or it should exist and be empty. Also its directory should enable the account running the installer to create the file.

3.2.1 Install with ACP auditing functionality

If you are installing GitMS where the account access auditing functionality for ACP is required, make sure that you set the following variables:

If FLUME_AVRO_SSL=true you also need to set:

For details see ACP installation instructions.

3.3 Install GitMS

  1. Extract the setup file.
  2. Save the wandisco-git-multisite.sh installer file to your Installation site.
  3. Make the script executable, e.g. enter the command:
    chmod a+x wandisco-git-multisite.sh
    Workaround if /tmp directory is "noexec"
    Running the installer script will write files to the system's /tmp directory. If the system's /tmp directory is mounted with the "noexec" option then you will need to use the following argument when running the installer:
    --target <someDirectoryWhichCanBeWrittenAndExecuted>
    E.g.
    ./git-multisite.sh --target /opt/wandisco/installation/
  4. Run the setup script:
    Don't sudo
    Instead the administrator should login (or sudo) to the "root" account and run the installation from there. This is because "sudo cmd" will not modify the PATH properly to include the "/sbin" directory, whereas using sudo to get to a shell's command prompt will do so.
      [root@redhat6 ~]# chmod a+x git-multisite.sh
      [root@redhat6 ~]# ./git-multisite.sh
      Verifying archive integrity... All good.
      Uncompressing WANdisco MultiSite .......
          ::   ::  ::     #     #   ##    ####  ######   #   #####   #####   #####
         :::: :::: :::    #     #  #  #  ##  ## #     #  #  #     # #     # #     #
        ::::::::::: :::   #  #  # #    # #    # #     #  #  #       #       #     #
       ::::::::::::: :::  # # # # #    # #    # #     #  #   #####  #       #     #
        ::::::::::: :::   # # # # #    # #    # #     #  #        # #       #     #
         :::: :::: :::    ##   ##  #  ## #    # #     #  #  #     # #     # #     #
          ::   ::  ::     #     #   ## # #    # ######   #   #####   #####   #####
    
    
      INFO: Using the following Memory settings:
    
      INFO: UI:         -Xms128m -Xmx1024m
      INFO: Replicator: -Xms1024m -Xmx4096m
    
      Do you want to use these settings for the installation? (Y/n)
      
  5. Enter Y and click Enter.

    Which port should the MultiSite UI listen on? [8080]:

    Running Gerrit?
    If you are going to integrate GitMS with Gerrit then make sure that you select a port that will not conflict. Gerrit also defaults to port 8080.

  6. Confirm the port that you want to run the admin interface on:

    We strongly advise against running Git MultiSite as the root user.
    
      Which user should Git MultiSite run as?
  7. Confirm the user who will run GitMS:

    This user will need to have read and write access to your git repos

  8. Which group should Git MultiSite run as?
  9. Confirm the group of the user running GitMS:

    Installing with the following settings:
    
    MultiSite user:    gitms
    MultiSite group:   gitms
    MultiSite UI Port: 8080
    MultiSite UI Minimum memory: 128
    MultiSite UI Maximum memory: 1024
    MultiSite Replicator Minimum memory: 1024
    MultiSite Replicator Maximum memory: 4096
    
    Do you want to continue with the installation? (Y/n)
  10. Confirm the configuration settings and enter Y to finish the install. In our example, our server runs as gitms with a group of gitms.
  11. Open a browser and go to the provided URL. If your server's DNS isn't running you can go to the next step at the following address:
    http://<IP_Adress>:<admin port>/multisite-local
    e.g. http://10.0.100.252:8080/

    Flush your browser cache
    If you are reinstalling and using SSL, then you should clear your browser cache before you continue. Previous SSL details are stored in the cache and will cause SSL errors if they are not flushed.

  12. The web installer begins with the Welcome screen:
    Setup 01

    Set up > Start

    Welcome to Git MultiSite.
    You're about to run through the installation, which should only take a couple of minutes.
    If you run into difficulties on the way, check our documentation or talk to our support team through the Customer upport Website.
    Before you click Next, make sure you Read the Installation Checklist
  13. Click Next to begin the installation.
  14. The next screen contains the WANdisco Master Subscription Agreement and Terms & Conditions. To continue the installation click the I Agree button.
    Setup 02

    Set up > License agreement

  15. On the next screen, License Upload, you are prompted to browse for your product license key file. Click the Browse button and locate your file. You received this from the WANdisco sales team. Contact them if you have any problems locating or using your license file.
    Setup 03

    Set up > license.key file

  16. On the Administrator Setup screen enter the username plus an associated password that you will use to log in to Git MultiSite's UI. This information is only added during the installation of the first "inductor" node.
    Setup 04

    Set up > Admin settings, entered or uploaded in the users.properties file

    Username
    The administrator's username.
    Password
    The administrator's password.
    Confirm Password
    Enter your password again to confirm correct entry.
    User Interface HTTP Port
    You entered the port during the first part of the installation. You can now choose an alternate port here.
    This port is sometimes referred to as the jetty port.
    For all subsequent node installations you should provide the users.properties file.
    Working with the user.properites file

    This properties file stores the unique information for the default admin user account. It is essential that this information exactly matches up between nodes. For this reason, it is only entered once during a deployment and then subsequently copied to all other nodes in the form of the users.properties file.

    The default location of the file is:

    /opt/wandisco/git-multisite/replicator/properties/users.properties

    If something goes wrong and you don't have a valid users.properties file in your deployment, Git MultiSite can automatically create a new one if you follow the procedure to Create a new users.properties file.

    Setup 05

    Set up > user.properties file for all nodes after the first node

  17. The last screen in the setup process covers Server Settings: Setup 01

    Set up > Server Settings

    Node ID

    The default name for this node.

    Temporary limitation
    Node names cannot contain spaces or periods.
    Node IP/Host
    The node's IP or hostname. If the server is multi-homed, you can select the IP to which you want Git MultiSite to be associated.
    Enter FQDN in this field
    We strongly recommend that you use fully qualified domain names for IP addresses. This can avoid SSL certification problems.
    Replication Port
    Select the port to use for replicated Git repository data. Default=6444.
    Content Server Port
    Select the port to use to transfer replicated content (repository changes). Default=4321. This is different from the port used by WANdisco's DConE2 agreement engine.
    Content Node Count
    This setting gives you the ability to enforce a degree of resiliance. The value represents the number of nodes within a membership that must receive the content before a proposal is submitted for agreement. If the value is greater than the total learners in the current membership, it is adjusted to total learners in the current membership. The proposing node is not considered in the calculation.
    Minimum Content Nodes Required
    Ticking this checkbox will enforce the Content Node Count as a prerequisite for replication.
    REST API Port
    The port to be used for Git MultiSite's REST-based API. Default=8082.
    REST API UI Using SSL
    Check box for enabling the use of SSL for all API traffic.
    REST API SSL Port
    The port to be used for Git MultiSite's REST-based API when traffic is secured using SSL encryption. Default=8445.
    UI Port
    The port for HTTP access to the MultiSite administrative interface. Default=8080.
    UI SSL Port
    The port for HTTPS encrypted access to the MultiSite administrative interface. Default=8443.
    SSL Certificate Alias
    The name of your SSL Certificate file.
    SSL Key Password
    The password for your HTTPS service.
    SSL Key Store
    The name of the keystore file. The keystore contains the public keys of authorized users.
    SSL Key Store Password
    The password associated with the keystore.
    SSL Trust Store
    The location of your truststore file. The truststore contains CA certifcates to trust. If your server's certificate is signed by a recognized Certification Authority (CA), the default truststore that ships with the JRE will already trust it because it already trusts trustworthy CAs. Therefore, you don't need to build your own, or to add anything to the one from the JRE.
    SSL Trust Store Password
    The password for your truststore.
    Truststores and key stores

    You might be familiar with the Public key system that allows two parties to use encryption to keep their communications with each other private (incomprehensible to an intercepting third-party).

    The keystore is used to store the public and private keys that are used in this system. However, iIn isolation, however, the system remains susceptible to the hijacking of the public key file, where an end user may receive a fake public key and be unaware that it will enable communication with an impostor.

    Enter Certificate Authorities (CAs). These trusted third parties issue digital certificates that verify that a given public key matches with the expected owner. These digital certificates are kept in the truststore. An SSL implementation that uses both keystore and truststore files offers a more secure SSL solution.

  18. Click FINISH when you have entered everything. The installer now completes the configuration.
  19. Click the START USING MULTISITE button that appears. Click the button to log in for the first time.
  20. Log in. Enter the username and password chosen ealier in the process then click FINISHED - LET'S GO!.
  21. The first time you view the dashboard, it contains mostly blank areas. Read the Reference Guide to learn what the buttons and options mean.

3.4 Non-interactive installation

You can also install GitMS with an unattended (scripted) install. Set the following environment variables:

GITMS_USER
The system user that runs GitMS.
GITMS_GROUP
The system group that GitMS runs in.
GITMS_UMASK
Set your required Umask settings. We validate your entry so that it must be a 3-digit number that begins with a zero, e.g. 077. Note: The first digit signifies the base of the number (octal) so 0777 is a 3-digit number. The product installs using 0022 or 022, but always shows 4-digits when installing.
GITMS_UI_PORT
The TCP port that the browser UI initially uses. You can change this during the browser-based setup. Default is 8080. The configurator will load on this following install.
Auditing environment variables
If you are installing or upgrading and will be using the ACP auditing functionality, read this note.

For a scripted start to the installation run:

	export TERM=xterm
	export GITMS_USER=(user_to_Run_GitMS)
	export GITMS_GROUP=(Group_to_Run_GitMS)
	export GITMS_UMASK=(Umask to apply): default 022
	export GITMS_UI_PORT=(PortToHostUI): default 8080
	export ENABLE_AUDITING=(true/false)
  

If ENABLE_AUDITING=true you also need to set all variables described in this section. For example:

export ACP_AVRO_HOST=(ACP_Flume_Address)

The installation then runs without user interaction. When installation is complete, the browser-based UI starts. You then need to complete the node set up from step 10.

3.4.1 Installing with tarball installer

If you wish to run the tarball installer please run the same script as above but with following extra parameters:

	export WAND_HOOK_PATH=(Path to git binaries): only change if tarball binaries, if rpm use /usr/bin
	export MSP_PREFIX=(Path for tarball to install under): default is /opt/wandisco/git-multisite
	export MSP_INIT=1

3.5 Repeat installation at all sites

Repeat the installation process for every node required to share your Git repositories.

You may benefit from creating an image of your initial server, with the repositories in place and using this as a starting point on your other sites. This helps ensure that your replicas are in exactly the same state.

Same location
All replicas must be in the same location, i.e. the same absolute path, and in exactly the same state before replication can start.

4. Node induction

After installing Git MultiSite at all sites, you need to make the sites aware of each other through the node induction process. Carefully follow the steps in this section.

4.1 Membership induction

You must connect nodes in a specific sequence. Follow these steps to ensure that your sites can talk to each other:

  1. When Git MultiSite is installed on all your sites, select one node to be your Inductor. This node will accept requests for membership and share its existing membership information. You can select any node.
    ** Induction overview **
  2. Log in to this Inductor's admin console, http://<Inductor's IP>:8080/multisite-local/, and get the following information, mainly from the SETTINGS tab.
    ** Induction overview **
    All your remaining sites are now classed as inductees.
  3. Select one of your remaining inductee sites. Connect to its web admin console, http://<Inductee1:8080/multisite-local/, and go to the Nodes tab.
  4. Click the Connect to Node button and enter the details that you collected from your inductor node.
    ** Induction overview **
    Node ID*
    The name of the inductor node. You can verify this from the NODE ID entry on the inductor node's SETTINGS tab, see step 2.
    Node Location ID*
    The reference code that defines the inductor node's location. You can verify this from the NODE ID entry on the inductor node's SETTINGS tab, see step 2.
    Node IP Address*
    The IP address of the inductor node server.
    Node Port No*
    The DConE Port number, 6444 by default, defined on the inductor node's SETTINGS tab.

    When you have entered these details, click the Send Connection Request button. The inductor node accepts the request and adds the inductee to its membership. Refresh your browser to see that this has happened..

  5. Go back to step 3 and select one of your remaining inductees. Repeat this process until all the sites that you want to be included in the current membership have been connected to the inductor.

4.2 Create a replication group

GitMS lets you share specific repositories between selected sites. Do this by creating Replication Groups that contain a list of sites and the specific repositories that they will share. For example, this figure shows 4 sites running 2 replication groups. Replication Group 1 replicates Repo1 across all four sites, while Replication Group 2 replicates repo2 across a subset of sites.

** I only live to be born again **

Four sites running two replication groups

Follow this procedure to create a Replication Group. You can create as many replication groups as you like. However, each repository can only be part of one active replication group at a time:

  1. When you have sites defined, click the REPLICATION GROUPS tab. Then click on the Create Replication Group button.
    ** Replication Group Creation 1 **

    Create Replication Group

    Local node automatically made the first member
    You cannot create a replication group remotely. The node on which you are creating the group must itself be a member. For this reason, when creating a replication group, the first node is added automatically.
  2. Enter a name for your Replication Group in the Replication Group Name field. Then enter an existing Node name in the Add Sites field. All existing sites that match your entry will appear. Click to select them. Instead of typing in a name you can click the drop-down button and choose from a list of existing sites that are not already members of the new group.

    You can select any number of available sites. The sites that you select will appear as clickable buttons in the Add Node field.
    ** This is history **

    Enter a name and add some nodes

  3. New sites are added as Active Voters, denoted by "AV". You can change the type of a node by clicking on its label. For an explanation of what each node type does, see the Reference Guide section, Guide to node types.
    ** This is history **

    Change node type

    When you have added all sites and configured their type, click Create Replication Group to see a group's details.
  4. Replication Groups that you create are listed on the REPLICATION GROUPS tab.
    ** This is history **

    Groups boxes

    Click View to view your options.
Important: Don't cancel replication group creation tasks
If you create a new replication group, then find that the task is stuck in pending because one of your nodes is down, do not use the Cancel Tasks option on the Dashboard's Pending Tasks table. not with a missing node
If, when all nodes are up and running, the replication group creation tasks are still not progressing, please contact the WANdisco support team for assistance.

Create a resilient replication group

For a replication group to be resilient to node failures, make sure that it has at least twice the number of acceptable failures plus one. I.e. for F failures, make sure there are 2F+1 nodes.

For example:
1 failure requires 2x1+1=3 nodes to continue operation
3 failures required 2x3+1=7 nodes to continue operation

4.3 Add repositories

Before adding a repo, you must run a git fsck to ensure its integrity.

You can also run a git gc before your git fsck to check performance.

When you have added at least one Replication Group you can add repositories to your node:

  1. Click the REPOSITORIES tab, then click Add.
    ** Add repository 1 **

    REPOSITORIES > Add

  2. Enter the Repository's name, the file system path (full path to the repository), and use the drop-down to select the replication group. You can set the repository to be read-only by ticking the Global Ready-only. You can deselect this later. Click ADD REPO.
    ** Add repository 1 **

    REPOSITORIES > Enter details > ADD REPO

  3. Alert
    If a repository that you added gets stuck in the deploying state, the Dashboard Replicator Tasks window notifies you. Cancel the deployment and try to add the repository again. To cancel a deployment, go to the Replicator Tasks window and click Cancel Task.
  4. Click the REPOSITORIES tab to see a list of the repositories added.
    ** Add repository 1 **

    Repositories listed

    The repositories list shows:

    Name
    The name of the repository, which is the same as the folder name in the Git directory.
    Take care when naming repositories.

    Follow any relevant best practices when naming repositories. For example, there's a known issue with Git running on MacOS where repositories that have the hash "#" in their name will fail operations, such as Git Clone. (NV-5280)

    Path
    The full path to the repository.
    Replication Group
    The replication group in which the repository may be replicated.
    Size
    The file size of the repository.

    Table columns describe master branch, not the whole repository

    The following columns of information describe the master branch.

    Youngest Rev
    The youngest, most recent, revision in the repository. Comparing the youngest revisions between replicas is a quick test that a repository is in the same state on all sites.
    Last Modified
    The timestamp for the last revision, which provides a quick indicator for the last time a Git user made a change.
    Global RO
    Checkbox that indicates whether the repository is globally Read-only, that is Read-only at all sites.
    Local RO
    Checkbox that indicates whether the repository is locally Read-only, that is Read-only to users at this node. The repository receives updates from the replicas on other sites, but never instigates changes itself.

Using GitMS as a mirror destination?

If you're using GitMS as a mirror of an existing repo, data should only be sent from the original source repo using git push --mirror. Otherwise, the push fails because MultiSite does not accept fast-forward pushes. This is because the mirror option is a forced command and the receiving repository is overwritten with each push.

Git configuration files for MultiSite repositories

GitMS sets the following variables in your repository's configuration file. Make sure the settings aren't changed or removed:

  • core.replicated
  • receive.denyNonFastFowards

4.4 Using Git submodules

If you use submodules, they are typically defined using the full URL of the repository, for example:

git add submodule test2 git@192.168.1.30:/home/wandisco/repos/subrepo.git

This adds the following into your .gitmodules file:

[submodule "test2"]
  path = test2
  url = git@192.168.1.30:/home/wandisco/repos/subrepo.git

In this way, submodule activity will occur against a specific Git server.

If the repository used as a submodule is being replicated through GitMS, you lose the benefits of using the repository on a local node. To maintain the benefits of the replicated environment, specify the relationship to the submodule using a relative path, such as:

git submodule add REPONAME ../RELATIVE-PATH-TO-REPO

For example:

git submodule add ../subrepo.git test2

This adds the following entry to your .gitmodules file:

[submodule "test2"]
  path = test2
  url = ../subrepo.git

Note: If you're using external submodules, you can continue to specify them using full URLs. This is only applicable to local submodules you want replicated.