Installation Guide
This guide describes how to install Git MultiSite (GitMS):
- Pre-installation requirements
- A standard installation
- Node configuration
1. Before you install
Before installing GitMS, make sure that you have sufficient hardware and that all required software is configured appropriately.
1.1 Skills requirements
This section describes the knowledge and technical requirements for deployment and operation of the WANdisco software. Make sure that you can meet the requirements before you begin the deployment.
Technical skill requirements | |
---|---|
System administration |
|
Apache administration, if applicable |
|
Networking |
|
Git |
|
Gerrit, if applicable |
|
If you would like help assessing your requirements, request a supported installation from WANdisco.
One administrator can manage all the systems running MultiSite. However, we recommend that you someone at each site who is familiar with MultiSite Basics.
1.2 Deployment overview
We recommend that you follow a well-defined plan for your WANdisco GitMS deployment. This helps you keep control, understand the product, and find and fix any issues before production. We recommend that you include the following steps:
1. Pre-deployment planning: Identify the requirements, people, and skills needed for deployment and operation. Agree a schedule and milestones. Highlight any assumptions, constraints, dependencies, and risks to a successful deployment.
2. Deployment preparation: Prepare and identify server specifications, locations, node configuration, repository set-up, replication architecture, and the server and software configurations.
3. Testing phase: Actions for an initial installation and testing in a non-production environment, executing test cases, and verifying deployment readiness.
4. Production deployment: Actions to install, configure, test, and deploy the production environment.
5. Post-deployment operations and maintenance: Actions including environment monitoring, system maintenance, training, and in-life technical support.
1.3 System requirements
This section describes how to prepare your Git servers for replication. You need to ensure that you've got a suitable platform, with sufficient hardware and compatible versions of the required software that is configured appropriately. Use this information as a guide, not as a fixed set of requirements.
You can run GitMS nodes as virtual machines on the same physical hardware. Note that this will impact the ability to provide uptime if there is a hardware outage.
If you do want to use virtual machines, make sure your setup is configured to allow uninterrupted running if there is a hardware outage.
1.3.1 Hardware recommendations
Hardware sizing guidelines | |||||||
---|---|---|---|---|---|---|---|
Size | #Users | Repository size (GB) | CPU speed (GHz) | #CPU | #Cores | RAM (GB) | HDD (GB) |
Small | 100 | 25 | 2 | 1 | 2-4 | 8-16 * | 100 |
Medium | 500 | 100 | 2 | 2 | 4 | 16-32 | 250 |
Large | 1000 | 500 | 2.66 | 4 | 4 | 32-64 | 750 |
Very Large | 5000 | 1000 | 2.66 | 4 | 4-6 | 128 | 1500 |
* For small deployments it should be possible to run with 8GB of system memory. However, if you are going to run additional services such as Gerrit on the same server then 16GB is the minimum system memory requirement. For GitMS/Gerrit deployments with large numbers (i.e. more than hundreds) of users, you should increase the minimum memory requirement to 32GB or 48GB.
1.3.2 Storage tips
- Use separate physical disks for Git and GitMS.
- Use the fastest possible disks for storage. Disk IO is the critical path to improve repository responsiveness.
- We recommend that you use RAID-1 or RAID-2 solutions. We do not recommend RAID-0.
Knowledgebase
For more information about calculating storage capacity requirements, read the Knowledgebase article, Hardware Sizing Guide.
1.3.3 Processor tips
- GitMS can run on a single 2GHz CPU, but for production you should run fast multi-core CPUs and scale the number of physical processors based on your peak concurrent usage.
- Aim to have no more than 15 concurrent Git users per CPU and 7 concurrent users per CPU core.
Example 1: A server with 4 physical single core processors is expected to support (15x1x4) = 60 concurrent users.
Example 2: A server with 4 physical processors, each being a quad core is expected to support (7x4x4) = 112 concurrent users.
1.3.4 Setup requirements
This is a summary of requirements. You must also check the more detailed Installation checklist.
MultiSite servers require:
- The same operating system
- Java and Python installed
- A browser with network access to all servers
- A commandline compression utility
- A unique license key file provided by WANdisco. You will need one for each node and you may need to provide the server IP addresses.
Git installations require:
- WANdisco's modified distribution of Git, 2.0.4 or later
Make sure you don't overwrite the WANdisco Git binaries with system versions. The WANdisco versions are required for replication to work correctly.
- Matching file and directory-level permissions on repositories
You must run Git and GitMS on the same server.
A repository can belong to only one replication group at a time.
2. Installation checklist
Though you may have referred to the checklist before evaluating GitMS, we strongly recommend that you reread it before deployment to confirm that your system still meets all requirements.
System setup | |
---|---|
Operating systems | We've tested the following operating systems:
Go 64-bit
We don't support GitMS on 32-bit architecture because this would impose serious limits on scalability. You must deploy on a 64-bit OS. During install you are asked which user and group you want to run MultiSite as. On Ubuntu this change does not apply system-wide, so some files have the default group set. This is not a problem, but something to consider when deciding on your OS.
|
Git server | Required version GitMS needs to use WANdisco's own Git distribution, version 2.0.4 or later, which includes modifications necessary to deploy Git with replicated repositories. Note that the GitMS installer does not update the Git version. You must do this before running the GitMS installer. Write access for system user The replicator user must have write permission for all repositories, because the replicator writes directly to the Git repository. Manage repository file ownership if using Git+SSH:// or file:// Accessing Git repositories via Apache is simplified because all user access is handled via the same daemon user. There are potential permission problems with Git+SSH or file:// when multiple users access the same repository. Tips:
Git binaries
These are now available from WANdisco. They provide the latest builds, including modifications required for GitMS. Same location
All replicated repositories must be in the same location, i.e. the same absolute path, and in exactly the same state before replication can start. |
Git client | Any Git client compatible with a Git 1.8 remote repository. |
Hooks | Hook scripts need to be replicated on all repository replicas |
System memory | Minimum recommended: 8GB RAM, 16GB swapping container |
Disk space |
Git: Match to projects and repositories. MultiSite Transaction Journal: Equivalent of seven days of changes. To estimate your disk requirements, you need to quantify some elements of your deployment:
|
File descriptor/User process limits | Ensure hard and soft limits are set to 64000 or higher. Check with the ulimit or limit command.Running lots of repositories
User process limits:
When the replicator is not run as a root user, the max user processes needs to be set to a high value otherwise your system cannot create the threads required to deploy all your repositories.Maximum processes and open files are low by default on some systems. We recommend that process numbers, file sizes, and number of open files are set to unlimited. Temporary changes for current shell:
-f The maximum size of files created by the shell, default option Permanent changes: RHEL6 and later: Make the changes in both
Ubuntu Make the changes in
If you do not see these increased limits, you may need to edit more files. If you are logging in as the MultiSite user, add the following to
If you
If you run commands through |
Journaling file system | Replicator logs should be on a journaling file system, for example, ext3 on Linux or VXFS from Veritas.
Avoid data loss
See our Knowledgebase article, Data Loss and Linux, that looks at several implementation strategies that militate against potential data loss as a result of power outages. |
Java | Install JDK 7.
Use Oracle Java
Our development and testing is done using Oracle JDK 7. While you may be able to use other Java packages, we cannot support you unless you run with Oracle's JDK 7 or later versions.
|
Python | Install version 2.3 or later. |
Browser compatibility |
Setup and configuration requires access through a browser. These browsers are known to work:
GitMS is not compatible with Internet Explorer 6 or 7
We know that some users are still tied to earlier versions of Internet Explorer. However, we cannot provide backward compatibility for all time. |
Network settings | |
---|---|
Reserved ports | During installation a block of ports is reserved for use by MultiSite. You cannot edit these ports after installation. Make sure you get them right at the start.
Required ports dcone.port= An integer between 1 - 65535, default=6444 DConE port handles agreement traffic between sites content.server.port= An integer between 1 - 65535, default=4321 The content server port is used for the replicator's payload data: repository changes etc. gitms.local.jetty.port= An integer between 1 - 65535, default=9999 The jetty port is used for the MultiSite management interface. jetty.http.port= An integer between 1 - 65535, default=8082 The jetty port is used for the MultiSite management interface. jetty.https.port An integer between 1 - 65535, default: 8445 The jetty port is used for the MultiSite management interface when SSL encryption is enabled. Make each port different
In contrast with earlier versions of MulitSite which used the same port for both the UI and replication traffic, Git MuliSite doesn't multiplex different traffic on a single port. You will need to assign a different port to each type of traffic. |
Firewall or AV software | If your network has a firewall, ensure that traffic is not blocked on the reserved ports noted above. Configure any AV software on your system so that it doesn't block or filter traffic on the reserved ports. |
Full connectivity | GitMS requires full network connectivity between all nodes. Ensure that each node's server can communicate with all other servers that will host nodes in your installation. |
VPN |
Set up IPsec tunnel , and ensure WAN connectivity. |
VPN persistent connections | Ensure that your VPN doesn't reset persistent connections for GitMS. |
Bandwidth | Put your WAN through realistic load testing before going into production. You can then identify and fix potential problems before they impact productivity. |
DNS setup | Use IP addresses instead of DNS hostnames, this ensures that DNS problems won't hinder performance. If you are required to use hostnames, test your DNS servers performance and availability prior to going into production. |
NTP | You should deploy a robust implementation of NTP, including monitoring as NTP will not auto-correct if the time is too far off-set from the current time. This is an important requirement because without nodes being in sync there are a number of problems that can occur. E.g. REST API created artifacts such as when deploying with Gerrit MultiSite will be improperly created, resulting in potential time reporting errors. |
MultiSite setup | |
---|---|
Replication configuration | Read our Replication Setup Guide for information on how to optimise your replication. |
Voters follow the sun | it users get the best performance if GitMS gets agreement from the local node. For this reason you should schedule for the voter node to correspond with the location in which developers are active (i.e. in office hours). |
Disk space for recovery journal |
Provision large amounts of disk space for multisite-plus/replicator/database, enough space to cover at least the number of commits within a two to four hours during your times of peak Git usage. |
License Model | GitMS is supplied through a licensing model based on the numbers of both nodes and Git repository end-users. WANdisco generates a license.key file will be matched to your agreed usage requirements. Evaluation license Production license Special node types Passive Nodes (Learner only): A passive node operates like a slave in a master-slave model of distribution. Change to its repository replicas only occur through inbound proposals, it never generates any proposals itself. Voter-only nodes (Acceptor only): A voter-only node does not need to know the content of proposals. It votes based only on the basis of replication history: "Have I already voted yes to a Global Order Number equal to or larger than this one?". These limited-function nodes are licensed differently from active nodes. Get more details from WANdisco's sales team. Briefly, the IP addresses are a fixed list. However, the node count and special node count may move between sets of nodes, as long as the number of each type of node is within the limit specified in the license.key. |
Gerrit setup: applicable if you are planning to integrate GitMS with Gerrit code review. | |
---|---|
Gerrit version | Version 2.9.1 or later: GitMS for Gerrit requires that you are running version 2.9.1 or later of Gerrit. You will need to upgrade to this or a later version before completing the installation of GitMS. |
Database | Percona XtraDB We have developed and tested GitMS for Gerrit using Percona XtraDB 5.6.22-25.8.
Configuration change: During installation of Gerrit's MultiSite components, you need to modify Gerrit's database settings to increase its maximum number of database connections. |
Replication requirements | You need to be aware of the following limits that apply to this version of GitMS for Gerrit:
|
Authentication |
OpenID not compatible: It's not possible to use Google's OpenID authentication. If you are planning to use HTTP then you need to ensure that you have an Apache web server running in front of Gerrit. |
Caching |
Is disabled:Gerrit stores a lot of information from both the repositories and its database in memory. Placing Gerrit in a distributed environment immediately causes problems as Gerrit and repository changes outside of each instance will happen as matter of routine, so that cached data will never be trustworthy. For this reason we set the maximum number of cache items to 0, disabling cache storage for all entities.
Use a local LDAP authority |
System resources | Protect the server against resource exhaustion: The integration of Gerrit into a GitMS deployment will increase the demands on server resources. Take note of GitMS's requirements for setting high File descriptor / User process limits. While these requirements are not changed by the addition of Gerrit, it does make resource management even more important. Gerrit garbage collection: The system administrator should configure Gerrit to run a scheduled garbage collection. This can help ensure that that the server doesn't experience errors or performance downgrade as as result of system resources running out. Gerrit garbage collection |
Plugins | Gerrit supports a number of plugins for integrating additional applications. Currently we have successfully tested the plugins for Jenkins and JIRA.
General plugin information
|
Integration with third-party applications | Many Gerrit deployments are integrated with one or more third-party applications. While there are no hard and fast rules for how these will be affected by moving to a replicated environment, the following information may be useful: Git hooks GitMS offers both standard and replicated hooks. The administrator must understand how these differ and which should be used for a given task. Gerrit event streamAt the moment the event stream only publishes events that occur directly on a node. Integrations that rely on the event stream (like the Jenkins plugin for Gerrit) must connect to every Gerrit node in order to function normally. |
3. Installation
This Installation Guide describes setting up GitMS for the first time. If you are upgrading from an earlier version of GitMS you should also follow this procedure. GitMS is a completely new class of product so it's not possible to follow a shortcut upgrade procedure.
- Stop GitMS on all nodes.
- Upgrade Git on all nodes.
- Start GitMS on all nodes.
- Follow the current GitMS upgrade instructions.
3.1 Installation overview
This is an overview of the process:
- Double-check the Installation checklist. Take time to make sure that you have everything set up and ready. This avoids problems during installation. In particular, check:
- Git authentication: Git is installed, and using authentication.
- JDK: You need to run an Oracle JDK. We recommend JDK 7, but 6 works, with warnings.
- Java memory settings: The Java process on which GitMS runs is assigned a minimum and maximum amount of system memory. By default it gets 128MB at startup and 4GB maximum.
- System resources: Ensure that your system is going to operate with a comfortable margin.
- Ensure that your repositories are copied into place on all nodes.
- Download and copy the MultiSite files into place.
- Run the setup, then complete the installation from a web browser.
3.2 Before you start
Install with ACP 1.5 auditing functionality
If you are installing Access Control Plus 1.5 with auditing functionality, make sure that you set the following variables:
ENABLE_AUDITING=true/false
: Install auditingFLUME_INSTALL_DIR=/opt/wandisco/git-ms-flume/
: Flume install location for acp-flume-sender. Make sure that you do not set the Flume install var to a directory that is unaccessible, i.e. one that is not writable by anyone, including root.ACP_AVRO_HOST=<ACP IP>
: Flume sender IPACP_AVRO_PORT=<ACP AVRO PORT>
: Flume sender portFLUME_GITMS_LOG=/opt/wandisco/git-multisite/replicator/logs/gitms.log
: Path to GitMS logFLUME_AVRO_SSL=true/false
: true/false to enable/disable SSLFLUME_AVRO_KEYSTORE_LOC
: Only required ifFLUME_AVRO_SSL=true
, keystore locationFLUME_AVRO_KEYSTORE_PASS
: Only required ifFLUME_AVRO_SSL=true
, keystore passwordFLUME_AVRO_TRUSTSTORE_LOC
: Only required ifFLUME_AVRO_SSL=true
, truststore locationFLUME_AVRO_TRUSTSTORE_PASS
: Only required ifFLUME_AVRO_SSL=true
, truststore password
For details see ACP installation instructions.
- Check through the Installation checklist
- Ensure that you have WANdisco's latest Git binaries pre-installed. GitMS edition requires FSFSWD libraries that are built into WANdisco's version of Git.
- Repositories need to be created using the file system switch (--fs-type fsfswd).
- Ensure that the system user used for installing GitMS has access to Java, otherwise the installation fails.
- At the end of an installation (when including auditing functionality), you'll currently see an error message:
You can ignore this error, the missing log file is not created by the system at this stage. It is created once the Git MultiSite replicator is first started, after the installation is complete.Do you want to continue with the installation? (Y/n) y Installing Apache Flume to /opt/wandisco/flume-git-multisite WARNING: Cannot read /opt/wandisco/git-multisite/replicator/logs/gitms.log Stopping flume: [OK] Starting flume: [OK] Starting ui: [OK] Checking if the GUI is listening on port 8080: ............Done Please visit http://<thisHost>:8080/ to finish the installation Installation Complete
Set the LOG_FILE environmental variable
If you need to capture a complete record of installer messages, warnings, and errors, then you need to set the LOG_FILE environment variable before running the installer. Run:export LOG_FILE="opt/wandiscoscp/log/file.file"This file's permissions must allow being appended to by the installer. Ideally, the file should not already exist, or it should exist and be empty. Also its directory should enable the account running the installer to create the file.
3.3 Install GitMS
- Extract the setup file.
- Save the
wandisco-git-multisite.sh
installer file to your Installation site. - Make the script executable, e.g. enter the command:
chmod a+x wandisco-git-multisite.sh
- Run the setup script:
[root@redhat6 ~]#
chmod a+x git-multisite.sh
[root@redhat6 ~]#./git-multisite.sh
Verifying archive integrity... All good. Uncompressing WANdisco MultiSite ....... :: :: :: # # ## #### ###### # ##### ##### ##### :::: :::: ::: # # # # ## ## # # # # # # # # # ::::::::::: ::: # # # # # # # # # # # # # # ::::::::::::: ::: # # # # # # # # # # # ##### # # # ::::::::::: ::: # # # # # # # # # # # # # # # :::: :::: ::: ## ## # ## # # # # # # # # # # # :: :: :: # # ## # # # ###### # ##### ##### ##### INFO: Using the following Memory settings: INFO: UI: -Xms128m -Xmx1024m INFO: Replicator: -Xms1024m -Xmx4096m Do you want to use these settings for the installation? (Y/n) -
Enter
Y
and click Enter.Which port should the MultiSite UI listen on? [8080]:
Running Gerrit?
If you are going to integrate GitMS with Gerrit then make sure that you select a port that will not conflict. Gerrit also defaults to port 8080. - Confirm the port that you want to run the admin interface on:
We strongly advise against running Git MultiSite as the root user. Which user should Git MultiSite run as?
- Confirm the user who will run GitMS:
This user will need to have read and write access to your git repos
-
Which group should Git MultiSite run as?
- Confirm the group of the user running GitMS:
Installing with the following settings: MultiSite user: gitms MultiSite group: gitms MultiSite UI Port: 8080 MultiSite UI Minimum memory: 128 MultiSite UI Maximum memory: 1024 MultiSite Replicator Minimum memory: 1024 MultiSite Replicator Maximum memory: 4096 Do you want to continue with the installation? (Y/n)
- Confirm the configuration settings and enter
Y
to finish the install. In our example, our server runs asgitms
with a group ofgitms
. - Open a browser and go to the provided URL. If your server's DNS isn't running you can go to the next step at the following address:
e.g.http://<IP_Adress>:<admin port>/multisite-local
http://10.0.100.252:8080/
Flush your browser cache
If you are reinstalling and using SSL, then you should clear your browser cache before you continue. Previous SSL details are stored in the cache and will cause SSL errors if they are not flushed. - The web installer begins with the Welcome screen:
Set up > Start
Welcome to Git MultiSite.
You're about to run through the installation, which should only take a couple of minutes.
If you run into difficulties on the way, check our documentation or talk to our support team through the Customer upport Website.
Before you click Next, make sure you Read the Installation Checklist - Click Next to begin the installation.
- The next screen contains the WANdisco Master Subscription Agreement and Terms & Conditions. To continue the installation click the I Agree button.
Set up > License agreement
- On the next screen, License Upload, you are prompted to browse for your product license key file. Click the Browse button and locate your file. You received this from the WANdisco sales team. Contact them if you have any problems locating or using your license file.
Set up > license.key file
- On the Administrator Setup screen enter the username plus an associated password that you will use to log in to Git MultiSite's UI. This information is only added during the installation of the first "inductor" node.
Set up > Admin settings, entered or uploaded in the
users.properties
file- Username
- The administrator's username.
- Password
- The administrator's password.
- Confirm Password
- Enter your password again to confirm correct entry.
- User Interface HTTP Port
- You entered the port during the first part of the installation. You can now choose an alternate port here.
This port is sometimes referred to as the jetty port.
Working with the user.properites file
This properties file stores the unique information for the default admin user account. It is essential that this information exactly matches up between nodes. For this reason, it is only entered once during a deployment and then subsequently copied to all other nodes in the form of the users.properties file.
The default location of the file is:
/opt/wandisco/git-multisite/replicator/properties/users.properties
If something goes wrong and you don't have a valid users.properties file in your deployment, Git MultiSite can automatically create a new one if you follow the procedure to Create a new users.properties file.
Set up > user.properties file for all nodes after the first node
- The last screen in the setup process covers Server Settings:
Set up > Server Settings
- Node ID
The default name for this node.
Temporary limitation
Node names cannot contain spaces or periods.- Node IP/Host
- The node's IP or hostname. If the server is multi-homed, you can select the IP to which you want Git MultiSite to be associated.
Enter FQDN in this field
We strongly recommend that you use fully qualified domain names for IP addresses. This can avoid SSL certification problems. - Replication Port
- Select the port to use for replicated Git repository data. Default=6444.
- Content Server Port
- Select the port to use to transfer replicated content (repository changes). Default=4321. This is different from the port used by WANdisco's DConE2 agreement engine.
- Content Node Count
- This setting gives you the ability to enforce a degree of resiliance. The value represents the number of nodes within a membership that must receive the content before a proposal is submitted for agreement. If the value is greater than the total learners in the current membership, it is adjusted to total learners in the current membership. The proposing node is not considered in the calculation.
- Minimum Content Nodes Required
- Ticking this checkbox will enforce the Content Node Count as a prerequisite for replication.
- REST API Port
- The port to be used for Git MultiSite's REST-based API. Default=8082.
- REST API UI Using SSL
- Check box for enabling the use of SSL for all API traffic.
- REST API SSL Port
- The port to be used for Git MultiSite's REST-based API when traffic is secured using SSL encryption. Default=8445.
- UI Port
- The port for HTTP access to the MultiSite administrative interface. Default=8080.
- UI SSL Port
- The port for HTTPS encrypted access to the MultiSite administrative interface. Default=8443.
- SSL Certificate Alias
- The name of your SSL Certificate file.
- SSL Key Password
- The password for your HTTPS service.
- SSL Key Store
- The name of the keystore file. The keystore contains the public keys of authorized users.
- SSL Key Store Password
- The password associated with the keystore.
- SSL Trust Store
- The location of your truststore file. The truststore contains CA certifcates to trust. If your server's certificate is signed by a recognized Certification Authority (CA), the default truststore that ships with the JRE will already trust it because it already trusts trustworthy CAs. Therefore, you don't need to build your own, or to add anything to the one from the JRE.
- SSL Trust Store Password
- The password for your truststore.
Truststores and key stores
You might be familiar with the Public key system that allows two parties to use encryption to keep their communications with each other private (incomprehensible to an intercepting third-party).
The keystore is used to store the public and private keys that are used in this system. However, iIn isolation, however, the system remains susceptible to the hijacking of the public key file, where an end user may receive a fake public key and be unaware that it will enable communication with an impostor.
Enter Certificate Authorities (CAs). These trusted third parties issue digital certificates that verify that a given public key matches with the expected owner. These digital certificates are kept in the truststore. An SSL implementation that uses both keystore and truststore files offers a more secure SSL solution.
- Click FINISH when you have entered everything. The installer now completes the configuration.
- Click the START USING MULTISITE button that appears. Click the button to log in for the first time.
- Log in. Enter the username and password chosen ealier in the process then click FINISHED - LET'S GO!.
- The first time you view the dashboard, it contains mostly blank areas. Read the Reference Guide to learn what the buttons and options mean.
3.4 Non-interactive installation
You can now install GitMS non-interactively. Set the following environment variables:
- GITMS_USER
- The system user that runs GitMS.
- GITMS_GROUP
- The system group that GitMS runs in.
- GITMS_UI_PORT
- The TCP port that the browser UI initially uses. You can change this during the browser-based setup.
- GITMS_UMASK
- Set your required Umask settings. We validate your entry so that it must be a 3-digit number that begins with a zero, e.g. 077. Note: The first digit signifies the base of the number (octal) so 0777 is a 3-digit number. The product installs using 0022 or 022, but always shows 0022 when installing. Optional variables:
- GITMS_UI_MEM_LOW
- The minimum amount of UI memory.
- GITMS_UI_MEM_HIGH
- The maximum amount of UI memory.
- GITMS_REP_MEM_LOW
- The minimum amount of Replicator memory.
- GITMS_REP_MEM_HIGH
- The maximum amount of Replicator memory.
If you are installing or upgrading to v1.5 and will be using the ACP 1.5 auditing functionality, read this note.
For a scripted start to the installation run:
GITMS_USER=wandisco GITMS_GROUP=wandisco GITMS_UI_PORT=8181 GITMS_UMASK=0777 export GITMS_USER GITMS_GROUP GITMS_UI_PORT GITMS_UMASK ./git-multisite.sh
The installation then runs without user interaction. When installation is complete, the browser-based UI starts. You then need to complete the node set up from step 10.
3.5 Repeat installation at all sites
Repeat the installation process for every node required to share your Git repositories.
You may benefit from creating an image of your initial server, with the repositories in place and using this as a starting point on your other sites. This helps ensure that your replicas are in exactly the same state.
All replicas must be in the same location, i.e. the same absolute path, and in exactly the same state before replication can start.
4. Node induction
After installing Git MultiSite at all sites, you need to make the sites aware of each other through the node induction process. Carefully follow the steps in this section.
4.1 Membership induction
You must connect nodes in a specific sequence. Follow these steps to ensure that your sites can talk to each other:
- When Git MultiSite is installed on all your sites, select one node to be your Inductor. This node will accept requests for membership and share its existing membership information. You can select any node.
- Log in to this Inductor's admin console, http://<Inductor's IP>:8080/multisite-local/, and get the following information, mainly from the SETTINGS tab.
All your remaining sites are now classed as inductees. - Select one of your remaining inductee sites. Connect to its web admin console, http://<Inductee1:8080/multisite-local/, and go to the Nodes tab.
- Click the Connect to Node button and enter the details that you collected from your inductor node.
- Node ID*
- The name of the inductor node. You can verify this from the NODE ID entry on the inductor node's SETTINGS tab, see step 2.
- Node Location ID*
- The reference code that defines the inductor node's location. You can verify this from the NODE ID entry on the inductor node's SETTINGS tab, see step 2.
- Node IP Address*
- The IP address of the inductor node server.
- Node Port No*
- The DConE Port number, 6444 by default, defined on the inductor node's SETTINGS tab.
When you have entered these details, click the Send Connection Request button. The inductor node accepts the request and adds the inductee to its membership. Refresh your browser to see that this has happened..
- Go back to step 3 and select one of your remaining inductees. Repeat this process until all the sites that you want to be included in the current membership have been connected to the inductor.
4.2 Create a replication group
GitMS lets you share specific repositories between selected sites. Do this by creating Replication Groups that contain a list of sites and the specific repositories that they will share. For example, this figure shows 4 sites running 2 replication groups. Replication Group 1 replicates Repo1 across all four sites, while Replication Group 2 replicates repo2 across a subset of sites.
Four sites running two replication groups
Follow this procedure to create a Replication Group. You can create as many replication groups as you like. However, each repository can only be part of one active replication group at a time:
- When you have sites defined, click the REPLICATION GROUPS tab. Then click on the Create Replication Group button.
Create Replication Group
Local node automatically made the first member
You cannot create a replication group remotely. The node on which you are creating the group must itself be a member. For this reason, when creating a replication group, the first node is added automatically. - Enter a name for your Replication Group in the Replication Group Name field. Then enter an existing Node name in the Add Sites field. All existing sites that match your entry will appear. Click to select them. Instead of typing in a name you can click the drop-down button and choose from a list of existing sites that are not already members of the new group.
You can select any number of available sites. The sites that you select will appear as clickable buttons in the Add Node field.
Enter a name and add some nodes
- New sites are added as Active Voters, denoted by "AV". You can change the type of a node by clicking on its label. For an explanation of what each node type does, see the Reference Guide section, Guide to node types.
Change node type
When you have added all sites and configured their type, click Create Replication Group to see a group's details. - Replication Groups that you create are listed on the REPLICATION GROUPS tab.
Groups boxes
Click View to view your options.
If you create a new replication group, then find that the task is stuck in pending because one of your nodes is down, do not use the Cancel Tasks option on the Dashboard's Pending Tasks table.
If, when all nodes are up and running, the replication group creation tasks are still not progressing, please contact the WANdisco support team for assistance.
Create a resilient replication group
For a replication group to be resilient to node failures, make sure that it has at least twice the number of acceptable failures plus one. I.e. for F failures, make sure there are 2F+1 nodes.
For example:
1 failure requires 2x1+1=3 nodes to continue operation
3 failures required 2x3+1=7 nodes to continue operation
4.3 Add repositories
Before adding a repo, you must run a git fsck
to ensure its integrity.
You can also run a git gc
before your git fsck
to check performance.
When you have added at least one Replication Group you can add repositories to your node:
- Click the REPOSITORIES tab, then click Add.
REPOSITORIES > Add
- Enter the Repository's name, the file system path (full path to the repository), and use the drop-down to select the replication group. You can set the repository to be read-only by ticking the Global Ready-only. You can deselect this later. Click ADD REPO.
REPOSITORIES > Enter details > ADD REPO
- Click the REPOSITORIES tab to see a list of the repositories added.
Repositories listed
The repositories list shows:
- Name
- The name of the repository, which is the same as the folder name in the Git directory.
- Path
- The full path to the repository.
- Replication Group
- The replication group in which the repository may be replicated.
- Size
- The file size of the repository.
- Youngest Rev
- The youngest, most recent, revision in the repository. Comparing the youngest revisions between replicas is a quick test that a repository is in the same state on all sites.
- Last Modified
- The timestamp for the last revision, which provides a quick indicator for the last time a Git user made a change.
- Global RO
- Checkbox that indicates whether the repository is globally Read-only, that is Read-only at all sites.
- Local RO
- Checkbox that indicates whether the repository is locally Read-only, that is Read-only to users at this node. The repository receives updates from the replicas on other sites, but never instigates changes itself.
Table columns describe master branch, not the whole repository
The following columns of information describe the master branch.
Using GitMS as a mirror destination?
If you're using GitMS as a mirror of an existing repo, data should only be sent from the original source repo using git push --mirror
. Otherwise, the push fails because MultiSite does not accept fast-forward pushes. This is because the mirror option is a forced command and the receiving repository is overwritten with each push.
Git configuration files for MultiSite repositories
GitMS sets the following variables in your repository's configuration file. Make sure the settings aren't changed or removed:
- core.replicated
- receive.denyNonFastFowards
4.4 Using Git submodules
If you use submodules, they are typically defined using the full URL of the repository, for example:
git add submodule test2 git@192.168.1.30:/home/wandisco/repos/subrepo.git
This adds the following into your .gitmodules
file:
[submodule "test2"] path = test2 url = git@192.168.1.30:/home/wandisco/repos/subrepo.git
In this way, submodule activity will occur against a specific Git server.
If the repository used as a submodule is being replicated through GitMS, you lose the benefits of using the repository on a local node. To maintain the benefits of the replicated environment, specify the relationship to the submodule using a relative path, such as:
git submodule add REPONAME ../RELATIVE-PATH-TO-REPO
For example:
git submodule add ../subrepo.git test2
This adds the following entry to your .gitmodules
file:
[submodule "test2"] path = test2 url = ../subrepo.git
Note: If you're using external submodules, you can continue to specify them using full URLs. This is only applicable to local submodules you want replicated.