Version: 3.2 (latest)

Upgrade Data Migrator

We recommend you regularly upgrade Data Migrator so you can take advantage of new functionality and other improvements. To upgrade, run through the prerequisites covered below and then run a newer version of the Data Migrator installer. The installer upgrades your Data Migrator instance to the new version.

Before you upgrade

Read through the following section before you begin a product upgrade.

When you upgrade, you'll probably need to make some configuration changes.

Files in the /etc/wandisco directory contain custom configuration changes. The files generally used for configuration are:

/etc/wandisco/livedata-migrator/application.properties
/etc/wandisco/livedata-migrator/vars.env
/etc/wandisco/ui/application-prod.properties
/etc/wandisco/ui/vars.env
/etc/wandisco/hivemigrator/application.properties
/etc/wandisco/hivemigrator/vars.sh
/etc/wandisco/hivemigrator/user-vars.sh

Existing configuration

New versions can introduce additional configuration properties and improved default values. Compare your existing configuration and apply any new properties or applicable values supplied with the new configuration files included with your latest version. Check the release notes for changes to these files and make any changes before you restart services.

RPM

For RPM-based installations, your modified configuration is preserved when an RPM upgrade is applied. The latest(new) configuration is saved to the same folder with a .rpmnew extension. Compare your existing configuration and apply any new properties or applicable values supplied with the new configuration files included with your latest version.

Debian based

If you’re on a Debian-based system, your current configuration will be saved with the .dpkg-old extension and no longer used. A new version of configuration file containing any new defaults and property features will be created and used. Compare new config and add your existing custom configuration to your new configuration before restarting services.

In most cases, it is recommended to keep your current configuration, and introduce any new properties as required. For /etc/wandisco/ui/application-prod.properties, it is essential to keep the existing configuration to ensure the UI starts. See Debian automatic handling of configuration files for more information.

caution

If Hive Migrator is being used, compaction of the H2 database must be done prior to performing upgrades. More information and steps to perform the compaction can be found here.

Hotfix patch

Newer releases can include previously issued hotfixes. If it's included in your latest version, and not required, fully remove the hotfix patch from your deployment.

If you've deployed a hotfix on your current version, see Hotfix patch removal important information to confirm if it's still required for your latest upgraded version.

info

Upgrading to Data Migrator 3.0 and later

See the following Known issue when upgrading remote agents with JDBC credentials: JDBC password overwritten on remote agent upgrade.

See the following Known issue article showing important changes in interpretation of the underscore character in different versions of Data Migrator metadata rule patterns.

caution

Upgrading to Data Migrator 2.5 and later

Location mapping properties

If you're upgrading to 2.5.4 and use tables in the Hive metastore, which have a path Serde property (either created by Spark or custom Serdes) indicating the data location, and require transforming this location to the location of your target platform data within migrations, review the location mapping properties information and contact Support so that these properties can be adjusted accordingly.

Databricks agents

If you are upgrading from any Data Migrator version prior to 2.5 and have Databricks agents. Because of the significant improvements to this agent type in 2.5, all Databricks migrations must be stopped and deleted, and any Databricks agents must be removed before upgrading to Data Migrator 2.5 and later.

info

Upgrading to remote agents
If your current deployment uses remote agents, you must complete additional steps before proceeding with the upgrade. See the following knowledge base article - known issue.

Update Hive Migrator database

Data Migrator includes a script that performs a safe database schema update. This script runs automatically during installations or upgrades using RPM or Debian. No additional actions are required.

Manual database upgrade

danger

Only perform a manual database upgrade if instructed to do so by support

If the automatic database update is interrupted or fails for any reason contact support for assistance. If instructed to do so, you can manually perform the database upgrade using the following script.

Hive Migrator database upgrade script:

Hive Migrator database upgrade script.

/opt/wandisco/hivemigrator/bin/hivemigrator-db-upgrade.sh

Running the upgrade script performs the following:

Creates a temporary directory /opt/wandisco/hivemigrator/hvm-db-upgrade-tmp. You can change its location.
Copies the H2 database defined in /etc/wandisco/hivemigrator/application.properties to the temporary directory.
The default entry in application.properties is:
```
# H2 database location
hivemigrator.storagePath=/opt/wandisco/hivemigrator/hivemigrator.db
```
If an old H2 driver is present:
- Detects agent databases placed in /opt/wandisco/hivemigrator/agent/.
- Copies the agent database and runs H2 version transition for each agent database copy.
- Overwrites the existing agent database with the copy if the version transition was successful.
- Applies any missing schema updates up to version 1.14 to the main database.
- Runs H2 version transition and deletes the old H2 driver.
Applies the new schema to the database copy.
Overwrites the existing database with the copy if the schema update was successful.
Deletes the temporary directory.

Change the temporary database location

The script creates a temporary directory in the same folder as the existing database. To select a different temporary directory, use this command before running the script:

export CUSTOM_TMP_DIR="<Full-Path-To-Different-Directory>"

Obtain a new installer and upgrade Data Migrator

To upgrade to the latest version of Data Migrator, download and run a new Data Migrator installer in the same way you do to install for the first time.

Upgrading to a newer version won't affect your filesystems or migrations. Any migrations that are in progress simply continue transferring data as normal.

note

You can check the component versions of your current installation by running the command livedata-migrator --version on your Data Migrator host machine.

System and custom users for upgrades

If you want to run the installer using a default user, run the following command:

./livedata-migrator.sh

Alternative /tmp directory

The Data Migrator installer extracts its contents to a temporary directory and decompresses them. By default, the temporary directory is a sub-directory of /tmp.

In some situations, extracting and decompressing in the default temporary directory fails. For example, if there is not enough disk space remaining, or if /tmp is mounted as noexec.

To avoid these issues, extract the contents to a different temporary directory by adding the --target option when you run the installer:

Example

./livedata-1.21.0-4-full_rpm_installer.sh --target /opt/wandisco/alternate_tmp_dir

Do not use /opt/wandisco/tmp as the value for --target or the installation will fail.

You can delete your temporary directory and its contents after installation.

The default system user for the Data Migrator and the UI services is hdfs, and the default system user for the Hive Migrator service is hive.

If you want to upgrade the product using a custom user and custom user group, run the following commands:

Thin installer

./livedata-migrator.sh --user <custom user> --group <custom group>

Fat installer

./livedata-migrator.sh -- --user <custom user> --group <custom group>

This sets the custom user and custom user group for all services and their respective directories.

For more information about configuring custom users, go to Configure system users.

If you don’t enter a custom user and group, then the pre-existing user and group are used from the following files:

/etc/wandisco/hivemigrator/vars.sh
/etc/wandisco/livedata-migrator/vars.env
/etc/wandisco/ui/vars.env
/etc/wandisco/hivemigrator/user-vars.sh

If any of these files don’t exist, the default user for that component is used instead.

Upgrade a Hive Migrator remote agent

Use the following steps to upgrade a Hive Migrator remote agent and reference this known issue:

Run the hive agent show command and copy the installationCommand value.
Upload the new hivemigrator-remote-server-installer.sh file to the remote host.
note
You can find the hivemigrator-remote-server-installer.sh file under /opt/wandisco/hivemigrator.

Make the installer executable:

chmod +x hivemigrator-remote-server-installer.sh

Run the installation command copied in step 1:

Example

./hivemigrator-remote-server-installer.sh -- --silent --config 25ma-example-string-AbCdEfGhIjKADogCJpemxlbj==

Restart the hivemigrator-remote-server service:
```
systemctl restart hivemigrator-remote-server
```
Check the remote agent is healthy using the hive agent check command.

Install components using RPM/DEB

If you're installing our product components individually using RPM/DEB, you can enter a custom user or group by adding a properties file with the custom user and group.

Example

/opt/wandisco/tmp/ldm.properties:
​​
USERNAME="custom"
GROUPNAME="custom"

/opt/wandisco/tmp/ui.properties:

USERNAME="custom"
GROUPNAME="custom"

/opt/wandisco/tmp/hvm.properties:

HIVE_MIGRATOR_SERVER_USER="custom"
HIVE_MIGRATOR_SERVER_GROUP="custom"

When you install using RPM/DEB, the properties file containing the custom user names and group names are used, and set the user and group of the service and its respective directories.

If you upgrade a single component without using a properties file, then the RPM/DEB checks for the pre-existing user and group in /opt/wandisco/hivemigrator/vars.sh, /opt/wandisco/livedata-migrator/vars.env, and /opt/wandisco/ui/vars.env. If any of these files don't exist, the installer uses the default user for that component.

note

This applies to the hivemigrator-remote-server installer.

If you don't enter a custom user or group to the installer when you upgrade, the existing vars.env/vars.sh for each component of the product is retained, and existing property values are inserted into the new vars.env/vars.sh provided by the component packaging.

We don't currently retain previous custom properties when you upgrade with a custom user or group.

Next steps

Continue migrating data as before. Learn how to get started.

Before you upgrade​

Existing configuration​

RPM​

Debian based​

Hotfix patch​

Upgrading to Data Migrator 3.0 and later​

Upgrading to Data Migrator 2.5 and later​

Location mapping properties​

Databricks agents​

Update Hive Migrator database​

Manual database upgrade​

Change the temporary database location​

Obtain a new installer and upgrade Data Migrator​

System and custom users for upgrades​

Upgrade a Hive Migrator remote agent​

Install components using RPM/DEB​

Next steps​