Skip to main content
Version: 2.0

Upgrade Data Migrator

We recommend you regularly upgrade Data Migrator so you can take advantage of new functionality and other improvements. To upgrade, run through the prerequisites covered below and then run a newer version of the Data Migrator installer. The installer upgrades your Data Migrator instance to the new version.

If your existing deployment uses Lightweight Directory Access Protocol (LDAP/Active Directory) to manage user access, take note of the following known issue.

Before you upgrade

Read through the following section before you begin a product upgrade.

When you upgrade, you'll probably need to make some configuration changes.

Files in the /etc/wandisco directory contain custom configuration changes. The files generally used for configuration are:

Existing configuration

New versions can introduce additional configuration properties and improved default values. Compare your existing configuration and apply any new properties or applicable values. Check the release notes for changes to these files and make any changes before you restart services.

vars.env

When the RPM is removed or an upgraded RPM is applied, /etc/wandisco/livedata-migrator/vars.env is saved to /etc/wandisco/livedata-migrator/vars.env.rpmsave. Original values are preserved in /etc/wandisco/livedata-migrator/vars.env.

application.properties

If modified, /etc/wandisco/livedata-migrator/application.properties is preserved when an RPM upgrade is applied. The latest configuration is saved to the same folder with a .rpmnew extension for RPM-based installations or a .dpkg-dist extension for Debian-based installations. If you're on a Debian-based system, you're prompted to decide whether to keep the old application-prod.properties file or use the new one from the installer. To ensure the UI starts, choose to keep the existing file.

Hotfix patch

If you've deployed a hotfix patch to resolve a problem on your current version, contact Support to confirm if it's still required for your latest upgraded version.
Newer releases can include previously issued hotfixes. If it's included in your latest version, and not required, fully remove the hotfix patch from your deployment.

Check the release notes for changes to these files and make any changes before you restart services.

info

Currently, when you upgrade, if Hive Migrator starts before LiveData Migrator, metadata migrations show as failed. Hive Migrator tries to check the Data Migrator license and fails to connect. The lack of connection leads to failed migrations as Hive Migrator assumes the license is invalid.

Resume failed metadata migrations

  • To resume individual failed metadata migrations, go to the Metadata Migrations panel on the Overview page and filter by Failed.
  • To resume multiple failed metadata migrations, go to Metadata Migrations and under Bulk Actions, select Resume.

    See Bulk actions.

Upgrading from 1.15.1 or earlier

If you're currently running LiveData Migrator 1.15.1 or earlier, you must first upgrade to LiveData Migrator 1.16 before upgrading to the latest version. Use the following installation steps. Before you start, read the 1.16 Release Notes.

Upgrading from 1.16 or later

Read through the following upgrade notices before starting your upgrade to the latest version:

info

Upgrading to Data Migrator 1.21 if using a Databricks agent Data Migrator 1.21 doesn't support Databricks JDBC driver version 2.6.22 or earlier. Upgrade to JDBC driver version 2.6.25 or higher to continue using Databricks agents with Data Migrator.

info

Upgrading to Data Migrator 1.20 or later if using remote agents
If your current deployment uses remote agents, you must complete additional steps before proceeding with the upgrade. See the following knowledge base article - known issue.

Configuration files stay the same after upgrading, but configuration files from the new version are also added into the same folder on an RPM installation. These new configuration files have the extension .rpmsave, and are ignored by Data Migrator by default. You may compare them and copy changes across accordingly, or use the new files.

The upgrade automatically overwrites shell scripts (such as start.sh) with the newer versions.

info

Don't change the encrypted database password for the UI in application-prod.properties. If you change the key, WANdisco UI won't start. If you're on a Debian-based system, you're prompted to decide whether to keep the old application-prod.properties file or use the new one from the installer. To ensure the UI starts, choose to keep the existing file.

info

Upgrading to Data Migrator 1.21 - Critical steps

Data Migrator 1.20 introduced changes to the way the Hive Migrator user is configured. If you're upgrading from 1.19 or earlier, complete all steps in Hive Migrator and Hive Migrator remote server to maintain successful metadata migrations. See the related known issue for more information.

info

Upgrading to/through Data Migrator 1.19 - Critical steps for Hive Migrator

This issue applies to any pre-1.19 version, upgrading to any later version.
For example: 1.18 to 1.20 or 1.18 to 1.21.

Large Hive Migrator databases may take up to 30 minutes to optimize. This process is automatic and occurs when you first start Hive Migrator after upgrading Data Migrator 1.19. If the Hive Migrator service is interrupted during this optimization, it may irreversibly corrupt the database.

We strongly recommend that you:

  • Back up the Hive Migrator database before you run a reset (purge) of all the metadata migrations.
    The default location of the database is here:

    /opt/wandisco/hivemigrator/hivemigrator.db.mv.db
  • Reset all metadata migrations. You can do this through the Swagger-based REST API documentation for metadata migrations with the /migration/reset/all command. This command purges the Hive Migrator database and clears the statistics and checksums for all migrations.

    The API call for doing metadata migration resets:

    curl -X 'POST' \
    'http://myldmhost.exampleurl.com:6780/migration/reset/all' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -d '{
    "forceStop": true
    }'

    A successful reset will produce the following output, including a "Success." for each migration:

    [
    {
    "migrationName": "MetaMigration1",
    "status": "OK",
    "errorCode": 0,
    "message": "Success."
    },
    {
    "migrationName": "MetaMigration2",
    "status": "OK",
    "errorCode": 0,
    "message": "Success."
    }
    ]
  • Ensure that the Hive Migrator service is not interrupted when it is first started after the upgrade.

Update Hive Migrator database

Data Migrator includes a script that performs a safe database schema update. This script runs automatically during installations or upgrades using RPM or Debian. No additional actions are required.

Manual database upgrade

danger

Only perform a manual database upgrade if instructed to do so by support

If the automatic database update is interrupted or fails for any reason contact support for assistance. If instructed to do so, you can manually perform the database upgrade using the following script.

note

There is a known issue running the update script on Data Migrator 1.9 or earlier if Hive Migrator uses a custom database path. See the following knowledge base article - known issue.

If the automatic database update fails during automatic upgrade, you may see errors like "Failed to update database" or "Failed to run liquibase". After a failed installation, you may see errors like "Database schema is out of sync" or "The write format 1 is smaller than the supported format 2" in /var/log/wandisco/hivemigrator/hivemigrator.log and Hive Migrator may not start up.

Hive Migrator database upgrade script:

Hive Migrator database upgrade script.
/opt/wandisco/hivemigrator/bin/hivemigrator-db-upgrade.sh

Running the upgrade script performs the following:

  • Creates a temporary directory /opt/wandisco/hivemigrator/hvm-db-upgrade-tmp. You can change its location.

  • Copies the H2 database defined in /etc/wandisco/hivemigrator/application.properties to the temporary directory.

    The default entry in application.properties is:
    # H2 database location
    hivemigrator.storagePath=/opt/wandisco/hivemigrator/hivemigrator.db
  • Applies the new schema to the database copy.

  • Overwrites the existing database with the copy if the schema update was successful.

  • Deletes the temporary directory.

Change the temporary database location

The script creates a temporary directory in the same folder as the existing database. To select a different temporary directory, use this command before running the script:

export CUSTOM_TMP_DIR="<Full-Path-To-Different-Directory>"

Obtain a new installer and upgrade Data Migrator

To upgrade to the latest version of Data Migrator, download and run a new Data Migrator installer in the same way you do to install for the first time.

Upgrading to a newer version won't affect your filesystems or migrations. Any migrations that are in progress simply continue transferring data as normal.

note

You can check the component versions of your current installation by running the command livedata-migrator --version on your Data Migrator host machine.

info

The hivemigrator-azure-hdi.noarch package is no longer included in versions after Data Migrator 1.18 and isn't automatically removed during upgrade. If you have upgraded from 1.18 or lower, remove the package manually using your package manager.

System and custom users for upgrades

If you want to run the installer using a default user, run the following command:

./livedata-migrator.sh
Alternative /tmp directory

The Data Migrator installer extracts its contents to a temporary directory and decompresses them. By default, the temporary directory is a sub-directory of /tmp.

In some situations, extracting and decompressing in the default temporary directory fails. For example, if there is not enough disk space remaining, or if /tmp is mounted as noexec.

To avoid these issues, extract the contents to a different temporary directory by adding the --target option when you run the installer:

Example
./livedata-1.21.0-4-full_rpm_installer.sh --target /opt/wandisco/alternate_tmp_dir

Do not use /opt/wandisco/tmp as the value for --target or the installation will fail.

You can delete your temporary directory and its contents after installation.

The default system user for the Data Migrator and the WANdisco UI services is hdfs, and the default system user for the Hive Migrator service is hive.

If you want to upgrade the product using a custom user and custom user group, run the following commands:

Thin installer
./livedata-migrator.sh --user <custom user> --group <custom group>
Fat installer
./livedata-migrator.sh -- --user <custom user> --group <custom group>

This sets the custom user and custom user group for all services and their respective directories.

For more information about configuring custom users, go to Configure system users.

If you don’t enter a custom user and group, then the pre-existing user and group are used from the following files:

  • /opt/wandisco/hivemigrator/vars.sh
  • /opt/wandisco/livedata-migrator/vars.env
  • /opt/wandisco/ui/vars.env

If any of these files don’t exist, the default user for that component is used instead.

Install components using RPM/DEB

If you're installing our product components individually using RPM/DEB, you can enter a custom user or group by adding a properties file with the custom user and group.

Example

/opt/wandisco/tmp/ldm.properties:
​​
USERNAME="custom"
GROUPNAME="custom"

/opt/wandisco/tmp/ui.properties:

USERNAME="custom"
GROUPNAME="custom"

/opt/wandisco/tmp/hvm.properties:

HIVE_MIGRATOR_SERVER_USER="custom"
HIVE_MIGRATOR_SERVER_GROUP="custom"

When you install using RPM/DEB, the properties file containing the custom user names and group names are used, and set the user and group of the service and its respective directories.

If you upgrade a single component without using a properties file, then the RPM/DEB checks for the pre-existing user and group in /opt/wandisco/hivemigrator/vars.sh, /opt/wandisco/livedata-migrator/vars.env, and /opt/wandisco/ui/vars.env. If any of these files don't exist, the installer uses the default user for that component.

note

This applies to the hivemigrator-remote-server installer.

If you don't enter a custom user or group to the installer when you upgrade, the existing vars.env/vars.sh for each component of the product is retained, and existing property values are inserted into the new vars.env/vars.sh provided by the component packaging.

We don't currently retain previous custom properties when you upgrade with a custom user or group.

Next steps

Continue migrating data as before. Learn how to get started.