Frequently asked questions

Find answers to the most common questions asked about LiveData Migrator.

What are the supported operating systems for LiveData Migrator?#

LiveData Migrator supports the following Linux-based operating systems:

  • Ubuntu 16 and 18
  • CentOS 6 and 7
  • Red Hat Enterprise Linux 6 and 7

Where should I install LiveData Migrator for production use?#

LiveData Migrator needs to be installed on an edge node in your Hadoop cluster. The edge node should have Java 1.8 and Hadoop clients (for example: HDFS client, Hive client, Kerberos client) installed but without any co-located/competing services. We recommend that the node's resources are dedicated to the running of LiveData Migrator.

Does LiveData Migrator support Kerberos authentication?#

If your Hadoop cluster has Kerberos enabled, ensure that the edge node has a valid keytab containing a suitable principal for the HDFS superuser.

Example of HDFS principal
hdfs@MYREALM.COM

If you are wanting to migrate Hive metadata from your Hadoop cluster, the edge node must also have a keytab containing a suitable principal for the Hive service.

Example of Hive service principal
hive/myDataMigratorHost@MYREALM.COM

How can I test LiveData Migrator?#

If you want to test LiveData Migrator before you install it on a production environment, use our HDFS Sandbox solution as the source filesystem. Your ADLS Gen2 storage account and container will be the target filesystem.

The Sandbox option can be selected when you create the LiveData Migrator resource through the Azure Portal.

How do I control what is or isn't migrated?#

When you migrate data, you select a path on your source filesystem (HDFS) to migrate data from. LiveData Migrator for azure will only migrate the files and subdirectories contained in this path to your target filesystem (ADLS Gen2 container).

Example source filesystem path
/my/migration/path

You can also exclude certain files and directories from being migrated within this path by creating exclusion templates. Exclusions templates are used to prevent files and directories being migrated based on their size, last modified date, and name.

Costs#

What costs will LiveData Migrator incur on my Azure subscription during the trial?

The first 5TB of data migration is free. We'll bill you for anything over this allowance.

How can costs be minimized during/after the trial period? Is there any option to turn down compute when data is not being replicated from on-prem into Azure Data Lake Store?

Cost is calculated based on number of transactions. You won't incur costs if there is low operation or no operation at all.

Networking#

What network requirements do I need for LiveData Migrator for Azure?

You need to set up your network before you install LiveData Migrator for Azure. See the Network Requirements guide to learn how to set up your virtual network, see the port requirements, and more.