Make sure your environment has the following prerequisites to successfully install and use LiveData Migrator for Azure.
The minimum speed for the bandwidth limit is 1024KB/s.
LiveData Migrator for Azure supports the following:
- Ubuntu 16 and 18
- CentOS 7
- Red Hat Enterprise Linux 6 and 7
Filesystems supported as sources:
- Hortonworks Data Platform (HDP) 2.6.3, 2.6.5, 3.1.0, 3.1.5
- Cloudera Distribution Hadoop (CDH) 5.13, 5.14, 5.15, 5.16, 6.2, 6.3
- Cloudera Data Platform (CDP) 7.1.4, 7.1.6, 7.1.7
- Arenadata Hadoop (ADH) 2.14
All HDFS versions above Hadoop 2.6 should work. However, if you need additional support, create a support ticket.
Filesystems supported as targets:
- ADLS Gen2
LiveData Migrator for Azure does not support ADLS Gen1 or Azure Blob Storage as target filesystems.
Supported metastores for metadata migrations:
- Azure HDI 4.0 Hive External Metastore
- Azure SQL Database
- Databricks cluster
- Snowflake warehouse
Required Azure CLI versions:
- 2.26.0 or higher
See the Microsoft documentation for instructions on how to install the Azure CLI.
We recommend you use the highest available version of the Azure CLI at all times.
Azure HDInsight version requirements:
- HDI 4.0
Databricks JDBC driver requirements:
- Databricks JDBC driver 2.6.25 or higher
HDI Enterprise Security Package (ESP) is not supported.
Provide your own Hadoop cluster or use the HDFS Sandbox to test LiveData Migrator for Azure. You can also use a Hortonworks Data Platform (HDP) sandbox.
See Getting Started with the Sandbox to learn how to use the HDFS Sandbox with LiveData Migrator for Azure.
To use LiveData Migrator with your own Hadoop cluster, you need the following:
- Root access to the Hadoop cluster.
- All of the network requirements.
- For example, ports 5671 and 443 must be open for outbound traffic.
The edge node on your on-premises cluster needs the following:
- Hadoop client libraries installed.
- Hadoop client available within the systemd environment.
- Java 1.8+
- If Kerberos is enabled on your Hadoop cluster, a valid keytab containing a suitable principal for the HDFS superuser must be available on the edge node.
- If you want to migrate Hive metadata from your Hadoop cluster, the edge node must also have a keytab containing a suitable principal for the Hive service.
Make sure your selected region is supported and consistent across all your resources.
- An Azure subscription.
- Resource group to contain the Azure zone resources.
- Owner or Contributor level access to the resource group.
- Create an ADLS Gen2 storage account and container with hierarchical namespace enabled.
To migrate metadata, you will require the following:
- An Azure SQL Database in the same Azure resource group as LiveData Migrator.
The following regions are supported:
- Australia East
- Canada Central*
- Canada East*
- East Asia
- East Japan
- East US 2
- East US
- Germany West Central*
- Korea Central*
- North Europe
- Southeast Asia
- South Central US
- UK South
- West Europe
- West Central US
- West Japan
- West US
- West US 2
- West US 3
Support for regions marked with an * is offered as a feature preview, and will be improved in the future.
If you can't see or select some of the above regions in the Azure Portal, you're most likely using an old resource provider. Re-register the resource provider to fix this.
As an additional prerequisite to use LiveData Migrator for Azure, register the LiveData resource provider.