Create a Databricks metadata target

LiveData Migrator for Azure supports metadata migration to Azure Databricks.

Before you start#

To use a Databricks metadata target, you need to supply a SparkJDBC42.jar file to the cluster LiveData Migrator for Azure is deployed on:

  1. Download version 2.6.22 of the Databricks JDBC driver to your LiveData Migrator host system.
  2. Unzip the package to gain access to the SparkJDBC42.jar file.
  3. Move the SparkJDBC42.jar file to the following directory:
    /opt/wandisco/hivemigrator/agent/databricks
  4. Change the owner of the .jar file to the system user and group that runs the Hive Migrator service. By default, these are "hive" and "hadoop" respectively:
    Example for hive:hadoop
    chown hive:hadoop /opt/wandisco/hivemigrator/agent/databricks/SparkJDBC42.jar

LiveData Migrator will detect the .jar file automatically in this location. You're now ready to create a Databricks metadata target.

important

Version 2.6.25 of the Databricks JDBC driver is incompatible with LiveData Migrator for Azure. Use version 2.6.22 instead.

Create a Databricks metadata target#

  1. In the Azure Portal, navigate to the LiveData Migrator resource page.

  2. From the LiveData Migrator menu on the left, select Metadata Targets.

  3. Select Create.

  4. Enter a Name for the metadata target as you want it to appear in your resource list.

  5. Under the Basics tab, select Databricks Target in the Type dropdown list.

  6. Complete the Databricks details:

    • JDBC Server Hostname: The domain name or IP address to use when connecting to the JDBC server. For example, hostname.
    • JDBC Port: The port to use when accessing the JDBC server. For example, 1433.
    • JDBC Http Path: The path to the Compute resource on Databricks. For example, sql/protocolv1/o/1010101010101010/1010-101010-eXaMpLe1.
    • Access Token: The access token to use when authenticating with the JDBC server. For example, s8Fjs823JdkeXaMpLeKeYWoSd82WjD23kSd8.

    To find your JDBC server details, see the Databricks documentation to find your JDBC server details. To generate an access token, see the Microsoft documentation.

  7. Select the Filesystem Details tab and complete the details:

    • Convert to delta format: Enable this to convert metadata sent to your Databricks cluster into the Delta Lake format.
    • FS Mount Point: The ADLS location in your Databricks cluster where you've mounted your cloud filesystem. For example, /mnt/mybucketname.
    • Data Target: Set a previously created data target for this metadata target.
    • DefaultFS Override: Provide an override for the default filesystem URI in the format dfbs:<mount_point> instead of a filesystem name. If --convert-to-delta-lake is not enabled, this must be set to dbfs:<your_fs_mount_point>, even if the mount point is blank. If --convert-to-delta-lake is enabled, set this to the path where your Delta Lake tables are stored. For example, dbfs:<path/to/delta/tables>.
  8. Select Review and create.

  9. Select Create.

Next steps#

You can migrate metadata to your Databricks target.