Configure Snowflake as a target
Configure Snowflake as a target metastore using either the UI or the CLI.
We support the following tables for migration:
- Apache Orc
- Apache Parquet
- Partitioned and non-partitioned tables
- Non-transactional tables only
Disable transactional tables with the following command:
tblproperties('transactional'='false')
We support public and private Snowflake schemas.
Prerequisites
Ensure you have the following before you start:
- A Snowflake account identifier, warehouse, and stage. See the Snowflake documentation.
- A cloud storage service already configured as your target filesystem. See Configure target filesystems.
- If you choose the private key method of authentication, you'll need a private key file for Snowflake key pair authentication. The file must be stored on your Data Migrator cluster host. In this case, the Hive user requires access permissions to the private key file. See Configure system users.
- Appropriate user and role privileges to ensure Data Migrator accesses and modifies only those Snowflake target schemas required for migrations. See below for examples.
Additional information.
- See the following Snowflake Known issue.
Examples of how to set up the Hive user with limited permissions on Snowflake
This example shows how you can grant permissions to access and modify specific databases and schemas in Snowflake.
USE ROLE ACCOUNTADMIN;
//Skip if warehouse already exists
CREATE WAREHOUSE "HVM_WAREHOUSE";
CREATE DATABASE "HVM_DB";
CREATE SCHEMA "HVM_DB"."HVM_SCHEMA";
CREATE ROLE "HVM_INTEGRATION";
GRANT USAGE ON WAREHOUSE "HVM_WAREHOUSE" TO ROLE "HVM_INTEGRATION";
GRANT USAGE ON DATABASE "HVM_DB" TO ROLE "HVM_INTEGRATION";
GRANT USAGE ON SCHEMA "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
//For internal stage
GRANT READ ON STAGE "HVM_DB"."HVM_SCHEMA".HVM_STAGE TO ROLE "HVM_INTEGRATION";
//For external stage
GRANT USAGE ON STAGE in "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
GRANT CREATE STAGE ON SCHEMA "HVM_DB"."HVM_SCHEMA" TO ROLE "HVM_INTEGRATION";
GRANT CREATE INTEGRATION ON ACCOUNT TO ROLE "HVM_INTEGRATION";
//There is no FUTURE grants support for databases, thus migration of databases requires ACCOUNTADMIN role
//If limited list of databases going to be migrated with HVM, databases can be created manually by admin or admin can
//grant ACCOUNTADMIN role to the HVM_USER for short period of time to migrate databases only (use particular HVM migration rule)
//Note: Changing the properties of a database, including comments, requires the OWNERSHIP privilege for the database.
//Invoke for all schemas
GRANT SELECT,INSERT,TRUNCATE ON FUTURE TABLES in schema <**DB.SCHEMA***> to role "HVM_INTEGRATION";
GRANT SELECT ON FUTURE VIEWS in schema <**DB.SCHEMA***> to role "HVM_INTEGRATION";
//Create new user with private key
CREATE USER "HVM_USER" RSA_PUBLIC_KEY="********" DEFAULT_ROLE = "HVM_INTEGRATION";
//Grant configured HVM_INTEGRATION role to the new user
GRANT ROLE "HVM_INTEGRATION" TO USER "HVM_USER";
This example shows how you can create a stored procedure and grant the LiveData Migrator user permissions. This method calls the stored procedure rather than using the CREATE DATABASE command.
USE ROLE ACCOUNTADMIN ;
CREATE OR REPLACE PROCEDURE demo_db.public.pTestDatabaseCreation(DBNAME string)
RETURNS text
LANGUAGE javascript
COMMENT = 'Create a database without ACCOUNTADMIN privileges'
EXECUTE AS OWNER
as
    $$
    var stmt = snowflake.createStatement({
        sqlText: "create database "+ DBNAME
    })
    var ret = stmt.execute();
    return ret;
    $$
;
GRANT USAGE ON PROCEDURE
demo_db.public.pTestDatabaseCreation(string) TO ROLE SYSADMIN;
USE ROLE SYSADMIN;
CALL demo_db.public.pTestDatabaseCreation('my_test_database');
Snowflake agent timeout recommendations
Occasionally, migrations may fail due to brief network disruption or heavy data processing. Reduce the chance of these failures by setting the following timeout properties in the hive agent add snowflake CLI command:
| Property | Description | Recommended | 
|---|---|---|
| --networkTimeout | Number of milliseconds to wait for a response when interacting with the Snowflake service before returning an error. 0(zero) specifies that no network timeout is set. Default=600000 | 600000 | 
| --queryTimeout | Number of seconds to wait for a query to complete before returning an error. 0(zero) specifies that the driver should wait indefinitely. Default=0 | 0 | 
- The recommended 10 minute (600000ms) - networkTimeoutensures that brief disruption to the network won't cause migration failures but migrations will not hang indefinitely.
- Leaving - queryTimeoutat the default- 0value ensures that long query times will never result in a timeout.
- UI
- CLI
Configure Snowflake as a target with the UI
- From the Dashboard, select an instance under Instances. 
- Under Filesystems & Agents, select Metastore Agents. 
- Under Metastore Agents, select Connect to Metastore. 
- Under Add a Metadata Agent, select the target to which you want to migrate your data from Filesystem. 
- Under Metastore Type, select Snowflake. 
- Enter a Display Name for the agent. 
- Enter your User. The email address associated with the user you wish to represent LiveData Migrator for migrations to Snowflake, for example: - user@domain.com.
- Select one of the following authentication types: - Basic Authentication
 note- If you're using Basic Authentication, you’ll need to reenter the password when updating this agent. - Private Key If you select private key authentication, the following entries are required: - Path to Private Key File Enter the path to the file in which you store the private key on your Data Migrator cluster host. For example, - /home/hive/snowflake_key.p8.
- Private Key File Password Enter the password you used when you created the private key file. 
 
 
- Enter an Account Identifier. For example, - <organization name>-<account name>. See Account Identifier and Organization Name.
- Enter a Warehouse Name. This is the name of the warehouse created in Snowflake. For example, - <cirata_whs>. See Overview of Warehouses.
- Enter a Stage Name. This is the name of the stage created in Snowflake that links the filesystem attached to this agent to the Snowflake environment. For example, - <cirata_stage>. See Create Stage.
- Enter a Stage Database. This is the Snowflake database that contains the stage. For example, - <cirata_db>. See Database.
- Enter a Stage Schema Name. This is the schema that contains the stage. See Database and Schema. 
- Enter a Schema Name. This is the schema into which this agent migrates tables. For example, database and table - testdb.table1once migrated display in Snowflake as- "testdb".PUBLIC."table1", where- PUBLICis the default value for the schema name.
- The following entries are optional: - Enter a Role. This is the role with which this agent acts. For example, a default Snowflake role or a custom role for your organization. You can enter a custom role for the JDBC connection used by Hive Migrator. 
- Enter a Default Filesystem Override to override the default filesystem URI. We recommend this for complex use cases only. 
 
- Select Save. 
Next steps
- Create a metadata migration using the Snowflake target agent you just configured. 
- Monitor the following from the Dashboard: - The progress of the migration.
- The status of the migration.
- The health of your agent connection.
To view the connection status:- Select Check status from the ellipsis.
- Select Settings
- Select View agent.
 
 
Configure Snowflake as a target with the CLI
- Run the following command to sign in to Data Migrator in the CLI: - livedata-migrator
- Add Snowflake as a metastore agent by running one of the following commands depending on the authentication method you want to use: - Add Snowflake agent with basic authenticationhive agent add snowflake basic
- Add Snowflake agent with private key authenticationhive agent add snowflake privateKey
 
- Enter values for the following properties depending on which authentication method you selected in the previous step: - Basic authentication properties- Property - Description - --password- Your password for basic authentication. - Private key authentication properties- Property - Description - --private-key-file- Path to your private key file for private key authentication. - --private-key-file-pwd- Password that corresponds to the above private key file (if not empty). 
- Enter values for the following properties to complete your Snowflake configuration: - Property - Description - --user- Your Snowflake username. - --file-system-id- The target filesystem ID. In the UI, this is called Filesystem. - --account-identifier- A unique ID for your Snowflake account. See Account Identifier and Organization Name. - --warehouse- The name of the warehouse created in Snowflake. A Snowflake-based cluster of compute resources. - --default-fs-override- Optional. You can enter a filesystem to override the default target filesystem URI. We recommend this for complex use cases only. - --name- Optional. You can enter a name for the metastore agent. If you don't enter a name, Data Migrator generates a name automatically. - --stage- Temporary storage for data being migrated to Snowflake. A stage is an external stage that uses a cloud storage service from Amazon S3, Azure, or Google Cloud. - --stage-database- This is the Snowflake database that contains the stage. For example, - <cirata_db>. See Database.- --stage-schema- This is the schema that contains the stage. See Database and Schema. - --schema- This is the schema into which this agent migrates tables. For example, database and table - testdb.table1once migrated display in Snowflake as- "testdb".PUBLIC."table1"where- PUBLICis the default value for the schema name.- --role- Optional. You can enter a custom role for the JDBC connection used by Hive Migrator. - --network-timeout- Number of milliseconds to wait for a response when interacting with the Snowflake service before returning an error. - --query-timeout- Number of seconds to wait for a query to complete before returning an error. 
- Run the following commands to check or update your configuration: - Command - Description - hive agent check- Check whether the Hive agent can connect to the metastore - hive agent delete- Delete a Hive agent - hive agent list- List all configured Hive agents - hive agent show- Show the configuration for a Hive agent - hive agent types- List supported Hive agent types 
Next steps
Create a metadata migration with the CLI using the Snowflake target agent you just configured.