Skip to main content
Version: 2.4.3 (latest)

Target Match

caution

Target Match removes files from target file systems if they don't exist at source. Read and understand the Target Match option fully to determine if it is applicable to your use case.

note

Run a verification report before enabling Target Match to determine extraneous files at target and if removal is required.

Use the Target Match data migration configuration option to remove files on the target file system that don't exist at the source.

With Target Match enabled, Data Migrator will migrate files from source to target by scanning both file systems, it also identifies files that do not exist at source but exist on target and will remove these files from the target during scanning.

info

Scanning will occur at the start and reset of live migration, start of one-time migration and start/repeatedly with the recurring migration.

With Target Match disabled on a 'non-live' migration, Data Migrator will scan the source file system only for files to replicate, but it will not replicate any live events as the migration is in a non-live state. Therefore, these events are ignored and result in extra files being present on the target file system that no longer exist on the source.

Additional information

note

Target Match enabled allows Data Migrator the capability to delete extraneous files that exist on target whilst scanning. If scanning has finished on a certain region of the file system, that region will not be re-evaluated until the next scan occurs. Therefore, any extraneous files manually created on the target post scanning will not be identified or removed until the next scan occurs.

caution

Exclusions apply to the migration of files, they don't limit or trigger removals from the target when using Target Match.

  • Excluding a file from a migration that exists on source and target won't result in the file being removed from target using Target Match.
  • Excluding a file from a migration that exists on target but not on source will still result in the file being removed from target using Target Match.
caution

Internal path mappings and Target Match are incompatible and not allowed to be used in conjunction. See this Known issue for conditions where this limitation isn't enforced.

Enable and disable Target Match on a migration

Target Match disabled

  • Scans the source file system to identify files to migrate.

  • Data Migrator will not be aware of or take any action on files that exist on target but not at source.

Target Match enabled

  • Scans both source and target file systems.

  • Identifies and migrates files from source.

  • Identifies and removes any files that exist on target but don't exist at source.

Enable Target Match during the creation of your migration or when stopped and reset.

UI

Data migrations created in the UI default to Target Match disabled, unless you're adding a live migration to an ADLS Gen2 live source, in which case, Target Match is enabled by default for all migrations created with this source type.

Create migration with Target Match in the UI

  1. Create a migration with the UI.
  2. Under Target Match, Select Enable Target Match to enable Target Match on this migration.
info

Target Match is enabled by default for live ADLS Gen2 sources.

Enable Target Match on single migration reset

Enable or disable Target Match on an individual migration when stopped and reset. See Reset Migration.

note

Recurring migrations are unable to be reset. To enable Target Match on an existing recurring migration, stop, delete then create the migration with Target Match.

Enable Target Match on bulk migration reset

Enable Target Match on multiple migrations when stopped and reset with the Reset with Target Match bulk action. Migrations must be in a Stopped or Failed state to appear in the list of migrations available to reset.

note

Live ADLS Gen2 sources have Target Match enabled by default, the Reset with Target Match bulk action isn't available to live ADLS Gen2 sources.

  1. On the Dashboard page, select the relevant instance from the Instances panel.
  2. Select Data Migrations from the Migrations menu on the left side.
  3. Under Bulk Action, select Reset with Target Match.
  4. Select all migrations you want to update.
  5. Select Reset.
note

You can't disable Target Match with a bulk action. Disable Target Match on an individual migration when stopped and reset. See Reset Migration.

CLI

Create migration with Target Match in the CLI

Use --target-match with the migration add command to enable Target Match when creating a new migration.

Target Match is disabled when --target-match is not present, with the exception of live migrations on an ADLS Gen2 live source which use Target Match regardless of the presence of the --target-match option.

Example, create migration with Target Match

The following example shows a recurring migration with Target Match enabled.


migration add --name example1 --path /data/4 --source mysource --target mytarget --scan-only --recurring-migration --recurring-period 10m --target-match

tip

Target Match will remove files on target. The CLI will prompt you to confirm. You won't get a confirmation prompt when adding a live migration on live ADLS Gen2 sources.

tip

When adding a live migration to a ADLS Gen2 live source, Target Match is enabled by default regardless of the presence of the --target-match option.

Enable Target Match on reset in the CLI

Use the migration reset command with the --target-match option and a value of either 'ENABLE' or 'DISABLE' on a stopped migration to enable or disable Target Match.

Example, enable Target Match on reset.

migration reset --migration-id mig3 --target-match ENABLE

Example, disable Target Match on reset.

migration reset --migration-id mig3 --target-match DISABLE

Check if enabled

UI

Confirm if Target Match is enabled on a migration with the UI.

  1. From the Dashboard, select the migration you want to check.
  2. Select Settings.
  3. Under Migration Settings, check the value of Target Match as either 'Enabled' or 'Disabled'.

CLI

Use the migration show CLI command with the --detailed option to show a migration configuration.


migration show --name MyMigration1 --detailed

If Target Match has been enabled, migrationScanType will have a value of "TWO_WAY_SCAN",

..
"target": "hdfstarget",
"state": "COMPLETED",
"resumable": false,
"abortReason": null,
"migrationStrategy": "NO_EVENT_STREAM",
"migrationScanType": "TWO_WAY_SCAN",
"exclusions": [
..

Activity monitoring

View the number of removal actions by checking the status of a migration in the UI or with the CLI. Details of files and directories identified and removed using Target Match are contained in the migration-audit log for the specific migration.

note

Use a verification report before enabling Target Match to determine extraneous files at target and if removal is required. The number of files shown in 'Total paths removed by Target Match' in the UI and the value of filesRemovedTargetMatchScan from the migration stats CLI command show the number of files removed from the base migration path and not a total number of files removed contained in subfolders. See the following Knowledge base article to learn more.

Number of files and directories removed

UI

To view the number of files and directories removed while using Target Match for a migration in the UI:

  1. From the Dashboard, select an instance under Instances.
  2. Under Migrations, select Data Migrations.
  3. Under Data Migrations, select the migration you want to check.
  4. View the number of files and directories removed in the Total paths removed by Target Match field.

Find more information on the migration status and summary. Learn more.

CLI

Use the migration stats CLI command to view the number of files and directories removed while using Target Match for a migration.

The value of filesRemovedTargetMatchScan shows the number of files removed from the base migration path. The value of "dirsRemovedTargetMatchScan" shows the number of directories removed.

migration stats --name MyMigration1

Logging

Extraneous files and directories identified for removal on the target by Target Match are logged in the migration-audit log for the specific migration with targetOnly=true.

info

The logs show the removal actions taken. When an action is taken to remove an extraneous directory, the log will reflect the removal of the directory but not the individual files contained in that directory.

..
2024-02-21 14:42:23.185: Path /STATIC/extra_dir_at_target returned from Iterator [sourceOnly=false, targetOnly=true]
2024-02-21 14:42:23.185: Path /STATIC/extra_file_at_target returned from Iterator [sourceOnly=false, targetOnly=true]
2024-02-21 14:42:23.185: Path /STATIC/static_DIR1 returned from Iterator [sourceOnly=false, targetOnly=false]
..

Files on the replication path and directories removed by Target Match are logged in the migration-audit log for the specific migration.

..
2024-02-21 14:42:24.735: Deleting Dangling Path [/STATIC/extra_dir_at_target] on target.
2024-02-21 14:42:24.735: Deleting Dangling Path [/STATIC/extra_file_at_target] on target.
..

Considerations

caution

Disaster recovery scenarios

Target Match identifies files to remove from a target. The source and target file system selection becomes a more critical component of any migration when using Target Match. For instance, if using a migration to recover a primary file system from a target, any new, additional, or extra files on the primary file system may actually be required. In this scenario, a migration with Target Match would not be applicable.

Hive compaction

Target Match is not recommended if Hive compaction is enabled on the target file system.

Contact Support if you have any questions or concerns around the use of Target Match with your migrations.