Skip to main content
Version: 2.3

Target Match

note

In the 2.3 release, Target Match can be added to new migrations with the CLI but cannot be added to existing migrations. To add Target Match to an existing migration, stop and re-add the migration via the CLI with the --target-match option.

caution

Target Match removes files from target file systems if they don't exist at source. Read and understand the Target Match option fully to determine if it is applicable to your use case.

Use the Target Match data migration configuration option to remove files on the target file system that don't exist at the source.

With Target Match enabled, Data Migrator identifies and migrates files from source and identifies and removes files not present at source from the target during scanning.

Scanning occurs at the start and reset of live migration, start of one-time migration and start/repeatedly with the recurring migration.

With Target Match disabled on a 'live' migration, Data Migrator scans the source file system for files to replicate. Any 'delete' or 'rename' events are replicated in real-time when the migration becomes live.

With Target Match disabled on a 'non-live' migration, Data Migrator scans the source file system for files to replicate but can't replicate those live 'rename' and 'delete' events. Data Migrator will add to any previously migrated content. Any files removed or renamed from source will remain at target, resulting in 'extra' files at target that no longer exist on the source.

Using the Target Match option allows Data Migrator to match the source and target paths by scanning both file systems, migrating newly added files and removing any 'extra' files that don't exist at source.

Target Match disabled

  • Scans the source file system to identify files to migrate.

  • Data Migrator will not be aware of or take any action on files that exist on target but not at source.

Target Match enabled

  • Scans both source and target file systems.

  • Identifies and migrates files from source.

  • Identifies and removes any files that exist on target but don't exist at source.

Additional information

note

Target Match gives Data Migrator the capability to delete extraneous files whilst scanning. If scanning has finished on a certain region of the file system, that region will not be re-evaluated until the next scan. Any extraneous files manually created on the target post scanning will not be identified or removed until the next scan.

caution

Exclusions apply to the migration of files, they don't limit or trigger removals from the target when using Target Match.

  • Excluding a file from a migration that exists on source and target won't result in the file being removed from target using Target Match.
  • Excluding a file from a migration that exists on target but not on source will still result in the file being removed from target using Target Match.
caution

Internal path mappings and Target Match are incompatible and not allowed to be used in conjunction. See this Known issue for conditions where this limitation isn't enforced.

Enable Target Match on a migration

Enable Target Match during the creation of your migration.

UI

In this release Target Match can't be configured with the UI.

Data migrations created in the UI default to Target Match disabled, unless you're adding a live migration to an ADLS Gen2 live source, in which case, Target Match is enabled by default for all migrations created with this source type.

CLI

Use --target-match with the migration add command to enable Target Match when creating a new migration.

Target Match is disabled when --target-match is not present, with the exception of live migrations on an ADLS Gen2 live source which use Target Match regardless of the presence of the --target-match option.

Example, create migration with Target Match

The following example shows a recurring migration with Target Match enabled.


migration add --name example1 --path /data/4 --source mysource --target mytarget --scan-only --recurring-migration --recurring-period 10m --target-match

tip

Target Match will remove files on target. The CLI will prompt you to confirm. You won't get a confirmation prompt when adding a live migration on live ADLS Gen2 sources.

tip

When adding a live migration to a ADLS Gen2 live source, Target Match is enabled by default regardless of the presence of the --target-match option.

Check if enabled

UI

In this release you can't check if Target Match is enabled on a migration with the UI.

CLI

Use the migration show CLI command with the --detailed option to show a migration configuration.


migration show --name MyMigration1 --detailed

If Target Match has been enabled, migrationScanType will have a value of "TWO_WAY_SCAN",

..
"target": "hdfstarget",
"state": "COMPLETED",
"resumable": false,
"abortReason": null,
"migrationStrategy": "NO_EVENT_STREAM",
"migrationScanType": "TWO_WAY_SCAN",
"exclusions": [
..

Logging

Extraneous files identified on the target by Target Match are logged in the migration-audit log for the specific migration with targetOnly=true.

..
2023-10-17 14:40:36.596: Path /Data/1/menu.sh returned from Iterator [sourceOnly=false, targetOnly=false]
2023-10-17 14:40:36.596: Path /Data/1/oldfile1 returned from Iterator [sourceOnly=false, targetOnly=true]
2023-10-17 14:40:36.596: Path /Data/1/oldfile2 returned from Iterator [sourceOnly=false, targetOnly=true]
..

Considerations

caution

Disaster recovery scenarios

Target Match identifies files to remove from a target. The source and target file system selection becomes a more critical component of any migration when using Target Match. For instance, if using a migration to recover a primary file system from a target, any new, additional, or extra files on the primary file system may actually be required. In this scenario, a migration with Target Match would not be applicable.

Hive compaction

Target Match is not recommended if Hive compaction is enabled on the target file system.

Contact Support if you have any questions or concerns around the use of Target Match with your migrations.