Configure Parallel Scan
If you use the parallel scan feature and have feedback to share, contact us.
Data Migrator allows you to enable parallel scan mode for Target Match migrations.
How it works
Parallel Scan is an improved approach to scanning which traverses the file directory tree structure in a different way. Previously we scanned root nodes of the tree until we reached the lowest-level leaf node - sometimes referred to as depth-first scanning. This approach takes a breadth-first approach where each level of the tree is scanned and we traverse that level in a recursive fashion.
We have introduced parallel processing to this scanning approach and a global scan pool to help to manage system resources.
What it is addressing
- Scanning Performance - improvements; faster scan times
- System Resource Management - helps manage scanning load on system; enhanced control of resources on namenode
Configuration of Parallel Scan
The following properties are available for configuration of the parallel scan feature:
Name | Details | Configure via API | Configure via application.properties |
---|---|---|---|
global.scan.limit | Integer size of the global scan pool that migrations can request from to scan | ✅ | ✅ |
global.scan.pool.max | Maximum value allowed to set global.scan.limit (default is 10,000) | ✅ | |
migration.parallel.scan.percentage | Integer value representing the percentage that a single migration will request from the scan pool; overrides priority percentages from the next property below | ✅ | |
migration.priority.pool.allocations | Percentages of the scan pool that different migration priorities will request; default values are 2%, 4%, 8% for low, normal and high priorities respectively | ✅ |
In certain circumstances where there are hundreds of concurrent migrations, with extensive filesystem tree structures, we must ensure that sufficient system resources are allocated for parallel scan. Specifically increasing the maximum size of the heap that can be used by the Java Virtual Machine (JVM) is key - see the property JVM_MAX_MEM
here.