Skip to main content
Version: 1.22.0

Scaling

info

Before deploying this feature, please contact WANdisco Support to ensure your use case is suitable and to receive detailed deployment instructions.

The latest release of Data Migrator introduces data transfer agents as a preview feature. This functionality is under active development and is not intended to be used in most production environments.
See Preview features.

Data transfer agents let you scale beyond the limitations of a single host by sharing the workload of transferring data across additional hosts with access to your source. Data transfer agents accelerate data transfer by removing network, memory, and CPU bottlenecks.

This means Data Migrator can be scaled to the capacity of your wide area network or another limitation such as the data transfer capability of your storage environment.

Prerequisites

  • You are the system Administrator.

  • Additional hosts are deployed on your network that can access the source storage environment.

  • You have the default name hdfs for your system user and system group.
    See Configure system users.

  • Port 1433 is open between the host running Data Migrator and all hosts running data transfer agents.
    See Network requirements.

Limitations and considerations

The following will be available in future releases:

  • Configure data transfer agents using the UI. (Currently, the REST API or the CLI must be used to deploy data transfer agents.)

  • Manage your bandwidth limit across all hosts that are part of a Data Migrator deployment.

Please note the following:

  • Data transfer agents assist with moving data but aren't involved in scaling metadata migrations, which are typically under a much lower load and don’t experience bottlenecks from being deployed on a single host.

  • While there's no theoretical limit to the number of data transfer agents that can be deployed, testing has focussed on deployments with fewer than 8 nodes.