LiveData Migrator tracks events and transactions on your Hadoop Distributed File System (HDFS) NameNode to migrate data. Some default NameNode configuration settings can slow down migrations with heavier data loads, so we recommend configuring each of these properties in the
hdfs-site.xml file on your HDFS cluster:
100000 in your
This is the maximum number of events the NameNode can send to LiveData Migrator and other inotify clients in a single Remote Procedure Call (RPC) response. LiveData Migrator sends RPCs to read events on the filesystem, which it uses to detect data changes that need to be migrated. On filesystems with lots of activity, the default maximum of 1000 may mean the NameNode sends events more slowly than they happen to the filesystem, and migrations will therefore fall behind changes made.
This value is
1000 by default. Increasing it to
100000 lets migrations process more events with each RPC, increasing the rate of data migration.
100000 will only consume an additional 1MB of NameNode memory.
25000000 in your
This is the maximum number of transactions the NameNode retains. Every time a new transaction is made past this number, the oldest one is deleted.
LiveData Migrator uses these transactions to detect and migrate changes to the filesystem. If the next queued transaction to read is deleted while a migration is underway or paused, the migration won't be able to continue. This will usually only happen on a very transaction-heavy filesystem if LiveData Migrator is manually turned off or undergoing an update, where it will restart and try to continue migrations from events that have already been deleted.
This value is
1000000 by default. Increasing it to
25000000 eliminates the risk of transaction loss with no impact on performance.
25000000 affect cluster performance. The extra transaction retention will only consume extra storage space of approximately several gigabytes.
Below is a glossary of the NameNode configuration properties that are most relevant to LiveData Migrator:
|Property||Description||Default value||Recommended value|
|dfs.namenode.inotify.max.events.per.rpc||The maximum number of events the NameNode can send to inotify clients (including LiveData Migrator) in one Remote Procedure Call (RPC) response.||1000||100000|
|dfs.namenode.num.extra.edits.retained||The number of transactions the NameNode retains. LiveData Migrator reads these transactions to track filesystem activity.||1000000||25000000|
|dfs.namenode.checkpoint.txns||The number of transactions after which the NameNode will create a checkpoint, splitting the filesystem load by letting it read multiple, smaller checkpoints of events instead of a single, oversized checkpoint which could harm performance. In most cases, no modification is necessary.||1000000||1000000|
|dfs.namenode.max.extra.edits.segments.retained||The maximum number of extra edit checkpoints, which contain transactions retained by the number of extra edits, that the NameNode will maintain. In most cases, no modification is necessary.||10000||10000|
Restart all cluster services that rely on HDFS configuration (including the HDFS service) to apply your configuration changes.