logo

WANDISCO FUSION®
USER GUIDE

1. Release Notes

1.1. Release 2.11.2.3 Build 1245

16 April 2018

WANdisco Fusion 2.11.2.3 is the first minor release following Fusion 2.11.1.5, and includes new features, issue resolutions, platform support and other improvements. These release notes include details on the specific improvements and enhancements to the product, and should be read in conjunction with the product documentation.


1.1.1. Installation

The release can be installed with updates of the IHC server RPM, the Fusion server RPM and the client stack or package. e.g. The following packages should be updated for HDP 2.6.0:

fusion-hcfs-hdp-2.6.0-ihc-server-2.11.2.3.el6-xxxx.noarch.rpm
fusion-hcfs-hdp-2.6.0-server-2.11.2.3.el6-xxxx.noarch.rpm
fusion-hcfs-hdp-2.6.0-2.11.2.3.stack.tar.gz

Please contact WANdisco support for help with this process, and find detailed installation instructions in the user guide.


1.1.2. Highlighted New Features

This release includes the following major new features

WD-FUI-5823 - Ability to change memberships for rules

The priority zone for a replication rule can be modified after a rule has been created to support the requirement to switch the priority zone.


1.1.3. Highlighted Improvements

WD-FUI-5610 - ADLS support for Azure

This release provides Installation-time support for configuration of ADLS as the underlying file system.

WD-FUS-5243 - Support for CDH 5.14

Cloudera 5.14 is a supported Hadoop distribution.


1.1.4. New Platform Support

WANdisco Fusion has added support for the following new platforms since Fusion 2.11.1.5:

  • CDH 5.14

Platform support for IBM Big Insights 4.0 has been removed from this release.


1.1.5. Available Packages

This release of WANdisco Fusion supports the following versions of Hadoop:

  • ASF Apache hadoop 2.5.0 - 2.7.0

  • CDH 5.4.0 - CDH 5.14.0

  • HDP 2.1.0 - HDP 2.6.4

  • MapR 5.0.0 - MapR 5.2.0

  • IOP (IBM BigInsights) 4.2.5


1.1.6. System Requirements

Before installing, ensure that your systems, software and hardware meet the requirements found in our online user guide at http://docs.wandisco.com/bigdata/wdfusion.

Certified Third-Party Components

WANdisco certifies the interoperability of Fusion with a wide variety of systems, including Hadoop distributions, object storage platforms, cloud environments, and applications.

  • Amazon S3

  • Amazon EMR 5.3 - 5.4

  • Ambari 1.6, 1.7, 2.0, 2.1

  • CDH 5.4 - 5.14

  • EMC Isilon 7.2, 8.0

  • Google Cloud Storage

  • Google Cloud Dataproc

  • HDP 2.1.0 - 2.6.4

  • IBM BI 2.1.2 - 4.2.5

  • MapR M4.0.1 - M5.2.0

  • Microsoft Azure Blob Storage

  • Microsoft Azure HDInsights 3.5 - 3.6

  • MySQL, PostgreSQL (Hive Metastore)

  • Oracle BDA

Client Applications Supported

WANdisco Fusion is architected for maximum compatibility and interoperability with applications that use standard Hadoop File System APIs. All applications that use the standard Hadoop Distributed File System API or any Hadoop-Compatible File System API should be interoperable with WANdisco Fusion, and will be treated as supported applications. Additionally, Fusion supports the replication of content with Amazon S3 and S3-compatible objects stores, locally-mounted file systems, and NetApp NFS devices, but does not require or provide application compatibility libraries for these storage services.


1.1.7. Known Issues

Fusion 2.11.2.3 includes a small set of known issues with workarounds. In each case, resolution for the known issues is underway.

  • Fusion does not support truncate command - WD-FUS-3022

The public boolean truncate(Path f, long newLength) operation in org.apache.hadoop.fs.FileSystem (> 2.7.0) is not yet supported. Files will be truncated only in the cluster where the operation is initiated. Consistency check and repair can be used to both detect and resolve any resulting inconsistencies.

  • Recursive parent directory creation with exclusions - WD-FUS-4847

When an exclusion rule prevents the replication of specific files, applcations that perform a mkdir() operation than includes the creation of parent directories will not create those parent directories. This may be an unexpected outcome from the definition of that exclusion rule.

  • Failed deployment of DSM will block removal - WD-FUS-3781

A DSM that fails to deploy properly on on node is not capable of participating in its own removal, and thus blocks removal.

  • Non-recursive OnTap repair repairs recursively - WD-FUS-3932, WD-FUS-3640

All subdirectories for an OnTap snapdiff repair are repaired when recursive it set to false.


1.1.8. Other Improvements

In addition to the highlighted features listed above, Fusion 2.11.2.3 includes the following improvements in general operation.

  • ADLS support for Azure - WD-FUI-5610

  • Fusion installer client detection - WD-FUI-5734

  • UI validation of secure transfer for Azure storage - WD-FUI-5897

  • Client installation step improvement - WD-FUI-5685

  • Ability to change memberships for rules - WD-FUI-5823

  • Optimized file transfer query from UI - WD-FUI-6001

  • Correct time shown for consistency status - WD-FUI-6009

  • Support for CDH 5.14 - WD-FUI-6011, WD-FUS-5243

  • Hive CLI startup with fs.hdfs.impl - WD-FUS-4876

  • Bypass and healthy directories omitted from cleanup - WD-FUS-5172

  • CompatibilityAdatpor use ApiCompatibility classloader on proxy creation - WD-FUS-5173

  • Knox audit logging failure correction with Ranger plugin - WD-FUS-3373

  • Classpaths added for Atlas - WD-FUS-3845

  • Distribution upgrade compatibility with yum repository logic - WD-FUS-4186

  • Client upgrade or removal symlink correction - WD-FUS-4722, WD-FUS-5130

  • Correct FileNotFound exceptions when FS cache is disabled - WD-FUS-4579

  • Improved GSN synchronization following extended target zone downtime - WD-FUS-4936

  • Fixed behavior of consistency check task data availability - WD-FUS-5036

  • Talkback collects rpm_info for Debian installs - WD-FUS-5041

  • ListObjectV1 compatibility with ListObjectV2 - WD-FUS-5050

  • Consistency check completion time correction - WD-FUS-5057

  • LocalFS Debian client installation - WD-FUS-5065

  • renameWithOptions support for native Azure file system - WD-FUS-5069

  • Solr for HDP distributions - WD-FUS-5081

  • Compatibility with EsgynDB - WD-FUS-5084

  • LocalFS Fusion installer no longer uses outdated AWS Java SDK - WD-FUS-5098

  • wasbs:// underlying file system configuration - WD-FUS-5126

  • Update Kafka configuration setting for Ranger audit logging - WD-FUS-5143

  • Talkback correction for HDP and Azure - WD-FUS-5162, WD-FUS-5153

  • Failed execution no retry for malformed setfacl operations - WD-FUS-5155

  • Correct defaults for client underlyingFsClass - WD-FUS-5157

  • Improved stack checks on oozie processes - WD-FUS-4129

1.2. Release 2.11.1.5 Build 1066

21 February 2018

WANdisco Fusion 2.11.1 is the first minor release following Fusion 2.11, and includes new features, issue resolutions, platform support, performance and usability improvements. These release notes include details on the specific improvements and enhancements to the product, and should be read in conjunction with the product documentation.


1.2.1. Installation

The release can be installed with updates of the IHC server RPM, the Fusion server RPM and the client stack or package. e.g. The following packages should be updated for HDP 2.6.0:

fusion-hcfs-hdp-2.6.0-ihc-server-2.11.1.4.el6-xxxx.noarch.rpm
fusion-hcfs-hdp-2.6.0-server-2.11.1.4.el6-xxxx.noarch.rpm
fusion-hcfs-hdp-2.6.0-2.11.1.4.stack.tar.gz

Please contact WANdisco support for help with this process, and find detailed installation instructions in the user guide at https://docs.wandisco.com/bigdata/wdfusion/2.11/#install.


1.2.2. Highlighted New Features

This release includes the following major new features.

Fusion Kernel and Performance

The usage of Fusion Kernel has bee updated to support higher rates of transaction throughput than previous releases. WANdisco testing that spans a variety of load types show throughput improvements ranging from 20% to 35% compared to Fusion 2.11.


1.2.3. Highlighted Improvements

WD-FUS-4429 - Support for HDP 2.6.3

Hortonworks Data Platform 2.6.3 is a supported Hadoop distribution.

WD-FUS-4904 - Support for HDP 2.6.4

Hortonworks Data Platform 2.6.4 is a supported Hadoop distribution.


1.2.4. New Platform Support

WANdisco Fusion has added support for the following new platforms since Fusion 2.11:

  • CDH 5.13

  • HDP 2.6.3 - 2.6.4


1.2.5. Available Packages

This release of WANdisco Fusion supports the following versions of Hadoop:

  • ASF Apache hadoop 2.5.0 - 2.7.0

  • CDH 5.4.0 - CDH 5.13.0

  • HDP 2.1.0 - HDP 2.6.4

  • MapR 5.0.0 - MapR 5.2.0

  • IOP (IBM BigInsights) 4.0 - 4.2.5

The trial download includes the installation packages for CDH and HDP distributions only.


1.2.6. System Requirements

Before installing, ensure that your systems, software and hardware meet the requirements found in our online user guide at http://docs.wandisco.com/bigdata/wdfusion.

Certified Third-Party Components

WANdisco certifies the interoperability of Fusion with a wide variety of systems, including Hadoop distributions, object storage platforms, cloud environments, and applications.

  • Amazon S3

  • Amazon EMR 4.0 - 5.4

  • Ambari 1.6, 1.7, 2.0, 2.1

  • CDH 5.4 - 5.13

  • EMC Isilon 7.2, 8.0

  • Google Cloud Storage

  • Google Cloud Dataproc

  • HDP 2.1.0 - 2.6.4

  • IBM BI 2.1.2 - 4.2.5

  • MapR M4.0.1 - M5.2.0

  • Microsoft Azure Blob Storage

  • Microsoft Azure HDInsights 3.2 - 3.6

  • MySQL, PostgreSQL (Hive Metastore)

  • Oracle BDA

Client Applications Supported

WANdisco Fusion is architected for maximum compatibility and interoperability with applications that use standard Hadoop File System APIs. All applications that use the standard Hadoop Distributed File System API or any Hadoop-Compatible File System API should be interoperable with WANdisco Fusion, and will be treated as supported applications. Additionally, Fusion supports the replication of content with Amazon S3 and S3-compatible objects stores, locally-mounted file systems, and NetApp NFS devices, but does not require or provide application compatibility libraries for these storage services.


1.2.7. Known Issues

Fusion 2.11.1 includes a small set of known issues with workarounds. In each case, resolution for the known issues is underway.

  • Fusion does not support truncate command - WD-FUS-3022

The public boolean truncate(Path f, long newLength) operation in org.apache.hadoop.fs.FileSystem (> 2.7.0) is not yet supported. Files will be truncated only in the cluster where the operation is initiated. Consistency check and repair can be used to both detect and resolve any resulting inconsistencies.

  • Recursive parent directory creation with exclusions - WD-FUS-4847

When an exclusion rule prevents the replication of specific files, applcations that perform a mkdir() operation than includes the creation of parent directories will not create those parent directories. This may be an unexpected outcome from the definition of that exclusion rule.

  • Failed deployment of DSM will block removal - WD-FUS-3781

A DSM that fails to deploy properly on on node is not capable of participating in its own removal, and thus blocks removal.

  • Non-recursive OnTap repair repairs recursively - WD-FUS-3932, WD-FUS-3640

All subdirectories for an OnTap snapdiff repair are repaired when recursive it set to false.

  • [Azure deployments only] It’s not possible to set the replication exchange directory on the UI settings screen, any attempt will result in an error - WD-FUI-5828


1.2.8. Other Improvements

In addition to the highlighted features listed above, Fusion 2.11.1 includes a wide set of improvements in performance, functionality, scale, interoperability and general operation.

  • Check ownership in Fusion handshake token process - WD-FUS-4006

  • Correct ownership for parent directories in Fusion handshake token process - WD-FUS-4079

  • S3 Plugin to not retry certificate errors - WD-FUS-3917, WD-FUS-4672, WD-FUS-4675

  • Ignore .fusion in chown and chmod -R of top-level directory - WD-FUS-1964

  • Report total transfer size correctly - WD-FUS-3218, WD-FUS-4681

  • Improve plugin loading error handling - WD-FUS-3244, WD-FUS-4933

  • Make move across encryption zones NonRetriable - WD-FUS-3630

  • Support replication between clusters with common fs.defaultFS configuration - WD-FUS-3683, WD-FUS-1185, WD-HIVE-110

  • Update WADL representation for global excluded properties - WD-FUS-3742, WD-FUS-3831

  • Workaround for HDFS-3545 - WD-FUS-3853

  • Make AuthenticationException non-retriable - WD-FUS-3865

  • Stalled transfer to report 0 bytes/s - WD-FUS-3903

  • S3 plugin to not retry certificate errors - WD-FUS-3917

  • Logging infrastructure change - WD-FUS-3973

  • Log Jersey exceptions - WD-FUS-4091

  • Support s3a:// as an HCFS - WD-FUS-4114

  • Add Fusion jars to Spark classpath in CDH parcel - WD-FUS-4130

  • Clean S3 buffer directory on startup - WD-FUS-4770, WD-FUS-4162

  • CDH Fusion Parcels updates - WD-FUS-4170 (WD-FUS-4130, WD-FUS-4134, WD-FUS-4142, WD-FUS-4152, WD-FUS-4157, WD-FUS-4245, WD-FUS-4246, WD-FUS-4490, WD-FUS-4507, WD-FUS-4508, WD-FUS-4633, WD-FUS-4757)

  • Improved consistency check metrics - WD-FUS-4241

  • Improve bulk S3 object deletion - WD-FUS-4293

  • UTC naming for thread-dump and task-gc files - WD-FUS-4295, DCO-748

  • Configurable S3 ListObjectsRequest maxKeys - WD-FUS-4300

  • Support multiple fs.s3.buffer.dir locations - WD-FUS-4303

  • Allow custom consistency check for locations that are not present in all zones - WD-FUS-4346

  • Log entry for clean shutdown - WD-FUS-4359

  • Ensure maximum of single HFlush in AgreedProposalStore - WD-FUS-4419

  • Logging for HFlush handling - WD-FUS-4444

  • TransferManager NPE correction - WD-FUS-4512

  • Prevent creation of replication rule that matches a default exclusion - WD-FUS-4563

  • Correct talkback feedback on failure - WD-FUS-4595

  • API for summary statistics of execution dependencies - WD-FUS-4647, WD-FUS-3470

  • IHC server should fail to start if ports unavailable - WD-FUS-4652

  • URISyntaxException should be non-retriable - WD-FUS-4674

  • Safeguard against zero chunkSize - WD-FUS-4677

  • Correct pull behavior for file systems that do not support appends - WD-FUS-4805, WD-FUS-4691

  • Improve client messaging on license expiry - WD-FUS-4483, WD-FUS-4693

  • File transfers to report whether total size is final - WD-FUS-4736

  • Improve history management with KMS configuration - WD-FUS-4763

  • Clean objects storage buffer directories on startup - WD-FUS-4770

  • Services search for PID if PID file missing - WD-FUS-4796

  • Avoid potential Rename failure retry - WD-FUS-4813

  • Correct silent discards of replicated rename operations - WD-FUS-4814

  • Improved display for transfer of renamed files - WD-FUS-4820, WD-FUS-4824

  • Correct percent remaining display - WD-FUS-4821

  • Configurable early pull - WD-FUS-4835

  • Cache checks on existence of bypass directory - WD-FUS4894

  • Minimize internal serialization - WD-FUS-4915

  • Correct transfer name handling on unrelated renames WD-FUS-4928

  • Improve JAVA_HOME discovery - WD-FUS-4964

  • Talkbacks request support ticket number - WD-FUS-3838

  • Consolidate constants across components - WD-FUS-4181

  • Clean up stale repl.dir.exchange entries following server restart - WD-FUS-4413

  • Talkbacks include plugin information - WD-FUS-4639

  • Correct file size display on completed file transfers - WD-FUS-4668

  • Talkback consistency improvements - WD-FUS-4669

  • Improved error handling on replicated directory info - WD-FUS-4826

  • Support multiple URIs per DSM - WD-FUS-4885, WD-FUS-4886

  • RequestID in logs - WD-FUS-4844

  • Do not run scheduled consistency check after directory removal - WD-FUS-4877

  • service fusion-server-stop return code correction - WD-FUS-4760

  • Correction to FINEST logging target - WD-FUS-4779

  • Support for HDP 2.6.3 - WD-FUS-4429

  • Support for HDP 2.6.4 - WD-FUS-4904, WD-FUI-5706

  • Simplify request event lifecycle - WD-FUS-4766, WD-FUS-4791, WD-FUS-4792, WD-FUS-4919

  • Support appends for ADLS - WD-FUS-4878

  • S3 Plugin support for v1 and v2 object listing (fs.fusion.s3.listing.method) - WD-FUS-4859, WD-FUS-5061

  • Recover from MVStore interruption - WD-FUS-5023, WD-FUS-5030

  • Replication of Sentry and Ranger policies - WD-FUS-3956, WD-FUS-4450

  • Document removeAll for DELETE API - WD-FUS-4953

  • Improve package status information - WD-FUI-5779

  • Fusion client parcels for SLES 12 - WD-FUI-5638

  • Correct proxyuser configuration at installation - WD-FUI-5671

  • Fix to retry button for Hive installation - WD-FUI-5163

  • Handle plugin states - WD-FUI-5544

  • Minimize runtime exception logging - WD-FUI-5704

  • Represent manual fast bypass state in UI - WD-FUI-4557

  • Deploy client statkc only to nodes with HDFS client - WD-FUI-5312

  • Improve rule removal - WD-FUI-5316

  • Allow throttle retry configuration for S3 plugin - WD-FUI-5456

  • Correct addition of replicated directory for Azure - WD-FUI-5564

  • Support fs.azure.enable.append.support - WD-FUI-5618

  • Better handle multiple replicated files in UI - WD-FUI-5629

  • Handle default Ranger replicated directory - WD-FUI-5639

  • Provide content type for .deb packages - WD-FUI-5699

  • Set fs.s3.consistent.throwExceptionOnInconsistency and fs.s3.consistent.retryCount for EMR - WD-FUI-5703

  • Correct installer client page - WD-FUI-5754

  • Allow transfer.chunk.size UI setting - WD-FUI-5773

  • Improve handshake token validation - WD-FUI-5803

  • Correct blank IHC network interface page - WD-FUI-5367

1.3. Release 2.11.0.3 Build 991

26 January 2018

WANdisco Fusion 2.11.0.3 is a minor release for customers using 2.11.0.x versions of the product. It adds support for new platforms and addresses a small number of minor issues.

We advise all customers using WANdisco Fusion to apply this minor update to their environment.


1.3.1. Installation

WANdisco Fusion 2.11.0.3 can be installed as a full release, or with updates of the IHC server RPM, the Fusion server RPM and the client stack or package. e.g. the following packages should be updated for HDP 2.6.0:

  fusion-hcfs-hdp-2.6.0-ihc-server-2.11.0.3.el6-xxxx.noarch.rpm
  fusion-hcfs-hdp-2.6.0-server-2.11.0.3.el6-xxxx.noarch.rpm
  fusion-hcfs-hdp-2.6.0-2.11.0.3.stack.tar.gz

Please contact WANdisco support for assistance with this process.


1.3.2. Highlighted Improvements

FUS-4471 - Removing last replicated directory causes NPE

In previous versions, immediately adding a replication rule for a path that is a sub-directory of a removed replication rule could result in a null pointer exception causing the failure of a Fusion server. This is resolved in WANdisco Fusion 2.11.


1.3.3. Known Issues Resolved

Previous known issues that are resolved in this release are:

SetPermissionRequest should not retry on FileNotFoundException

Operations that attempt to set a permission on a non-existent file may be retried unnecessarily.

This issue is resolved in 2.11.0.3.


1.3.4. New Platform Support

WANdisco Fusion has added support for the following new platforms since Fusion 2.11:

  • CDH 5.13

  • HDP 2.6.3

Support for CDH 5.2 and CDH 5.3 has been removed.


1.3.5. Available Packages

This release of WANdisco Fusion supports the following versions of Hadoop:

  • ASF Apache hadoop 2.5.0 - 2.7.0

  • CDH 5.4.0 - CDH 5.13.0

  • HDP 2.1.0 - HDP 2.6.3

  • MapR 5.0.0 - MapR 5.2.0

  • IOP (IBM BigInsights) 4.0 - 4.2.5

The trial download includes the installation packages for CDH and HDP distributions only.


1.3.6. System Requirements

Before installing, ensure that your systems, software and hardware meet the requirements found in our online user guide at http://docs.wandisco.com/bigdata/wdfusion.

Certified Third-Party Components

WANdisco certifies the interoperability of Fusion with a wide variety of systems, including Hadoop distributions, object storage platforms, cloud environments, and applications.

  • Amazon S3

  • Amazon EMR 4.0 - 5.4

  • Ambari 1.6, 1.7, 2.0, 2.1

  • CDH 4.4, 5.2 - 5.13

  • EMC Isilon 7.2, 8.0

  • Google Cloud Storage

  • Google Cloud Dataproc

  • HDP 2.1.0 - 2.6.3

  • IBM BI 2.1.2 - 4.2.5

  • MapR M4.0.1 - M5.2.0

  • Microsoft Azure Blob Storage

  • Microsoft Azure HDInsights 3.2 - 3.6

  • MySQL, PostgreSQL (Hive Metastore)

  • Oracle BDA

Client Applications Supported

WANdisco Fusion is architected for maximum compatibility and interoperability with applications that use standard Hadoop File System APIs. All applications that use the standard Hadoop Distributed File System API or any Hadoop-Compatible File System API should be interoperable with WANdisco Fusion, and will be treated as supported applications. Additionally, Fusion supports the replication of content with Amazon S3 and S3-compatible objects stores, locally-mounted file systems, and NetApp NFS devices, but does not require or provide application compatibility libraries for these storage services.


1.3.7. Known Issues

WANdisco Fusion 2.11.0.3 includes a small set of known issues with workarounds. In each case, resolution for the known issues is underway.

  • WANdisco Fusion does not support truncate command - WD-FUS-3022

The public boolean truncate(Path f, long newLength) operation in org.apache.hadoop.fs.FileSystem (> 2.7.0) is not yet supported. Files will be truncated only in the cluster where the operation is initiated. Consistency check and repair can be used to both detect and resolve any resulting inconsistencies.

  • Recursive parent directory creation with exclusions - WD-FUS-4847

When an exclusion rule prevents the replication of specific files, applications that perform a mkdir() operation than includes the creation of parent directories will not create those parent directories. This may be an unexpected outcome from the definition of that exclusion rule.


1.3.8. Other Improvements

In addition to the highlighted features listed above, WANdisco Fusion 2.11.0.3 includes the following improvements.

  • Installer determination of Kerberos configuration - WD-FUI-5568

  • Signature verification during install on SLES 12 - WD-FUI-5571

  • Fusion installer hadoop.proxyuser.$fusionuser.hosts value - WD-FUI-5671

  • Improve browser cache handling to resolve installation steps - WD-FUI-5680

  • Bypass and replicated exchange directory configuration for HDInsight - WD-FUI-5556

  • Show priority zone when editing a rule - WD-FUI-5626

  • Correct plugin status display for plugins that fail to start - WD-FUI-5647

  • Improve YARN configuration changes for Big Replicate install - WD-FUI-5653

  • More robust display of replication rules - WD-FUI-5655

  • Correct Ambari stack download links - WD-FUI-5689

  • Zone name display in title - WD-FUI-5509

  • Correct Hive plugin status display - WD-HIVE-757

  • Replicated directory addition fix in Azure - WD-FUI-5564

  • fs.azure.enable.append.support for Azure zones - WD-FUI-5618

  • Correct repair tab warnings for replication rules - WD-FUI-5669

  • Content-Type for client downloads - WD-FUI-4272

  • Improved recovery from significant outages - WD-FUS-4851

  • Talkbacks should not duplicate fusion server logs - WD-FUS-4866

  • Support HDP 2.6.3 - WD-FUS-4429

  • Improve content replication on source zone process restarts - WD-FUS-4797

  • Correct license update instructions - WD-FUS-4869

  • Talkback corrections for Azure - WD-FUS-4343

  • Corrections to NetApp and LocalFs repair - WD-FUS-4840

  • Improve behavior of setPermissionRequest - WD-FUS-4845, WD-FUS-4846

  • LocalFs client packaging fix for Debian - WD-FUS-4879

  • Per-DSM manual bypass script - WD-FUS-4417', `WD-FUS-3861

  • Improved replicate Metastore operation retries - WD-HIVE-781

  • Hive replication corrections - WD-HIVE-786


1.4. Release 2.11.0.2 Build 778

6 December 2017

WANdisco inc. is pleased to present WANdisco Fusion 2.11 as the next major release of the Fusion platform, available now from the WANdisco inc. file distribution site. This release includes key new features, platform support, installation, scale, performance and usability improvements, and establishes the basis for further product extensibility.


1.4.1. Installation

Find detailed installation instructions in the user guide at Installation.

Upgrades from Earlier Versions

As a major release, Fusion 2.11 introduces incompatibilities with the network protocols and storage formats used by prior versions. Please contact WANdisco inc. support for information on the upgrade mechanism appropriate for your environment.


1.4.2. Security Fixes

Potential exposure to the following security issues is resolved with WANdisco Fusion 2.11.


1.4.3. Highlighted New Features

This release includes the following major new features.

Fusion Kernel and Performance

WANdisco Fusion leverages WANdisco’s Distributed Coordination Engine (DConE). This release is the first to take advantage of improvements made in the core engine that is now referred to as Fusion Kernel.

The most significant impact of the Fusion Kernel is improved overall product performance. WANdisco testing that spans a variety of load types shows throughput improvements and memory requirements reduced. You can expect benefits ranging from 40% to 75% compared to previous releases.

Replication Memberships

Replication rule creation is simplified by the removal of the membership concept.

Memberships were used in previous versions of WANdisco Fusion has been replaced by simpler priority selection among zones, and the ability to control specific Fusion server roles in each zone. Memberships no longer need to be created, and there is no need to remove memberships that may no longer be in use by replication rules.

Non-Blocking Consistency Check

Consistency checks provide a mechanism to determine if there are any differences in the state of content within the scope of a replication rule. In versions of Fusion prior to 2.11, during a consistency check, no change could be made via Fusion to the content being checked to ensure that the results of the check remain valid.

WANdisco Fusion 2.11 introduces an alternative, non-blocking consistency check that allows information on consistency state to be determined without blocking other activity while the check is underway. It takes advantage of tracking the state of changes to content under check during execution, and produces information for each item checked that covers the states: consistent, not-consistent, potentially inconsistent.

Bulk Replication Rules

Multiple replication rules can be created at the same time when they share attributes other than file system location.

Sidelining

WANdisco Fusion versions before 2.11 included a feature called "sidelining". This allowed Fusion nodes that had fallen behind the agreement processing being performed among the network of nodes to a configurable degree to be sidelined, such that would no longer participate in agreement processing. The benefit of this approach was to ensure the overall health of a network under memory-constrained conditions, where the slow processing speed of an individual node was prevented from halting progress of the entire network.

A sidelined node required an intrusive process ("unsidelining") to bring it back into the network to continue processing agreements.

WANdisco Fusion 2.11 supports operation in a manner that eliminates the potential for sidelining when nodes exceed memory constraints for agreement processing.

Logging

WANdisco Fusion logging has been changed. Where the Fusion server logged information to a set of rolling log files in /var/log/fusion/server named fusion-dcone.log.<number>, that information is now logged to files in the same location that are timestamped on creation. e.g. fusion-server.log.2017-10-06T12:22:53.

Fusion client logging is disabled by default and can be re-enabled through the Settings > Log Settings view.

Non-Coordinated Notification of File Content

This version of WANdisco Fusion does not use coordinated activities to communicate information among zones about the availability of new file content. This removes a significant portion of communication through the Fusion Kernel related to progress in writing file content and reduces the overall load on the coordination engine as a result.

Broader HDFS API Support

The set of HDFS API methods that were not previously coordinated by Fusion is extended with the inclusion of support for:

  • public void concat(Path trg, Path[] psrcs)

  • public boolean mkdir(Path f, FsPermission permission)

  • public FSDataOutputStream append(Path f, final EnumSet<CreateFlag> flag, final int bufferSize, final Progressable progress)

  • public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress)

  • public FSDataOutputStream create(Path f, FsPermission permission, EnumSet<CreateFlag> flags, int bufferSize, short replication, long blockSize, Progressable progress, final Options.ChecksumOpt checksumOpt)

  • public HdfsDataOutputStream create(final Path f, final FsPermission permission, final boolean overwrite, final int bufferSize, final short replication, final long blockSize, final Progressable progress, final InetSocketAddress[] favoredNodes)

  • public void rename(Path src, Path dst, Rename…​ options)


1.4.4. Highlighted Improvements

FUS-4471 - Removing last replicated directory causes NPE

In previous versions, immediately adding a replication rule for a path that is a sub-directory of a removed replication rule could result in a null pointer exception causing the failure of a Fusion server. This is resolved in WANdisco Fusion 2.11.

FUS-3719 - User-Agent field in S3 requests

Object upload requests made to an AWS S3 endpoint now include identifying information that identifies the source as WANdisco Fusion 2.11.

FUS-3897 - Repair task improvements

Information about repair tasks in progress includes details on the repair’s source of truth, type and the timestamp of task completion.

FUS-3901 - New default exclusions

HDFS/HCFS replication rules now exclude .tmp and .hive-staging locations by default.

FUS-3968 - Move fusion.username.translation out of core-site.xml

The fusion.username.translation configuration property is now specified in application.properties rather than in core-site.xml, allowing it to be changed without impacting other cluster services.

FUS-4000 - UTC Timestamps

Log entries now include UTC-based time information rather than local timezone values.

FUS-4002 - Fusion to IHC connections use SO_REUSEADDR

The Fusion server can cope with a faster rate of connection recycling independently of the kernel settings.

FUS-3999 - Scheduled Consistency Check

Consistency checks can be scheduled in cron format. The consistencyCheckPeriod that is specified for a given replication rule is now defined as a string with the form of a cron expression, e.g. 0 0 0/6 * * ?

FUS-4076 - Consistency Check results across nodes

The results of a consistency check are now available from any Fusion node, not just that which initiated the check or the writer node.

FUS-4202 - Visibility of ongoing transfers

The IHC server now exposes an endpoint to report on the status of ongoing transfers, improving visibility of transfer status across a deployment.

FUS-4224 - Memory use improvements

Memory usage associated with large file writes is improved.

FUS-4314 - Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3)

Replication to S3 can take advantage of SSE-S3 through the fs.fusion.s3.sse.enabled configuration property, which defaults to false.

FUS-4407 - Expose default exclusion rules via API

Default exclusion rules for a replication folder can be found via the API endpoint at /fusion/fs/properties/global/excluded/default


1.4.5. Known Issues Resolved

Previous known issues that are resolved in this release are:

Consistency repair tool fails for files in Swift storage

Previous versions of WANdisco Fusion could not perform a consistency repair to content that was stored in an OpenStack Swift zone. This issue is resolved in WANdisco Fusion 2.11.

Renamed directory with incomplete file will never receive these files

In some circumstances for previous versions of Fusion modification of the metadata for a parent directory within a replicated location can prevent the completion of content transfer that is underway for files underneath that directory. Fusion’s metadata consistency is unaffected, but file content may not be available in full.

This issue is resolved in Fusion 2.11.

Fusion does not support concat() operation

In previous versions of Fusion the public void concat(Path trg, Path[] psrcs) operation in org.apache.hadoop.fs.FileSystem is not yet supported, and would result in filesystem inconsistency.

This issue is resolved in WANdisco Fusion 2.11.


1.4.6. New Platform Support

WANdisco Fusion has added support for the following new platforms since Fusion 2.10:

  • ASF Apache Hadoop 2.5.0 - 2.7.0

  • CDH 5.12

  • HDP 2.6.2

Additionally the Pivotal Hadoop Distribution is no longer a supported platform.


1.4.7. Available Packages

This release of WANdisco inc.'s Fusion supports the following versions of Hadoop:

  • ASF Apache Hadoop 2.5.0 - 2.7.0

  • CDH 5.4.0 - CDH 5.12.0

  • HDP 2.1.0 - HDP 2.6.2

  • MapR 5.0.0 - MapR 5.2.0

  • IOP (IBM BigInsights) 4.0 - 4.2.5

The trial download includes the installation packages for CDH and HDP distributions only.


1.4.8. System Requirements

Before installing, ensure that your systems, software, and hardware meet the requirements found in our online user guide at http://docs.wandisco.com/bigdata/wdfusion.

Certified Third-Party Components

WANdisco certifies the interoperability of Fusion with a wide variety of systems, including Hadoop distributions, object storage platforms, cloud environments, and applications.

  • Amazon S3

  • Amazon EMR 4.0 - 5.4

  • Ambari 1.6, 1.7, 2.0, 2.1

  • CDH 4.4, 5.2 - 5.12

  • EMC Isilon 7.2, 8.0

  • Google Cloud Storage

  • Google Cloud Dataproc

  • HDP 2.1.0 - 2.6.2

  • IBM BI 2.1.2 - 4.2.5

  • MapR M4.0.1 - M5.2.0

  • Microsoft Azure Blob Storage

  • Microsoft Azure HDInsights 3.2 - 3.6

  • MySQL, PostgreSQL (Hive Metastore)

  • Oracle BDA

Client Applications Supported

WANdisco Fusion is architected for maximum compatibility and interoperability with applications that use standard Hadoop File System APIs. All applications that use the standard Hadoop Distributed File System API or any Hadoop-Compatible File System API should be interoperable with WANdisco Fusion, and will be treated as supported applications. Additionally, Fusion supports the replication of content with Amazon S3 and S3-compatible objects stores, locally-mounted file systems, and NetApp NFS devices, but does not require or provide application compatibility libraries for these storage services.


1.4.9. Known Issues

Fusion 2.11 includes a small set of known issues with workarounds. In each case, resolution of the known issues is underway.

  • Fusion does not support truncate command - WD-FUS-3022

The public boolean truncate(Path f, long newLength) operation in org.apache.hadoop.fs.FileSystem (> 2.7.0) is not yet supported. Files will be truncated only in the cluster where the operation is initiated. Consistency check and repair can be used to both detect and resolve any resulting inconsistencies.

  • Recursive parent directory creation with exclusions - WD-FUS-4847

When an exclusion rule prevents the replication of specific files, applications that perform a mkdir() operation than includes the creation of parent directories will not create those parent directories. This may be an unexpected outcome from the definition of that exclusion rule.

  • SetPermissionRequest should not retry on FileNotFoundException - WD-FUS-4846

Operations that attempt to set a permission on a non-existent file may be retried unnecessarily.


1.4.10. Other Improvements

In addition to the highlighted features listed above, Fusion 2.11 includes a wide set of improvements in performance, functionality, scale, interoperability and general operation.

  • Create multiple replication rules at once - WD-FUI-4443

  • Display current, failed and pending rules in one table - WD-FUI-4470

  • FIX - Zone shown twice in UI after induction - WD-FUI-4657

  • Remove/move detected IPs messages at the top of UI installer Server step - WD-FUI-4681

  • Change json REST API output (do not encode URIs) - WD-FUI-4741

  • Support multiple replication rule type in UI - WD-FUI-4859

  • FIX - CC button for non-writer does nothing - WD-FUI-4866

  • Log whenever an email alert is sent - WD-FUI-4893

  • FIX - JAVA_HOME unused at console install - WD-FUI-4894

  • PID file for Fusion UI - WD-FUI-4906

  • Automatically strip protocol prefixes (and trailing paths) from domain inputs - WD-FUI-5078

  • FIX - TypeError on replicated folder creation - WD-FUI-5197

  • FIX - Typo: "relevant" - WD-FUI-5220

  • FIX - Provide zone name in title - WD-FUI-5509

  • FIX - MapR install fails to write core-site.xml - WD-FUI-5388

  • FIX - Installer redirects to IP no host - WD-FUI-5329

  • FIX - UI client sends wrong path name during replicated path deletion - WD-FUI-5250

  • FIX - Consume core API to show default exclusions - WD-FUI-5246

  • FIX - Root replication directory is marked consistent when only subdir is checked - WD-FUS-3145

  • FIX - Repair fails to repair the files to Swift zone - WD-FUS-3642

  • FIX - We should throw an error if a snapdiff does not exist - WD-FUS-3645

  • Improve speed for listing written keys - WD-FUS-3677

  • Add storage type as a Zone property - WD-FUS-3708

  • FIX - mv 10000 files from non-replicated directory to replicated directory fails - WD-FUS-3809

  • FIX - CC hanging when there is 10,000+ files on Cleversafe - WD-FUS-3816

  • FIX - Talkback not able to customize setting for TALKBACKNAME, FUSION_MARKER variables - WD-FUS-3866

  • HDP fusion-client RPM should remove symlinks for Oozie server when the RPM is uninstalled - WD-FUS-3872

  • Better diagnostics for IHC SSL configuration problems - WD-FUS-3894

  • Repair task improvements - WD-FUS-3897

  • FIX - Can’t remove replication rule if created with special characters - WD-FUS-3975

  • C118641: File size on target zone is larger than on source zone after replication finished on HDFS-S3 - WD-FUS-3979

  • Add output of hdfs dfs -count to talkback - WD-FUS-4004

  • Schedule consistency check at specific time(s) - WD-FUS-4036

  • Repair API call defaults are most aggressive - WD-FUS-4037

  • Distribute CC results across nodes - WD-FUS-4076

  • Add support for CDH 5.12 - WD-FUS-4078

  • FIX - fusion-server doesn’t like MB suffix for swift.segmentSize - WD-FUS-4092

  • Return appropriate response for the /fs/repair endpoint if given task is not a repair - WD-FUS-4105

  • FIX - RepairResource can associate the wrong RepairDetails with a task - WD-FUS-4106

  • FIX - Rename of non-repl to repl for files with 0 bytes - WD-FUS-4109

  • FIX - IHC throws UnsupportedOperationException when initializing S3Plugin - WD-FUS-4117

  • FIX - Higher than expected heap with G1 GC - WD-FUS-4122

  • FIX - Fusion parcel breaks CDH client config updates - WD-FUS-4133

  • FIX - Fusion gsn directory FileNotFoundException upon PeriodicWriterProposal - WD-FUS-4138

  • FIX - Move from non-replicated folder to replicated one produces inconsistent results - WD-FUS-4146

  • DOC - Disable and remove DES, 3DES, and RC4 ciphers - WD-FUS-4154

  • FIX - NetApp: setowner throws NPE can cause fusion to lock up - WD-FUS-4155

  • FIX - Parcel hotfix for earlier versions of cloudera - WD-FUS-4157

  • FIX - RPM upgrade is looking for htrace-core4.jar which should be htrace-core.jar - WD-FUS-4168

  • FIX - Swift consistency check doesn’t notice sub-folders - WD-FUS-4182

  • Provide IHC API for ongoing transfers - WD-FUS-4202

  • [Talkback] Store Temporary Files within TMPDIR - WD-FUS-4203

  • Make NPE nonretriable - WD-FUS-4222

  • FIX - cloudera-scm-agent port is in-use because of ssh tunneling script - WD-FUS-4225

  • Add new learners to an existing zone - WD-FUS-4250

  • Provide a mechanism to clean completed, non-contiguous GSN ranges - WD-FUS-4279

  • FIX - Talkback Doesn’t Correctly Grab Hive Configurations - WD-FUS-4296

  • UnsupportedOperationException should be non-retriable - WD-FUS-4317

  • FIX - CC check state set twice on triggering node - WD-FUS-4331

  • FIX - Solr symlinks must be careful to only reference activated fusion parcel - WD-FUS-4341

  • Expose default replication exclusions via the rest API - WD-FUS-4407

  • [fusion-utility-script] per-DSM manual bypass script - WD-FUS-4417

  • FIX - ACLs displaying as inconsistent due to reported order (on Sentry managed paths) - WD-FUS-4453

  • FIX - null Fusion authority causes us to use string, 'null', as authority - WD-FUS-4489

  • FIX - FusionUriUtils#normalize breaks with null scheme - WD-FUS-4503

  • FIX - Talkback should use -p switch to netstat - WD-FUS-4523

  • FIX - Client heap usage on large put - WD-FUS-4767

  • FIX - Non writer knowledge of rename completion - WD-FUS-4723

  • FIX - Rename file with space failed for LocalFS - WD-FUS-4706

  • FIX - EMR sending 0 length request - WD-FUS-4700

  • FIX - S3 Throttle Retries - WD-FUS-4684

  • FIX - Some *.RENAME files not removed - WD-FUS-4682

  • FIX - CDH parcel alternatives incorrectly sharing dictionary keys - WD-FUS-4656

  • Expose additional repair parameters - WD-FUS-4654

  • Support for ListStatusIterator() in FileSystem API - WD-FUS-4541

  • FIX - null Fusion authority - WD-FUS-4538

1.5. Earlier releases

Notes from WANdisco Fusion 2.10.x releases can be found here:

Table 1. 2.10.x Releases

October 2017

Release 2.10.5

October 2017

Release 2.10.4

September 2017

Release 2.10.3.2

August 2017

Release 2.10.3.1

May 2017

Release 2.10.2

April 2017

Release 2.10