Glossary

Technical guide and glossary for Hadoop and WANdisco Fusion terms.

Introducing WD Fusion

WD Fusion provides consistent, continuous data replication between file systems in Hadoop clusters. Client applications that use Fusion interact with a virtual file system that integrates the underlying storage across multiple clusters. When changes are made to files in one cluster, they are replicated immediately and consistently to the other clusters that WD Fusion spans.

(DIAGRAM HERE)

WD Fusion Terms

To help you understand how WD Fusion operates, this documentation uses the terms Zone, Membership and Replication Rule. They each play a critical role in your configuration and use of WD Fusion. You should understand this terminology before installing WD Fusion.

Zones

A Zone represents the file system used in a standalone Hadoop cluster. Multiple Zones could be from separate clusters in the same data center, or could be from distinct clusters operating in geographically-separate data centers that span the globe. WD Fusion operates as a distributed collection of servers. While each WD Fusion server always belongs to only one Zone, a Zone can have multiple WD Fusion servers (for load balancing and high availability). When you install WD Fusion, you should create a Zone for each cluster's file system.

DConE Terms

Memberships

A Membership is a defined group of WD Fusion servers that replicate data between their Zones. You can have as many WD Fusion servers in a Membership as you like, and each WD Fusion server can participate in multiple Memberships. WD Fusion allows you to define multiple Memberships, and WD Fusion servers can fulfil different roles in each Membership they participate in. This allows you to control exactly how your WD Fusion environment operates normally, and how it behaves when faced with network failures, host failures and other types of issues.

Replication Rules

File system content is replicated selectively by defining Replication Rules, which specify: the directory in the file system that will be replicated, the Zones that will participate in that replication, and the Membership associated with those Zones. Without any Replication Rules defined, each Zone's file system operates independently of the others. With the combination of Zones, Memberships and Replication Rules, WD Fusion gives you complete control over how data are replicated between the file systems of your Hadoop clusters.

Induction

The process of forming a membership between a number of WD Fusion nodes is called Induction.