logo

WANDISCO FUSION®
PLUGIN FOR LIVE RANGER

1. Welcome

Welcome to the User Guide for the Fusion Plugin for Live Ranger, version 4.0.

Apache Ranger is a framework to manage data security in Hadoop deployments. It provides centralized security administration, fine-grained authorization and centralized auditing within a single cluster. Use the Fusion Plugin for Live Ranger to extend the capabilities of WANdisco Fusion to Apache Ranger across multiple Hadoop environments, and keep your security policies consistent.

1.1. Product overview

WANdisco Fusion gives you LiveData: consistent data everywhere, spanning platforms and locations, even for changing data at petabyte scale. Business critical data is guaranteed consistent, always available, and accessible from anywhere.

The Fusion Plugin for Live Ranger extends WANdisco Fusion to information managed and used by Apache Ranger. Use it to keep your security policies consistent among Hadoop deployments with WANdisco Fusion. Key features include:

  • Apache Ranger policy replication

  • Coordination of activities that modify policy definitions, including those performed via the Apache Ranger REST API, or from its administrative interface in a browser

  • Integration with WANdisco Fusion

1.2. Documentation guide

This guide contains the following:

Welcome

This chapter introduces this user guide and provides help with how to use it.

Release Notes

Details the latest software release, covering new features, fixes and known issues to be aware of.

Concepts

Explains how Fusion Plugin for Live Ranger through WANdisco Fusion uses WANdisco’s LiveData platform.

Installation

Covers the steps required to install and set up Fusion Plugin for Live Ranger into a WANdisco Fusion deployment.

Operation

Describes the steps required to run, reconfigure and troubleshoot Fusion Plugin for Live Ranger.

Reference

Additional Fusion Plugin for Live Ranger documentation.

1.2.1. Symbols in the documentation

In the guide we highlight types of information using the following call outs:

The alert symbol highlights important information.
The STOP symbol cautions you against doing something.
Tips are principles or practices that you’ll benefit from knowing or using.
The i symbol shows where you can find more information, such as in our online Knowledge base.

1.3. Contact support

See our online Knowledge base which contains updates and more information.

If you need more help raise a case on our support website.

1.4. Give feedback

If you find an error or if you think some information needs improving, raise a case on our support website or email docs@wandisco.com.

2. Release Notes

The Fusion Plugin for Live Ranger extends WANdisco Fusion by replicating Apache Ranger. With it, WANdisco Fusion maintains a Live Data environment including Ranger content, so that applications can access, use and modify a consistent view of data everywhere, spanning platforms and locations, even at petabyte scale. WANdisco Fusion ensures the availability and accessibility of critical data everywhere.

2.1. Live Ranger 4.0 Build 333

7 August 2019

For the release notes and information on known issues, please visit the Knowledge base - WANdisco Fusion Plugin for Live Ranger 4.0.

3. Concepts

Familiarity with the following concepts will improve your use of the Fusion Plugin for Live Ranger.

WANdisco Fusion Plugin

A plugin is used by WANdisco Fusion to extend its functionality. Plugins are loaded by the WANdisco Fusion server on startup.

Apache Ranger

Apache Ranger offers a centralized security framework for fine grained access control over Hadoop and related components (Apache Hive, HBase, Storm, Knox, Solr, Kafka and YARN). Use the Apache Ranger administration console to manage policies for accessing resources (file, folder, database, table, column, etc.) for a particular set of users and/or groups, and enforce those policies within Hadoop.

Ranger has a centralized web application that consists of policy, audit and administration modules. Authorized users can manage security policies via a web interface or the Apache Ranger REST API. Policies are enforced in Hadoop components by Ranger Plugins.

Apache Ranger Policy Server

The Policy Server maintains the policies defined by users, and responds to requests from Ranger Plugins to retrieve policy information.

Apache Ranger Audit Server

The Audit Server can be configured to send access audit logs generated by Apache Ranger Plugins to a range of destinations.

Apache Ranger Administration Portal

The Ranger Administration Portal provides a simple interface for security administrators to create and manage policies enforced by Apache Ranger.

Apache Ranger Plugin

Ranger Plugins are specific to the Hadoop component in which they enforce Ranger policies retrieved from the Ranger Policy Server. They are lightweight Java implementations that are embedded in the processes of other cluster components to intercept operations that would always execute without security policy enforcement, and apply those policies to prevent unauthorized operations. Plugins also deliver information to the Ranger Audit Server.

3.1. Product concepts

The Fusion Plugin for Live Ranger implements LiveData for Apache Ranger policies. It intercepts operations that act on policy definitions in the Apache Ranger Policy Server and ensures that they are coordinated and replicated among multiple Ranger Policy Server instances.

live ranger architecture
Figure 1. Live Ranger Architecture

It consists of two key components:

Live Ranger Proxy

The Live Ranger Proxy is a server that sits between clients and the REST API and Web interface of the Ranger Policy Server. Prior to forwarding client requests to the Ranger Policy Server, the proxy first proposes them to the WANdisco Fusion server for coordination.

Live Ranger Plugin

The Live Ranger Plugin is a runtime extension for the WANdisco Fusion server. It accepts proposals for operation coordination from the Live Ranger Proxy, and leverages the LiveData capabilities of the WANdisco Fusion server to ensure that all operations are performed with guaranteed consistent outcomes among multiple Apache Ranger deployments.

This Plugin is also responsible for the execution of operations that originate from other Ranger deployments. It presents those requests to its local Apache Ranger Policy Server as though they originated locally so they can be executed.

3.2. Supported Functionality

The Fusion Plugin for Live Ranger:

  • provides functionality to replicate Ranger policy definitions between instances of the Apache Ranger Policy Administration Server using WANdisco Fusion

  • intercepts all means by which Ranger policies can be created, modified, deleted, etc. to coordinate those operations among multiple Apache Ranger instances

  • offers functionality for an administrator to check and report on the consistency between policy definitions across multiple Ranger instances

  • supports the ability to resolve inconsistencies among policies between Ranger instances

  • provides a selection of REST API endpoints by which its operation can be managed

Of note, the following capabilities are explicitly not performed by this product:

  • Synchronization of operations performed by Ranger Plugins that are specific to Hadoop components in each cluster. There is no dependency between the Fusion Plugin for Live Ranger and Ranger Plugins deployed in each cluster. Note that this means that although Ranger policies and their administration will be replicated with guaranteed consistency among Ranger instances, each cluster’s Ranger plugins will poll those policies independently, applying them independently also.

  • Replication of the Ranger Key Management Service. The Ranger KMS is a cryptographic key management service that supports "data at rest" encryption in HDFS.

  • Selective replication of Ranger policies. Ranger policy replication is enabled as a whole between clusters when using the Fusion Plugin for Live Ranger. Either all Ranger policies and repositories are replicated, or none are.

4. Installation

4.1. Pre-requisites

4.1.1. System Requirements

Along with the standard product requirements for WANdisco Fusion, you need to:

  • Ensure that your clusters use an Ambari-based deployment, see the release notes for your specific version for more information on which Hortonworks versions are supported.[1]

  • Configure the Hadoop environment for either Simple or Kerberos security.

  • Use Apache Ranger for policy enforcement.

Known Issue
The Ranger Admin must be active before installing the Fusion Plugin for Live Ranger proxy.

4.1.2. Replication Requirements

  • Ranger services must match
    Both zones must have the same set of ranger services. See the Ranger service section for more information.

4.1.3. Security Requirements

Manual kerberos only
This section is only a requirement is you are using manual kerberos mode.

For each cluster, ensure that the following security-related requirements are in place before you start installation.

Here wd-ranger-user is used as an example username. Please replace this with a name appropriate for you set up.
Add the system user wd-ranger-user on all nodes where the Fusion Server is installed, or you are going to install Live Ranger Proxy Server:
useradd wd-ranger-user

On the node where the KDC server is running:

Create the principal
kadmin.local# addprinc -randkey wd-ranger-user/@
Create the keytab
kadmin.local# xst -norandkey -kt wd-ranger-proxy.keytab wd-ranger-user/@
Copy the keytab into the Ranger Proxy Server node
scp wd-ranger-proxy.keytab root@:/etc/security/keytabs
Change ownership of the file on that host
chown wd-ranger-user:wd-ranger-user /etc/security/keytabs/wd-ranger-proxy.keytab

Add the wd-ranger-user and hdfs user to the underlying Ranger instance with admin roles.

Create appropriate users in Ranger

The Fusion user must have admin permissions

  1. Login to the Ranger Admin UI

  2. Navigate to Settings >> Users/Groups tab

  3. Create wd-ranger-user user with admin role

  4. Create hdfs user with admin role

4.2. Installation

There are two installation methods for Fusion Plugin for Live Ranger. The method requiring the least user input is outlined in the next section - the Ambari Installation method. There is also an alternative method for installing the stacks which is described in the Manually extract the stack section.

If you are using Ranger HA, you will need to install Fusion Plugin for Live Ranger on all nodes running Ranger.

Before starting the installation ensure your Fusion servers are inducted between zones.

4.2.1. Ambari Installation

CLI installation
  1. Obtain the Live Ranger Plugin installer from customer.wandisco.com and open a terminal session on your WANdisco Fusion node.

  2. Ensure the downloaded file is executable e.g.

    chmod +x live-ranger-installer.sh
  3. Run the installer using an account with appropriate permissions:

    ./live-ranger-installer.sh

    The installer will now start.

    Verifying archive integrity... All good.
    Uncompressing WANdisco Live Ranger..................
    
        ::   ::  ::     #     #   ##    ####  ######   #   #####   #####   #####
       :::: :::: :::    #     #  #  #  ##  ## #     #  #  #     # #     # #     #
      ::::::::::: :::   #  #  # #    # #    # #     #  #  #       #       #     #
     ::::::::::::: :::  # # # # #    # #    # #     #  #   #####  #       #     #
      ::::::::::: :::   # # # # #    # #    # #     #  #        # #       #     #
       :::: :::: :::    ##   ##  #  ## #    # #     #  #  #     # #     # #     #
        ::   ::  ::     #     #   ## # #    # ######   #   #####   #####   #####
    
    You are about to install WANdisco Live Ranger version 4.0.0.0
    
    Full installation of this plugin currently requires that the appropriate
    'Management Pack' stack is installed through your Ambari server node.
    
    This installer package includes all the currently supported stacks for this.
    
    If you have not already done so, you should:
    
      1) copy this installer to the ambari-server node
      2) run the installer with the 'install-stack' sub-command.
      3) use the Ambari UI to Add the service
    
    For further guidance and clarifications, visit https://docs.wandisco.com/
    
    Do you want to continue with the installation? (Y/n) Y

    The installer will perform an integrity check and confirm the product version that will be installed.

    A prompt message is displayed warning of the prerequisite 'Management Pack' stack that should be installed through your Ambari server node prior to Live Ranger plugin installation.

    Enter "Y" to continue the installation.

    The instructions in this section document the install-stack option. Note this method stops and starts the Ambari server automatically. For the alternative method see Manually extract the stack.

  4. Copy the installer to the Ambari server node e.g.

    scp live-ranger-installer.sh :/tmp
  5. On your Ambari server node run:

    /tmp/live-ranger-installer.sh install-stack
  6. Now go to your Ambari UI and follow the steps below.

Installation via the Ambari UI
  1. Click on Actions > Add Service.

    liveranger2.0 ambari add service
    Figure 2. Add Service
  2. Select Live Ranger Proxy and click Next.

    liveranger2.0 ambari service next
    Figure 3. Select Fusion Live Ranger
  3. In Assign Masters, select the node where you want to deploy the Live Ranger Proxy Server. Click Next

    If existing proxy server RPM files are found by Ambari the installation will fail. If this happens, remove the files and restart the installation. IMPORTANT: Once installation is complete, make sure the files are left in place.
    liveranger2.0 ambari select masters
    Figure 4. Select Masters
  4. In Assign Slaves and Clients, select the client checkbox on the node(s) where the WANdisco Fusion server is installed. Do not select any other nodes than a Fusion Server, and all Fusion Server nodes must be selected.

    liveranger2.0 ambari select slaves
    Figure 5. Select Slaves
  5. Provide the necessary configuration values and then click Next.

    liveranger2.0 ambari customize services
    Figure 6. Customize Services
    Advanced proxy-plugin-site

    The Admin user’s credentials for Ranger Admin.

    Kerberos enabled

    Check this box if Kerberos is enabled on your cluster.

    Cluster name

    Automatically determined by Ambari.

    Ranger Admin URL

    Automatically determined by Ambari.

    Advanced proxy-server-site

    The Admin user’s credentials for Ranger Admin. This must match the credentials given for Advanced proxy-plugin-site above.

    Kerberos enabled

    Check this box if Kerberos is enabled on your cluster.

    Cluster name

    Automatically determined by Ambari.

    Kerberos Read Only Users

    Only required if your cluster is kerberized. See User levels for more information.

    Proxy keytab

    Automatically determined by Ambari

    Spnego keytab

    Automatically determined by Ambari

    Zone name

    The zone name in your Fusion Server

  6. Provide configuration details for the plugin.

    liveranger2.0 ambari configure plugin
    Figure 7. Configure Plugin
  7. Provide configuration details for the server.

    liveranger2.0 ambari configure server
    Figure 8. Configure Server
    liveranger2.0 ambari configure server2
    Figure 9. Configure Server
  8. Review the configuration and click Deploy.

    liveranger2.0 ambari review
    Figure 10. Review Configuration and Deploy
  9. Confirm successful deployment and click Next.

    liveranger2.0 ambari install start test
    Figure 11. Install, Start, Test

    The installation of Fusion Plugin for Live Ranger is now complete.

  10. Click on Quick LinksLive Ranger UI.

    liveranger2.2 ambari live ranger ui
    Figure 12. Quick link - Ranger UI
  11. Log in to the Ranger UI. Changes made using this proxy server will be now be replicated between zones.

    liveranger2.0 ranger ui
    Figure 13. Ranger UI - login
  12. After installation is complete, restart the Fusion server.

  13. If using Ranger HA, now repeat the installation on all nodes running Ranger

  14. Now go to the validation section.

Manually extract the stack

This is an alternative method to the install-stack option described above. Once you have completed the steps here you will then need to complete the Installation via the Ambari UI section.

  1. Perform steps 1-3 above.

  2. Run:

    ./live-ranger-installer.sh extract-stack
  3. Transfer the stack file to the /tmp folder on the ambari-server node e.g.

    scp fusion-ranger-proxy--centos.stack.tar.gz :/tmp
  4. Connect to Ambari-Server node and stop Ambari-Server.

    ambari-server stop
  5. Install the Live Ranger stack for Ambari:

    ambari-server install-mpack --mpack=/tmp/fusion-ranger-proxy-hdp-.stack.tar.gz
  6. Start the Ambari-Server.

    ambari-server start
  7. Now follow the steps in the Installation via the Ambari UI section to complete your installation.

4.3. Validation

Once your installation is complete, you should verify that replication is working as expected before entering into a production phase. For example, create a User or Group through the Ranger UI and confirm that it is replicated.

  1. Go to your Ranger UI - http://<Host Name>:<Live Ranger Port> e.g http://localhost:8072

  2. In SettingsUsers/Groups, add a new User.

    liveranger2.1 ambari add user
    Figure 14. Ranger UI - Add user/group
  3. Once the User is created, go to another zone in Ranger and confirm that the new User has been replicated.

4.4. Upgrade

If you wish to perform a Fusion Plugin for Live Ranger upgrade, please contact WANdisco support.

4.5. Uninstallation

If you wish to uninstall Live Ranger, please contact WANdisco support. The uninstall procedure currently requires manual editing and should not be done without calling WANdisco’s support team for assistance. The process involves both service and package removal. If you had a HA setup, you will also need to edit your load balancer configuration.

5. Operation

Character encoding support

To use the standard Chinese coded character set GB18030, some additional configurations must be made to the underlying Ranger DBMS, i.e.,

  1. Replace your /etc/my.cnf with my.cnf.

  2. The Ranger assets within MySQL also needed to be converted from UTF8 to UTF8MB4. See ranger_mysql_gb18030.sql[ranger_mysql_gb18030.sql.

5.1. Replication

5.1.1. View replication rule

Once Fusion Plugin for Live Ranger is installed, the All Ranger Rules replication rule is visible on the Replication tab of the WANdisco Fusion UI.

liveranger2.2 viewrule
Figure 15. Replication rule list

Click on All Ranger Rules to see more details.

liveranger2.2 viewrule 02
Figure 16. View All Ranger Rules
Type

The type of replication rule, in this case the type is Ranger.

Ranger Policies

All Ranger policies are included in this single rule so that HDP clusters replicate Apache Ranger policy definitions. The rule controls how the data is replicated between zones and does not have any impact on the policies themselves which you continue to manage through the Ranger UI.

Zones

Lists the zones between which this rule’s associated path is replicated. Note that the "local" label identifies which of the zones that the currently viewed node belongs.

Go back to Rule list - click this button to return to the Replication Rules screen.

5.1.2. Consistency check

When to perform a consistency check?
  • After adding new data into replication group

  • Periodically, as part of your platform monitoring

  • As part of a system repair/troubleshooting

To perform a consistency check follow the steps below.

  1. On the Replication tab, click on All Ranger Rules.

    liveranger2.2 cc 01
    Figure 17. Select All Ranger Rules
  2. On the Status tab you can see the results of the previous consistency check. Click Check now to trigger a new check.

    liveranger2.3 cc 02
    Figure 18. Trigger consistency check
  3. The results of the consistency check will now be displayed. Yellow bars are inconsistent. A more detailed report can also be downloaded.

    liveranger2.3 cc 03
    Figure 19. Consistency result

    If the result of the consistency check is inconsistent, see the make consistent section for what to do next.

Consistency check results

The consistency check lists the results of 6 items.

Groups

A group of users which can be used to set up policy permission access. Groups can be external or internal.

Permission Models

These define the access to specific Ranger UI tabs for users or groups. View your Permission Models at Settings > Permissions.

Policies

This gives users or groups permissions to access particular resources for a particular service.

Services Definitions

The mapping of a service to a Ranger configuration for example HDFS, HBase, Hive, YARN and Ranger KMS.

Services

The component supported in Ranger for example HDFS, HBase, Hive, YARN and Ranger KMS.

Users

These can be synced from an external source or created internally in the Ranger UI.

5.1.3. Make consistent

If you have performed a consistency check and the result is inconsistent, follow the steps below to make the zones become consistent.

  1. Select the zone which you want to be the Source of Truth by clicking on the relevant graph.

    liveranger2.3 makecon 01
    Figure 20. Make consistent

    The differences between the zones will now be highlighted. This example shows that one group will be added to zone02 when making the zones consistent. A proportion of the bar will turn red if the source of truth contains less groups, for example, than the other zone. Yellow represents inconsistency, including if the overall number is the same but the content is different.

  2. Now click Make Consistent.

  3. The zones are now consistent. The state will remain as Unknown until a new consistency check is run.

Make consistent history

Make consistent history can be viewed in the lower section of the Status tab.

liveranger2.3 makeconhistory 01
Figure 21. Make consistent history

5.1.4. Ranger service

Ranger services must match
For consistency check and make consistent to function properly, both zones must have the same set of Ranger services. This includes, for example, Ranger KMS.
Replication
  1. Operations on Ranger Services are not replicated. They must be created on a per cluster basis.

  2. Policy replication will only occur if the service name is the same between clusters. For example for c1_hdfs in cluster 1 and c2_hdfs in cluster 2, replication will occur between the clusters. However, replication will not occur between c1_hdfs and c2_hdfs1. Note that replication will also occur if the cluster name is omitted, provided the service names match, i.e. it is just called hdfs in all clusters.

  3. Everything that is not a Ranger Service is replicated (users, groups etc).

Consistency check
  1. Service consistency check will not consider configuration, and will check for the service name based on the cluster. For example, cluster1 name c1 will create the service as c1_<component1> and will be treated same as cluster2 name c2 service c2_<component1>.

  2. Entities can also be used in consistency checks (API only, not via the UI), e.g.

    curl -v -s -X POST "http://<fusion-server-host-name>:8082/plugin/rangerproxy/cc?path=/rangerproxy"

    The above call will give the consistency check for 6 entities - servicedef, service, policy, user, group and permission.

    You can also specify entities, giving more than one entity by comma separated values, e.g.,

    curl -v -s -X POST "http://<fusion-server-host-name>:8082/plugin/rangerproxy/cc?path=/rangerproxy&<entityName>

    where <entityName> can be one of the 6 listed above in the format entityName = servicedef,policy.

Make consistent
  1. Making services consistent needs to be done manually.

  2. As with replication and consistency checks, making policies consistent will only work if the service exists on both clusters (see above).

Once configured, restart the WANdisco Fusion server to use the configuration applied:

service fusion-server restart

Then start each Ranger Proxy server:

service rangerproxy-server start

5.2. Administration

5.2.1. Login and login credentials

Log in information is replicated and persistent across all nodes.

5.2.2. User levels

There are 3 levels of user when using Kerberos.

Admin

Admin users can perform all operations. This is set in the Ranger UI.

Kerberos read only users

During installation you provide a list of read only users. These users can perform GET operations only.

Not defined

Users which are neither Admin or read only can perform no operations.

5.2.3. Configuration

Configuration of the Fusion Plugin for Live Ranger proxy and server is performed with changes to the configuration files generated at installation time:

  • /etc/wandisco/fusion/plugins/live-ranger/proxy-plugin-site.xml

  • /etc/wandisco/live-ranger-proxy/proxy-server-site.xml

The Ranger Administration UI can be enabled for access via SSL. For full details of how to configure the Fusion Plugin for Live Ranger for interoperability with SSL-enabled Ranger installations, please contact WANdisco support.

6. Reference Guide

6.1. Setup Live Ranger HA

If your clusters are kerberized, errors will occur unless all Fusion Plugin for Live Ranger proxy hosts are added to spnego keytabs in order to be accessible by hdfs namenodes. This is applicable for all versions of Fusion Plugin for Live Ranger.
  1. Follow the steps detailed by Hortonworks on Configuring Ranger Admin HA.

  2. Ranger Admin will now be running in HA mode. You should now see the Ranger login page at e.g. http://rpxy01-vm2.bdfrem.wandisco.com:88. In this example, the load balancer is installed on vm2 and Ranger Admin on vm4, vm5.

  3. Install Live Ranger on more than one node. While configuring Live Ranger Proxy and Plugin, enter the policy manager url the nodes comma separated, e.g.:

    http://rpxy01-vm4.bdfrem.wandisco.com:6080,http://rpxy01-vm5.bdfrem.wandisco.com :6080
  4. Change the load balancer configuration to point to Live Ranger Proxy rather than Ranger Admin. This needs to be done on the node on which the load balancer is installed, vm2 in this example.

    cd /usr/local/apache2/conf
    vi ranger-cluster.conf

    Update the following

    <Proxy balancer://rangercluster>
        BalancerMember http://rpxy01-vm4.bdfrem.wandisco.com:6080 loadfactor=1 route=1
        BalancerMember http://rpxy01-vm5.bdfrem.wandisco.com:6080 loadfactor=1 route=2

    To

    <Proxy balancer://rangercluster>
        BalancerMember http://rpxy01-vm0.bdfrem.wandisco.com:8072 loadfactor=1 route=1 sta retry=30
        BalancerMember http://rpxy01-vm1.bdfrem.wandisco.com:8072 loadfactor=1 status=+H retry=0 route=2
  5. Run the following commands to restart the httpd server:

    cd /usr/local/apache2/bin
    ./apachectl restart
  6. Live Ranger will now be running in HA mode. You should see the Ranger login page at e.g. http://rpxy01-vm2.bdfrem.wandisco.com:88.


1. While operation is supported with Azure HDInsight 3.6, there is no automated installation process for it because its version of Ambari prevents the deployment of additional stacks.