Step to install HPE Ezmeral Data Fabric (formerly MapR) 7.x cluster on Linux

Contents

1. Pre-Installation Requirements. 1

2. Download and Configure HPE Ezmeral Repositories. 1

3. Install Core Data Fabric Packages. 2

4. Configure ZooKeeper and CLDB. 2

5. Cluster Initialization. 3

6. Verify Cluster Status. 3

7. Additional Configuration (Optional) 4

8. Test the Cluster. 4

9. Set Up Monitoring and Logging. 5

Step to install HPE Ezmeral Data Fabric (formerly MapR) 7.x cluste on linux

Setting up an HPE Ezmeral Data Fabric (formerly MapR) 7.x cluster on Linux involves several steps, including environment preparation, software installation, and cluster configuration. Here’s a detailed guide to install and configure a basic Ezmeral Data Fabric 7.x cluster on Linux:


1. Pre-Installation Requirements

  • Operating System: Ensure your Linux distribution is compatible. HPE Ezmeral 7.x supports various versions of RHEL, CentOS, and Ubuntu. Check the official compatibility matrix for version specifics.
  • Hardware Requirements: Verify that your hardware meets the minimum requirements:
    • CPU: At least 4 cores per node (adjust based on workload).
    • Memory: Minimum of 8 GB RAM (16 GB recommended).
    • Storage: SSD or high-performance disks for data storage; adequate storage space for data and logs.
  • Network: Ensure all cluster nodes can communicate over the network. Set up DNS or /etc/hosts entries so nodes can resolve each other by hostname.
  • Permissions: You will need root or sudo privileges on each node.

2. Download and Configure HPE Ezmeral Repositories

  • Add Repository and GPG Key: Set up the HPE Ezmeral Data Fabric repository on each node by adding the appropriate repository file and importing the GPG key.
    • For RHEL/CentOS:

sudo tee /etc/yum.repos.d/ezmeral-data-fabric.repo <<EOF

[maprtech]

name=MapR Technologies

baseurl=http://package.mapr.com/releases/v7.0.0/redhat/

enabled=1

gpgcheck=1

gpgkey=http://package.mapr.com/releases/pub/maprgpg.key

EOF

sudo rpm –import http://package.mapr.com/releases/pub/maprgpg.key

  • Update Package Manager:

 CentOS/RHEL: sudo yum update


3. Install Core Data Fabric Packages

  • Install Core Packages:
    • Install essential packages, including core components, CLDB, and webserver.

# For CentOS/RHEL

sudo yum install mapr-core mapr-cldb mapr-fileserver mapr-zookeeper mapr-webserver

Install Additional Services:

-Based on your needs, install additional services like MapR NFS, Resource Manager, or YARN.

sudo yum install mapr-nfs mapr-resourcemanager mapr-nodemanager


4. Configure ZooKeeper and CLDB

  • ZooKeeper Configuration:
    • Identify nodes to act as ZooKeeper servers (recommended at least 3 for high availability).
    • Add each ZooKeeper node to /opt/mapr/zookeeper/zookeeper-3.x.x/conf/zoo.cfg:

server.1=<zk1_hostname>:2888:3888

server.2=<zk2_hostname>:2888:3888

server.3=<zk3_hostname>:2888:3888

  • Start ZooKeeper on each ZooKeeper node:

sudo systemctl start mapr-zookeeper

  • CLDB Configuration:
    • Specify the nodes that will run the CLDB service.
    • Edit /opt/mapr/conf/cldb.conf and add the IPs or hostnames of the CLDB nodes:

cldb.zookeeper.servers=<zk1_hostname>:5181,<zk2_hostname>:5181,<zk3_hostname>:5181


5. Cluster Initialization

  • Set Up the MapR License:
    • Copy the HPE Ezmeral Data Fabric license file to /opt/mapr/conf/mapr.license on the CLDB node.
  • Run Cluster Installer:
    • Use the configure.sh script to initialize the cluster. Run this script on each node:

sudo /opt/mapr/server/configure.sh -C <cldb1_ip>:7222,<cldb2_ip>:7222 -Z <zk1_hostname>,<zk2_hostname>,<zk3_hostname>

  • The -C flag specifies the CLDB nodes, and -Z specifies the ZooKeeper nodes.
  • Start Warden Services:
    • On each node, start the mapr-warden service to initiate the core services:

sudo systemctl start mapr-warden


6. Verify Cluster Status

  • MapR Control System (MCS):
    • Access the MCS web UI to monitor the cluster. Open https://<cldb_node_ip&gt;:8443 in a browser.
    • Log in with the default credentials and verify the health and status of the cluster components.
  • CLI Verification:
    • Run the following command on the CLDB node to check cluster status:

maprcli node list -columns hostname,ip

  • Check the status of services using:

maprcli service list


7. Additional Configuration (Optional)

  • NFS Gateway Setup:
    • Install and configure the MapR NFS gateway to expose cluster data as NFS shares.

sudo yum install mapr-nfs

sudo systemctl start mapr-nfs

  • High Availability (HA) Setup:
    • For high availability, consider adding redundant nodes for critical services (CLDB, ZooKeeper) and configuring failover settings.
  • Security Configuration:
    • Set up user roles and permissions using the maprcli command and configure Kerberos or TLS for secure authentication if needed.

8. Test the Cluster

  • Data Operations: Use the following commands to test basic operations:

# Create a new directory in the data fabric

hadoop fs -mkdir /test_directory

# Copy a file into the data fabric

hadoop fs -copyFromLocal localfile.txt /test_directory

# List files in the directory

hadoop fs -ls /test_directory

  • Service Health Check: Use the MCS or maprcli commands to ensure all services are running as expected.

9. Set Up Monitoring and Logging

  • MapR Monitoring:
    • Set up logging and monitoring for long-term maintenance. Configure mapr-metrics or integrate with external monitoring tools (e.g., Prometheus).
  • Backup and Recovery:
    • Enable volume snapshots and set up periodic backups for critical data.

Following these steps will give you a functional HPE Ezmeral Data Fabric 7.x cluster on Linux, ready for production workloads. Customize configurations based on your specific needs, especially around security, high availability, and resource allocation to get optimal performance from your environment.

Disk encryption

In HPE Ezmeral Data Fabric (formerly MapR), disk encryption (not just volume-level encryption) can provide added security by encrypting the entire storage disk at a low level, ensuring that data is protected as it is written to and read from physical storage. This approach is commonly implemented using Linux-based disk encryption tools on the underlying operating system, as HPE Ezmeral does not natively provide disk encryption functionality.

Steps to Set Up Disk Encryption for HPE Ezmeral Data Fabric on Linux

To encrypt disks at the OS level, use encryption tools like dm-crypt/LUKS (Linux Unified Key Setup), which is widely supported, integrates well with Linux, and offers flexibility for encrypting storage disks used by HPE Ezmeral Data Fabric.

1. Prerequisites

  • Linux system with root access where HPE Ezmeral Data Fabric is installed.
  • Unformatted disk(s) or partitions that you plan to use for HPE Ezmeral storage.
  • Backup any important data, as disk encryption setups typically require formatting the disk.

2. Install Required Packages

Ensure cryptsetup is installed, as it provides the tools necessary for LUKS encryption.

sudo apt-get install cryptsetup   # For Debian/Ubuntu systems

sudo yum install cryptsetup       # For CentOS/RHEL systems

3. Encrypt the Disk with LUKS

  1. Set Up LUKS Encryption on the Disk:
    • Choose the target disk (e.g., /dev/sdb), and initialize it with LUKS encryption. This command will erase all data on the disk.

sudo cryptsetup luksFormat /dev/sdb

  1. Open and Map the Encrypted Disk:
    • Unlock the encrypted disk and assign it a name (e.g., encrypted_data).

sudo cryptsetup luksOpen /dev/sdb encrypted_data

  1. Format the Encrypted Disk:
    • Create a file system (such as ext4) on the encrypted disk mapping.

sudo mkfs.ext4 /dev/mapper/encrypted_data

  1. Mount the Encrypted Disk:
    • Create a mount point for the encrypted storage, and then mount it.

sudo mkdir -p /datafabric

sudo mount /dev/mapper/encrypted_data /datafabric

  1. Configure Automatic Unlocking on Reboot (Optional):
    • To automate unlocking on system boot, you can store the passphrase in a secure location or use a network-based key server, but this may affect security.
    • Alternatively, you can manually unlock the disk after each reboot using cryptsetup luksOpen.

4. Update HPE Ezmeral to Use the Encrypted Disk

  1. Update HPE Ezmeral Configuration:
    • Point HPE Ezmeral Data Fabric’s configuration to use the new encrypted mount point (/datafabric) for its storage.
    • Update relevant configuration files, such as disksetup or fstab, as needed, to use the encrypted path.
  2. Restart HPE Ezmeral Services:
    • Restart services to ensure that the system is using the encrypted disk for data operations.

5. Verify Disk Encryption

To confirm the encryption is working correctly:

  1. Check the encrypted device status:

sudo cryptsetup -v status encrypted_data

  1. Confirm that the mount point is in use by HPE Ezmeral and verify that data written to the directory is stored on the encrypted disk.

Summary

Using LUKS for disk encryption on the HPE Ezmeral Data Fabric platform provides robust data-at-rest security at the storage disk level. This setup ensures that any data written to physical disks is encrypted, protecting it from unauthorized access at a hardware level.

How to encrypt disk in HPE ezmeral

In HPE Ezmeral Data Fabric (formerly MapR), disk encryption is a key component for securing data at rest. HPE Ezmeral supports data-at-rest encryption through encryption keys and policies that enable disk-level encryption, protecting data on disk without impacting application performance.

Here’s a guide to setting up disk encryption in HPE Ezmeral:

1. Prerequisites

  • HPE Ezmeral Data Fabric 6.x or 7.x installed.
  • Access to MapR Control System (MCS) or command-line interface (CLI) to configure encryption settings.
  • MapR Core Security enabled. Data encryption requires core security to be enabled for HPE Ezmeral Data Fabric.
  • Access to the MapR Key Management System (KMS), or alternatively, an external KMS can also be used, depending on your setup and security requirements.

2. Configure MapR Security and KMS (Key Management System)

  1. Enable Core Security:
    • During HPE Ezmeral installation, make sure core security is enabled. If it’s not, you’ll need to enable it as encryption depends on core security services.
  2. Configure MapR KMS:
    • The MapR KMS service handles key management for encryption. Ensure that the KMS service is running, as it is essential for generating and managing encryption keys.
    • You can check the KMS status through the MCS or by using:

maprcli kms keys list

  1. Set Up an External KMS (Optional):
    • If you need to integrate with an external KMS (such as AWS KMS or other supported key management systems), configure it to work with HPE Ezmeral as per the system’s documentation.

3. Generate Encryption Keys

  1. Use the maprcli to Generate Keys:
    • You can create encryption keys using the maprcli command. These keys are necessary for encrypting and decrypting data on the disks.
    • To create an encryption key, use:

maprcli kms keys create -keyname <encryption_key_name>

  1. Store and Manage Keys:
    • After generating the key, you can use it in volume policies or for specific datasets. Key management can be handled directly within MapR KMS or through integrated KMS if you’re using an external provider.

4. Apply Encryption Policies to Volumes

Encryption in HPE Ezmeral is typically applied at the volume level:

  1. Create a Volume with Encryption:
    • When creating a new volume, specify that it should be encrypted and assign it the encryption key generated in the previous step.
    • For example:

maprcli volume create -name <volume_name> -path /<volume_path> -encryptiontype 1 -keyname <encryption_key_name>

  1. encryptiontype 1 specifies that the volume should be encrypted.
  2. Apply Encryption to Existing Volumes:
    • You can also apply encryption to existing volumes by updating them with an encryption key.
    • Run:

maprcli volume modify -name <volume_name> -encryptiontype 1 -keyname <encryption_key_name>

  1. Verify Volume Encryption:
    • You can check the encryption status of volumes in MCS or by using:

maprcli volume info -name <volume_name>

  1. Look for the encryption field to confirm that it’s enabled.

5. Test Disk Encryption

  • Once encryption is configured, run tests to confirm data security. Write data to the encrypted volume and verify that unauthorized users or applications cannot access the data without appropriate permissions.

6. Monitor and Rotate Keys (Ongoing Maintenance)

  1. Monitor the Health of Encrypted Volumes:
    • Use the MCS or CLI to monitor the health of volumes, checking for any issues related to encryption or access.
  2. Rotate Encryption Keys:
    • To rotate keys, generate a new key and assign it to the volume. You may need to use the MapR KMS tool or an external KMS provider depending on your setup.
    • Regularly rotating keys is a best practice to enhance security.

7. (Optional) Use External KMS with HPE Ezmeral

If you prefer an external KMS, configure it as per your organization’s security requirements. External KMS options may provide additional features, such as compliance reporting and enhanced security controls.

Summary

By following these steps, you can set up and manage disk encryption within HPE Ezmeral Data Fabric, ensuring that data at rest remains protected. Regularly review encryption policies, monitor encrypted volumes, and rotate keys to maintain robust data security.

Can use dm-crypt/LUKS encryption after Linux OS is installed ?

Yes, you can use dm-crypt/LUKS to encrypt disks or specific partitions after Linux is installed, although it may require some additional steps, especially if you’re encrypting a drive that already contains data. Here’s a general guide on how to use dm-crypt/LUKS for encryption post-installation:

Steps to Use dm-crypt/LUKS for Post-Installation Disk Encryption

Option 1: Encrypting a Non-System Partition or Additional Disk

If you want to encrypt a separate partition or disk that doesn’t contain the OS (e.g., a secondary data disk), this process is straightforward.

  1. Backup Data:
    • If the disk or partition already contains data, make a backup, as this process will erase the data on the disk.
  2. Install Required Packages:
    • Ensure cryptsetup is installed.

sudo apt update

sudo apt install cryptsetup

  1. Initialize the LUKS Partition:
    • Replace /dev/sdX with the disk or partition you want to encrypt (e.g., /dev/sdb1).

sudo cryptsetup luksFormat /dev/sdX

  1. Confirm and enter a passphrase when prompted. This passphrase will be required to unlock the partition.
  2. Open the Encrypted Partition:
    • This maps the encrypted partition to a device you can interact with.

sudo cryptsetup open /dev/sdX encrypted_data

  1. Format the Partition:
    • Format the encrypted partition to your preferred file system (e.g., ext4).

sudo mkfs.ext4 /dev/mapper/encrypted_data

  1. Mount the Partition:
    • Create a mount point and mount the partition.

sudo mkdir /mnt/encrypted_data

sudo mount /dev/mapper/encrypted_data /mnt/encrypted_data

  1. Configure Automatic Mounting (Optional):
    • To have the partition prompt for a passphrase at boot, edit /etc/crypttab and /etc/fstab.
    • Add an entry to /etc/crypttab:

encrypted_data /dev/sdX none luks

  1. Then, add an entry to /etc/fstab to mount it at boot:

/dev/mapper/encrypted_data /mnt/encrypted_data ext4 defaults 0 2

Option 2: Encrypting the Root or System Partition Post-Installation

Encrypting the root or system partition after installation is more complex, as it requires bootloader changes and adjustments to the initramfs. It typically involves the following high-level steps:

  1. Boot from a Live USB: Boot from a live Linux USB to avoid active usage of the root partition.
  2. Backup System: Make a complete backup of your system, as these steps involve substantial changes to the partition.
  3. Set Up LUKS on Root Partition:
    • Unmount the root partition and initialize it with luksFormat, then reopen it.
  4. Restore Data: Restore your data to the encrypted root partition.
  5. Update Initramfs: Update your initramfs configuration to include the LUKS configuration so it can prompt for a password at boot.
  6. Update Bootloader: Modify the bootloader (e.g., GRUB) to support LUKS so that it can unlock the root partition at boot.

Important Considerations

  • Performance: LUKS encryption can introduce some performance overhead, particularly on older hardware.
  • Password Management: Store your passphrase securely, as losing it will make the data irrecoverable.
  • Backup Regularly: Encrypted partitions make data recovery complex, so regular backups are essential.

Using dm-crypt/LUKS after installation is feasible, especially for non-system partitions, and provides strong encryption for securing sensitive data on Linux.

Example on HPE ezmeral install on 12 servers cluster

Setting up an HPE Ezmeral Data Fabric cluster on 12 servers involves several key steps, including planning, installation, configuration, and validation. Here’s a step-by-step guide to get you started:

1. Plan the Cluster Configuration

  • Determine Node Roles: Decide which servers will handle specific roles. For a 12-node setup, you could designate:
    • 3 nodes for core services (e.g., CLDB, ZooKeeper, Resource Manager).
    • 9 nodes for data and compute (e.g., Node Manager, FileServer services, Spark, HBase, etc.).
  • Network and Hostname Configuration:
    • Ensure each server has a static IP address, and configure hostnames consistently across nodes.
    • Set up DNS or /etc/hosts entries for name resolution.
  • Storage: Prepare storage volumes for the Data Fabric filesystem and other data services, ideally with high-throughput storage for each node.

2. Prepare the Servers

  • OS Requirements: Install a compatible Linux distribution on each server (e.g., RHEL, CentOS, or Ubuntu).
  • User and Security Settings:
    • Create a user for Ezmeral operations (typically mapr).
    • Disable SELinux or configure it to permissive mode.
    • Ensure firewall ports are open for required services (e.g., CLDB, ZooKeeper, Warden).
  • System Configuration:
    • Set kernel parameters according to Ezmeral requirements (e.g., adjust vm.swappiness and fs.file-max settings).
    • Synchronize time across all servers with NTP.

3. Install Prerequisite Packages

  • Install necessary packages for HPE Ezmeral Data Fabric, such as Java (Oracle JDK 8), Python, and other utilities.
  • Ensure SSH key-based authentication is configured for the mapr user across all nodes, allowing passwordless SSH access.

4. Download and Install HPE Ezmeral Data Fabric Packages

  • Obtain the installation packages for HPE Ezmeral Data Fabric 7.x from HPE’s official site.
  • Install the required packages on each node, either manually or using a script. Required packages include mapr-core, mapr-cldb, mapr-zookeeper, mapr-fileserver, and mapr-webserver.

5. Install and Configure ZooKeeper

  • On the nodes designated to run ZooKeeper, install the ZooKeeper package (mapr-zookeeper) and configure it.
  • Update /opt/mapr/conf/zookeeper.conf to specify the IP addresses of all ZooKeeper nodes.
  • Start the ZooKeeper service on each of these nodes.

6. Install and Configure CLDB

  • Install the mapr-cldb package on the nodes you’ve chosen to run CLDB.
  • Configure CLDB settings in /opt/mapr/conf/cldb.conf and specify the cldb.nodes parameter to list the IPs or hostnames of your CLDB nodes.
  • Start the CLDB service on these nodes.

7. Install Other Core Services

  • Install the following packages on appropriate nodes:
    • mapr-webserver for the Control System (MCS).
    • mapr-resourcemanager and mapr-nodemanager if using YARN.
  • Start each of these services as needed.

8. Install Data and Compute Services

  • FileServer: Install mapr-fileserver on each node where you plan to store data. This typically includes all nodes except the dedicated service nodes.
  • NodeManager: Install mapr-nodemanager on the data and compute nodes if running YARN for resource management.
  • Database and Table Services: Install services like mapr-db and mapr-hbase on nodes where you need NoSQL capabilities or HBase tables.

9. Cluster Configuration

  • Use maprcli commands or the MCS web interface to configure and manage your cluster. For example:
    • Set up cluster-wide replication and volume configurations.
    • Define quotas and user permissions for data access.
    • Configure replication and fault tolerance settings.

10. Start and Verify the Cluster

  • Start all services using systemctl or service commands as needed.
  • Check the health of the cluster using:bashCopy codemaprcli node list -columns ip,svc,health
  • Access the MapR Control System (MCS) GUI by going to https://<cldb-node>:8443 and log in to confirm services are running correctly.

11. Post-Installation and Validation

  • Validate that all nodes are connected and the services are running.
  • Run health checks to ensure there are no configuration issues.
  • Monitor storage utilization and data replication to confirm that volumes and data distribution meet your needs.
  • Set up monitoring (e.g., Grafana, Prometheus) if you want real-time insight into cluster performance.

This setup provides a foundational Ezmeral Data Fabric cluster with core, data, and compute services distributed across 12 nodes for high availability and scalability. Each step includes installing necessary services, configuring core components, and verifying that the cluster operates as expected.

Leave a comment