VMware vSphere Big Data Extensions - 2.2 User’s Manual

VMware vSphere Big Data Extensions
Command-Line Interface Guide
vSphere Big Data Extensions 2.2
This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document, see http://www.vmware.com/support/pubs.
EN-001702-00
You can find the most up-to-date technical documentation on the VMware Web site at:
http://www.vmware.com/support/
The VMware Web site also provides the latest product updates.
If you have comments about this documentation, submit your feedback to:
docfeedback@vmware.com
Copyright © 2013 – 2015 VMware, Inc. All rights reserved. Copyright and trademark information. This work is licensed under a Creative Commons Attribution-NoDerivs 3.0 United States License
(http://creativecommons.org/licenses/by-nd/3.0/us/legalcode).
VMware, Inc.
3401 Hillview Ave. Palo Alto, CA 94304 www.vmware.com
2 VMware, Inc.

Contents

About This Book 7
Using the Serengeti Remote Command-Line Interface Client 9
1
Access the Serengeti CLI By Using the Remote CLI Client 9
Log in to Hadoop Nodes with the Serengeti Command-Line Interface Client 10
Managing Application Managers 13
2
About Application Managers 15
3
Add an Application Manager by Using the Serengeti Command-Line
4
Interface 17
View List of Application Managers by using the Serengeti Command-Line
5
Interface 19
Modify an Application Manager by Using the Serengeti Command-Line
6
Interface 21
View Supported Distributions for All Application Managers by Using the
7
Serengeti Command-Line Interface 23
View Configurations or Roles for Application Manager and Distribution by
8
Using the Serengeti Command-Line Interface 25
VMware, Inc.
Delete an Application Manager by Using the Serengeti Command-Line
9
Interface 27
Managing the Big Data Extensions Environment by Using the Serengeti
10
Command-Line Interface 29
About Application Managers 29
Add a Resource Pool with the Serengeti Command-Line Interface 32
Remove a Resource Pool with the Serengeti Command-Line Interface 33
Add a Datastore with the Serengeti Command-Line Interface 33
Remove a Datastore with the Serengeti Command-Line Interface 33
Add a Network with the Serengeti Command-Line Interface 34
Remove a Network with the Serengeti Command-Line Interface 34
Reconfigure a Static IP Network with the Serengeti Command-Line Interface 35
Reconfigure the DNS Type and Generate a Hostname with the Serengeti Command-Line Interface 35
Increase Cloning Performance and Resource Usage of Virtual Machines 36
3
Managing Users and User Accounts 39
11
Create an LDAP Service Configuration File Using the Serengeti Command-Line Interface 39
Activate Centralized User Management Using the Serengeti Command-Line Interface 41
Create a Cluster With LDAP User Authentication Using the Serengeti Command-Line Interface 41
Change User Management Modes Using the Serengeti Command-Line Interface 42
Modify LDAP Configuration Using the Serengeti Command-Line Interface 43
Creating Hadoop and HBase Clusters 45
12
About Hadoop and HBase Cluster Deployment Types 47
Default Hadoop Cluster Configuration for Serengeti 47
Default HBase Cluster Configuration for Serengeti 48
About Cluster Topology 48
About HBase Clusters 51
About MapReduce Clusters 58
About Data Compute Clusters 61
About Customized Clusters 71
Managing Hadoop and HBase Clusters 79
13
Stop and Start a Cluster with the Serengeti Command-Line Interface 79
Scale Out a Cluster with the Serengeti Command-Line Interface 80
Scale CPU and RAM with the Serengeti Command-Line Interface 80
Reconfigure a Cluster with the Serengeti Command-Line Interface 81
About Resource Usage and Elastic Scaling 83
Delete a Cluster by Using the Serengeti Command-Line Interface 88
About vSphere High Availability and vSphere Fault Tolerance 88
Reconfigure a Node Group with the Serengeti Command-Line Interface 88
Recover from Disk Failure with the Serengeti Command-Line Interface Client 89
Enter Maintenance Mode to Perform Backup and Restore with the Serengeti Command-Line
Interface Client 89
Monitoring the Big Data Extensions Environment 91
14
View List of Application Managers by using the Serengeti Command-Line Interface 91
View Available Hadoop Distributions with the Serengeti Command-Line Interface 92
View Supported Distributions for All Application Managers by Using the Serengeti Command-
Line Interface 92
View Configurations or Roles for Application Manager and Distribution by Using the Serengeti
Command-Line Interface 92
View Provisioned Clusters with the Serengeti Command-Line Interface 93
View Datastores with the Serengeti Command-Line Interface 93
View Networks with the Serengeti Command-Line Interface 93
View Resource Pools with the Serengeti Command-Line Interface 94
Cluster Specification Reference 95
15
Cluster Specification File Requirements 95
Cluster Definition Requirements 95
Annotated Cluster Specification File 96
Cluster Specification Attribute Definitions 99
4 VMware, Inc.
White Listed and Black Listed Hadoop Attributes 101
Convert Hadoop XML Files to Serengeti JSON Files 103
Contents
Serengeti CLI Command Reference 105
16
appmanager Commands 105
cluster Commands 107
connect Command 113
datastore Commands 114
disconnect Command 114
distro list Command 115
mgmtvmcfg Commands 115
network Commands 116
resourcepool Commands 118
topology Commands 118
usermgmt Commands 119
Index 121
VMware, Inc. 5
6 VMware, Inc.

About This Book

VMware vSphere Big Data Extensions Command-Line Interface Guide describes how to use the Serengeti Command-Line Interface (CLI) to manage the vSphere resources that you use to create Hadoop and HBase clusters, and how to create, manage, and monitor Hadoop and HBase clusters with the VMware Serengeti™ CLI.
VMware vSphere Big Data Extensions Command-Line Interface Guide also describes how to perform Hadoop and HBase operations with the Serengeti CLI, and provides cluster specification and Serengeti CLI command references.
Intended Audience
This guide is for system administrators and developers who want to use Serengeti to deploy and manage Hadoop clusters. To successfully work with Serengeti, you should be familiar with Hadoop and VMware vSphere®.
VMware Technical Publications Glossary
VMware Technical Publications provides a glossary of terms that might be unfamiliar to you. For definitions of terms as they are used in VMware technical documentation, go to
http://www.vmware.com/support/pubs.
®
VMware, Inc.
7
8 VMware, Inc.
Using the Serengeti Remote
Command-Line Interface Client 1
The Serengeti Remote Command-Line Interface Client lets you access the Serengeti Management Server to deploy, manage, and use Hadoop.
This chapter includes the following topics:
“Access the Serengeti CLI By Using the Remote CLI Client,” on page 9
n
“Log in to Hadoop Nodes with the Serengeti Command-Line Interface Client,” on page 10
n

Access the Serengeti CLI By Using the Remote CLI Client

You can access the Serengeti Command-Line Interface (CLI) to perform Serengeti administrative tasks with the Serengeti Remote CLI Client.
Prerequisites
Use the VMware vSphere Web Client to log in to the VMware vCenter Server® on which you deployed
n
the Serengeti vApp.
Verify that the Serengeti vApp deployment was successful and that the Management Server is running.
n
Verify that you have the correct password to log in to Serengeti CLI. See the VMware vSphere Big Data
n
Extensions Administrator's and User's Guide.
The Serengeti CLI uses its vCenter Server credentials.
Verify that the Java Runtime Environment (JRE) is installed in your environment and that its location is
n
in your path environment variable.
Procedure
1 Download the Serengeti CLI package from the Serengeti Management Server.
Open a Web browser and navigate to the following URL: https://server_ip_address/cli/VMware-
Serengeti-CLI.zip
2 Download the ZIP file.
The filename is in the format VMware-Serengeti-cli-version_number-build_number.ZIP.
3 Unzip the download.
The download includes the following components.
The serengeti-cli-version_number JAR file, which includes the Serengeti Remote CLI Client.
n
The samples directory, which includes sample cluster configurations.
n
Libraries in the lib directory.
n
VMware, Inc.
9
4 Open a command shell, and change to the directory where you unzipped the package.
5 Change to the cli directory, and run the following command to enter the Serengeti CLI.
For any language other than French or German, run the following command.
n
java -jar serengeti-cli-version_number.jar
For French or German languages, which use code page 850 (CP 850) language encoding when
n
running the Serengeti CLI from a Windows command console, run the following command.
java -Dfile.encoding=cp850 -jar serengeti-cli-version_number.jar
6 Connect to the Serengeti service.
You must run the connect host command every time you begin a CLI session, and again after the 30 minute session timeout. If you do not run this command, you cannot run any other commands.
a Run the connect command.
connect --host xx.xx.xx.xx:8443
b At the prompt, type your user name, which might be different from your login credentials for the
Serengeti Management Server.
NOTE If you do not create a user name and password for the Serengeti Command-Line Interface Client, you can use the default vCenter Server administrator credentials. The Serengeti Command-Line Interface Client uses the vCenter Server login credentials with read permissions on the Serengeti Management Server.
c At the prompt, type your password.
A command shell opens, and the Serengeti CLI prompt appears. You can use the help command to get help with Serengeti commands and command syntax.
To display a list of available commands, type help.
n
To get help for a specific command, append the name of the command to the help command.
n
help cluster create
Press Tab to complete a command.
n

Log in to Hadoop Nodes with the Serengeti Command-Line Interface Client

To perform troubleshooting or to run your management automation scripts, log in to Hadoop master, worker, and client nodes with SSH from the Serengeti Management Server using SSH client tools such as SSH, PDSH, ClusterSSH, and Mussh, which do not require password authentication.
To connect to Hadoop cluster nodes over SSH, you can use a user name and password authenticated login. All deployed nodes are password-protected with either a random password or a user-specified password that was assigned when the cluster was created.
Prerequisites
Use the vSphere Web Client to log in to vCenter Server, and verify that the Serengeti Management Server virtual machine is running.
10 VMware, Inc.
Chapter 1 Using the Serengeti Remote Command-Line Interface Client
Procedure
1 Right-click the Serengeti Management Server virtual machine and select Open Console.
The password for the Serengeti Management Server appears.
NOTE If the password scrolls off the console screen, press Ctrl+D to return to the command prompt.
2 Use the vSphere Web Client to log in to the Hadoop node.
The password for the root user appears on the virtual machine console in the vSphere Web Client.
3 Change the password of the Hadoop node by running the set-password -u command.
sudo /opt/serengeti/sbin/set-password -u
VMware, Inc. 11
12 VMware, Inc.

Managing Application Managers 2

A key to managing your Hadoop clusters is understanding how to manage the different application managers that you use in your Big Data Extensions environment.
VMware, Inc. 13
14 VMware, Inc.

About Application Managers 3

You can use Cloudera Manager, Ambari, and the default application manager to provision and manage clusters with VMware vSphere Big Data Extensions.
After you add a new Cloudera Manager or Ambari application manager to Big Data Extensions, you can redirect your software management tasks, including monitoring and managing clusters, to that application manager.
You can use an application manager to perform the following tasks:
List all available vendor instances, supported distributions, and configurations or roles for a specific
n
application manager and distribution.
Create clusters.
n
Monitor and manage services from the application manager console.
n
Check the documentation for your application manager for tool-specific requirements.
Restrictions
The following restrictions apply to Cloudera Manager and Ambari application managers:
To add an application manager with HTTPS, use the FQDN instead of the URL.
n
VMware, Inc.
You cannot rename a cluster that was created with a Cloudera Manager or Ambari application
n
manager.
You cannot change services for a big data cluster from Big Data Extensions if the cluster was created
n
with Ambari or Cloudera Manager application manager.
To change services, configurations, or both, you must make the changes from the application manager
n
on the nodes.
If you install new services, Big Data Extensions starts and stops the new services together with old services.
If you use an application manager to change services and big data cluster configurations, those changes
n
cannot be synced from Big Data Extensions. The nodes that you create with Big Data Extensions do not contain the new services or configurations.
15
16 VMware, Inc.
Add an Application Manager by Using the Serengeti Command-Line
Interface 4
To use either Cloudera Manager or Ambari application managers, you must add the application manager and add server information to Big Data Extensions.
NOTE If you want to add a Cloudera Manager or Ambari application manager with HTTPS, use the FQDN in place of the URL.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager add command.
appmanager add --name application_manager_name --type [ClouderaManager|Ambari]
--url http[s]://server:port
Application manager names can include only alphanumeric characters ([0-9, a-z, A-Z]) and the following special characters; underscores, hyphens, and blank spaces.
You can use the optional description variable to include a description of the application manager instance.
3 Enter your username and password at the prompt.
4 If you specified SSL, enter the file path of the SSL certificate at the prompt.
What to do next
To verify that the application manager was added successfully, run the appmanager list command.
VMware, Inc.
17
18 VMware, Inc.
View List of Application Managers by using the Serengeti Command-Line
Interface 5
You can use the appManager list command to list the application managers that are installed on the Big Data Extensions environment.
Prerequisites
Verify that you are connected to an application manager.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager list command.
appmanager list
The command returns a list of all application managers that are installed on the Big Data Extensions environment.
VMware, Inc.
19
20 VMware, Inc.
Modify an Application Manager by Using the Serengeti Command-Line
Interface 6
You can modify the information for an application manager with the Serengeti CLI, for example, you can change the manager server IP address if it is not a static IP, or you can upgrade the administrator account.
Prerequisites
Verify that you have at least one external application manager installed on your Big Data Extensions environment.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager modify command.
appmanager modify --name application_manager_name
--url <http[s]://server:port>
Additional parameters are available for this command. For more information about this command, see
“appmanager modify Command,” on page 106.
VMware, Inc.
21
22 VMware, Inc.
View Supported Distributions for All Application Managers by Using the
Serengeti Command-Line Interface 7
Supported distributions are those distributions that are supported by Big Data Extensions. Available distributions are those distributions that have been added into your Big Data Extensions environment. You can view a list of the Hadoop distributions that are supported in the Big Data Extensions environment to determine if a particular distribution is available for a particular application manager.
Prerequisites
Verify that you are connected to an application manager.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager list command.
appmanager list --name application_manager_name [--distros]
If you do not include the --name parameter, the command returns a list of all the Hadoop distributions that are supported on each of the application managers in the Big Data Extensions environment.
The command returns a list of all distributions that are supported for the application manager of the name that you specify.
VMware, Inc.
23
24 VMware, Inc.
View Configurations or Roles for Application Manager and Distribution by Using the Serengeti Command-
Line Interface 8
You can use the appManager list command to list the Hadoop configurations or roles for a specific application manager and distribution.
The configuration list includes those configurations that you can use to configure the cluster in the cluster specifications.
The role list contains the roles that you can use to create a cluster. You should not use unsupported roles to create clusters in the application manager.
Prerequisites
Verify that you are connected to an application manager.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager list command.
appmanager list --name application_manager_name [--distro distro_name (--configurations | --roles) ]
The command returns a list of the Hadoop configurations or roles for a specific application manager and distribution.
VMware, Inc.
25
26 VMware, Inc.
Delete an Application Manager by Using the Serengeti Command-Line
Interface 9
You can use the Serengeti CLI to delete an application manager when you no longer need it.
Prerequisites
Verify that you have at least one external application manager installed on your Big Data Extensions
n
environment.
Verify that application manager you want to delete does not contain any clusters, or the deletion
n
process will fail.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager delete command.
appmanager delete --name application_manager_name
VMware, Inc.
27
28 VMware, Inc.
Managing the Big Data Extensions Environment by Using the Serengeti
Command-Line Interface 10
You must manage yourBig Data Extensions, which includes ensuring that if you choose not to add the resource pool, datastore, and network when you deploy the Serengeti vApp, you add the vSphere resources before you create a Hadoop or HBase cluster. You must also add additional application managers, if you want to use either Ambari or Cloudera Manager to manage your Hadoop clusters. You can remove resources that you no longer need.
This chapter includes the following topics:
“About Application Managers,” on page 29
n
“Add a Resource Pool with the Serengeti Command-Line Interface,” on page 32
n
“Remove a Resource Pool with the Serengeti Command-Line Interface,” on page 33
n
“Add a Datastore with the Serengeti Command-Line Interface,” on page 33
n
“Remove a Datastore with the Serengeti Command-Line Interface,” on page 33
n
“Add a Network with the Serengeti Command-Line Interface,” on page 34
n
“Remove a Network with the Serengeti Command-Line Interface,” on page 34
n
“Reconfigure a Static IP Network with the Serengeti Command-Line Interface,” on page 35
n
“Reconfigure the DNS Type and Generate a Hostname with the Serengeti Command-Line Interface,”
n
on page 35
“Increase Cloning Performance and Resource Usage of Virtual Machines,” on page 36
n

About Application Managers

You can use Cloudera Manager, Ambari, and the default application manager to provision and manage clusters with VMware vSphere Big Data Extensions.
After you add a new Cloudera Manager or Ambari application manager to Big Data Extensions, you can redirect your software management tasks, including monitoring and managing clusters, to that application manager.
You can use an application manager to perform the following tasks:
List all available vendor instances, supported distributions, and configurations or roles for a specific
n
application manager and distribution.
Create clusters.
n
Monitor and manage services from the application manager console.
n
Check the documentation for your application manager for tool-specific requirements.
VMware, Inc.
29
Restrictions
The following restrictions apply to Cloudera Manager and Ambari application managers:
To add an application manager with HTTPS, use the FQDN instead of the URL.
n
You cannot rename a cluster that was created with a Cloudera Manager or Ambari application
n
manager.
You cannot change services for a big data cluster from Big Data Extensions if the cluster was created
n
with Ambari or Cloudera Manager application manager.
To change services, configurations, or both, you must make the changes from the application manager
n
on the nodes.
If you install new services, Big Data Extensions starts and stops the new services together with old services.
If you use an application manager to change services and big data cluster configurations, those changes
n
cannot be synced from Big Data Extensions. The nodes that you create with Big Data Extensions do not contain the new services or configurations.

Add an Application Manager by Using the Serengeti Command-Line Interface

To use either Cloudera Manager or Ambari application managers, you must add the application manager and add server information to Big Data Extensions.
NOTE If you want to add a Cloudera Manager or Ambari application manager with HTTPS, use the FQDN in place of the URL.
Procedure
1 Access the Serengeti CLI.
2 Run the appmanager add command.
appmanager add --name application_manager_name --type [ClouderaManager|Ambari]
--url http[s]://server:port
Application manager names can include only alphanumeric characters ([0-9, a-z, A-Z]) and the following special characters; underscores, hyphens, and blank spaces.
You can use the optional description variable to include a description of the application manager instance.
3 Enter your username and password at the prompt.
4 If you specified SSL, enter the file path of the SSL certificate at the prompt.
What to do next
To verify that the application manager was added successfully, run the appmanager list command.

Modify an Application Manager by Using the Serengeti Command-Line Interface

You can modify the information for an application manager with the Serengeti CLI, for example, you can change the manager server IP address if it is not a static IP, or you can upgrade the administrator account.
Prerequisites
Verify that you have at least one external application manager installed on your Big Data Extensions environment.
30 VMware, Inc.
Loading...
+ 96 hidden pages