Apple MAC OS X SERVER 10.5 LEOPARD XGRID ADMINISTRATION

Mac OS X Server
Xgrid Administration and High Performance Computing
For Version 10.5 Leopard
Apple Inc.
© 2007 Apple Inc. All rights reserved.
Every effort has been made to ensure that the information in this manual is accurate. Apple Inc. is not responsible for printing or clerical errors.
Apple 1 Infinite Loop Cupertino, CA 95014-2084 408-996-1010 www.apple.com
Use of the “keyboard” Apple logo (Option-Shift-K) for commercial purposes without the prior written consent of Apple may constitute trademark infringement and unfair competition in violation of federal and state laws.
AirPort, Apple, the Apple logo, Bonjour, FireWire, iPod, Mac, Macintosh, Mac OS, Xgrid, Xsan, and Xserve are trademarks of Apple Inc., registered in the U.S. and other countries. Apple Remote Desktop and Finder are trademarks of Apple Inc.
Intel, Intel Core, and Xeon are trademarks of Intel Corp. in the U.S. and other countries.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.
UNIX is a registered trademark of The Open Group.
Other company and product names mentioned herein are trademarks of their respective companies. Mention of third-party products is for informational purposes only and constitutes neither an endorsement nor a recommendation. Apple assumes no responsibility with regard to the performance or use of these products.
019-0946/2007-09-01

Contents

1
Preface 9 About This Guide
9
What’s New in Xgrid Administration
9
What’s in This Guide
10
Using This Guide
10
Using Onscreen Help
11
Advanced Server Administration Guides
12
Viewing PDF Guides on Screen
12
Printing PDF Guides
13
Getting Documentation Updates
13
Getting Additional Information
Part I Xgrid Administration
Chapter 1 17 Introducing Xgrid Service
17
About Xgrid and Computational Grids
18 20 20
21
21 22 23 23 24 24 24
Chapter 2 25 Setting Up and Configuring Xgrid Service
25 26 26 27
How Xgrid Works Common Types of Grids and Grid Computing Styles Xgrid Clusters Local Grids Distributed Grids
Xgrid Components
Agent Client Controller Jobs Requirements and Capacities
Setup Overview Before Setting Up Xgrid Service
Authentication Methods for Xgrid Single Sign-On (SSO)
3
27 27 28 28 28 29 29 30 30 30
31 32 33 34 34 34 35 35 36 37 37 37 38
Password-Based Authentication No Authentication
Hosting the Grid Controller Turning Xgrid Service On Configuring Xgrid with the Xgrid Service Configuration Assistant
Configuring Xgrid to Host a Grid Using the Xgrid Service Configuration Assistant
Configuring Xgrid to Join a Grid Using Xgrid Service Configuration Assistant Setting Up Xgrid Service
Xgrid and Multiple Network Interfaces
Configuring Controller Settings
Starting Xgrid Service
Configuring an Xgrid Agent (Mac OS X Server)
Configuring an Xgrid Agent (Mac OS X) Setting Up Grid Authentication
Setting Up Kerberos for Xgrid
Setting Passwords for Xgrid Managing Client Access
Setting SACL Permissions for Users and Groups
Setting SACL Permissions for Administrators Managing Xgrid Service
Viewing Xgrid Service Status
Viewing Xgrid Service Logs
Stopping Xgrid Service
Chapter 3 39 Managing a Grid
39
Using Xgrid Admin
40 40 40
41 41
41 42 43 43 44 44 44 44 45 45 45 46
4
Status Indicators in Xgrid Admin
Managing the Xgrid Controller
Connecting to an Xgrid Controller Disconnecting from an Xgrid Controller Adding an Xgrid Controller Removing an Xgrid Controller
Managing Agents
Viewing a List of Agents Adding an Agent Deleting an Agent
Managing Jobs
Viewing a List of Jobs Stopping a Job Repeating or Restarting a Job
Deleting a Job Adding a Grid Deleting a Grid
Contents
46
Monitoring Grid Activity
Chapter 4 47 Planning and Submitting Xgrid Jobs
47
Structuring Jobs for Xgrid
47 48 48 48 49 49
Chapter 5 51 Solving Xgrid Problems
51
51 52 52 53 53 53 53 54 55
About Job Styles About Job Failure
Submitting a Job
Examples of Xgrid Job Submission and Results Retrieval Viewing Job Status Retrieving Job Results
If Your Agents Can’t Connect to the Xgrid Controller If You Use Xgrid over SSH If You Run Tasks on Multi-CPU Machines If You Submit a Large Number of Jobs If You Want to Use Xgrid on Other Platforms If the Xgrid Controller Must Be Restarted If Xgrid Has Crashed If You Are Trying to Submit Jobs over 2 GB If You Want to Enable Kerberos/SSO for Xgrid For More Information
Part II Configuring High Performance Computing
Chapter 6 59 Introducing High Performance Computing
59
Understanding HPC
59
Apple and HPC
60 60 60 62
Chapter 7 63 Reviewing the Cluster Setup Process
64
Chapter 8 67 Identifying Prerequisites and System Requirements
67 67 67 68 68 72
Mac OS X Server
Xserve Clusters
Xserve 64-Bit Architecture
Support of Loosely Coupled Computations
Cluster Setup Overview
Prerequisites
Expertise
Xserve Configuration System Requirements
Infrastructure Requirements
Software Requirements
Contents
5
72
Private Network Requirements
73
Static IP Address and Hostname Requirements
Chapter 9 75 Preparing the Cluster for Configuration
75
Preparing the Cluster Nodes for Software Configuration
78
(Optional) Setting Up the Management Computer
Chapter 10 81 Setting Up the Cluster Controller
81
Setting Up Server Software on the Cluster Controller
84
Configuring DNS Service
85 86 86 87 88 90 90 90
91
Verifying DNS Settings Configuring Open Directory Service
Configuring the Cluster Controller as an Open Directory Master Configuring DHCP Service Configuring Firewall Settings on the Cluster Controller Configuring NAT Settings on the Cluster Controller Configuring NFS Configuring VPN Service Configuring Xgrid Service
92 Preparing the Data Drive as a Mirrored RAID set 93 Creating a Home Directory Automount Share Point 94 Creating User Accounts
Chapter 11 95 Setting Up Compute Nodes
95 Creating an Auto Server Setup Record for Compute Nodes 98 Verifying LDAP Record Creation 98 Setting Up Compute Nodes
99 Configuring Cluster Nodes 101 Creating and Verifying a VPN Connection 101 Joining a Remote Client to the Kerberos Realm
10 2 Verifying Remote Client Access to the Kerberos Realm
Chapter 12 103 Testing Your Cluster
10 3 Checking Your Cluster Using Xgrid Admin 10 4 Testing Your Xgrid Cluster 10 5 Verifying Your Xgrid Configuration 10 6 Verifying Your SSH Connection
Appendix A 107 Cluster Setup Checklist
Appendix B 111 Automating Compute Node Configuration
111 Naming Multiple Cluster Nodes 112 Joining Multiple Cluster Nodes to the Kerberos Realm 112 Configuring Xgrid Agent Settings Using Apple Remote Desktop
6
Contents
11 4 Using SSH Without Passwords
Glossary 11 5
Index 121
Contents 7
8 Contents

About This Guide

This guide describes the Xgrid components included in Mac OS X Server and tells you how to configure and use them in computational grids.
Xgrid in Mac OS X Server version 10.5 includes a controller for computational grids and an agent that allows the server’s processor to work on jobs submitted to a grid. The agent is also available in computers using Mac OS X v10.3 or v10.4.

What’s New in Xgrid Administration

Xgrid service, Xgrid Admin, and high performance computing (HPC) in Mac OS X Server v10.5 Leopard include the following valuable new features.
 Improved security with Xgrid superuser access controls  New Xgrid service configuration assistant  Logging improvements
Preface

What’s in This Guide

This guide is organized as follows: Â Part I—Xgrid Administration. The chapters in this part of the guide introduce you to
Xgrid service and the applications and tools available for administering xgrid.
 Part II—Configuring High Performance Computing. The chapters in this part of the
guide introduce you to HPC and the applications and tools available for administering HPC.
Note: Because Apple frequently releases new versions and updates to its software, images shown in this book may be different from what you see on your screen.
9

Using This Guide

The following list contains suggestions for using this guide: Â Read the guide in its entirety. Subsequent sections might build on information and
recommendations discussed in prior sections.
 The instructions in this guide should always be tested in a nonoperational
environment before deployment. This nonoperational environment should simulate, as much as possible, the environment where the computer will be deployed.

Using Onscreen Help

You can get task instructions on screen in Help Viewer while you’re managing Leopard Server. You can view help on a server or an administrator computer. (An administrator computer is a Mac OS X computer with Leopard Server administration software installed on it.)
To get help for an advanced configuration of Leopard Server:
m Open Server Admin or Workgroup Manager and then:
 Use the Help menu to search for a task you want to perform.  Choose Help > Server Admin or Help > Workgroup Manager to browse and search
the help topics.
The help for Server Admin and Workgroup Manager contains instructions taken from Server Administration and other advanced administration guides described in “Advanced Server Administration Guides,” next.
To see the latest server help topics:
m Make sure the server or administrator computer is connected to the Internet while
you’re getting help.
Help Viewer automatically retrieves and caches the latest server help topics from the Internet. When not connected to the Internet, Help Viewer displays cached help topics.
10 Preface About This Guide

Advanced Server Administration Guides

Getting Started covers basic installation and initial setup methods for a standard, workgroup, or advanced configuration of Leopard Server. An advanced guide, Server Administration, covers advanced planning, installation, setup, and more. A suite of additional guides, listed below, covers advanced planning, setup, and management of individual services. You can get these guides in PDF format from the Mac OS X Server documentation website at www.apple.com/server/documentation.
This guide ... tells you how to:
Getting Started and Mac OS X Server Worksheet
Command-Line Administration Install, set up, and manage Mac OS X Server using UNIX command-
File Services Administration Share selected server volumes or folders among server clients
iCal Service Administration Set up and manage iCal shared calendar service.
iChat Service Administration Set up and manage iChat instant messaging service.
Mac OS X Security Configuration Make Mac OS X computers (clients) more secure, as required by
Mac OS X Server Security Configuration
Mail Service Administration Set up and manage IMAP, POP, and SMTP mail services on the
Network Services Administration Set up, configure, and administer DHCP, DNS, VPN, NTP, IP firewall,
Open Directory Administration Set up and manage directory and authentication services, and
Podcast Producer Administration Set up and manage Podcast Producer service to record, process,
Print Service Administration Host shared printers and manage their associated queues and print
QuickTime Streaming and Broadcasting Administration
Server Administration Perform advanced installation and setup of server software, and
System Imaging and Software Update Administration
Upgrading and Migrating Use data and service settings from an earlier version of
User Management Create and manage user accounts, groups, and computers. Set up
Install Mac OS X Server and set it up for the first time.
line tools and configuration files.
using the AFP, NFS, FTP, and SMB/CIFS protocols.
enterprise and government customers.
Make Mac OS X Server and the computer it’s installed on more secure, as required by enterprise and government customers.
server.
NAT, and RADIUS services on the server.
configure clients to access directory services.
and distribute podcasts.
jobs.
Capture and encode QuickTime content. Set up and manage QuickTime streaming service to deliver media streams live or on demand.
manage options that apply to multiple services or to the server as a whole.
Use NetBoot, NetInstall, and Software Update to automate the management of operating system and other software used by client computers.
Mac OS X Server or Windows NT.
managed preferences for Mac OS X clients.
Preface About This Guide 11
This guide ... tells you how to:
Web Technologies Administration Set up and manage web technologies, including web, blog,
webmail, wiki, MySQL, PHP, Ruby on Rails, and WebDAV.
Xgrid Administration and High Performance Computing
Mac OS X Server Glossary Learn about terms used for server and storage products.
Set up and manage computational clusters of Xserve systems and Mac computers.

Viewing PDF Guides on Screen

While reading the PDF version of a guide on screen: Â Show bookmarks to see the guide’s outline, and click a bookmark to jump to the
corresponding section.
 Search for a word or phrase to see a list of places where it appears in the document.
Click a listed place to see the page where it occurs.
 Click a cross-reference to jump to the referenced section. Click a web link to visit the
website in your browser.

Printing PDF Guides

If you want to print a guide:
 Save ink or toner by not printing the cover page.  Save color ink on a color printer by looking in the panes of the Print dialog for an
option to print in grays or black and white.
 Maximize the printed page image by changing the Scale setting in the Page Setup
dialog. Try 122% with Paper Size set to US Letter. (PDF pages are 7.5 by 9 inches except Getting Started, which is CD size, 125 by 125 mm.)
 Reduce the bulk of the printed document and save paper by printing more than one
page per sheet of paper. In the Print dialog, choose Layout from the untitled pop-up menu. If your printer supports two-sided (duplex) printing, select one of the Two­Sided options. Otherwise, choose 2 from the Pages per Sheet pop-up menu, and optionally choose Single Hairline from the Border menu.
12 Preface About This Guide

Getting Documentation Updates

Periodically, Apple posts revised help pages and new editions of guides. Some revised help pages update the latest editions of the guides.
 To view new onscreen help topics for a server application, make sure your server or
administrator computer is connected to the Internet and click “Latest help topics” or “Staying current” in the main help page for the application.
 To download the latest guides in PDF format, go to the Mac OS X Server
documentation website:
www.apple.com/server/documentation

Getting Additional Information

For more information, consult these resources: Â Read Me documents—important updates and special information. Look for them on
the server discs.
 Mac OS X Server website (www.apple.com/macosx/server)—gateway to extensive
product and technology information.
 Apple Service & Support website (www.apple.com/support)—access to hundreds of
articles from Apple’s support organization.
 Apple customer training (train.apple.com)—instructor-led and self-paced courses for
honing your server administration skills.
 Apple discussion groups (discussions.info.apple.com)—a way to share questions,
knowledge, and advice with other administrators.
 Apple mailing list directory (www.lists.apple.com)—subscribe to mailing lists so you
can communicate with other administrators using email.
 Open Source website (developer.apple.com/darwin/)—Access to Darwin open source
code, developer information, and FAQs.
Preface About This Guide 13
14 Preface About This Guide
Part I: Xgrid Administration
Use the chapters in this part of the guide to learn about Xgrid service and the applications and tools available for administering Xgrid.
Chapter 1 Introducing Xgrid Service
Chapter 2 Setting Up and Configuring Xgrid Service
Chapter 3 Managing a Grid
Chapter 4 Planning and Submitting Xgrid Jobs
Chapter 5 Solving Xgrid Problems
I

1 Introducing Xgrid Service

1
Use this chapter to learn about what Xgrid is and how it can help you.
You use Xgrid to create grids of multiple computers and distribute complex jobs among them for high-throughput computing.
Xgrid, a technology in Mac OS X Server and Mac OS X, simplifies deployment and management of computational grids. Xgrid enables administrators to group computers in grids or clusters, and enables users to easily submit complex computations to groups of computers (local, remote, or both), as either an ad hoc grid or a centrally managed cluster.

About Xgrid and Computational Grids

Xgrid makes it easy to turn an ad hoc group of Mac systems into a low-cost supercomputer. Xgrid is ideal for individual researchers, specialized collaborators, and application developers. For example:
 Scientists can search biological databases on a cluster of Xserve systems.  Engineers can perform finite element analyses on their workgroup’s desktops.  Animators can render images using Mac systems across multiple corporate locations.  Research teams can enlist colleagues and interested laypeople in Internet-scale
volunteer grids to perform long-running scientific calculations.
 Anyone needing to perform CPU-intensive calculations can simultaneously run a
single job across multiple computers, dramatically improving throughput and responsiveness.
With Xgrid functionality integrated into Mac OS X Server, system administrators can quickly enable Xgrid on Mac systems throughout their company, turning idle CPU cycles into a productive cluster at no incremental cost.
17

How Xgrid Works

Xgrid creates multiple tasks for each job and distributes those tasks among multiple nodes. These nodes can be desktop computers running Mac OS X v10.3 or later, or server computers running Mac OS X Server v10.4 or later.
Many desktop computers sit idle during the day, in evenings, and on weekends. The assembly of these systems into a computational grid is known as desktop recovery. This method of grid construction enables you to vastly improve your computational capacity without purchasing extra hardware, and Xgrid makes the software configuration a straightforward task.
For a server to function as a controller, Xgrid requires Mac OS X Server v10.4 or later, with a minimum of 256 MB of RAM. To operate as an agent in a grid, Xgrid requires Mac OS X v10.3 or later, with a minimum of 128 MB of RAM (256 MB advisable). All Xgrid participants must have a network connection. As always, the more RAM a system has, the better it performs, particularly for high-performance computing applications.
A grid is a group of computers working together to solve a single problem. The systems in a grid can be loosely coupled, geographically dispersed and, to some extent, heterogeneous. In contrast, systems in a cluster are often homogeneous, collocated, and strictly managed.
Highly dispersed grids, such as SETI@Home, enable individuals to donate their spare processor cycles to a cause. In office environments, large rendering or simulation jobs can be distributed across all the systems left idle overnight. These can even be used to augment a dedicated computational cluster, which is available to Xgrid clients at all times.
These distinct grid configurations are explained in “Common Types of Grids and Grid Computing Styles” on page 20.
18 Chapter 1 Introducing Xgrid Service
The illustration below gives an example of how a grid handles a job.
Distributed agents
1 Client submits job to Controller
Controller
2 Controller splits job into tasks, then submits tasks to Agents
3 Agents execute tasks
Dedicated Desktop
Client
5 Controller collects tasks and returns job results to Client
4 Agents return tasks to Controller
Dedicated Server
Part-Time Desktop
Xgrid has no limitations on the amount of computational power it can support. The performance of the grid depend on the systems participating, the software running, and the network, among other factors. However, individual applications strongly influence the performance of the grid.
You determine if an application is improved by being deployed on a computational grid. In the best case, application performance may scale linearly with the size of the grid. In the worst case, the addition of agents to a grid can cause a job to complete in even more time than if there were fewer agents. (In such a situation, tasks become so small that the overhead associated with distributing the increased number of tasks supersedes the performance gain of using more agents.) You should be aware of these considerations.
Many proprietary projects enable you to participate in a large computational grid. Often these projects, such as SETI@Home and FightAIDS@Home, are tied to a specific scientific purpose. They usually have easy-to-install software that enables any volunteer to participate in that particular project, and they frequently take the form of a screen saver or background process.
Chapter 1 Introducing Xgrid Service 19
You don’t need to think in terms of thousands or millions of seldom-used computers to see the significance of a computational grid. For example, computers used by university students or corporate employees often work fewer hours than the hours they sit idle at night or on weekends. These computers could contribute productively to the work of a grid without diminishing their usefulness to the students or employees.
Other grid projects are designed for large-scale computational grids, such as the Globus Alliance (a group founded by universities and researchers), with flexible resource management tools and more intelligent grid deployment methods. Instead of developing neatly packaged applications for a specific grid, such projects provide comprehensive frameworks for application deployment.
Xgrid enables users to participate in a computational grid of their choice while still providing the flexibility of a more generic framework for grid developers when deploying grid applications. Xgrid provides the primary benefits of both.
The advantages of the Xgrid technology include:
 Easy grid configuration and deployment  Straightforward yet flexible job submission  Automatic controller discovery by agents and clients  Flexible architecture based on open standards  Support for the UNIX security model, including Kerberos single sign-on or regular
password authentication
 Choice between a command-line interface or an API-based model for grid interaction

Common Types of Grids and Grid Computing Styles

Xgrid can be used in tightly coupled clusters, worldwide grids, and everything in between. This immense flexibility enables you to deploy grids of almost any nature. Three main topologies are commonly used for Xgrid deployments, discussed as follows:
 “Xgrid Clusters” on page 20  “Local Grids” on page 21  “Distributed Grids” on page 21

Xgrid Clusters

Computational clusters are sets of systems dedicated to computation. In a cluster, systems are typically co-located in a rack, connected using gigabit Ethernet or another high-performance network, and strictly managed for maximum performance.
Cluster systems are often entirely homogeneous: their operating systems are the same versions, they have the same software installed, and they generally have the same processor, disk, and RAM configurations.
20 Chapter 1 Introducing Xgrid Service
Xgrid enables administrators to easily configure the distributed resource management functionality of the cluster. Each server in the system runs the agent software, and the head node in the cluster runs the controller software.
Xgrid distributes tasks across the cluster. In clusters, failure rates are generally very low. Systems are rarely, if ever, offline, and their resources are not shared with general user tasks. Clusters are the most efficient but most expensive model of distributed computing.

Local Grids

Systems that are under common administration in a company, university computer lab, or other managed environment can often be easily assembled into a grid for desktop recovery. These systems are often on a local area network (LAN) and they are generally managed by a single organization. As a result, they provide good network performance and offer substantial manageability.
Because these systems are often also used as day-to-day workstations, users can easily interrupt grid tasks by moving the mouse, resetting the system, or even accidentally disconnecting the system from the network. In such cases, a task might fail as part of an Xgrid job the Xgrid controller eventually reassigns the failed task to another agent, and the job completes successfully.
In local grids, performance is limited by such situations and by the varying performance of any given agent on the grid.

Distributed Grids

When a system is permitted to donate its time, a distributed grid is formed.
The Xgrid agent enables a user to specify any IP address or host name for its controller. By specifying a grid, a user can dedicate his or her CPU time to that grid no matter where the controller is located.
The manager of the controller has no direct management control or knowledge of the agent system but is nonetheless able to harness its CPU time.
Distributed grids have very high failure rates for jobs but place a very low burden for the grid administrator. With very, very large jobs, high task failure rates may not substantially affect the performance of the grid if such failures can be rapidly reassigned to other available agents.
Network performance can also be a consideration because data is sent over the Internet, rather than over a local network, to agents connected to a grid. The monetary cost of such distributed grids is extremely low.
Chapter 1 Introducing Xgrid Service 21

Xgrid Components

The Xgrid three-tier architecture simplifies the distribution of complicated tasks. Its user clients, grid controllers, and computational agents work together to streamline the process of assembling nodes, submitting jobs, and retrieving results.
The illustration below gives an example of the Xgrid components and the process of auto configuration for a grid.
Distributed agents
Client
4 Client submits using mDNS, DNS, or name/address
1 Controller advertises via mDNS
Controller
5 Clients and Controller mutually authenticate using passwords or single sign-on
2 Agents locate Controller using mDNS, DNS, or name/address
3 Agents and Controller mutually authenticate using passwords or single sign-on
Dedicated Desktop
Dedicated Server
Part-time Desktop
The primary components of a computational grid perform the following functions: Â An agent runs one task at a time per CPU; a dual-processor computer can run two
tasks simultaneously.
 A controller queues tasks, distributes those tasks to agents, and handles task
reassignment.
 A client submits jobs to the Xgrid controller in the form of multiple tasks. (A client
can be any computer running Mac OS X v10.4 or later or Mac OS X Server v10.4 or later.)
In principle, the agent, controller, and client can run on the same server, but it is often more efficient to have a dedicated controller node.
22 Chapter 1 Introducing Xgrid Service

Agent

Xgrid agents run the computational tasks of a job. In Mac OS X Server, the agent is turned off by default. When an agent is turned on and becomes active at startup, it registers with a controller. (An agent can be connected to only one controller at a time.) The controller sends instructions and data to the agent as needed for the controller’s jobs. After it receives instructions from the controller, the agent performs its assigned tasks and sends the results back to the controller.
By default, agents seek to bind to the first available controller on the LAN. Alternatively, you can specify that it bind to a specific controller.
You can also specify whether an agent is always available or is available only when the computer is idle. A computer is considered idle when it has no mouse or keyboard input and ignores CPU and network activity. If a user returns to a computer that is running a grid task, the computer continues to run the task until it is finished.
By default, the agent on Mac OS X Server is dedicated and the agent on a Mac OS X computer (not a server) is configured to accept tasks only when the computer has had no user input for 15 minutes.
For details about configuring an agent, see “Configuring an Xgrid Agent (Mac OS X Server)” on page 32.
For information about managing agents, see “Managing Agents” on page 42.

Client

Any system can be an Xgrid client if it is running Mac OS X v10.4 or later and has a network connection to the Xgrid controller system. In general, the client can connect to only a single controller.
Depending on how a controller is configured, the client must supply a password or be authenticated by Kerberos (single sign-on) before submitting a job to the grid.
A user submits a job to the controller from a system running the Xgrid client software, usually a command-line tool accessed with the Terminal application. The job can specify the controller or use multicast DNS (mDNS) to dynamically discover the first available controller. When the job is complete, the controller notifies the client and the client can retrieve the results of the job.
For information about client authentication to the controller, see “Setting Up Grid Authentication” on page 34.
Chapter 1 Introducing Xgrid Service 23

Controller

The Xgrid controller manages the communications among the computational resources of a grid. The controller requires Mac OS X Server v10.4 or later. The controller accepts network connections from clients and agents. It receives job submissions from clients, divides the jobs into tasks, dispatches tasks to agents, and returns results to the clients.
Although there can be more than one Xgrid controller running on a subnet, there can only be one controller per logical grid. Each controller can have an arbitrary number of agents connected, but Apple has tested 128 agents per controller.
However, there is no software limitation on the number of agents, and users of Xgrid can choose to exceed 128 agents on a controller at their own risk, with a theoretical maximum equal to the number of available sockets on the controller system.
For details about setting up an Xgrid controller, see “Configuring Controller Settings” on page 30.
For information about managing controllers and grids, see “Managing the Xgrid Controller” on page 40.

Jobs

A job is a collection of execution instructions that can include data and executables. Xgrid can run scripts, utilities, and custom software (anything that doesn’t require user interaction).
A client submits a job to the grid. The controller accepts the job and its associated files, divides the job into tasks, and then distributes the tasks to agents. Agents accept the tasks, perform the calculations, and return the results to the controller, which aggregates them and returns them to the clients.
For more information about jobs, see “Structuring Jobs for Xgrid” on page 47 and “Submitting a Job” on page 48.

Requirements and Capacities

Xgrid is designed to scale from small clusters of a few computers up to large organization-wide grids. Xgrid supports up to 128 agents, any number of jobs comprising up to 100,000 queued tasks, up to 128 MB of submitted data per job, and up to 128 MB of results per job. These are recommended limits and are not enforced by the software. You may choose to exceed these limits at your own risk.
24 Chapter 1 Introducing Xgrid Service
2 Setting Up and Configuring Xgrid
Service
2
Use this chapter to plan your grid and set up the Xgrid agent and controller.
Xgrid simplifies deployment and management of computational grids. Using Server Admin you can configure Xgrid to set up computer groups (grids or clusters) and allow users to easily submit complex computations to these grids (local, remote, or both), as either an ad hoc grid or a centrally managed cluster.

Setup Overview

Here is an overview of the steps for setting up Xgrid service:
Step 1: Before you begin
See “Before Setting Up Xgrid Service” on page 26. Identify the Xgrid environment you need. Before configuring Xgrid, you must make some decisions about the grid.
Step 2: Turn Xgrid service on
Prior to configuring, turn on Xgrid service. See “Turning Xgrid Service On” on page 28.
Step 3: (Optional) Use the Xgrid service configuration assistant to configure Xgrid
If you choose to, you can configure Xgrid using the Xgrid service configuration assistant. This assistant helps with Xgrid configuration by automating many of the settings you make. See “Configuring Xgrid with the Xgrid Service Configuration Assistant” on page 28.
Step 4: Configure Xgrid controller settings
Configure your server as an Xgrid controller using Server Admin. See “Configuring Controller Settings” on page 30.
Step 5: Start Xgrid service
Start Xgrid service on the server using Server Admin. See “Starting Xgrid Service” on page 31.
25
Step 6: Configure Xgrid agent settings (Mac OS X Server)
Configure your server as an Xgrid agent using Server Admin. See “Configuring an Xgrid Agent (Mac OS X Server)” on page 32.
Step 7: Configuring Xgrid agent settings (Mac OS X)
Configure computers as Xgrid agents by using Sharing Preferences. See “Configuring an Xgrid Agent (Mac OS X)” on page 33.

Before Setting Up Xgrid Service

Before configuring Xgrid service, you must define the grid environment you’ll create. In particular, you must decide the following:
 The kind of authentication to use. See “Authentication Methods for Xgrid” on
page 26.
 Where to host your controller. See “Hosting the Grid Controller” on page 28.  How you will manage the controller. See “Managing Xgrid Service” on page 37 and
“Monitoring Grid Activity” on page 46.

Authentication Methods for Xgrid

You can configure Xgrid with or without authentication. If you choose to require authentication of controllers to mutually authenticate with clients and agents, you can choose Single Sign-On or Password-Based Authentication. The following authentication options are available:
 Single Sign-On  Password-Based Authentication  No Authentication
You set up an Xgrid controller using Server Admin. You can specify the type of authentication for agents and clients. The passwords entered in Server Admin for the controller must match those entered for each agent and client.
Consider these points when establishing passwords for agents and clients: Â Kerberos authentication (single sign-on or SSO). If you use Kerberos authentication
for agents or clients, the server that’s the Xgrid controller must be configured for Kerberos, in the same realm as the server running the Kerberos domain controller (KDC) system, and bound to the Open Directory master.
The agent uses the host principal found in the /etc/krb5.keytab file. The controller uses the Xgrid service principal found in the /etc/krb5.keytab file.
 Agents. The agent determines the authentication method. The controller must
conform to that method and password (if a password is used). When an agent is configured with a standard password (not SSO), you must use the same password for agents when you configure the controller. If the agent has specified SSO, the correct service principal and host principals must be available.
26 Chapter 2 Setting Up and Configuring Xgrid Service
 Clients. If your server is the controller for a grid, be sure that Mac OS X and Mac OS X
Server clients use the correct authentication method for the controller.
A client cannot submit a job to the controller unless the user chooses the correct authentication method and enters their password correctly, or has the correct ticket­granting ticket from Kerberos.
For more information, see “Setting Up Grid Authentication” on page 34.

Single Sign-On (SSO)

SSO is the most powerful and flexible form of authentication. It leverages the Open Directory and Kerberos infrastructures in Mac OS X Server to manage authentication behind the scenes, without user intervention.
Each Xgrid participant must have a Kerberos principal. The clients and agents obtain ticket-granting tickets for their principal, which is used to obtain a service ticket for the controller service principal. The controller looks at the ticket granted to the client to determine the user’s principal and verifies it with the relevant service access control lists (SACLs) and groups to determine privileges.
Generally, you should use this option if any of the following conditions are true:
 You already have SSO in your environment.  You have administrator control over all agents and clients in use.  Jobs must run with special privileges (such as for local, network, or SAN file system
access).

Password-Based Authentication

When you can’t use SSO, you can require password authentication. You may not be able to use SSO if:
 Potential Xgrid clients are not trusted by your SSO domain (or you don’t have one)  You want to use agents across the Internet or that are outside your control  It is an ad hoc grid, without the ability to prearrange a web of trust
In these situations, your best option is to specify a password. You have two distinct password settings: one for controller-client and one for controller-agent. For security reasons these should be different passwords.
Note: You can also create hybrid environments, such as with client-controller authentication done using passwords but controller-agent authentication done using SSO (or vice versa).

No Authentication

This option is suitable only for testing a private network in a home or a lab that is inaccessible from any untrusted computer, or when none of the jobs or the computers contain sensitive or important information.
Chapter 2 Setting Up and Configuring Xgrid Service 27
Otherwise, do not use this option. It creates a potential security hole (because anyone can connect or run a job) and should never be used on a system exposed to the Internet, especially when potentially sensitive data is involved.
If you choose to use no authentication, agents can join the grid and clients can submit jobs to the grid without authenticating.

Hosting the Grid Controller

The primary requirement for a controller is that it must be network-accessible to clients and agents. In some cases this may mean the controller must be placed outside an organizational firewall (or inside a buffer zone); otherwise, you would need to open up port 4111 so the controller can be contacted.
It is much simpler (though not essential) for the controller to be on the same subnet as the agents and usual clients, so they can discover each other using Bonjour. If that’s not feasible, host the controller on a server with a fixed IP address and fully qualified DNS name (or alternatively, using Dynamic DNS and a service lookup entry) so that agents and clients know where to find it.

Turning Xgrid Service On

Before you can configure Xgrid settings, you must turn Xgrid service on in Server Admin.
To turn Xgrid service on:
1 Open Server Admin and connect to the server.
2 Click Settings.
3 Click Services.
4 Select the Xgrid checkbox.
5 Click Save.

Configuring Xgrid with the Xgrid Service Configuration Assistant

You can set up Xgrid service by configuring the controller and agent using the Xgrid service configuration assistant. This optional configuration assistant guides you through setting up a server to host a grid or join an existing grid.
Before this assistant proceeds, your server must have access to a directory server that provides Kerberos services.
28 Chapter 2 Setting Up and Configuring Xgrid Service

Configuring Xgrid to Host a Grid Using the Xgrid Service Configuration Assistant

Use the Xgrid service configuration assistant to configure the Xgrid agent and controller to run on this server. This also configures a network file system.
To set up Xgrid to host a grid using the Xgrid service configuration assistant:
1 Open Server Admin and connect to the server.
2 Click the triangle to the left of the server.
The list of services appears.
3 In the expanded Servers list, click Xgrid.
4 Click Overview.
5 Click Configure Xgrid Service (at the lower right).
This opens the Xgrid service configuration assistant.
6 Click Continue.
7 Choose “Host a grid,” then click Continue.
8 Enter the username and password for the directory administrator to authenticate with
the directory domain displayed, then click Continue.
9 Review and confirm your configuration settings, then click Continue.
This restarts Xgrid service using your settings.
10 Click Close.

Configuring Xgrid to Join a Grid Using Xgrid Service Configuration Assistant

Use the Xgrid service configuration assistant to configure the Xgrid agent to run on this server. Joining a grid means that an agent is set up on this server and is bound to an existing controller.
To join a grid using the Xgrid service configuration assistant:
1 Open Server Admin and connect to the server.
2 Click the triangle to the left of the server.
The list of services appears.
3 In the expanded Servers list, click Xgrid.
4 Click Overview.
5 Click Configure Xgrid Service (at the lower right).
This opens the Xgrid service configuration assistant.
6 Click Continue.
7 Choose “Join a grid,” then click Continue.
Chapter 2 Setting Up and Configuring Xgrid Service 29
8 Specify the controller you want to bind your agent to.
Select “Browse Bonjour-discoverable controllers” to view and select from available controllers.
Select “Use controller with hostname” to enter the hostname of a specific controller.
9 Click Continue.
10 Review and confirm your configuration settings, then click Continue.
This restarts Xgrid service using your settings.
11 Click Close.

Setting Up Xgrid Service

You set up Xgrid service by configuring two groups of settings on the Settings pane for Xgrid service in Server Admin:
 Controller. Use to configure your server as an Xgrid controller and set client and
agent authentication.
 Agent. Use to configure your server as an Xgrid agent, to specify the controller, and
to set controller authentication.
The following section describes how to configure these settings. An additional section tells you how to start Xgrid service when you finish. (By default, the Xgrid controller and agent are disabled.)
Important: If you specify a password, the agent and controller must use the same
password or must authenticate using Kerberos (SSO). For information about authentication options, see “Setting Passwords for Xgrid” on page 34.

Xgrid and Multiple Network Interfaces

On a server with multiple network interfaces, Mac OS X Server makes Xgrid service available over all interfaces. You can’t configure Xgrid service separately for each interface.

Configuring Controller Settings

You use Server Admin to configure an Xgrid controller. When configuring the controller, you can also set a password for any agent using the grid and for any client that submits a job to the grid.
To configure an Xgrid controller:
1 Open Server Admin and connect to the server.
2 Click the triangle to the left of the server.
The list of services appears.
3 In the expanded Servers list, click Xgrid.
30 Chapter 2 Setting Up and Configuring Xgrid Service
Loading...
+ 94 hidden pages