Product specifications are subject to change without notice and do not represent a commitment on the part of Avid Technology, Inc.
This product is subject to the terms and conditions of a software license agreement provided with the software. The product may
only be used in accordance with the license agreement.
This product may be protected by one or more U.S. and non-U.S patents. Details are available at www.avid.com/patents
This document is protected under copyright law. An authorized licensee of Interplay Central may reproduce this publication for the
licensee’s own use in learning how to use the software. This document may not be reproduced or distributed, in whole or in part, for
commercial purposes, such as selling copies of this document or providing support or educational services to others. This document
is supplied as a guide for Interplay Central. Reasonable care has been taken in preparing the information it contains. However, this
document may contain omissions, technical inaccuracies, or typographical errors. Avid Technology, Inc. does not accept
responsibility of any kind for customers’ losses due to the use of this document. Product specifications are subject to change without
notice.
The following disclaimer is required by Apple Computer, Inc.:
APPLE COMPUTER, INC. MAKES NO WARRANTIES WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THIS
PRODUCT, INCLUDING WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR ITS FITNESS FOR ANY PARTICULAR
PURPOSE. THE EXCLUSION OF IMPLIED WARRANTIES IS NOT PERMITTED BY SOME STATES. THE ABOVE EXCLUSION
MAY NOT APPLY TO YOU. THIS WARRANTY PROVIDES YOU WITH SPECIFIC LEGAL RIGHTS. THERE MAY BE OTHER
RIGHTS THAT YOU MAY HAVE WHICH VARY FROM STATE TO STATE.
The following disclaimer is required by Sam Leffler and Silicon Graphics, Inc. for the use of their TIFF library:
Permission to use, copy, modify, distribute, and sell this software [i.e., the TIFF library] and its documentation for any purpose is
hereby granted without fee, provided that (i) the above copyright notices and this permission notice appear in all copies of the
software and related documentation, and (ii) the names of Sam Leffler and Silicon Graphics may not be used in any advertising or
publicity relating to the software without the specific, prior written permission of Sam Leffler and Silicon Graphics.
THE SOFTWARE IS PROVIDED “AS-IS” AND WITHOUT WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR OTHERWISE,
INCLUDING WITHOUT LIMITATION, ANY WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
IN NO EVENT SHALL SAM LEFFLER OR SILICON GRAPHICS BE LIABLE FOR ANY SPECIAL, INCIDENTAL, INDIRECT OR
CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY THEORY OF LIABILITY, ARISING
OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
The following disclaimer is required by the Independent JPEG Group:
This software is based in part on the work of the Independent JPEG Group.
This Software may contain components licensed under the following conditions:
Copyright (c) 1989 The Regents of the University of California. All rights reserved.
Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are
duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and
use acknowledge that the software was developed by the University of California, Berkeley. The name of the University may not be
used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS
PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
Copyright (C) 1989, 1991 by Jef Poskanzer.
Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice
appear in supporting documentation. This software is provided "as is" without express or implied warranty.
Copyright 1995, Trinity College Computing Center. Written by David Chappell.
.
2
Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby
granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice
appear in supporting documentation. This software is provided "as is" without express or implied warranty.
Copyright 1996 Daniel Dardailler.
Permission to use, copy, modify, distribute, and sell this software for any purpose is hereby granted without fee, provided that the
above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting
documentation, and that the name of Daniel Dardailler not be used in advertising or publicity pertaining to distribution of the software
without specific, written prior permission. Daniel Dardailler makes no representations about the suitability of this software for any
purpose. It is provided "as is" without express or implied warranty.
Modifications Copyright 1999 Matt Koss, under the same license as above.
Copyright (c) 1991 by AT&T.
Permission to use, copy, modify, and distribute this software for any purpose without fee is hereby granted, provided that this entire
notice is included in all copies of any software which is or includes a copy or modification of this software and in all copies of the
supporting documentation for such software.
THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT ANY EXPRESS OR IMPLIED WARRANTY. IN PARTICULAR,
NEITHER THE AUTHOR NOR AT&T MAKES ANY REPRESENTATION OR WARRANTY OF ANY KIND CONCERNING THE
MERCHANTABILITY OF THIS SOFTWARE OR ITS FITNESS FOR ANY PARTICULAR PURPOSE.
This product includes software developed by the University of California, Berkeley and its contributors.
The following disclaimer is required by Paradigm Matrix:
Portions of this software licensed from Paradigm Matrix.
The following disclaimer is required by Ray Sauers Associates, Inc.:
“Install-It” is licensed from Ray Sauers Associates, Inc. End-User is prohibited from taking any action to derive a source code
equivalent of “Install-It,” including by reverse assembly or reverse compilation, Ray Sauers Associates, Inc. shall in no event be liable
for any damages resulting from reseller’s failure to perform reseller’s obligation; or any damages arising from use or operation of
reseller’s products or the software; or any other damages, including but not limited to, incidental, direct, indirect, special or
consequential Damages including lost profits, or damages resulting from loss of use or inability to use reseller’s products or the
software for any reason including copyright or patent infringement, or lost data, even if Ray Sauers Associates has been advised,
knew or should have known of the possibility of such damages.
The following disclaimer is required by Videomedia, Inc.:
“Videomedia, Inc. makes no warranties whatsoever, either express or implied, regarding this product, including warranties with
respect to its merchantability or its fitness for any particular purpose.”
“This software contains V-LAN ver. 3.0 Command Protocols which communicate with V-LAN ver. 3.0 products developed by
Videomedia, Inc. and V-LAN ver. 3.0 compatible products developed by third parties under license from Videomedia, Inc. Use of this
software will allow “frame accurate” editing control of applicable videotape recorder decks, videodisc recorders/players and the like.”
The following disclaimer is required by Altura Software, Inc. for the use of its Mac2Win software and Sample Source
Code:
Portions relating to gdttf.c copyright 1999, 2000, 2001, 2002 John Ellson (ellson@lucent.com).
Portions relating to gdft.c copyright 2001, 2002 John Ellson (ellson@lucent.com).
Portions relating to JPEG and to color quantization copyright 2000, 2001, 2002, Doug Becker and copyright (C) 1994, 1995, 1996,
1997, 1998, 1999, 2000, 2001, 2002, Thomas G. Lane. This software is based in part on the work of the Independent JPEG Group.
See the file README-JPEG.TXT for more information. Portions relating to WBMP copyright 2000, 2001, 2002 Maurice Szmurlo and
Johan Van den Brande.
Permission has been granted to copy, distribute and modify gd in any context without fee, including a commercial application,
provided that this notice is present in user-accessible supporting documentation.
This does not affect your ownership of the derived work itself, and the intent is to assure proper credit for the authors of gd, not to
interfere with your productive use of gd. If you have questions, ask. "Derived works" includes all programs that utilize the library.
Credit must be given in user-accessible documentation.
This software is provided "AS IS." The copyright holders disclaim all warranties, either express or implied, including but not limited to
implied warranties of merchantability and fitness for a particular purpose, with respect to this code and accompanying
documentation.
Although their code does not appear in gd, the authors wish to thank David Koblas, David Rowley, and Hutchison Avenue Software
Corporation for their prior contributions.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (http://www.openssl.org/)
Interplay Central may use OpenLDAP. Copyright 1999-2003 The OpenLDAP Foundation, Redwood City, California, USA. All Rights
Reserved. OpenLDAP is a registered trademark of the OpenLDAP Foundation.
Avid Interplay Pulse enables its users to access certain YouTube functionality, as a result of Avid's licensed use of YouTube's API.
The charges levied by Avid for use of Avid Interplay Pulse are imposed by Avid, not YouTube. YouTube does not charge users for
accessing YouTube site functionality through the YouTube APIs.
Avid Interplay Pulse uses the bitly API, but is neither developed nor endorsed by bitly.
Attn. Government User(s). Restricted Rights Legend
U.S. GOVERNMENT RESTRICTED RIGHTS. This Software and its documentation are “commercial computer software” or
“commercial computer software documentation.” In the event that such Software or documentation is acquired by or on behalf of a
unit or agency of the U.S. Government, all rights with respect to this Software and documentation are subject to the terms of the
License Agreement, pursuant to FAR §12.212(a) and/or DFARS §227.7202-1(a), as applicable.
4
Trademarks
003, 192 Digital I/O, 192 I/O, 96 I/O, 96i I/O, Adrenaline, AirSpeed, ALEX, Alienbrain, AME, AniMatte, Archive, Archive II, Assistant
Station, AudioPages, AudioStation, AutoLoop, AutoSync, Avid, Avid Active, Avid Advanced Response, Avid DNA, Avid DNxcel, Avid
DNxHD, Avid DS Assist Station, Avid Ignite, Avid Liquid, Avid Media Engine, Avid Media Processor, Avid MEDIArray, Avid Mojo, Avid
Remote Response, Avid Unity, Avid Unity ISIS, Avid VideoRAID, AvidRAID, AvidShare, AVIDstripe, AVX, Beat Detective, Beauty
Without The Bandwidth, Beyond Reality, BF Essentials, Bomb Factory, Bruno, C|24, CaptureManager, ChromaCurve,
ChromaWheel, Cineractive Engine, Cineractive Player, Cineractive Viewer, Color Conductor, Command|24, Command|8,
Control|24, Cosmonaut Voice, CountDown, d2, d3, DAE, D-Command, D-Control, Deko, DekoCast, D-Fi, D-fx, Digi 002, Digi 003,
DigiBase, Digidesign, Digidesign Audio Engine, Digidesign Development Partners, Digidesign Intelligent Noise Reduction,
Digidesign TDM Bus, DigiLink, DigiMeter, DigiPanner, DigiProNet, DigiRack, DigiSerial, DigiSnake, DigiSystem, Digital
Choreography, Digital Nonlinear Accelerator, DigiTest, DigiTranslator, DigiWear, DINR, DNxchange, Do More, DPP-1, D-Show, DSP
Manager, DS-StorageCalc, DV Toolkit, DVD Complete, D-Verb, Eleven, EM, Euphonix, EUCON, EveryPhase, Expander,
ExpertRender, Fader Pack, Fairchild, FastBreak, Fast Track, Film Cutter, FilmScribe, Flexevent, FluidMotion, Frame Chase, FXDeko,
HD Core, HD Process, HDpack, Home-to-Hollywood, HYBRID, HyperSPACE, HyperSPACE HDCAM, iKnowledge, Image
Independence, Impact, Improv, iNEWS, iNEWS Assign, iNEWS ControlAir, InGame, Instantwrite, Instinct, Intelligent Content
Management, Intelligent Digital Actor Technology, IntelliRender, Intelli-Sat, Intelli-sat Broadcasting Recording Manager, InterFX,
Interplay, inTONE, Intraframe, iS Expander, iS9, iS18, iS23, iS36, ISIS, IsoSync, LaunchPad, LeaderPlus, LFX, Lightning, Link &
Sync, ListSync, LKT-200, Lo-Fi, MachineControl, Magic Mask, Make Anything Hollywood, make manage move | media, Marquee,
MassivePack, Massive Pack Pro, Maxim, Mbox, Media Composer, MediaFlow, MediaLog, MediaMix, Media Reader, Media
Recorder, MEDIArray, MediaServer, MediaShare, MetaFuze, MetaSync, MIDI I/O, Mix Rack, Moviestar, MultiShell, NaturalMatch,
NewsCutter, NewsView, NewsVision, Nitris, NL3D, NLP, NSDOS, NSWIN, OMF, OMF Interchange, OMM, OnDVD, Open Media
Framework, Open Media Management, Painterly Effects, Palladium, Personal Q, PET, Podcast Factory, PowerSwap, PRE,
ProControl, ProEncode, Profiler, Pro Tools, Pro Tools|HD, Pro Tools LE, Pro Tools M-Powered, Pro Transfer, QuickPunch,
QuietDrive, Realtime Motion Synthesis, Recti-Fi, Reel Tape Delay, Reel Tape Flanger, Reel Tape Saturation, Reprise, Res Rocket
Surfer, Reso, RetroLoop, Reverb One, ReVibe, Revolution, rS9, rS18, RTAS, Salesview, Sci-Fi, Scorch, ScriptSync,
SecureProductionEnvironment, Serv|GT, Serv|LT, Shape-to-Shape, ShuttleCase, Sibelius, SimulPlay, SimulRecord, Slightly Rude
Compressor, Smack!, Soft SampleCell, Soft-Clip Limiter, SoundReplacer, SPACE, SPACEShift, SpectraGraph, SpectraMatte,
SteadyGlide, Streamfactory, Streamgenie, StreamRAID, SubCap, Sundance, Sundance Digital, SurroundScope, Symphony, SYNC
HD, SYNC I/O, Synchronic, SynchroScope, Syntax, TDM FlexCable, TechFlix, Tel-Ray, Thunder, TimeLiner, Titansync, Titan, TL
Aggro, TL AutoPan, TL Drum Rehab, TL Everyphase, TL Fauxlder, TL In Tune, TL MasterMeter, TL Metro, TL Space, TL Utilities,
tools for storytellers, Transit, TransJammer, Trillium Lane Labs, TruTouch, UnityRAID, Vari-Fi, Video the Web Way, VideoRAID,
VideoSPACE, VTEM, Work-N-Play, Xdeck, X-Form, Xmon and XPAND! are either registered trademarks or trademarks of Avid
Technology, Inc. in the United States and/or other countries.
Adobe and Photoshop are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or
other countries. Apple and Macintosh are trademarks of Apple Computer, Inc., registered in the U.S. and other countries. Windows
is either a registered trademark or trademark of Microsoft Corporation in the United States and/or other countries. All other
trademarks contained herein are the property of their respective owners.
Avid Interplay Central Services — Service and Server Clustering Overview • XXXX-XXXXXX-XX Rev A • May 2014
• Created 5/27/14 • This document is distributed by Avid in online (electronic) form only, and is not available for
purchase in printed form.
This guide is intended for the person responsible for installing, maintaining or administering a
cluster of Avid Interplay Common Services (ICS) servers. It provides background and technical
information on clustering in ICS. It provides an inventory of ICS services along with instructions
on how to interact with them for maintenance purposes. Additionally, it explains the specifics of
an ICS cluster, how each service operates in a cluster, and provides guidance on best practices
for cluster administration. Its aim is to provide a level of technical proficiency to the person
charged with installing, maintaining, or troubleshooting an ICS cluster.
For a general introduction to Interplay Central Services, including ICS installation and clustering
steps, see the Avid Interplay Central Services Installation and Configuration Guide. For
administrative information for Interplay Central, see the Avid Interplay Central Administration Guide.
1Overview
Interplay Central Services (ICS) is a collection of software services running on a server,
supplying interfaces, video playback and other services to Avid Solutions including Interplay
Central, Interplay Sphere, Interplay MAM and mobile applications. A cluster is a collection of
servers that have ICS and additional clustering infrastructure installed. The cluster is configured
to appear to the outside world as a single server. The primary advantages of a cluster are
high-availability, and additional playback capacity.
High availability is obtained through automatic failover of services from one node to another.
This can be achieved with a cluster of just two servers, a primary (master) and secondary (slave).
All ICS services run on the primary. Key ICS services also run on the secondary node. In the
event that a service fails on the primary node, the secondary node automatically takes over,
without the need for human intervention.
When additional capacity is the aim, multiple additional servers can be added to the master-slave
cluster. In this case, playback requests (as always, fielded by the master) are automatically
distributed to the available servers, which perform the tasks of transcoding and serving the
transcoded media. This is referred to as load-balancing. A load-balanced cluster provides better
performance for a deployment supporting multiple, simultaneous users or connections.
An additional benefit of a load-balanced cluster is cache replication, in which media transcoded
by one server is immediately distributed to all the other nodes in the cluster. If another node
receives the same playback request, the material is immediately available without the need for
further effort.
In summary, an ICS server cluster provides the following:
•Redundancy/High-availability. If any node in the cluster fails, connections to that node are
automatically redirected to another node.
•Scale/Load balancing. All incoming playback connections are routed to a single cluster IP
address, and are subsequently distributed evenly to the nodes in the cluster.
•Replicated Cache. The media transcoded by one node in the cluster is automatically
replicated on the other nodes. If another node receives the same playback request, the media
is immediately available without the need to re-transcode.
•Cluster monitoring. A cluster resource monitor lets you actively monitor the status of the
cluster. In addition, If a node fails (or if any other serious problem is detected, e-mail is
automatically sent to one or more e-mail addresses.
Single Server Deployment
In a single server deployment, all ICS services (including the playback service) run on the same
server. This server also holds the ICS database and the RAID 5 file cache. Since there is only one
server, all tasks, including transcoding, are performed by the same machine. The single server
has a host name and IP address. This is used, for example, by Interplay Central users to connect
directly to it using a web browser.
The following diagram illustrates a typical single-server deployment.
Single Server Deployment
Cluster Deployment
In a basic deployment, a cluster consists of one master-slave pair of nodes configured for
high-availability. Typically, other nodes are also present, in support of load-balanced transcoding
and playback. As in a single node deployment, all ICS traffic is routed through a single node —
the master, in this case, which is running all ICS services. Key ICS services and databases are
replicated on the slave node — some are actively running, some are in "standby" mode — which
is ready to assume the role of master at any time.
Playback requests, handled by the ICPS playback service, are distributed by the master to all
available nodes. The load-balancing nodes perform transcoding, but do not participate in
failovers; that is, without human intervention, they can never take on the role of master or slave.
10
How a Failover Works
An interesting difference in a cluster deployment is at the level of IP address. In a single server
deployment, each server owns its host name and IP address, which is used, for example, by
Interplay Central users to connect using a web browser. In contrast, a cluster has a virtual IP
address (and a corresponding host name) defined at the DNS level. Interplay Central users enter
the cluster IP address or host name in the browser's address bar, not the name of an individual
server. The cluster redirects the connection request to one of the servers in the cluster, which
remains hidden from view.
The following diagram illustrates a typical cluster deployment.
How a Failover Works
Failovers in ICS operate at two distinct levels: service, and node. A cluster monitor oversees both
levels. A service that fails is swiftly restarted by the cluster monitor, which also tracks the
service's fail count. If the service fails too often (or cannot be restarted), the cluster monitor gives
responsibility for the service to another node in the cluster, in a process referred to as a failover.
A service restart in itself is not enough to trigger a failover. A failover occurs when the fail count
for the service reaches the threshold value.
The node on which the service failed remains in the cluster, but no longer performs the duties
that have failed. Until you manually reset the fail count, the failed service will not be restarted.
11
How a Failover Works
In order to achieve this state of high-availability, one node in the cluster is assigned the role of
master node. It runs all the key ICS services. The master node also owns the cluster IP address.
Thus all incoming requests come directly to this node and are serviced by it. This is shown in the
following illustration:
Should any of the key ICS services running on the master node fail without recovery (or reach
the failure threshold) the node is automatically taken out of the cluster and another node takes on
the role of master node. The node that takes over inherits the cluster IP address, and its own ICS
services (that were previously in standby) become fully active. From this point, the new master
receives all incoming requests. Manual intervention must be undertaken to determine the cause
of the fault on the failed node and to restore it to service promptly.
In a correctly sized cluster, a single node can fail and the cluster will properly service its users.
n
However, if two nodes fail, the resulting cluster is likely under-provisioned for expected use and
will be oversubscribed.
This failover from master to slave is shown in the following illustration.
12
How Load Balancing Works
How Load Balancing Works
In ICS the video playback service is load-balanced, meaning incoming video playback requests
are distributed evenly across all nodes in the cluster. This can be done because the Interplay
Central Playback Service (ICPS) is actively running on all nodes in the cluster concurrently. A
load-balancing algorithm controlled by the master node monitors the clustered nodes, and
determines which node gets the job.
The exception is the master node, which is treated differently. A portion of its CPU capacity is
preserved for the duties performed by the master node alone, which include serving the UI,
handling logins and user session information, and so on. When the system is under heavy usage,
the master node will not take on additional playback jobs.
The following illustration shows a typical load-balanced cluster. The colored lines indicate that
playback jobs are sent to different nodes in the cluster. They are not meant to indicate a particular
client is bound to a particular node for its entire session, which may not be the case. Notice the
master node’s bandwidth preservation.
13
How Load Balancing Works
The next illustration shows a cluster under heavy usage. As illustrated, CPU usage on the master
node will not exceed a certain amount, even when the other nodes approach saturation.
14
2System Architecture
ICS features messaging systems, cluster management infrastructure, user management services,
and so on. Many are interdependent, but they are nevertheless best initially understood as
operating at logically distinct layers of the architecture, as shown in the following illustration.
The following table explains the role of each layer:
System Architecture Layer Description
Web-Enabled ApplicationsA the top of the food chain are the web-enabled client
applications that take advantage of the ICS cluster. These include
Interplay Central, Interplay MAM, Sphere and the iOS apps.
Cluster IP AddressAll client applications gain access to ICS via the cluster IP
address. This is a virtual address, established at the network level
in DNS and owned by the node that is currently master. In the
event of a failover, ownership of the cluster IP address is
transferred to the slave node.
The dotted line in the illustration indicates it is Corosync that
manages ownership of the cluster IP address. For example, during
a failover, it is Corosync that transfers ownership of the cluster IP
address from the master node to the slave node.
Node IP AddressesWhile the cluster is seen from the outside as a single machine
with one IP address and host name, it is important to note that all
the nodes within the cluster retain individual host names and IP
addresses. Network level firewalls and switches must allow the
nodes to communicate with one another.
Top-Level ServicesAt the top level of the service layer are the ICS services running
on the master node only. These include:
•IPC - Interplay Central core services (aka "middleware")
•ACS - Avid Common Service bus (aka "the bus")
(configuration & messaging uses RabbitMQ.
The dotted line in the illustration indicates the top level services
communicate with one another via ACS, which, in turn, uses
RabbitMQ.
Load Balancing ServicesThe mid-level service layer includes the ICS services that are
load-balanced. These services run on all nodes in the cluster.
•ICPS - Interplay Central Playback Services: Transcodes and
serves transcoded media.
•AvidAll - Encapsulates all other ICPS back-end services.
16
System Architecture Layer Description
DatabasesThe mid-level service layer also includes two databases:
•PostgreSQL: Stores data for several ICS services (UMS,
ACS, ICPS, Pulse).
•MongoDB: Stores data related to ICS messaging.
Both these databases are synchronized from master to slave for
failover readiness.
RabbitMQ Message QueueRabbitMQ is the message broker ("task queue") used by the ICS
top level services.
RabbitMQ maintains its own independent clustering system. That
is, RabbitMQ is not managed by Pacemaker. This allows
RabbitMQ to continue delivering service requests to underlying
services in the event of a failure.
FilesystemThe standard Linux filesystem.
This layer also conceptually includes GlusterFS, the Gluster
"network filesystem" used for cache replication. GlusterFS
performs its replication at the file level.
Unlike the Linux filesystem, GlusterFS operates in the "user
space" - the advantage being any GlusterFS malfunction does not
bring down the system.
DRBDDistributed Replicated Block Device (DRBD) is responsible for
volume mirroring.
DRBD replicates and synchronizes the system disk's logical
volume containing the PostgreSQL and MongoDB databases
across the master and slave, for failover readiness. DRBD carries
out replication at the block level.
PacemakerThe cluster resource manager. Resources are collections of
services grouped together for oversight by Pacemaker.
Pacemakers sees and manages resources, not individual services.
Corosync and HeartbeatCorosync and Heartbeat are the clustering infrastructure.
Corosync uses a multicast address to communicate with the other
nodes in the cluster. Heartbeat contains Open Cluster Framework
(OCF) compliant scripts used by Corosync for communication
within the cluster.
17
System Architecture Layer Description
HardwareAt the lowest layer is the server hardware.
It is at the hardware level that the system disk is established in a
RAID1 (mirror) configuration.
Note that this is distinct from the replication of a particular
volume by DRBD. The RAID 1 mirror protects against disk
failure. The DRBD mirror protects against node failure.
Disk and File System Layout
It is helpful to have an understanding of a node's disk and filesystem layout. The following
illustration represents the layout of a typical node:
Disk and File System Layout
In ICS 1.5 a RAID 5 cache was required for multi-cam, iOS, and MAM non-h264 systems only.
n
As of ICS 1.6 a separate cache is required, but it does not always need to be RAID 5.
The following table presents contents of each volume:
Physical
Volumes (pv)
sda1/bootRHEL boot
sda2/dev/drbd1ICS databases
Volum e
Groups (vg)
Logical
Volumes (lv)DirectoryContent
partition
18
ICS Services and Databases in a Cluster
Physical
Volumes (pv)
sda3icpsswap
sdb1icscache/cacheICS file cache
Volum e
Groups (vg)
Logical
Volumes (lv)DirectoryContent
/dev/dm-0
root
/
Note the following:
•sda1 is a standard Linux partition created by RHEL during installation of the operating
system
•sda2 is a dedicated volume created for the PostgreSQL (UMS, ACS, ICS, Pulse) and
MongoDB (ICS messaging) databases. The sda2 partition is replicated and synchronized
between master and slave by DRBD.
•sda3 contains the system swap disk and the root partition.
•sdb1 is the RAID 5 cache volume used to store transcoded media and various other
temporary files.
ICS Services and Databases in a Cluster
swap space
RHEL system
partition
The following table lists the most important ICS services that take advantage of clustering, and
where they run:
19
ICS Services and Databases in a Cluster
Services
ICSIPC Core Services
(“the middleware”)
(avid-interplay-central)
User Management Service
(avid-um)
Avid Common Services bus
(“the bus”)
(acs-ctrl-core)
AAF Generator
(avid-aaf-gen)
ICS Messaging
(acs-ctrl-messenger)
Playback Services
(“the back-end”)
(avid-all)
Interplay Pulse (avid-mpd)MPD
Load Balancing (avid-icps-manager)XLB
IPCONOFFOFFOFF
UMS
ACS
AAF
ICPS
ics-node 1
(Master)
ONOFFOFFOFF
ONOFFOFFOFF
ONONOFFOFF
ONONONON
ONONONON
ONONONON
ONONONON
ONONONON
ics-node 2
(Slave)ics-node 3ics-node n
= ON (RUNNING)= OFF (STANDBY) = OFF (DOES NOT RUN)
Note the following:
•All ICS services run on the Master node in the cluster.
•Most ICS services are off on the Slave node but start automatically during a failover.
•On all other nodes, the ICS services never run.
•Some services spawned by the Avid Common Service bus run on the master node only (in
standby on the slave node); others are always running on both nodes.
•The Playback service (ICPS) runs on all nodes for Performance Scalability (load balancing
supports many concurrent clients and/or large media requests) and High Availability (service
is always available).
The following table lists the ICS databases, and where they run:
20
Clustering Infrastructure Services
ICS DatabasesICS-node 1
(Master)
ICS DatabasePostgreSQLONOFFOFFOFF
Service Bus
Messaging
Database
= ON (RUNNING)= OFF (STANDBY)= OFF (DOES NOT RUN)
MongoDB
ONOFFOFFOFF
ICS-node 2
(Slave)ICS-node 3ICS-node n
Clustering Infrastructure Services
The ICS services and databases presented in the previous section depend upon the correct
functioning a clustering infrastructure. The infrastructure is supplied by a small number of
open-source software designed specifically (or very well suited) for clustering. For example,
Pacemaker and Corosync work in tandem to restart failed services, maintain a fail count, and
failover from the master node to the slave node, when failover criteria are met.
The following table presents the services pertaining to the infrastructure of the cluster:
SoftwareFunction
RabbitMQCluster Message
Broker/Queue
GlusterFSFile Cache Mirroring
DRBDDatabase Volume
Mirroring
PacemakerCluster Management &
Service Failover
CorosyncCluster Engine Data Bus
HeartbeatCluster Message Queue
= ON (RUNNING)= OFF (STANDBY)= OFF
Node 1
(Master)
ONONONON
ONONONON
ONONOFFOFF
ONONONON
ONONONON
ONONONON
21
Node 2
(Slave)Node 3Node n
(DOES NOT RUN)
Note the following:
•RabbitMQ, the message broker/queue used by ACS, maintains its own clustering system. It
is not managed by Pacemaker.
•GlusterFS mirrors media cached on an individual RAID 5 drive to all other RAID 5 drives in
the cluster.
•DRBD mirrors the ICS databases across the two servers that are in a master-slave
configuration.
•Pacemaker: The cluster resource manager. Resources are collections of services
participating in high-availability and failover.
•Corosync and Heartbeat: The fundamental clustering infrastructure.
•Corosync and Pacemaker work in tandem to detect server and application failures, and
allocate resources for failover scenarios.
DRBD and Database Replication
Recall the filesystem layout of a typical node. The system drive (in RAID1) consists of three
partitions: sda, sda2 and sda3. As noted earlier, sda2 is the partition used for storing the ICS
databases, stored as PostgreSQL databases.
DRBD and Database Replication
The following table details the contents of the databases stored on the sda2 partition:
22
GlusterFS and Cache Replication
DatabaseDirectoryContents
PostgreSQL/mnt/drbd/postgres_dataUMS - User Management Services
ACS - Avid Common Service bus
ICPS - Interplay Central Playback Services.
MPD - Multi-platform distribution (Pulse)
MongoDB/mnt/drbd/mongo_dataICS Messaging
In a clustered configuration, ICS uses the open source Distributed Replicated Block Device
(DRBD) storage system software to replicate the sda2 partition across the Master-Slave cluster
node pair. DRBD runs on the master node and slave node only, even in a cluster with more than
two nodes. PostgreSQL maintains the databases on sda2. DRBD mirrors them.
The following illustration shows DRBD volume mirroring of the sda2 partition across the master
and slave.
GlusterFS and Cache Replication
Recall that the ICS server transcodes media from the format in which it is stored on the ISIS (or
standard filesystem storage) into an alternate delivery format, such as an FLV, MPEG-2
Transport Stream, or JPEG image files. In a deployment with a single ICS server, the ICS server
maintains a cache where it keeps recently-transcoded media. In the event that the same media is
requested again, the ICS server can deliver the cached media, without the need to re-transcode it.
23
Clustering and RabbitMQ
In an ICS cluster, caching is taken one step farther. In a cluster, the contents of the RAID 5
volumes are replicated across all the nodes, giving each server access to all the transcoded
media. The result is that each ICS server sees and has access to all the media transcoded by the
others. When one ICS server transcodes media, the other ICS servers can also make use of it,
without re-transcoding.
The replication process is set up and maintained by GlusterFS, an open source software solution
for creating shared filesystems. In ICS, Gluster manages data replication using its own highly
efficient network protocol. In this respect, it can be helpful to think of Gluster as a "network
filesystem" or even a "network RAID" system.
GlusterFS operates independently of other clustering services. You do not have to worry about
starting or stopping GlusterFS when interacting with ICS services or cluster management
utilities. For example, if you remove a node from the cluster, GlusterFS itself continues to run
and continues to replicate its cache against other nodes in the Gluster group. If you power down
the node for maintenance reasons, it will re-synchronize and 'catch up' with cache replication
when it is rebooted.
The correct functioning of the cluster cache requires that the clocks on each server in the cluster
n
are set to the same time. See “Verifying Clock Synchronization” on page 67.
Clustering and RabbitMQ
RabbitMQ is the message broker ("task queue") used by the ICS top level services. ICS makes
use of RabbitMQ in an active/active configuration, with all queues mirrored to exactly two
nodes, and partition handling set to ignore. The RabbitMQ cluster operates independently of the
ICS master/slave failover cluster, but is often co-located on the same two nodes. The ICS
installation scripts create the RabbbitMQ cluster without the need for human intervention.
Note the following:
•All RabbitMQ servers in the cluster are active and can accept connections
•Any client can connect to any RabbitMQ server in the cluster and access all data
•Each queue and its data exists on two nodes in the cluster (for failover & redundancy)
•In the event of a failover, clients should automatically reconnect to another node
•If a network partition / split brain occurs (very rare), manual intervention will be required
The RabbitMQ Cookie
A notable aspect of the RabbitMQ cluster is the special cookie it requires, which allows
RabbitMQ on the different nodes to communicate with each other. The RabbitMQ cookie must
be identical on each machine, and is set, by default, to a predetermined hardcoded string.
24
Clustering and RabbitMQ
Powering Down and Rebooting
With regards to RabbitMQ and powering down and rebooting nodes:
•If you take down the entire cluster, the last node down must always be the first node up. For
example, if "ics-serv3" is the last node you stop, it must be the first node you start.
•Because of the guideline above, it is not advised to power down all nodes at exactly the same
time. There must always be one node that was clearly powered down last.
•If you don't take the whole cluster down at once then the order of stopping/starting servers
doesn't matter.
For details, see
Handling Network Disruptions
“Shutting Down / Rebooting an Entire Cluster” on page 70.
•RabbitMQ does not handle network partitions well. If the network is disrupted on only some
of the machines and then it is restored, you should shutdown the machines that lost the
network and then power them back on. This ensures they re-join the cluster correctly. This
happens rarely, and mainly if the cluster is split between two different switches and only one
of them fails.
•On the other hand, If the network is disrupted to all nodes in the cluster simultaneously (as
in a single-switch setup), no special handling should be required.
Services and resources are key to the correct operation and health of a cluster. As noted in
“System Architecture” on page 15, services are responsible for all aspects of ICS activity, from
the ACS bus, to end-user management and transcoding. Additional services supply the clustering
infrastructure. Some ICS services are managed by Pacemaker, for the purposes of
high-availability and failover readiness. Services overseen by Pacemaker are called resources.
All services produce logs that are stored in the standard Linux log directories (under /var/log), as
detailed later in this chapter.
Services vs Resources
A typical cluster features both Linux services and Pacemaker cluster resources. Thus, it is
important to understand the difference between the two. In the context of clustering, resources
are simply one or more Linux services under management by Pacemaker. Managing services in
this way allows Pacemaker to monitor the services, automatically restart them when they fail,
and shut them down on one node and start them on another when they fail too many times.
It can be helpful to regard a cluster resource as Linux service inside a Pacemaker “wrapper”. The
wrapper includes the actions defined for it (start, stop, restart, etc.), timeout values, failover
conditions and instructions, and so on. In short, Pacemaker sees and manages resources, not
services.
For example, the Interplay Central (avid-interplay-central) service is the core Interplay Central
service. Since the platform cannot function without it, this service is overseen and managed by
Pacemaker as the AvidI PC resource.
As is known, the status of a Linux service can be verified by entering a command of the
following form at the command line:
service <servicename> status
In contrast, the state of a cluster resource is verified via the Pacemaker Cluster Resource
Manager, crm, as follows:
crm status <resource>
For details see:
Loading...
+ 59 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.