The HP VAN SDN Controller is a Java-based OpenFlow controller enabling SDN solutions such as network
controllers for the data center, public cloud, private cloud, and campus edge networks. This includes
providing an open platform for developing experimental and special-purpose network control protocols using
a built-in OpenFlow controller. This document provides detailed documentation for writing applications to run
on the HP VAN SDN Controller platform.
No part of this documentation may be reproduced or transmitted in any form or by any means without prior
written consent of Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without notice.
HEWLETT-PACKARD COMPANY MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS
MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE. Hewlett-Packard shall not be liable for errors contained herein or for
incidental or consequential damages in connection with the furnishing, performance, or use of this material.
The only warranties for HP products and services are set forth in the express warranty statements
accompanying such products and services. Nothing herein should be construed as constituting an additional
warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
Overview ··········································································································································································· 1
Basic Architecture ····························································································································································· 2
Internal Applications vs. External Applications ············································································································· 5
Acronyms and Abbreviations ·········································································································································· 6
2 Establishing Your Test and Development Environments ··························································································· 7
Test Environment ······························································································································································· 7
Installing HP VAN SDN Controller ························································································································· 7
Authentication Configuration ·································································································································· 7
Development Environment················································································································································ 7
Pre-requisites ····························································································································································· 7
HP VAN SDN Controller SDK ································································································································ 8
Web Layer ······························································································································································ 12
Business Logic Layer ·············································································································································· 12
REST API Documentation ······································································································································· 16
Facets and Handler Facets ································································································································· 114
Device Type Information ····································································································································· 115
Example Operation ············································································································································ 118
Using the Device Driver Framework·················································································································· 121
System Integrity ··················································································································································· 129
5 Including Debian Packages with Applications ····································································································· 130
AdminRest ···························································································································································· 130
Application zip file ······················································································································································ 130
Programming Your Application to Install a Debian Package on the Controller ··················································· 131
Determining when to install the Debian Package ···························································································· 131
AdminRest Interactions ······································································································································· 132
Removing the Debian Package ·································································································································· 134
Defining Model Objects ····································································································································· 150
Distributed Coordination Service ······················································································································ 152
iv
Creating Domain Service (Business Logic) ······································································································· 156
Creating a REST API ··········································································································································· 169
Creating a GUI···················································································································································· 197
Using SDN Controller Services ·························································································································· 208
Role orchestration ··············································································································································· 218
Unit Testing ··································································································································································· 229
Remote Debugging with Eclipse ································································································································· 232
Appendix A ································································································································································· 243
Using the Eclipse Application Environment ··············································································································· 243
Attaching Source Files when Debugging ········································································································· 248
Appendix B ·································································································································································· 251
This document describes the process of developing applications to run on the HP VAN SDN
Controller platform.
The base SDN Controller serves as a delivery vehicle for SDN solutions. It provides a platform for
developing various types of network controllers, e.g. data-center, public cloud, private cloud,
campus edge networks, etc. This includes being an open platform for development of experimental
and special-purpose network control protocols using a built-in OpenFlow controller.
The SDN Controller meets certain minimum scalability requirements and it provides the ability to
achieve higher scaling and high-availability requirements via a scale-out teaming model. In this
model, the same set of policies are applied to a region of network infrastructure by a team of such
appliances, which will coordinate and divide their control responsibilities into separate partitions
of the control domain for scaling, load-balancing and fail-over purposes.
Overview
Regardless of the specific personality of the controller, the software stack consists of two major
tiers. The upper Administrator tier hosts functionality related to policy deployment, management,
personae interactions and external application interactions, for example slow-path, deliberating
operations. The lower Controller tier, on the other hand, hosts policy enforcement, sensing, device
interactions, flow interactions, for example fast-path, reflex, muscle-memory like operations. The
interface(s) between the two tiers provide a design firewall and are elastic in that they can change
along with the personality of the overall controller. Also, they are governed by a rule that no
enforcement-related synchronous interaction will cross from the Controller to Administrator tier.
Figure 1 Controller Tiers
1
The Administration tier of the controller will host a web-layer through which software modules
installed on the appliance can expose REST APIs [1] [2] (or RESTful web services) to other external
entities. Similarly, modules can extend the available web-based GUI to allow network
administrators and other personae to directly interact with the features of the software running on
the SDN Controller.
A web application is an application that is accessed by users over a network such as the Internet
or an intranet. The HP VAN SDN Controller runs on a web server as illustrated in Figure 2.
Figure 2 Web Application Architecture
Servlets [3] [4] is the technology used for extending the functionality of the web server and for
accessing business systems. Servlets provide a component-based, platform-independent method for
building Web-based applications.
SDN applications do not implement Servlets directly but instead they implement RESTful web
services [1] [2] which are based on Servlets; however RESTful web services also act as controllers
as described in the pattern from Figure 3.
Figure 3 Web Application Model View Controller Pattern
Basic Architecture
The principal software stack of the appliance uses OSGi framework (Equinox) [5] [6] and a
container (Virgo) [7] as a basis for modular software deployment and to enforce service
provider/consumer separation. The software running in the principal OSGi container can interact
with other components running as other processes on the appliance. Preferably, such IPC
interactions will occur using a standard off-the shelf mechanism, for instance RabbitMQ, but they
can exploit any means of IPC best suited to the external component at hand. Virgo, based on
Tomcat [8], is a module-based Java application server that is designed to run enterprise Java
applications with a high degree of flexibility and reliability. Figure 4 illustrates the HP VAN SDN
Controller software stack.
2
Figure 4 HP VAN SDN Controller Software Stack
Jersey [2] is a JAX-RS (JSR 311) reference Implementation for building RESTful Web services. In
Representational State Transfer (REST) architectural style, data and functionality are considered
resources, and these resources are accessed using Uniform Resource Identifiers (URIs), typically
links on the web. REST-style architectures conventionally consist of clients and servers and they are
designed to use a stateless communication protocol, typically HTTP. Clients initiate requests to
servers; servers process requests and return appropriate responses. Requests and responses are
built around the transfer of representations of resources. Clients and servers exchange
representations of resources using a standardized interface and protocol. These principles
encourage RESTful applications to be simple, lightweight, and have high performance.
The HP VAN SDN Controller also offers a framework to develop Web User Interfaces - HP SKI. The
SKI Framework provides a foundation on which developers can create a browser-based web
application.
The HP VAN SDN Controller makes use of external services providing APIs that allow SDN
applications to make use of them.
Keystone [9] is an external service that provides authentication and high level authorization
services. It supports token-based authentication scheme which is used to secure the RESTful web
services (Or REST APIs) and the web user interfaces.
Hazelcast[10] is an in-memory data grid management software that enables: Scale-out
computing, resilience and fast, big data.
Apache Cassandra [10] is a high performance, extremely scalable, fault tolerant (no single point
of failure), distributed post-relational database solution. Cassandra combines all the benefits of
Google Bigtable and Amazon Dynamo to handle the types of database management needs that
traditional RDBMS vendors cannot support.
Figure 5 illustrates with more detail the tiers that compose the HP VAN SDN Controller. It shows
the principal interfaces and their roles in connecting components within each tier, the tiers to each
other and the entire system to the external world.
3
The approach aims to achieve connectivity in a controlled manner and without creating undue
dependencies on specifics of component implementations. The separate tiers are expected to
interact over well-defined mutual interfaces, with decreasing coarseness from top to bottom. This
means that on the way down, high-level policy communicated as part of the deployment
interaction over the external APIs is broken down by the upper tier into something similar to a
specific plan, which gets in turn communicated over the inter-tier API to the lower controller tier.
The controller then turns this plan into detailed instructions which are either pre-emptively
disseminated to the network infrastructure or are used to prime the RADIUS or OpenFlow [11] [ 12 ]
controllers so that they are able to answer future switch (other network infrastructure device)
queries.
Similarly, on the way up, the various data sensed by the controller from the network infrastructure,
regarding its state, health and performance, gets aggregated at administrator tier. Only the
administrator tier interfaces with the user or other external applications. Conversely, only the
controller tier interfaces with the network infrastructure devices and other supporting controller
entities, such as RADIUS, OpenFlow [11] [12], MSM controller software, and so on.
4
Figure 5 HP VAN SDN Controller Tiers
•
•
•
•
•
•
•
•
Internal Applications vs. External Applications
Internal applications (“Native” Applications / Modules) are ideal to exert relatively fine-grained,
frequent and low-latency control interactions with the environment, for example, handling packet-in
events. Some key points to consider when developing internal applications:
Authored in Java or a byte-code compatible language, e.g. Scala, or Scala DSL. Deployed on the SDN Controller platform as collections of OSGi bundles. Built atop services (Java APIs) exported and advertised by the platform and by other
applications.
Export and advertise services (Java APIs) to allow interactions with other applications. Dynamically extend SDN Controller REST API surface. Dynamically extend SDN Controller GUI by adding navigation categories, items, views, and
so on.
Integrate with the SDN Controller authentication & authorization framework. Integrate with the SDN Controller Persistency & Distributed Coordination API.
Internal applications are deployed on the HP VAN SDN Controller and they interact with it by
consuming business services (Java APIs) published by the controller in the SDK.
5
External applications are suitable to exert relatively coarse-grained, infrequent, and high-latency
•
•
•
•
Acronym
Description
DTO
Data Transfer Object
HP
Hewlett-Packard
HTTP
Hypertext Transfer Protocol
HTTPS
Hypertext Transfer Protocol Secure
HW
Hardware
LAN
Local Area Network
OF
OpenFlow
OSGi
Open Service Gatway Initiative
OWASP
Open Web Application Security Project
SNMP
Simple Network Management Protocol
VLAN
Virtual LAN
control interactions with the environment, such as path provisioning and flow inspections. External
applications can have these characteristics:
This can be written any language capable of establishing a secure HTTP connection.
Example: Java, C, C++, Python, Ruby, C#, bash, and so on.
They can be deployed on a platform of choice outside of the SDN Controller platform. They use REST API services exported and advertised by the platform and by other
applications.
They do not extend the Java APIs, REST APIs, or GUI of the controller.
This guide describes writing and deploying internal applications. For information about the REST
APIs you can use for external applications, see the HP VAN SDN Controller REST API Reference
Guide.
Acronyms and Abbreviations
There are many acronyms and abbreviations that are used in this document. Table 1 contains some
of the more commonly used acronyms and abbreviations.
Table 1 Commonly Used Acronyms and Abbreviations
CLI Command Line Interface
6
2 Establishing Your Test and Development
•
•
Environments
The suggested development environment contains two separate environments, a Test Environment
and a Development Environment. It is recommended to use a different machine for each of these
environments. The Test Environment is where the HP VAN SDN Controller and all the dependency
systems will be installed; it will be very similar to a real deployment, however virtual machines [13]
are useful during development phase. The Development Environment will be formed by the tools
needed to create, build and package the application. Once the application is ready for
deployment, the test environment will be used to install it.
One reason to keep these environments separated is because distributed applications may need a
team set up to test the application (Cluster of controllers). Another reason is that some unit test
and/or integration tests (RESTful Web Services [1] [2] for example) might open ports that are
reserved for services offered or consumed by the controller.
Test Environment
Installing HP VAN SDN Controller
To install the SDN controller follow the instructions from the HP VAN SDN Controller Installation
Guide [14].
Authentication Configuration
The HP VAN SDN Controller uses Keystone [9] for identity management. When it is installed, two
users are created, "sdn" and "rsdoc", both with a default password of "skyline". This password
can be changed using the keystone command-line interface from a shell on the system where the
controller was installed: Follow the instructions from the HP VAN SDN Controller Installation Guide
[14].
Development Environment
Pre-requisites
The development environment requirements are relatively minimal. They comprise of the following:
Operating System
Supported operating systems include:
Windows 7or later with MKS 9.4p1 Ubuntu 10.10 or later
7
•
Java
Maven
Curl
OSX Snow Leopard or later.
The Software Development Language used is Java SE SDK 1.6 or later. To install Java go to [ 15 ]
and follow the download and installation instructions.
Apache Maven is a software project management and comprehension tool. Based on the concept
of a project object model (POM), Maven can manage a project's build, reporting and
documentation from a central piece of information [16].
To install Maven go to [16] and follow the download and installation instructions. Note that if you
are behind a fire-wall, you may need to configure your ~/.m2/settings.xml appropriately to
access the Internet-based Maven repositories via proxy, for more information see Maven Cannot
Download Required Libraries on page 251.
Maven 3.0.4 or newer is needed. To verify the installed version of Maven execute the following
command:
$ mvn –version
Curl (or cURL) is a command line tool for transferring data with URL syntax. This tool is optional.
Follow the instruction from [17] to install Curl, or if you use Linux Ubuntu as development
environment you may use the Ubuntu Software Center to install it as illustrated in Figure 6.
Figure 6 Installing Curl via Ubuntu Software Center
IDE
An IDE, or an Integrated Development Environment, is a software application that provides a
programmer with many different tools useful for developing. Tools that bundled with an IDE may
include: an editor, a debugger, a compi ler, and more. Eclipse is a popular IDE that can be used to
program in Java and for developing applications. Eclipse might be referenced in this guide.
HP VAN SDN Controller SDK
Download the HP VAN SDN Controller SDK from [18]. The SDK is contained in the hp-sdn-sdk-*.zip
file (for example: hp-sdn-sdk-2.0.0.zip). Unzip its contents in any location. To install the SDN
Controller SDK jar files into the local Maven repository, execute the SDK install tool from the
8
Javadoc
directory where the SDK was unzipped, as follows (Note: Java SDK and Maven must already be
installed and properly configured):
$ bin/install-sdk
To verify that the SDK has been properly installed look for the HP SDN libraries installed in the local
Maven repository at:
~/.m2/repository/com/hp.
The controller Java APIs are documented in Javadoc format in the hp-sdn-apidoc-*.jar file.
Download the file and unzip its contents. To view the Java API documentation, open the index.html
file. Figure 7 illustrates an example of the HP VAN SDN Controller documentation.
Figure 7 HP VAN SDN Controller Javadoc
9
3 Developing Applications
•
•
•
•
•
•
•
•
Internal applications (“Native” Applications / Modules) are ideal to exert relatively fine-grained,
frequent and low-latency control interactions with the environment, for example, handling packet-in
events. Some key points to consider when developing internal applications:
Authored in Java or a byte-code compatible language, e.g. Scala, or Scala DSL. Deployed on the SDN Controller platform as collections of OSGi bundles. Built atop services (Java APIs) exported and advertised by the platform and by other
applications.
Export and advertise services (Java APIs) to allow interactions with other applications. Dynamically extend SDN Controller REST API surface. Dynamically extend SDN Controller GUI by adding navigation categories, items, views, and
so on.
Integrate with the SDN Controller authentication & authorization framework. Integrate with the SDN Controller Persistency & Distributed Coordination API.
Internal applications are deployed on the HP VAN SDN Controller and they interact with it by
consuming business services (Java APIs) published by the controller in the SDK.
Introduction
Figure 8 illustrates the various classes of software modules categorized by the nature of their
responsibilities and capabilities and the categories of the software layers to which they belong.
Also shown are the permitted dependencies among the classes of such modules. Note the explicit
separation of the implementations from interfaces (APIs). This separation principle is strictly
enforced in order to maintain modularity and elasticity of the application. Also note that these
represent categories, not necessarily the actual modules or components. This diagram only aims to
highlight the classes of software modules.
10
Figure 8 HP Application Modules
11
Web Layer
Components in this layer are responsible for receiving and consuming appropriate external
representations (XML, JSON, binary...) suitable for communicating with various external entities
and, if applicable, for utilizing the APIs from the business logic layer to appropriately interact with
the business logic services to achieve the desired tasks and/or to obtain or process the desired
information.
User Interface End-Point (REST API) and end-point resources for handling inbound requests
providing control and data access capabilities to the administrative GUI.
External Interface End-Point (REST API) are end-point resources for handling inbound requests
providing control and data access capabilities to external applications, including other
orchestration and administrative tools (for example IMC, OpenStack , etc.)
Business Logic Layer
Components in this layer fall into two fundamental categories: model control services and
outbound communications services, and each of these are further subdivided into public APIs and
private implementations.
The public APIs are composed of interfaces and passive POJOs [19], which provide the domain
model and services, while the private implementations contain the modules that implement the
various domain model and service interfaces. All interactions between different components must
occur solely using the public API mechanisms.
Model API—Interfaces & objects comprising the domain model. For example: the devices, ports,
network topology and related information about the discovered network environment.
Control API—Interfaces to access the modeled entities, control their life-cycles and in general to
provide the basis for the product features to interact with each other.
Communications API—Interfaces which define the outbound forms of interactions to control,
monitor and discover the network environment.
Control Implementations—Implementations of the control API services and domain model.
Communications Implementations—Implementations of the outbound communications API
services. They are responsible for encoding / transmitting requests and receiving / decoding
responses.
Health Service API—Allows an application to report its health to the controller (via the
HealthMonitorable interface or proactively submitting health information to the HealthService
directly via the updateHealth method) and/or listen to health events from the controller and other
applications (via the HealthListener interface). There are 3 types of health statuses:
• OK – A healthy status to denote that an application is functioning as expected.
• WARN – An unhealthy status to denote that an application is not functioning as expected
and needs attention. This status is usually accompanied by a reason as to why the
application reports this status to provide clues to remedy the situation.
•CRITICAL – An unhealthy status to denote that some catastrophic event has happened to
the application that affects the controller’s functionality. When the controller receives a
CRITICAL event, it will assume that its functionality has been affected, and will proceed to
12
shutdown the Openflow port to stop processing Openflow events. If in a teaming
•
•
•
•
•
•
•
•
•
•
environment, the controller will remove itself from the team.
Persistence Layer
Data Access API—Interfaces, which prescribe how to persist and retrieve the domain model
information, such as locations, devices, topology, etc. This can also include any prescribed routing
and flow control policies.
Data Access Implementations—Implementations of the persistence services to store and
retrieve the SDN-related information in a database or other non-volatile form.
Authentication
Controller REST APIs are secured via a token-based authentication scheme. OpenStack Keystone
[9] is used to provide the token-based authentication.
This security mechanism:
Provides user authentication functionality with RBAC support. Completely isolates the security mechanism from the underlying REST API. Works with OpenStack Keystone. Exposes a REST API to allow any authentication server that implements this REST API to be
hosted elsewhere (outside the SDN appliance).
This security mechanism does not:
Provide authorization. Authorization needs to be provided by the application based on the
authenticated subject's roles.
Support filtering functionality such as black-listing or rate-limiting.
To achieve isolation of security aspects from the API, authentication information is encapsulated by
a token that a user receives by presenting his/her credentials to an Authentication Server. The user
then uses this token (via header X-Auth-Token) in any API call that requires authentication. The
token is validated by an Authentication Filter that fronts the requested API resource. Upon
successful authentication, requests are forwarded to the RESTful APIs with the principal's
information such as:
User ID User name User roles Expiration Date
Upon unsuccessful authentication (either no token or invalid token), it is up to the application to
deny or allow access to its resource. This flexibility allows the application to implement its own
authorization mechanism, such as ACL-based or even allow anonymous operations on certain
resources.
The flow of token-based authentication in the HP VAN SDN Controller can be summarized as
illustrated in Figure 9.
13
Figure 9 Token-based Authentication Flow
1) API Client presents credentials (username/password) to the AuthToken REST API.
2) Authentication is performed by the backing Authentication Server. The SDN Appliance
includes a local Keystone-based Authentication Server, but the Authentication Server may also
be hosted elsewhere by the customer (and maybe integrated with an enterprise directory such
as LDAP for example), as long as it implements the AuthToken REST API (described elsewhere).
The external Authentication Server use-case is shown by the dotted-line interactions. If the user
is authenticated, the Authentication Server will return a token.
3) The token is returned back to the API client.
4) The API client includes this token in the X-Auth-Token header when making a request to the HP
VAN SDN Controller’s RESTful API.
5) The token is intercepted by the Authentication Filter (Servlet Filter).
6) The Authentication Filter validates the token with the Authentication Server via another
AuthToken REST API.
7) The validation status is returned back to the REST API.
8) If the validation is unsuccessful (no token or invalid token), the HP VAN SDN Controller will
return a 401 (Unauthorized) status back to the caller.
9) If the validation is successful, the actual the HP VAN SDN Controller REST API will be invoked
and business logics ensue.
In order to isolate services and applications from Keystone specifics, two APIs in charge of
providing authentication services (AuthToken REST API's) are published:
14
Public API:
•
•
•
•
•
•
•
•
1) Create token. This accepts username/password credentials and return back a unique token with
some expiration.
Service API:
1) Revoke token. This revokes a given token.
2) Validate token. This validates a given token and returns back the appropriate principal's
information.
Authentication services have been split into these two APIs to limit sensitive services (Service API) to
only authorized clients.
REST API
Internal applications do not make use of the HP VAN SDN Controller’s REST API, they extend it by
defining their own RESTful Web Services. Internal applications make use of the business services
(Java APIs) published by the controller. For external applications consult the RESTful API
documentation (or Rsdoc) as described at Rsdoc Live Reference on page 17.
Representational State Transfer (REST) defines a set of architectural principles by which Web
services are designed focusing on a system's resources, including how resource states are
addressed and transferred over HTTP by a wide range of clients written in different languages
[20].
Concrete implementation of a REST Web service follows four basic design principles:
Use HTTP methods explicitly. Be stateless. Expose directory structure-like URIs. Transfer XML, JavaScript Object Notation (JSON), or both.
One of the key characteristics of a RESTful Web service is the explicit use of HTTP. HTTP GET, for
instance, is defined as a data-producing method that's intended to be used by a client application
to retrieve a resource, to fetch data from a Web server, or to execute a query with the expectation
that the Web server will look for and respond with a set of matching resources [20].
REST asks developers to use HTTP methods explicitly and in a way that's consistent with the
protocol definition. This basic REST design principle establishes a one-to-one mapping between
create, read, update, and delete (CRUD) operations and HTTP methods. According to this
mapping:
To create a resource on the server, use POST. To retrieve a resource, use GET. To change the state of a resource or to update it, use PUT. To remove or delete a resource, use DELETE.
See [1] for guidelines to design REST APIs or RESTful Web Services and Creating a REST API on
page 169for an example.
15
REST API Documentation
In addition to the Rsdoc, the HP VAN SDN Controller REST API provides information for interacting
with the controller’s REST API.
Rsdoc
Rsdoc is a semi-automated interactive RESTful API documentation. It offers a useful way to interact
with REST APIs.
Figure 10 RSdoc
It is called RSdoc because is a combination of JAX-RS annotations [2] and Javadoc [21] (Illustrated
in F i g u r e 11 ).
16
Figure 11 RSdoc, JAX-RS and Javadoc
NOTE
Use the correct password if it was changed following instructions from
on
page 7.
JAX-RS annotations and Javadoc are already written when implementing RESTful Web Services, and
they are re-used to generate an interactive API documentation.
Rsdoc Extension
The HP VAN SDN Controller SDK offers a method to extend the Rsdoc to include applications
specific RESTful Web Services (As the example illustrated in F i g u r e 11). Since JAX-RS annotations and
Javadoc are already written when implementing RESTful Web Services, in order to enable an
application to extend the RSdoc is relatively easy and automatic: a few configuration files need to
be updated. See Creating RSdoc on page 193for an example.
Rsdoc Live Reference
To access the HP VAN SDN Controller’s Rsdoc (including extensions by applications):
1. Open a browser at https
2. Get an authentication token by entering the following authentication JSON document:
{"login":{"user":"sdn","password":"skyline","domain":"sdn"}} (as illustrated in Fi g ur e 12 ).
://SDN_CONTROLLER_ADDRESS:8443/api (As illustrated in F igu re 10 ).
Authentication Configuration
17
Figure 12 Authenticating via RSdoc Step 1
3. Set the authentication token as the X-AUTH-TOKEN in the RSdoc and then click “Explore,” as
illustrated in Fi g u re 13. From this point all requests done via RSdoc will be authenticated as long
as the token is valid.
18
Figure 13 Authenticating via RSdoc Step 2
•
•
•
•
Audit Logging
The Audit Log retains information concerning activities, operations and configuration changes that
have been performed by an authorized end user. The purpose of this subsystem is to allow tracking
of significant system changes. This subsystem provides an API which various components can use to
record the fact that some important operation occurred, when and who triggered the operation and
potentially why. The subsystem also provides means to track and retrieve the recorded information
via an internal API as well as via external REST API. An audit log entry, once created, may not be
modified. Audit log entries, once created, may not be selectively deleted. Audit log entries are only
removed based on the age out policy defined by the administrator.
Audit Log data is maintained in persistence storage (default retention period is one year) and is
presented to the end user via both the UI and the REST API layers.
The audit log framework provides a cleanup task that is executed daily (by default) that ages out
audit log entries from persistent storage based on the policy set by the administrator.
An audit log entry consists of the following:
User—a string representation of the user that performed the operation which triggered the
audit log entry.
Time-stamp—the time that the audit log entry was created. The time information is persisted
in an UTC format.
Activity—a string representation of the activity the user was doing that triggered this audit log
entr y.
Data—a string description for the audit log entry. Typically, this contains the data associated
with the operation.
19
•
•
•
•
•
•
•
•
Origin—a string representation of the application or component that originated this audit log
entr y.
Controller ID—the unique identification of the controller that originated the audit log entry.
Applications may contribute to the Audit Log via the Audit Log service. When creating an audit log
entry the user, activity, origin and data must be provided. The time-stamp and controller
identification is populated by the audit log framework. To contribute an audit log entry, use the
post(String us er, String origi n, String activi ty, String descr ip tion)
method provided by the AuditLogService API. This method will return the object that was created.
The strings associated with the user, origin and activity are restricted to a maximum of 255
characters, whereas the description string is restricted to a maximum of 4096 characters.
An example of an application consuming the Audit Log service is described at Auditing with Logs on
page 215.
Alert Logging
The purpose of this subsystem is to allow for management of alert data. The subsystem comprises of
an API which various components can use to generate alert data. The subsystem also provides
means to track and retrieve the recorded information via an internal API as well as via external REST
API. Once an alert entry has been created the state of the alert (active or not) is the only
modification that is allowed.
Alert data is maintained in persistent storage (default retention period is 14 days) and is presented
to the end user via both the UI and REST API layers. The alert framework provides a cleanup task
that is executed daily (by default) that ages out alert data from persistent storage based on the
policy set by the administrator.
An alert consists of the following:
Severity—one of Informational, Warning or Critical Time-stamp—The time the alert was created. The time information is persisted in an UTC
format.
Description—a string description for the alert Origin—a string representation of the application or component that originated the alert Topic—the topic related to the alert. Users can register for notification when alerts related to
a given topic or set of topics occur
Controller ID—the unique identification of the controller that originated the alert
Applications may contribute alerts via the Alert service. When creating an alert the severity, topic,
origin and data must be provided. The time-stamp and controller identification is populated by the
alert framework. To contribute an alert, use the
post(Severity severity, Ale rtTopic topic, String origin, Stri ng data)
method provided by the AlertService API. This method returns the Alert DTO object that was created.
The string associated with the origin is restricted to a maximum of 255 characters, as well as the
data string.
An example of an application consuming the Alert service is described at Posting Alerts on page
212 .
20
Configuration
The SDN controller presents configurable properties and allows the end user to modify
configurations via both the UI and REST API layers. The HP VAN SDN Controller uses the OSGi
Configuration Admin [22] [23] and MetaType [24] [25] services to present the configuration data.
For an application to provide configuration properties that are automatically presented by the SDN
controller, they must provide the MetaType information for the configurable properties. The metatype
information is contained in a “metatype.xml” file that must be present in the OSGI-INF/metatype
folder of the application bundle.
The necessary metatype.xml can be automatically generated via the use of the Maven SCR
annotations [26] and Maven SCR [27] plugin in a Maven pom.xml file for the application (See Root
POM File on page 139). The SCR annotations must be included as a dependency, and the SCR
plug-in is a build plugin.
Application pom.xml Example:
<?xml versio n="1.0" encoding="UTF-8"?>
<project xmlns=http://ma ven .apache.org/POM/4.0.0
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
The component can then use Annotations to define the configuration properties as illustrated in the
following listing.
Configurable Property Key Definition Example:
package com.hp.hm.impl;
import org.apache.felix. scr .annotations.*;
...
@Component (metatype=true)
public class SwitchComponen t im plements SwitchService {
@Property(intValue = 100, description="Some Configuration")
protected static final St ring CONFIG_KE Y = " cfg.key";
...
}
The component is provided the configuration data by the OSGi framework as a Java Dictionary
object, which can be referenced as a basic Map of key -> value pairs. The key will always be a
Java String object, and the value will be a Java Object. A component will be provided the
configuration data at component initialization via an annotated “activate” method. Live updates to
a components configuration will be provided via an annotated “modified” method. Both of these
annotated methods should define a Map<String, Object> as an input parameter. The following
listing shows an example.
Configurable Property Example:
...
import com.hp.sdn.misc.C onf igUtils;
@Component (metatype=tru e)
public class SwitchComponen t im plements SwitchService {
@Property(intValue = 100, description="Some Configuration")
protected static final Stri ng CONFIG_KEY = "cfg.key";
As the configuration property value can one of several different kinds of Java object (Integer, Long,
String, etc.) a utility class is provided to read the appropriate Java object type from the configuration
map. The ConfigUtils.java class provides methods to read integers, longs, strings, Booleans and
ports from the configuration map of key -> value pairs. The caller must provide the following
information:
The configuration map The key (string) for the desired property in the configuration map A data Validator object (can be null) A default value. The default value is returned if the provided key is not found in the
configuration map, if the key does not map to an Object of the desired type, or if a provided
data validator object rejects the value.
A Validator is a typed class which performs custom validation on a given configuration value. For
example, a data validator which only allows integer values between 10 and 20 is illustrated in the
following listing.
Configurable Property Validator Example:
...
import com.hp.sdn.misc.Validator;
public class MyValidator impl em ents Validator<Integer> {
@Override
public boolean isValid( In teger value) {
return ((10 <= value) && (value <= 20));
}
}
To use this validator with the ConfigUtils class to obtain the configuration value from the
configuration map, just include it in the method call:
MyValidato r myValidator = new My Validator();
ConfigUtils.readInt(config, CONFIG_KEY, myValidator, 15);
High Availability
Role orchestration
Role Orchestration Service provides a federated mechanism to define the role of teamed controllers
with respect to the network elements in the controlled domain. The role that a controller assumes in
relation to a network element would determine whether it has abilities to write and modify the
configurations on the network element, or has only read-only access to it.
As a preparation to exercise the Role Orchestration Service (ROS) in the HP VAN SDN Controller,
there are two pre-requisite operations that needs to be carried out beforehand:
23
1) Create controller team: Using the teaming interfaces, a team of controllers need to be defined
for leveraging High Availability features.
2) Create Region: the network devices for which the given controller has been identified as a
master are grouped into “regions”. This grouping is defined in the HP VAN SDN Controller
using the Region interface detailed in subsequent sections.
Once the region definition(s) are in place, the ROS would take care of ensuring that a master
controller is always available to the respective network element(s) even when the configured master
experiences a failure or there is effectively a disruption of the communication channel between the
controller and the network device(s).
Failover: ROS would trigger the failover operation in two situations:
1) Controller failure: The ROS detects the failure of a controller in a team via notifications from
the teaming subsystem. If the ROS determines that the failed controller instance was master to
any region, it would immediately elect one of the backup (slave) controllers to assume the
mastership over the affected region.
2) Device disconnect: The ROS instance in a controller would get notified of a communication
failure with network device(s) via the Controller Service notifications. It would instantly federate
with all ROS instances in the team to determine if the network device(s) in question are still
connected to any of the backup (slave) controllers within the team. If that is the case, it would
elect one of the slaves to assume mastership over the affected network device(s).
Failback: When the configured master recovers from a failure and joins the team again, or when
the connection from the disconnected device(s) with the original master is resumed, ROS would
initiate a failback operation i.e. the mastership is restored back to the configured master as defined
in the region definition.
ROS exposes API’s through which interested applications can:
1) Create, delete or update a region definition
2) Determine the current master for a given device identified by a datapathId or IP address
3) Determine the slave(s) for a given device identified by a datapathId or IP address
4) Determine if the local controller is a master to a given device identified by a datapath
5) Determine the set of devices that a given controller is playing the master or slave role.
6) Register for region and role change notifications.
Details of the RegionService and RoleService APIs may be found at the Javadocs provided with the
SDK. See Javadoc on page 9 for details.
Illustrative usages of Role Service API’s
- To determine the controller which is currently playing the role of Master to a given datapath,
applications can use the following API’s depending on the specific need:
import com.hp.sdn.adm.ro le. RoleService;
import com.hp.sdn.adm.sy ste m.SystemInforamationServ ice;
…
public class SampleService {
// Mandatory dependency.
private final Sy stemInformationService sysIn foService;
24
// Mandatory dependency.
private final Ro leService roleService;
public void doAct() {
IpAddress masterIp = roleService.getM aster(dpid).ip();
if(masterIp.equals(sysInfoService.
getSystem().getAddress())){
log.debug(“this cont ro ller is the master to {}”,
dpid);
// now that we know this controller has master privil ages
// we could for exam ple initiate wri te operations on the
// datapath – like sending flow-mods
}
}
}
- To determine the role that a controller is playing with respect to a given datapath
import com.hp.of.lib.msg .Co ntrollerRole;
import com.hp.sdn.adm.ro le. RoleService;
import com.hp.sdn.region .Co ntrollerNode;
import com.hp.sdn.region.ControllerNodeModel;
…
public class SampleService {
// Mandatory dependency.
private final Ro leService roleService;
public void doAc t() {
...
ControllerNode co ntroller = new Cont ro llerNodeModel(“10.1.1.1” );
Contro llerRole role = ro leService.ge tC urrentRole(controller,de viceIp);
switch(role){
case MASTER:
// the given controll er has master privilages
// we can trigger write-operations from that co ntroller
...
Break;Case SLAVE:
// we have only read priv ileges
...
break;
default:
// indicates the cont roller and device are not associated
// to any region.
break;
25
}
}
Notification on Region and Role changes
Applications can express interest in region change notifications using the addListener(...) API in
RegionService and providing an implementation of the RegionListener. A sample listener
implementation is illustrated in the following listing:
Region Listener Example:
import com.hp.sdn.adm.re gio n.RegionListener;
import com.hp.sdn.region .Re gion;
...
public class RegionListener Im pl implements Re gionListener {
...
@Override
public void added(Regio n re gion) {
log.debug(“Mast e r of ne w region: {}”, region.master() );
}
@Override
public void removed(Reg io n region) {
log.debug(“Mast e r of re moved region: {}”, region.mast er());
}
}
Similarly applications can express interest in role change notifications using the addListener(...) API
in RoleService and providing an implementation of the RoleListener. A sample listener
implementation is illustrated in the following listing:
Role Listener Example:
import com.hp.sdn.adm.ro le. RoleEvent;
import com.hp.sdn.adm.ro le. RoleListener;
...
public class RoleListenerIm pl implements RoleListener {
...
@Override
public void rolesAsserted(RoleEvent roleEvent) {
log.debug(“Previous master: {}”, roleEvent.oldMaster());
log.debug(“New m ast er: {}”, roleEvent.newMaster ());
log.debug(“Affected datapaths: {}”, roleEvent.data paths());
}
}
OpenFlow
OpenFlow messages are sent and received between the controller and the switches (datapaths) it
manages. These messages are byte streams, the structure of which is documented in the OpenFlow
Protocol Specification documents published by the Open Networking Foundation (ONF) [28].
26
The Message Library is a Java implementation of the OpenFlow specification, providing facilities for
•
•
•
•
•
•
encoding and decoding OpenFlow messages from and to Java rich data types.
The controller handles the connections from OpenFlow switches and provides the means for upper
layers of software to interact with those switches via the ControllerService API.
The following figure illustrates this:
Figure 14 OpenFlow Controller
Message Library
The Message Library is a Java implementation of the OpenFlow specification, providing facilities for
encoding and decoding OpenFlow messages from and to Java rich data types.
Design Goals
The following are the overall design goals of the library:
To span all OpenFlow protocol versions
However, actively supporting just 1.0.0 and 1.3.2
To be extensible
Easily accommodating future versions
To provide an elegant, yet simple, API for handling with OpenFlow messages To reduce the burden on application developers
Insulating developers from differences across protocol versions, as much as possible
To expose the semantics but hide the syntax details
Developers will not be required to encode and decode bitmasks, calculate message
lengths, insert padding, etc.
To be robust and type-safe
Working with Java enumerations and types
27
Design Choices
•
•
•
•
•
•
•
Some specific design choices were made to establish the underlying principles of the
implementation, to help meet the goals specified above.
All OpenFlow messages are fully creatable/encodable/decodable, making the library
completely symmetrical in this respect.
The controller (or app) never creates certain messages (such as PortStatus, FlowRemoved,
However, providing a complete solution allows us to emulate OpenFlow switches in Java
Message instances, for the most part, are immutable.
This means a single instance can be shared safely across multiple applications (and
This implies that the structures that make up the message (ports, instructions, actions, etc.)
Where possible, “Data Types” will be used to encourage API type-safety – see the
Where bitmasks are defined in the protocol, Java enumerations are defined with a constant
for each bit.
MultipartReply, etc.) as these are only ever generated by the switch. Technically, we
would only need to decode those messages, never encode them.
code. This facilitates the writing of automated tests to verify switch/controller interactions
in a deterministic manner.
multiple threads) without synchronization.
must also be immutable.
Javadocs for com.hp.util.ip and com.hp.of.lib.dt.
A specific bitmask value is represented by a Set of the appropriate enumeration
constants.
For example: Set<PortConfig>
A message instance is mutable only while the message is under construction (for example, an
application composing a FlowMod message). To be sent through the system it must be
converted to its immutable form first.
To create and send a message, an application will:
Use the Message Factory to create a mutable message of the required type
Set the state (payload) of the message
Make the message immutable
Send the message via the ControllerService API.
The Core Controller will use the Message Factory to encode the message into its byte-stream
form, for transmitting to the switch.
The Core Controller will use the Message Factory to decode incoming messages from their
byte-stream form into their (immutable) rich data type form.
28
Figure 15 Message Factory Role
•
•
•
•
•
•
Message Composition and Type Hierarchy
All OpenFlow message instances are subclasses of the OpenflowMessage abstract class. Every
message includes an internal Header instance that encapsulates:
The protocol version The message type The message length (in bytes) The transaction ID (XID)
In addition to the header, specific messages may include:
Data values, such as “port number”, “# bytes processed”, “metadata mask”, “h/w address”,
etc.
These values are represented by Java primitives, enumeration constants, or data types.
Other common structures, such as Ports, Matches, Instructions, Actions, etc.
These structure instances are all subclasses of the OpenflowStructure abstract class.
For each defined OpenFlow message type (see com.hp.of.lib.msg.MessageType) there are
corresponding concrete classes representing the immutable and mutable versions of the message.
For a given message type (denoted below as “Foo”) the following class relationships exist:
29
Figure 16 OpenFlow Message Class Diagram
Each mutable subclass includes a private Mutable object that determines whether the instance is still
“writable”. While writable, the “payload” of the mutable message can be set. Once the message
has been made immutable, the mutable instance is marked as “no longer writable”; any attempt to
change its state will result in an InvalidMutableException being thrown.
Note that messages are passive in nature as they are simply data carriers.
Note also that structures (e.g. a Match) have a very similar class relationship.
Factories
Messages and structures are parsed or created by factories. Since the factories are all about
processing, but contain no state, the APIs consist entirely of static methods. Openflow messages are
created, encoded, or parsed by the MessageFactory class. Supporting structures are created,
encoded, or parsed by supporting factories, e.g. MatchFactory, FieldFactory, PortFactory, etc.
The main factory that application developers will deal with is the MessageFactory:
30
Figure 17 Message Factory Class Diagram
•
•
•
•
•
•
•
•
•
The other factories that a developer might use are:
MatchFactory—creates matches, used in FlowMods FieldFactory—creates match fields, used in Matches InstructionFactory—creates instructions for FlowMods ActionFactory—creates actions for instructions, (1.0 flowmods), and group buckets PortFactory—creates port descriptions
Note that there are “reserved” values (special port numbers) defined on the Port class
QueueFactory—creates queue descriptions MeterBandFactory—creates meter bands, used in MeterMod messages BucketFactory—creates buckets, used in GroupMod messages TableFeatureFactory—creates table feature descriptions
Note that application developers should not ever need to invoke “parse” or “encode” methods on
any of the factories; those methods are reserved for use by the Core Controller.
An example: creating a FlowMod message
The following listing shows an example of how to create a flowmod message:
Flowmod Message Example:
public class SampleFlowModM es sageCreation {
private static fin al ProtocolVersi on PV = ProtocolVersion.V_1_3;
private static final long COOKIE = 0x00002468;
private static fin al TableId TABLE_I D = TableId.valueOf(200);
31
private static final int FL OW_IDLE_TIME OU T = 300;
private static final int FL OW_HARD_TIME OU T = 600;
private static final int FL OW_PRIORITY = 50 ;
private static fin al Set<FlowModFl ag> FLAGS = EnumSet. of(
FlowModFlag.SEND_FLOW_REM,
FlowModFlag.CHECK_OVERLAP,
FlowModFlag.NO_BYTE_COUNTS
);
private static final MacAddress MAC =
MacAddress.valueOf("00001e:000000");
private static final MacAddress MAC_MASK =
MacAddress.valueOf("ffffff:000000");
private static final PortNumber SMTP_PORT = PortNumb er.valueOf(25);
private static final MacAddress MAC_DEST = MacAddress.BROADCAST;
private static final IpAd dress IP_DEST = Ip A ddress.LOOPBACK_IPv4;
private OfmFlowMod sampleFlowModCreation() {
// Create a 1.3 FlowMod ADD message...
OfmMutableFlowM o d fm = (OfmMutableFlowMod)
MessageFactory.create(PV, MessageType.FLOW_MOD,
FlowModCommand.ADD);
// NOTE: outPort = ANY an d outGroup = ANY by de fault so we don’t have
// to explicitly set them.
// Also, bufferId def au lts to BufferId.NO_BUFFER.
// com.hp.of.lib.in st r.ActionFactory.createAc tion;
// com.hp.of.lib.in st r.InstructionFactory.createInstruction;
//com.hp.of.lib.instr.InstructionFactory.createMutableInstruction;
List<Instructio n > result = new ArrayList<Instruct ion>();
result.add(createInstruction(PV, InstructionType.WRITE_METADATA,
INS_MET A _DATA, INS_META_MASK));
InstrMutableAct i on apply = createMutableInstruc tion(PV,
InstructionType.APPLY_ACTIONS);
apply.addAction(createAction(PV, ActionType.DEC_NW_TTL))
.addAction(createActionSetField(PV, ETH_DST, MAC_DEST))
.addAction(createActionSetFi eld(PV, IPV4_DST, IP_DEST));
result.add((Instruction) apply.toImmutable());
return result;
}
}
Core Controller
The Core Controller handles the connections from OpenFlow switches and provides the means for
upper layers of software to interact with those switches via the ControllerService API.
Design Goals
The following are the overall design goals of the core controller:
To support OpenFlow 1.0.0 and 1.3.2 switches. To provide the base platform for higher-level OpenFlow Controller functionality. To implement the services of:
Accepting and maintaining connections from OpenFlow-capable switches
33
Maintaining information about the state of all OpenFlow ports on connected switches
•
•
•
•
•
•
•
•
•
Conforming to protocol rules for sending messages back to switches
To provide a modular framework for controller sub-components, facilitating extensibility of the
core controller.
To provide an elegant, yet simple, API for Network Service components and SDN
Applications to access the core services.
To provide a certain degree of “sandboxing” of applications to protect them (and the
controller itself) from ill-performing applications.
Design Choices
Some specific design choices were made to establish the underlying principles of the
implementation, to help meet the goals specified above.
The controller will use the OpenFlow Message Library to encode / decode OpenFlow
messages; all APIs will be defined in terms of OpenFlow Java rich data-types.
All OpenFlow messages and structures passed into and out of the controller must be
immutable.
Services and Applications may register as listeners to be notified of events such as:
Datapaths connecting or disconnecting
Messages received from datapaths
Packets received from datapaths (packet-in processing)
Flows being added to or removed from datapaths
The controller will decouple incoming connection events and message events from the
consumption of those events by listeners, using bounded event queues.
This will provide some level of protection for the controller and for the listeners, from an
It is up to each listener to consume events fast enough to keep pace with the rate of
Services and Applications will interact with the controller via the ControllerService API. The controller will be divided into several modules, each responsible for specific tasks:
Core Controller—listens for connections from, and maintains state information about,
Packet Sequencer—listens for Packet-In messages, orchestrates the processing and
Flow Tracker—provides basic management of flow rules, meters, and groups.
Controller Service
ill-performing listener implementation.
arrival.
− In the event that the listener is unable to do so, an out-of-band “queue-full” event will
be posted, and event queuing for that listener will be suspended.
OpenFlow switches (datapaths).
subsequent transmission of Packet-Out replies.
The ControllerService API provides a common façade for consumers to interact with the controller.
The implementing class (ControllerManager) delegates to the appropriate sub-component or to the
core controller. The following sections briefly describe the API methods, with some code examples –
see the Javadocs for more details.
34
In the following code examples, it is assumed that a reference to the controller service
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
implementation has been stored in the field
private ControllerService cs = ...;
cs:
Datapath Information
Information about datapaths that have connected to the controller is available; either all connected
datapaths, or a datapath with a given ID:
The DataPathInfo API provides information about a datapath:
the datapath ID the negotiated protocol version the time at which the datapath connected to the controller the time at which the last message was received from the datapath the list of OpenFlow-enabled ports the reported number of buffers the reported number of tables the set of capabilities the remote (IP) address of the connection the remote (TCP) port of the connection a textual description the manufacturer the hardware version the software version the serial number a device type identifier
The following listing shows an example of how to use Datapath information:
Datapath Information Example:
DataPathId dpid = DataPathId.valueOf("00:0 0:00:00:00:00:00:01");
DataPathInfo dpi;
try {
dpi = cs.getDataPa thInfo(dpid);
log.info("Datapath with ID {} is connected", dpid);
log.info("Nego tiated protocol ve rsion is {}", dpi.negotiated());
for (Port p: dpi.ports()) {
...
}
} catch (NotFo undException e) {
log.warn("Datapath w ith I D {} is not connected", dpid);
}
35
Listeners
•
•
•
•
•
Application code may wish to be notified of events via a callback mechanism. A number of
methods allow the consumer to register as a listener for certain types of event:
Message Listeners – notified when OpenFlow messages arrive from a datapath. At
registration, the listener specifies the message types of interest. Note that one exception to
this is PACKET_IN messages; to hear about these, one must register as a
SequencedPacketListener.
Sequenced Packet Listeners – notified when PACKET_IN messages arrive from a datapath.
This mechanism is described in more detail in a following section.
Flow Listeners – notified when FLOW_MOD messages are pushed out to datapaths, or when
flow rules are removed from datapaths (either explicitly, or by timeout).
Group Listeners – notified when GROUP_MOD messages are pushed out to datapaths. Meter Listeners – notified when METER_MOD messages are pushed out to datapaths.
The following listing shows an example that listens for ECHO_REPLY messages (presumably we
have some other code that is sending ECHO_REQUEST messages), and PORT_STATUS messages.
ECHO_REPLY and PORT_STATUS Example:
private static final Set<Mess ag eType> INTEREST = EnumSet.of(
MessageType.ECHO_REPLY,
MessageType.PORT_STATUS
);
private clas s MyListener implements MessageL istener {
@Override
public void queueEvent(QueueEvent event) {
log.warn("Messa g e Listener Queue event: {}", even t);
}
@Override
public void event(MessageEvent event) {
if (event.type() == OpenflowEventType.MESSAGE_RX) {
OpenflowMessage msg = event.msg();
DataPathId dpid = event.dpid();
switch (msg. get Type()) {
case ECHO_REPLY:
handleEchoReply((OfmEchoReply) msg, dpid);
break;
case POR T_S TATUS:
handlePortStatus((OfmPortStatus) msg, dpid);
break;
}
private void handlePort St atus(OfmPortStatus msg, Data PathId dpid) {
...
}
}
Statistics
The ControllerService API has a number of methods for retrieving various “statistics” about the
controller, or about datapaths in the network.
getStats()—returns statistics on byte and packet counts, from the controller’s perspective. getPortStats(...)—queries the specified datapath for statistics on its ports. getFlowStats(...)—queries the specified datapath for statistics on installed flows. getGroupDescription(...)—queries the specified datapath for its group descriptions. getGroupStats(...)—queries the specified datapath for statistics on its groups. getGroupFeatures(...)—queries the specified datapath for the group features it supports. getMeterConfig(...)—queries the specified datapath for its meter configurations. getMeterStats(...)—queries the specified datapath for statistics on its meters. getMeterFeatures(...)—queries the specified datapath for the meter features it supports. getExperimenter(...)—queries the specified datapath for meter configuration or statistics for
OpenFlow 1.0 datapaths.
As an example, a method to print all the flows on a given datapath could be written as follows:
Flows Example:
private void printFlowStats (D ataPathId dpid) {
List<MBodyFlowStats > stats = cs.getFlowStats(dpid, Ta bleId.ALL);
// Note: the above is a blocking call, which will wait for the
// controller to send the request to the datapath and retrie ve the
// response, before retur ni ng.
print("All flows installed on datapath {} ...", dpid);
for (MBodyFlowSt ats fs: stats)
printFlow(fs);
}
private void printFlow(MBodyFlowStats fs) {
print("Table ID : {}", fs.getTableId());
print("Duratio n : {} secs", fs.getDu rationSec());
print("Idle Time out : {} secs", fs.get IdleTimeout());
print("Hard Time out : {} secs", fs.getHardTimeout());
37
print("Match : {}", fs.ge tM atch());
•
•
•
•
•
•
•
// Note: this is one area where we need to be cogn izant of the version:
if (fs.getVersion() == Pr ot ocolVersion.V_1_0)
print("Actions : {}", fs.getActions());
else
print( "Instructions : {} ", fs.getInstructions());
}
Sending Messages
Applications may construct and send messages to datapaths via the “send” methods:
The returned MessageFuture(s) allow the caller to choose whether to wait synchronously (block
until the outcome of the request is known), or whether to do some other work and then check on
the result of the request later.
When a message is sent to a datapath, the corresponding MessageFuture encapsulates the state
of that request. Initially the future’s result is UNSATISF IED. Once the outcome is determined, the
future is “satisfied” with one of the following results:
SUCCESS—the request was a success; the reply message is available via reply(). SUCCESS_NO_REPLY—the request was a success; there is no associated reply. OFM_ERROR—the request failed; the datapath issued an error, available via reply(). EXCEPTION—the request failed due to an exception; available via cause(). TIMEOUT—the request timed-out waiting for a response from the datapath.
The following listing shows a code example that attaches a timestamp payload to an
ECHO_REQUEST message, then retrieves the timestamp payload from the ECHO_REPLY sent back
by the datapath:
ECHO_REQUEST and ECHO_REPLY Example:
private static final Protocol Ve rsion PV = ProtocolVersion.V_1 _3;
private stat ic final int SIZE_OF _LONG = 8;
private static final String E_ECHO_FAILED =
"Failed to send Echo Request: {}";
private static final long REQUE ST_TIMEOUT_M S = 5 000;
long now = S yst em.currentTimeMillis();
long then = retrieveTimestamp(future.reply());
38
long duration = now - then;
log.info("ECHO Latency to {} is {} ms", dpid, duration);
} else {
log.warn(E_ECHO_FAILED, future.result());
}
} catch (Exception e) {
log.warn(E_ECHO_FAIL ED, e.toString());
}
}
private OpenflowMessage createEchoRequest(byte[] timestamp) {
OfmMutableEchoReque s t echo = (OfmMutableEchoRequest )
MessageFactory.create(PV, MessageType.ECHO_REQUEST );
echo.data(timestamp);
return echo.toImmutable();
}
private long retrieveTimest am p(OpenflowMessage reply) {
OfmEchoReply echo = (OfmE ch oReply) reply;
return ByteUtils.get Lon g(echo.getData(), 0);
}
Packet Sequencer
PACKET_IN messages are handled by the controller with the Packet Sequencer module. The design
of this module provides an orderly, deterministic, yet flexible, scheme for allowing code running on
the controller to register for participation in the handling of PACKET_IN messages. An application
wishing to participate will implement the SequencedPacketListener (SPL) interface.
The following figure illustrates the relationship between the Sequencer and the SPLs participating
in the processing chain:
39
Figure 18 Packet-In Processing
•
•
•
•
•
•
•
•
The Roles provide three broad bands of participation with the processing of PACKET_IN messages:
An ADVISOR may analyze and provide additional metadata about the packet (attached as
“hints” for listeners further downstream), but does not contribute directly to the formation of
the PACKET_OUT message.
A DIRECTOR may contribute to the formation of the associated PACKET_OUT message by
adding actions to it; DIRECTORs may also determine that the PACKET_OUT message is ready
to be sent back to the datapath, and can instruct the Sequencer to send it on it s way.
An OBSERVER passively monitors the PACKET_IN/PACKET_OUT interactions.
Within each role, SPLs are processed in order of decreasing “altitude”. The altitude is specified
when the SPL registers with the controller. Between them, the role and altitude provide a
deterministic ordering of the “processing chain”.
When a PACKET_IN message event occurs, the PACKET_IN is wrapped in a MessageContext
which provides the context for the packet being processed. The packet is also decoded to the
extent where the network protocols present in the packet are identified; this information is attached
to the context.
The message context is passed from SPL to SPL (via the event() callback) in the predetermined
order, but only to those SPLs where at least one of the network protocols present in the packet is
also defined in the SPL’s “interest” set:
During an ADVISOR’s event() callback, hints might be attached to the context with a call to
addHint(Hint).
During a DIRECTOR’s event() callback, the PacketOut API may be utilized to:
Add an action to the PACKET_OUT message under construction.
Clear all the actions from the PACKET_OUT message under construction.
Indicate to the sequencer that the packet should be blocked (i.e. not sent back to the
source datapath).
40
Indicate to the sequencer that the packet should be sent (i.e. the PACKET_OUT should be
•
•
•
•
•
transmitted back to the source datapath).
During an OBSERVER’s event callback, the context can be examined to determine the
outcome of the packet processing.
Once a DIRECTOR invokes the PacketOut.send() method from their callback, the sequencer will
convert the mutable PACKET_OUT message to its immutable form and attempt to send it back to
the datapath. If an error occurs during the send, this fact is recorded in the message context, and
the DIRECTOR’s errorEvent() callback is invoked.
Note that every SPL that registers with the sequencer is guaranteed to see every MessageContext
(subject to their ProtocolId “interest” set).
Here is some sample code that shows how to register as an observer of DNS packets sent to the
controller in PACKET_IN messages:
private static final int OBS_ALTITUDE = 25;
private static final Set<ProtocolId>
OBS_INTEREST = EnumSet.of(ProtocolId.DNS);
private fina l MyObserver myObserver = new MyObserver();
private static class MyObserv er extends SequencedPacketAdapter {
@Override
public void event(MessageContex t context) {
Dns dns = context.decodedPacket().get(ProtocolId.DNS);
reportOnDnsPacket(dns, context.srcEvent().dpid());
}
private void reportOnDn sP acket(Dns dns, DataPathId dpid ) {
// Since packet proce ssing (this thread) is fast-path,
// queue the report task onto a separate thread, then return.
// ...
}
}
Note that event processing should happen as fast as possible, since this is key to the performance
of the controller. In the example above, it is suggested that the task of reporting on the DNS
packet is submitted to a queue to be processed in a separate thread, so as not to hold up the
main IO-Loop.
41
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Message Context
The MessageContext is the object which maintains the state of processing a PACKET_IN message,
and the formulation of the PACKET_OUT message to be returned to the source datapath. When a
PACKET_IN message is received by the controller, several things happen:
A new MessageContext is created The PACKET_IN message event is attached The packet data (if there is any) is decoded and the Packet model attached A mutable PACKET_OUT message is created and attached (with appropriate fields set) The MessageContext is passed from listener to listener down the processing chain
The MessageContext provides the following methods:
srcEvent() – returns the message event (immutable) containing the PACKET_IN message
received from the datapath.
getVersion() – returns the protocol version of the datapath / OpenFlow message. getPacketIn() – returns the PACKET_IN message from the message event. decodedPacket() – returns the network packet model (immutable) of the decoded packet data. getProtocols() – returns an ordered list of protocol IDs for the protocol layers in the decoded
packet.
packetOut() returns the PacketOut API, through which actions may be applied to the
PACKET_OUT message under construction.
getCompletedPacketOut() – returns the PACKET_OUT message (immutable) that was sent
back to the datapath.
addHint(Hint) – adds a hint to the message context. getHints() – returns the list of hints attached to the context. isHandled() – returns true if a DIRECTOR has already instructed the sequencer to send or
block the PACKET_OUT message.
isBlocked() – returns true if a DIRECTOR has already instructed the sequencer to block the
PACKET_OUT message.
isSent() – returns true if a DIRECTOR has already instructed the sequencer to send the
PACKET_OUT message.
isTestPacket() – returns true if the associated packet has been determined to be a diagnostic
test packet.
requiresProcessing() – returns true if the associated packet is not a test packet, and has not
yet been blocked or sent.
failedToSend() – returns true if the attempt to send the PACKET_OUT message failed. toDebugString() – returns a detailed, multi-line string representation of the message context.
42
Flow Tracker and Pipeline Manager
•
•
•
•
•
•
•
•
•
The Flow Tracker is a sub-component of the core controller that facilitates management of flow
rules, meters and groups across all datapaths managed by the controller. Its functionality is
accessed through the ControllerService API.
The Pipeline Manager is a sub-component that maintains an in-memory model of the flow table
capabilities of (1.3) datapaths. When an application attempts to install a flow, the flow tracker will
consult the pipeline manager to choose a suitable table in which to install the flow, if no explicit
table ID has been provided by the caller.
Flow Management
Flow management includes:
Getting flow statistics from a specified datapath, for one or all flow tables Adding or modifying flows on a specified datapath Deleting flows from a specified datapath
See the earlier Message Library section for an example of how to create a FLOW_MOD message.
Group Management
Group management includes:
Getting group descriptions from a datapath, for one or all groups. Getting groups statistics from a datapath, for one or all groups. Sending group configuration to a datapath.
Note that groups are only supported for OpenFlow 1.3 datapaths.
Meter Management
Meter management includes:
Getting meter configurations from a datapath, for one or all meters Getting meter statistics from a datapath, for one or all meters. Sending meter configuration to a datapath
Note that meters are only supported for OpenFlow 1.3 datapaths. However, some 1.0 datapaths
can support metering through the use of EXPERIMENTER messages.
Flow Rules
The primary mechanism used in the implementation of SDN applications is the installation of flow
rules (aka “FlowMods”) on datapaths (aka switches).
Flow Classes
Before a FlowMod can be constructed and sent via the controller service, a corresponding “Flow
Class” must be registered. The flow class explicitly defines the match fields that will be present in the
flow, and the types of actions that will be taken when the flow rule is matched. The registration of
43
flow classes also enables the controller to arbitrate flow priorities and therefore minimize conflicts
amongst co-resident SDN applications.
A flow class can be registered with code similar to the following:
private static final String L2_PATH_FWD = "com.foo.app.l2.path";
private static final String PASSWORD = "aPjk57";
private static final String L2_DESC = "Reactive path forwarding flows" ;
private volatile ControllerService controll e r = . .. ; // injected reference
private FlowClas s l2Class;
private void init() {
l2Class = new FlowClassRegistrator(L2_PATH_FWD, PASS WORD, L2_DESC)
.fields(ETH_SRC, ETH_DST, ETH_TYP E, IN_PORT)
.actions(FORWARD).register(controller);
}
On creating the Registrator, the first parameter is a logical name for the flow class, the second
parameter is a password used to verify ownership of the flow class (typically via the REST API), and
the third parameter is a short text description of the class (that is displayed in the UI).
“fields” should specify the list of match fields that will be set in the match; “actions” is the class of
actions that will be employed in the actions/instructions of the FlowMod.
Note the use of static imports making the code more concise and easier to read.
The flow class instance created by the controller service is needed to inject both the controllerassigned priority and controller-assigned base cookie for the class. On creating the flow mod
message, code such as the following might be used:
private static final long MY_COOKIE = 0x00beef00;
private static final ProtocolVersion pv = ProtocolVersion.V_1_3;
// ... set match fields and act ions ...
// ... send flow ...
44
The flow class is assigned a unique “base cookie” (top 16 bits of the 64 bit field) which must be
“OR”ed with any cookie value that you wish to include in the flow (bottom 48 bits of the 64 bit
field).
The flow class “priority” is a private, logical key to be stored in the FlowMod “priority” field. It is
used by the controller to look up the pre-registered flow class record, so that the match fields and
actions of the FlowMod can be validated against the list of intended matches/actions.
When your application gets uninstalled, be sure to unregister any flow classes you created:
private void cleanup() {
controller.unre g isterFlowClass(l2Class, PAS SWORD);
}
Flow Contributors
When a datapath first connects to the controller, an initial handshaking sequence is employed.
A component may implement InitialFlowContributor and register itself with the controller service.
During step (7) above, the provideInitialFlows(...) callback method will be invoked on every
registered contributor, requesting any flows to be included in the set of initial flows to be laid down
on the newly-connected datapath.
A possible implementation might look like this:
@Override
public List<OfmFlowMod> provideInitialFlows(DataPathInfo info,
bo olean isHybrid) {
List<OfmFlowMod> result = new ArrayList<>(1);
if (isHybrid)
result.add(buildFlowMod(info));
return result;
}
45
•
•
•
•
•
•
•
•
Note that the info parameter provides information about the newly-connected datapath, and the
isHybrid parameter indicates whether the controller is configured for hybrid mode or not.
Such a component must register with the controller service to have its callback invoked at the
appropriate times:
controller.registerInitialFlowContributor(this);
Metrics Framework
The fundamental objectives to be addressed by the metering framework are as follows.
Support components that are part of the HP VAN SDN Controller Framework and
applications that are not.
Make metrics simple to use. Support the creation and updating of metrics within the controller and from outside, to
accommodate apps that have external components but want to keep all of their metric data
in one repository.
Support several metric types:
Counter
Gauge
Rolling counter
Ratio gauge
Histogram
Meter
Timer
Designed to be robust
Maintains functionality when the controller stops and restarts
Maintains functionality when the metering framework stops and restarts, but the
controller does not
Support persistence of data over time on different time scales. Support display of specified metrics via JMX. Support authorization-based REST access to persisted data over time.
External View
The overarching purpose of metering support is to provide a centralized facility that application
developers can use to track metric values over time, and to provide access to the resulting time
stamped values thereafter via REST. The use of this facility, as shown in the following conceptual
46
diagram, should demand relatively little effort from a developer beyond creating and updating the
•
•
metrics they wish to utilize.
Figure 19 Metrics Architecture
Essentially a component or application must contact the MetricService to create a new
TimeStampedMetric on their behalf; they will be returned a reference to the resulting (new)
TimeStampedMetric object. The developer can then manipulate the returned TimeStampedMetric
object as appropriate for their own needs, updating its value at their own cadence, on a regular
or irregular basis, to reflect changes in whatever is being measured.
Behind the scenes, the MetricService API is backed by a MetricManagerComponent OSGi
component. This component delegates almost all of its work to a MetricManager singleton, which
(conceptually) contains a centralized Collection of the TimeStampedMetric references doled out at
the request of other components and applications. This Collection of TimeStampedMetric
references allows the metering framework to process the TimeStampedMetrics en masse,
irrespective of which application or component requested them, in a fashion that is completely
decoupled from the requesting application's or component's use of the TimeStampedMetrics.
The most essential processing done by the metering framework is to periodically persist
TimeStampedMetric values to disk, and to expose "live" TimeStampedMetric values through JMX.
Other processing is also done, such as aging out old TimeStampedMetric values. Decoupled from
this ongoing persistence of TimeStampedMetric values that are still being used, values that have
already been persisted from TimeStampedMetrics over time may be read via the REST API and
exported for further analysis or processing outside the controller
TimeStampedMetric Types
There are seven types of TimeStampedMetric. They are listed below, with an example of how each
type might be used.
TimeStampedCounter
A cumulative measurement that is incremented or decremented when some event occurs.
Example application: the number of OpenFlow devices discovered by the controller.
TimeStampedGauge
47
An instantaneous measure.
•
•
•
•
•
•
•
Example application: the amount of disk space consumed by metric data.
TimeStampedHistogram
A distribution of values from a stream of data for which mean, minimum, maximum, and
various quartile values are tracked.
Example application: distribution of OpenFlow flow sizes.
TimeStampedMeter
Aggregates event durations to measure event throughput.
Example application: the frequency with which OpenFlow flow requests are sent to the
controller by a specific switch.
TimeStampedRatioGauge
A ratio between two non-cumulative instantaneous numbers.
Example application: the amount of disk space consumed by a specific application's
metric data compared to all metric data.
TimeStampedRollingCounter
A cumulative measurement that is asymptotically increased when some event occurs, and
may eventually roll over to zero and begin anew.
Example application: a MIB counter that represents the number of octets observed in a
specific subnet.
TimeStampedTimer (combines the functionality of TimeStampedHistogram and
TimeStampedMeter)
Aggregates event durations to provide statistics about the event duration and throughput.
Example application: the rate at which entries are placed on a queue and a histogram
of the time they spent on the queue.
TimeStampedMetric Life Cycle
Creating a TimeStampedMetric
It is possible to create a TimeStampedMetric and track its value from a component or application
that is running within the controller.
To request that the MetricService create a new TimeStampedMetric, a component or application
must provide a MetricDescriptor object that specifies the characteristics of the desired
TimeStampedMetric. A MetricDescriptor contains four fields that, when combined, produce a
combination (four-tuple) that is unique to that MetricDescriptor and the resulting
TimeStampedMetric: an application ID, a primary tag, a secondary tag, and a metric name. The
MetricDescriptor also contains other fields, as follows.
Required Field(s)
A name that is unique among TimeStampedMetrics of the same application ID, primary tag,
and secondary tag combination (String).
Optional Field(s)
The ID of the application creating the TimeStampedMetric instance (String, defaulted to the
application ID).
48
•
•
•
•
•
•
•
•
•
•
TimeStampedMetric
Corresponding
Met ricDescr iptor
Required
MetricDescriptorBuilder
Tim eStam pedCounter
CounterDescriptor
CounterDescriptorBuilder
TimeStampedGauge
GaugeDescriptor
GaugeDescriptorBuilder
TimeStampedHistogram
HistogramDescriptor
HistogramDescriptorBuilder
TimeStampedMeter
MeterDescriptor
MeterDescriptorBuilder
TimeStampedRatioGauge
RatioGaugeDescriptor
RatioGaugeDescriptorBuilder
A primary tag (String, no default). A secondary tag (String, no default). A description (String, no default). The summary interval in minutes (enumerated value, defaulted to 1 minute). Whether values for the resulting TimeStampedMetric should be visible to the controller's JMX
server (boolean, defaulted to false).
Whether values for the resulting TimeStampedMetric should be persisted (boolean, defaulted
to true).
The summary interval uses an enumerated data type to restrict the possible values to 1, 5, or 15
minutes. Also, note that while the value of most TimeStampedMetrics will likely be persisted over
time there may be cases, for example troubleshooting metrics, in which it is not desired to persist
the values as a time series but just to view them in real time via JMX.
The primary and secondary tags are provided as a means of grouping metrics for a specific
application. For example, consider an application that is to monitor router port statistics; it might
have collected a metric called TxFrames from every port of every router. The primary and
secondary tags would then be used to segment the occurrences of the TxFrames metric from each
router port. For some router A, port X, the four-tuple that identifies the specific instance of
TimeStampedMetric corresponding to that port might be as follows.
There is a MetricDescriptor subclass that corresponds to each type of TimeStampedMetric. These
MetricDescriptor subtypes can only be created by using the corresponding MetricDescriptorBuilder
subclasses. The relationship between the desired TimeStampedMetric type, corresponding
MetricDescriptor subtype, and the MetricDescriptorBuilder subclasses to use to produce an
instance of the right MetricDescriptor subtype are summarized below.
Table 2 Metric Descriptor Subtype
Subtype
Subtype
Subtype
49
Tim eStam pedRollingCounter
RollingCounterDescriptor
RollingCounterDescriptorBuilder
TimeStampedTimer
TimerDescriptor
TimerDescriptorBuilder
Using MetricDescriptorBuilders represents the application of a well-known design pattern that
allows most of the fields of each MetricDescriptor subtype instance that is produced to be
defaulted to commonly-used values. Thus, for a typical use case in which the defaults are
applicable, the component or application that is using a MetricDescriptorBuilder to produce a
MetricDescriptor subtype instance can specify values only for the fields of the
MetricDescriptorBuilder subtype that are to differ from the default values.
Call MetricService
Once a MetricDescriptor has been created, the component or application creating a
TimeStampedMetric can invoke the appropriate MetricService method for the metric type they wish
to create. The MetricService methods that pertain to TimeStampedMetric creation are listed below.
Note that the creation of one TimeStampedMetric type, TimeStampedRollingCounter, offers the
option to specify an extra parameter above and beyond the properties conveyed by
theMetricDescriptor object.
MetricService:
public interface MetricServ ic e {
public TimeStampedCounter createCounter(CounterDescripto r descriptor);
public TimeStampedGauge createGauge(GaugeDescriptor descriptor);
public TimeStampedHistogram createHistogram(
HistogramDescriptor descriptor);
public TimeStampedMeter createMeter(MeterDescriptor descri ptor);
public TimeStampedRa tio Gauge createRatioGauge(
RatioGaugeDescript or descriptor);
public TimeStampedRo lli ngCounter createRollingCou nter(
RollingCounterD e scriptor descriptor);
public TimeStampedRo lli ngCounter createRollingCou nter(
RollingCounterDescriptor descriptor, long primingValue);
public TimeStampedT imer createTime r( TimerDescriptor descriptor );
}
The optional extra parameter for the TimeStampedRollingCounter is an initial priming value for the
rolling counter that will be used to take subsequent delta values. Otherwise the value of the
TimeStampedRollingCounter instance the first time it should be persisted will instead be used to
prime the rolling counter and no value will be observed until its second persistence occurs.
Upon acquiring a TimeStampedMetric instance from the MetricService, the component or
application that requested the creation has a reference to the resulting TimeStampedMetric. The
value of the TimeStampedMetric may be updated whenever the component or application wishes,
as frequently or infrequently as desired, on a schedule or completely asynchronously; the
framework's interaction with the TimeStampedMetric is unaffected by these factors. The method(s)
that may be used to update the value of a TimeStampedMetric will depend upon the type of
TimeStampedMetric. Each time the value of a TimeStampedMetric is updated, a time stamp in the
50
TimeStampedMetric is updated, relative to the controller's system clock, to indicate when the
•
•
•
•
•
•
•
update occurred; this time stamp is used by the framework in processing the resultant values.
The following methods may be used to update the value of each TimeStampedMetric type.
TimeStampedCounter
dec()—Decrements the current count by one.
dec(long)—Decrements the current count by the specified number.
inc()—Increments the current count by one.
inc(long)—Increments the current count by the specified number.
TimeStampedGauge
setValue(long)—Stores the latest snapshot of the gauge value.
TimeStampedHistogram
update(int)—Adds the specified value to the sample set stored by the histogram.
update(long)—Adds the specified value to the sample set stored by the histogram.
TimeStampedMeter
mark()—Marks the occurrence of one event.
mark(long)—Marks the occurrence of the specified number of events.
TimeStampedRatioGauge
updateNumerator(double)—Stores the latest snapshot of the numerator value.
updateDenominator(double)—Stores the latest snapshot of the denominator value.
update(double, double)—Stores the latest snapshot of both numerator and denominator
values.
TimeStampedRollingCounter
setLatestSnapshot(long)—Stores the latest snapshot of the rolling counter.
TimeStampedTimer
time(Callable<T>)—Measures the duration of execution for the provided Callable and
incorporates it into duration and throughput statistics.
update(int)—Adds an externally-recorded duration in milliseconds.
update(long)—Adds an externally-recorded duration in milliseconds.
Unregistering a TimeStampedMetric
Depending upon where its creation was initiated, from within or from outside the controller, the
collection of values from a TimeStampedMetric may be halted by a component or an application
that is running within the controller or from outside of the controller via the southbound metering
REST interface.
When the component or application that requested the creation of a TimeStampedMetric wishes to
stop the metering framework from processing a TimeStampedMetric, presumably in preparation for
destroying it, it must do so via the following MetricService method.
Metric Removal API:
51
public interface MetricServ ic e {
public void removeMetric(TimeStampedMetri c toRemove);
}
This method effectively unregisters the TimeStampedMetric from the metering framework so that the
framework no longer holds any references to it and thus no longer exposes it via JMX, summarizes
and persists its values, or does any other sort of processing on the TimeStampedMetric. Whether
the TimeStampedMetric is subsequently destroyed by the component or application that requested
its creation, it has disappeared from the framework's viewpoint.
Reregistering a TimeStampedMetric
If the controller bounces (goes down and then comes back up), all components and applications
that are using TimeStampedMetrics within the controller will be impacted as will the metering
framework; presumably they will initialize themselves in a predictable fashion, and if they register
their TimeStampedMetrics following the bounce using the same MetricDescriptor information they
used before the bounce metering should recover fine; the same UIDs will be assigned to their
various TimeStampedMetrics that were assigned before the bounce and the net effect will be a
gap in the data on disk for TimeStampedMetrics whose values are persisted. But for application
components outside the controller that created and are updating TimeStampedMetrics there may
be no indication that the controller has bounced - or gone down and stayed down - until the next
time they try to update TimeStampedMetricvalues.
Another possible, albeit unlikely, failure scenario arises should the metering service bounce while
other components and applications do not; this could happen if someone killed and restarted the
metering OSGi bundle. If this occurred, any components or applications that are using
TimeStampedMetrics within the controller might be oblivious to the bounce as their references to
the TimeStampedMetrics they requested will still be present, but they will be effectively unregistered
from the metering framework when it reinitializes. The UIDs and MetricDescriptor data will be
preserved by the framework for TimeStampedMetrics that have their data persisted, but they will
appear to be TimeStampedMetrics that are no longer in use and just have persisted data that is
waiting to be aged out. Again, for application components outside the controller that created and
are updating TimeStampedMetrics there may be no indication that the metering service has
bounced until the next time they try to update TimeStampedMetric values.
In order to be notified that the MetricService has gone down and/or come up, the OSGi
component that corresponds to a component or application using TimeStampedMetrics should
bind to the MetricService; then a method will be invoked when either occurrence happens to the
MetricService and the component or application can react accordingly. There is no change to
normal TimeStampedMetric creation required to handle the first failure scenario outlined above, as
all OSGi components within the controller will recover after a bounce just as they do whenever the
controller is initialized. But for the second failure scenario above, there is a way that a component
or application can react when notified that the metering service has initialized following a bounce
in which the component or application that owns TimeStampedMetrics has not bounced.
To handle such a scenario, components or applications should keep a Collection of the
TimeStampedMetrics that they allocate; each TimeStampedMetricthat is created on their behalf
should be added to the Collection. When the entire controller is initializing and the component or
application is notified that the MetricService is available this Collection will be empty or perhaps
not even exist yet, but in the second failure scenario above the Collection should contain
references to the pertinent TimeStampedMetrics when the MetricService becomes available. The
52
component or application can then iterate through the Collection, calling the following
MetricService method for each TimeStampedMetric.
Metric Registration API:
public interface MetricServ ic e {
public void registerMetric(TimeStampedMetric toRegister);
}
This will re-register the existing TimeStampedMetric reference with the metering framework.
Depending upon how long the bounce took there may be a gap in the resulting data on disk for
TimeStampedMetrics that are to be persisted. It is also possible, depending on the type of
TimeStampedMetric, that the value produced by the first interval summary following the bounce is
affected by the bounce. For example, since TimeStampedRollingCounters take the delta of the last
value reported and the previous value reported, there could be a spike in value that span the
entire time of the bounce in the first value persisted for a TimeStampedRollingCounter.
Time Series Data
As noted for the preceding northbound REST API for data retrieval, time series values returned from
the REST API for TimeStampedMetrics may be returned in "raw" form or may be further
summarized to span specified time intervals. In "raw" form TimeStampedMetric values will be
returned at the finest granularity possible; if the values for the TimeStampedMetric specified were
summarized and persisted every minute then "raw" data will be returned such that each value
spans a one-minute interval, and if the values for a particular Metric were summarized and
persisted every five minutes then "raw" data will be returned such that each value spans a fiveminute interval. If time series data is requested for a TimeStampedMetric at a granularity that is
finer than that with which the TimeStampedMetric values were persisted, for example data is
requested at one-minute intervals for a TimeStampedMetric whose values were persisted every
fifteen minutes, an error will be returned to alert to the user that their request cannot be fulfilled.
It is important to note that while the persisted time series data for a given corresponding
TimeStampedMetric is computed from values that the TimeStampedMetric is updated with, the
resulting persisted data will typically not have the same form as the values that the
TimeStampedMetric is updated with. For example, consider the case of the
TimeStampedRollingCounter metric type; while TimeStampedRollingCounters are updated with 64bit rolling counter values, the only value persisted for such a metric is the delta between two such
64-bit values (the 64-bit values themselves are not persisted). Generally speaking, the value
persisted for a TimeStampedMetric is the change in its value since the last time the
TimeStampedMetric's value was persisted. This approach focuses the resulting data on what each
TimeStampedMetric was measuring during a persistence interval, rather than the mechanics used
to convey the measurements.
Returned Data
The content returned for each data point, whether "raw" or summarized, differs somewhat
depending upon the type of TimeStampedMetric the data resulted from. For "raw" data this
content is essentially just a JSON representation of the data persisted for each data point being
retrieved. For summarized data values that are computed from "raw" values, the content takes the
same form as that of a "raw" data point except that the values represent the combination of all
"raw" data points from the summarized interval. The content provided for each data point includes
the following.
53
•
•
•
•
When the value of the TimeStampedMetric that the data point was formulated from was last
updated
How many milliseconds (prior to the last update time) are encompassed by the reported
value
The value measured over the milliseconds spanned by the data point Sufficient information is thus provided should the data recipient wish to normalize the data to
a standard interval length to smooth fluctuations in value that may be introduced by
variations in the milliseconds spanned by time series values.
Summarized Values
Time series values may also be requested from the REST API in a form that is not "raw", such that
each value returned represents a longer interval than the "raw" values persisted for a
TimeStampedMetric. In this case the necessary data must be read in "raw" form from the data
store and further summarized to produce values that span the requested interval before being
returned. For example, if a particular TimeStampedMetric's values were persisted every five
minutes and the REST API was invoked to retrieve hourly time series values for that
TimeStampedMetric, twelve "raw" values that each span five minutes would be read from the data
store and combined to produce a single resulting data point that spans the same hour
encompassed by the twelve "raw" data points.
There may be gaps in the "raw" data points that span a specific interval when summarized values
are returned. Continuing the preceding example of returning values that each represent an hour
interval with "raw" data points that each represent five minutes, one would typically expect that
twelve such "raw" data values would be summarized to produce one returned value. But in some
cases there could be gaps in the "raw" data for a given hour, for example for one hour span there
may be only ten "raw" data points persisted. Such gaps should be relatively infrequent and may
be caused by various situations; the source of the metric's data, perhaps a device on the network,
might be inaccessible, or perhaps the controller rebooted. The effect of any such gaps will be
accounted for in the summarized values that are returned; the information provided by each
resulting value is sufficient for the recipient to normalize the data to smooth any inconsistencies
introduced by gaps if so desired.
When summarized values are returned each resulting value represents the summary of a set of
"raw" data points. These sets must be anchored somehow in the total time span encompassed by
the REST request. For example, the time series data requested could be for a week of hourly data
ending at the current time. Suppose that the "raw" data points for the specified metric were
persisted at one-minute intervals, but that they started only four days ago; the first hour of data
returned will span a time interval that starts at the time of the oldest data point within the time
span encompassed by the REST request, in this case beginning four days ago. Each summarized
value will be produced from "raw" data points that are offset from the starting time of the first data
point returned. Continuing our example every hour value returned will be produced by "raw"
minute data points that are offset by some multiple of 60 minutes from starting time of the first
returned data point, four days ago in this case.
The technique used to summarize "raw" TimeStampedMetric values to produce summarized values
is contingent upon the type of TimeStampedMetric the data resulted from. For all
TimeStampedMetric types, the milliseconds spanned by each "raw" value are simply summed over
the specified interval and the latest update time stamp among the "raw" values is reported as the
last updated time stamp of the resulting value.
54
•
•
•
•
•
•
•
TimeStampedCounter
Counts from each "raw" data point are summed, producing a long value for the total
count during the summarized interval.
TimeStampedGauge
Values from each "raw" data point are averaged, producing a double value for the
gauge reading during the summarized interval.
TimeStampedHistogram
Sample counts from the "raw" data points are summed and the minimum and maximum
for the interval are computed by finding the lowest minimum and highest maximum
among the "raw" data points, producing three long values for the total sample count and
minimum and maximum sample values during the summarized interval. The means of the
"raw" data points are averaged and their standard deviations combined, producing two
double values for the mean and standard deviation of the sample values during the
summarized interval.
TimeStampedMeter
Sample counts from the "raw" data points are summed and rates from the "raw" data
points are averaged, producing a long value for the total sample count and a double
value for the average rate during the summarized interval.
TimeStampedRatioGauge
Ratio values from each "raw" data point are averaged, producing double values for the
numerator and denominator readings during the summarized interval.
TimeStampedRollingCounter
Delta values from each "raw" data point are summed, producing a long value for the
total delta during the summarized interval.
TimeStampedTimer
Sample counts from the "raw" data points are summed and the minimum and maximum
for the interval are computed by finding the lowest minimum and highest maximum
among the "raw" data points, producing three long values for the total sample count and
minimum and maximum sample values during the summarized interval. The means and
rates of the "raw" data points are averaged and their standard deviations combined,
producing three double values for the mean, average rate, and standard deviation of the
sample values during the summarized interval.
JMX Clients
JConsole or another JMX client may be used to connect to the HP VAN SDN Controller's JMX
server to view selected metric values "live". Access is only permitted for local JMX clients, so any
such clients must be installed on the controller system. No JMX clients are delivered with the
controller or are among the prerequisites for installing it; they must be installed separately. For
example, the openjdk-7-jdk package must be installed on the controller system to use JConsole.
Which TimeStampedMetrics are exposed via JMX is determined at the time of their creation, by a
field in the MetricDescriptor used to create each TimeStampedMetric. Once the controller has been
properly configured to permit local JMX access the user can inspect the exposed
TimeStampedMetrics as they are updated "live" by the components or applications within the
controller or external application components that created them.
55
The content exposed for each TimeStampedMetric is contingent on the type of TimeStampedMetric,
but generally speaking the "live" values used by the TimeStampedMetric are visible as they are
updated by the creator of the TimeStampedMetric. Using JConsole as an example, one will see a
screen somewhat like Figure 20 (the exact appearance will depend upon what JVMs are running
on the system):
Figure 20 JConsole – New Connection
Choose a local connection to the JMX server instance that looks like the one highlighted in the
preceding screenshot and click the Connect button. Upon successfully connecting to that JMX
server instance, one should see a screen that looks something like Fi g ur e 21.
56
Figure 21 JConsole
In the list of nodes shows on the left, note the one that says HP VAN SDN Controller; this is the
node under which all metrics exposed via JMX will be nested. Each application installed on the HP
VAN SDN Controller will have a similar node under which all of the metrics exposed by that
application are nested. Expanding the node will reveal all of the exposed metrics, which will look
something like Figure 22 (note that this is just an example; real metrics will have different names).
57
Figure 22 JConsole – HP VAN SDN Controller Metrics
The name displayed for each TimeStampedMetric is a combination of the primary and secondary
tags and metric name specified in its MetricDescriptor during its creation; this combination will be
unique among all TimeStampedMetrics monitored for a specific application. If the optional
primary and/or secondary tags are not specified then only the fields provided will be used to
formulate the displayed name for the TimeStampedMetric. One may select a listed metric to
expand the node on the left. Selecting the Attributes subnode displays properties of the
TimeStampedMetric that are exposed via JMX.
58
Figure 23 JConsole – Metric Example
•
The metric UID, value field(s), and time spanned by the reported value (in seconds) are among the
attributes that will be displayed.
For those TimeStampedMetrics that are persisted as well as exposed via JMX, it is possible to see
the seconds get reset when the value is stored; otherwise they grow forever.
GUI
SKI Framework - Overview
The SKI Framework provides a foundation on which developers can create a browser-based web
application. It is a toolkit providing assets that developers can use to construct a web-based
Graphical User Interface, as shown in Figure 24.
Third Party Libraries: (Client Side):
jQuery—A popular, powerful, general purpose, cross-browser DOM manipulation
engine
jQuery UI—An extension to jQuery, providing UI elements (widgets, controls, ...)
HTML Templates—providing alternate layouts for the UI
Core SKI Framework—providing navigation, search, and basic view functionality
Reference Documentation—documenting the core framework and library APIs
Reference Implementation—providing an example of how application code might be
written
SKI Assets (Server Side):
Java Classes—providing assistance in formulating RESTful Responses
Figure 24 SDN Controller main UI
SKI Framework - Navigation Tree
The SKI framework implements a navigation model consisting of a list of top-level categories in
which each category consists of a list of navigation items. Each navigation item consists of a list of
views in which one of the views is considered the default View. The default View is selected when
the navigation item is selected. The other views associated with the navigation item can be
navigated to using the selector buttons located on the view toolbar. Figure 25 shows the SKI UI
view diagram.
60
Figure 25 SKI UI view diagram
•
•
•
SKI Framework - Hash Navigation
The SKI Framework encodes context and navigation information in the URL hash. For example,
consider the URL:
http://appserver.rose.hp.com/webapp/ui/app/#hash
The #hash portion of the URL is encoded as #vid,ctx,sub, where:
vid—is the view ID, used to determine which view to display ctx—is the context, used to determine what data to retrieve from the server sub—is the sub-context, used to specific any additional context information with respect to the
view (that is, select a specific row in a table)
The following diagrams show the sequence of events on how SKI selects a view and loads the
data if a URL is pasted into the browser. The #hash is decoded into #vid,ctx,sub, as shown in
Figure 26. The vid (view ID) is used to determine the view, navigation item and category to be
selected.
61
Figure 26 SKI UI view hash diagram
Next, the ctx (context), shown in Figure 27, can be used to help determine what data to retrieve
from the Server RESTlet.
Figure 27 SKI UI view and context hash diagram
When the Asynchronous HTTP request returns, the data (likely in JSON form), as shown in Figure
28, can be used to populate the view’s DOM (grids, widgets, etc.).
62
Figure 28 SKI UI view data retrieval diagram
Finally, the sub (sub-context) can be used to specify addition context information to the view. In
this, case the second item is selected, as shown in Figure 29.
Figure 29 SKI UI view sub-context hash diagram
63
SKI Framework - View Life-Cycle
•
•
•
•
•
•
•
All views are event driven and can react to the following life-cycle events:
Create—called a single time when the view needs to be created (that is, navigation item is
clicked for the first time). At this time, a view will return its created DOM structure (that is, an
empty table).
Preload—called only once, after the view is in the DOM. At this time, a view can perform
any initialization that can only be done after the DOM structure has been realized.
Reset—may be called multiple times, allows the view to clear any stale data Load—may be called multiple times, allows the view to load its data. This is where a view
can make any Ajax calls needed to obtain server-side data.
Resize—may be called multiple times, allows the view to handle resize events caused by the
browser or main layout
Error—may be used to define an application specific error handler for the view Unload—called to allow a view to perform any cleanup as it is about to be replaced by
another view
SKI Framework - Live Reference Application
The SKI reference application hp-util-ski-ui-X.XX.X.war is distributed with the SDK in the lib/util/
direct or y. You need to install the Apache Tomcat web server to run the reference application.
Simply copy this war file into your Tomcat webapps directory as the file ski-ui.war. You can
launch the reference application in your browser with URL: localhost:8080/ski-ui/ref/index.html.
Figure 30 shows the SKI UI reference application.
64
Figure 30 SKI UI reference application
•
•
•
From these pages, you have access to the most up to date documentation and reference code.
The reference application includes examples on how to:
Add categories, navigation items and views. Create a jQuery UI layout in your view. Create various widgets (buttons, radios, and so on) in your view.
UI Extension
The SDN UI Extension framework allows third-party application to inject UI content seamlessly into
the main SDN UI. The following list is the important files a developer needs to be aware of to
make use of the UI Extensions framework. For more information, see Distributed Coordination
Primitives see 5 Sample Application
.
65
Introduction
•
•
In a network managed by a controller, the controller itself stands out to be a single point of failure.
Controller failures can disrupt the entire network functionality. HP VAN SDN Controller Distributed
Coordination infrastructure provides various mechanisms that controller applications can make use
of in achieving active-active, active-standby Distributed Coordination paradigms and internode
communication. The Distributed Coordination infrastructure provides 2 services for the applications
to develop Distributed Coordination aware controller modules.
Controller Teaming Distributed Coordination Service
Following figure describes the communication between the controller applications and the HP VAN
SDN Controller Distributed Coordination sub-systems. “App1 – 1” indicates instance of application
1 on controller instance 1. Distributed services, ensures the data synchronization across the
controller cluster nodes.
Figure 31 Application view of Coordination Services
66
Controller Teaming
•
•
•
Teaming Configuration Service
The Teaming Configuration Service provides REST interfaces (/team) that can be used to set up a
team of controllers. Without team configuration, controller nodes will bootstrap in standalone
mode. As the teaming is configured, identified nodes form a cluster and the controller Applications
can communicate across the cluster using Coordination Service interfaces.
The following curl command is used to get the current team configuration. 192.168.66.1 is the IP
address of one of the teamed controllers.
curl --noprox y 192.168.66.1 --hea de r "X-Auth-Token:
19a4b8a048ef4965882eb8c570292bcd" --request GET --url
https://192.168.66.1:8443/sdn/v2.0/team -ksS
For team creation help and other configuration commands please refer to HP VAN SDN Controller
Administrator Guide [29].
Distributed Coordination Service
Distributed Coordination Service provides the building blocks to achieve high availability in the HP
VAN SDN Controller environment. This service can be retrieved from the Teaming Service. An
example java application that makes use of different functionalities of the Coordination Service is
described in the subsequent sections.
Distributed Coordination Service includes:
Publish Subscribe Service Distributed Maps Distributed Locks
Serialization
It is required to register a Serializer for each distributable object because of the multiple class
loaders approach followed by OSGi. No serializer is required for the following types: Byte,
Boolean, Character, Short, Integer, Long, Float, Double, byte[], char[], short[], int[], long[], float[],
double[], String.
If a distributable object implements Serializable, Distributable must be found before Serializable
in the class hierarchy going from the distributable object to its super classes. Unfortunately the
order matters: The class hierarchy is analyzed when registering the serializer. If Serializable is
found before Distributable an exception is thrown with a message describing this restriction.
Example of distributable object declarations:
67
import com.hp.api.Distri but able
class ValidDistributableType implements Distributable {
}
class ValidDistributableType implements Distributable, Serializable {
}
class ValidDistributable Typ e extends SerializableType imp lements
Distributable {
}
class Invali dDistributableType implement s Serializable, Distributable {
}
Example of serializer registration:
@Component
public class Consumer {
@Reference(cardinal i ty = ReferenceCardinality.MAN DATORY_UNARY, policy =
ReferencePolicy.DYNAMIC)
private volatile Coordi na tionService coordinationSe rvice;
@Activate
public void activate() {
coordinationService.registerSerializer(new
MyDistributableObjectSe r ializer(), MyDistributableO bject.class);
}
In a distributed environment, applications tend to communicate with each other. Applications might
be co-located on the same controller node or they may exist on different nodes of the same
controller cluster. The Publish Subscribe Service provides a way to accomplish this kind of
distributed communication mechanism. Note that communication can occur between the nodes of
a controller cluster and not across controller clusters. The Publish Subscribe Service provides a
mechanism where several applications on different controller nodes can register for various types
of bus messages, send and receive messages without worrying about delivery failures or out of
order delivery. When an application pushes a message, all the subscribers to that message type
for active members of the team are notified irrespective of their location in the controller cluster.
68
Publish Subscribe service is provided by the Distributed Coordination Service which is in turn
provided by the Teaming service. Please refer to the Javadoc for a detailed explanation of
methods provided by publish-subscribe service.
Publish Subscribe service also provides mechanisms to enable global ordering for specific
message types. Global ordering is disabled by default. With global ordering enabled, all
receivers will receive all messages from all sources with the same order. If global order is disabled
two different receivers could receive messages from different sources in different orders. It is
important to note - since global ordering degrades performance - that messages from the same
source will still be ordered even with global ordering disabled.
Example:
Let A and B be message publishers (Sources).
Let R and W be message subscribers (Receivers).
Assume A sends messages a
Assume B sends messages b
1 b2 b3
1 a2 a3
in that order.
in that order.
With or without global ordering the following holds:
· a
· a
· b
· b
arrives before a2
1
arrives before a3
2
arrives before b2
1
arrives before b3
2
With global ordering
· Let a
1 b1 a2 a3 b2 b3
be the sequence of messages received by R
· Then W receives messages in the same order
Without global ordering
· Let a
1 b1 a2 a3 b2 b3
be the sequence of messages received by R
· Then W may (or may not) receives messages in the same order.
The global ordered sequence does not necessarily represent the sequence in which the events
were actually generated, but the sequence in which they were received by a node designated as a
reference automatically by the Distributed Coordination service. This reference node propagates
the events in the order received; this is how global ordering is commonly implemented. Thus,
global ordering is from the receiving point of view and not from the sending point of view (It is not
possible to determine the actual order events were generated - common problem in distributed
systems: It is not possible to get a global state of the system).
The example below presents a common implementation of publish subscribe service.
Publish-Subscribe Example:
PubSubExample.java
import com.hp.sdn.teamin g.T eamingService;
69
import com.hp.util.dcord .Co ordinationService;
SampleMessag eListener<SampleMessage> list ener = new
import com.hp.sdn.demo.e xam ple.SampleMessage;
@Component
public class PubSubExample {
private Coordina tionService coor dinationService;
private PublishS ubscribeServic e pubSubService;
@Reference(cardinal i ty = ReferenceCardinality.MAN DATORY_UNARY, policy =
ReferencePolicy.DYNAMIC)
protected volati le TeamingServic e teamingSvc;
A Distributed Map is a class of a decentralized distributed system that provides a lookup service
similar to a hash table; (key, value) pairs are stored in a Distributed Map, and any participating
node can efficiently retrieve the value associated with a given key. Responsibility for maintaining
the mapping from keys to values is distributed among the nodes, in such a way that a change in
the set of participants causes a minimal amount of disruption. This allows a Distributed Map to
scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and
failures.
The distributed map is an extension to the Java Map interface and due to this, the applications
can perform any operation that can be performed on a regular Java map. The data structure
internally distributes data across nodes in the cluster. The data is almost evenly distributed among
the members and backups can be configured so the data is also replicated. Backups can be
configured as synchronous or asynchronous; for synchronous backups when a map.put(key, value)
returns, it is guaranteed that the entry is replicated to one other node. Each distribute map is
distinguished by the namespace and it is set upon creation of the distributed map.
The Distributed Coordination Service provides a mechanism where applications running on
multiple controllers to register for notifications for specific distributed maps. Notifications of a
distributed map are received when entries in the distributed map are added, updated or removed.
Notifications are received per entry.
public class SimpleEntryLis te ner implements EntryListener <String, String> {
@Override
public void added(EntryEvent<String, String> entry) {
// Any action to be taken on receipt of a message notification.
// In this example, there is a simple print
String string = "Added notification recieved";
System.out.println(string);
}
@Override
public void updated(Ent ry Event<String, String> entry) {
// Any action to be taken on receipt of a mess age notification.
// In this example, there is a simple print
String string = "Updated notification recieved";
System.out.println(string);
}
@Override
public void removed(Ent ry Event<String, String> entry) {
// Any action to be taken on receipt of a message noti fication.
// In this example, there is a simple print
String string = "Removed notification recieved";
System.out.println(string);
}
}
Performance Considerations
Keep in mind the following when using the distributed coordination services:
1. Java objects can be written directly to distributed coordination services.
- There is no need to serialize the data before it is written to these structures.
Thecoordination service will serialize/deserialize the data as it is distributed in the team
using the serializer you have registered.
2. Minimize other in-memory local caches for distributed map data.
74
3. Minimize tying map entry listeners to persistence.
Distributed Lock
Protecting the access to shared resources becomes increasingly important in a distributed
environment. A lock is a synchronization primitive that ensures only a single thread is able to
access a critical section. Distributed Locks offered by the Coordination Service provides an
implementation of locks for distributed environments where threads can run either in the same JVM
or in different JVMs.
Applications needs to define a namespace that is used as the lock identity to make sure
application instances running on different JVMs acquire the right lock. Applications on different
controller nodes should agree upon the namespace and acquire the necessary lock on it before
accessing the shared resource.
A distributed lock extends the functionality of java.util.concurrent.locks.Lock and thus it can be
used as a regular Java lock with the following differences:
- The distributed map is already in memory and serves this purpose. If your application
needs this data to be available if and when the coordination service is down then a local
cache could be appropriate as well as reading from persistence any previously saved
records to startup the cache in those scenarios.
- Consider how important it is for your data to be persisted before automatically tying a
distributed map entry listener for the purpose of writing to the database.
Locks are automatically released when a member (node) has acquired a lock and this member
goes down. This prevents threads that are waiting for a lock from waiting indefinitely. This is
needed for failover to work in a distributed system. The downside however is that if a member
goes down that acquired the lock and started making changes, other members could start to see
partial changes.
Data Versioning with Google Protocol Buffers (GPB)
For the long term maintainability, interoperability, and extensibility of application data it is
recommended that applications version the data they write using the different coordination
services. Google Protocol Buffers (GPB) is the recommended versioning mechanism for these
services that is supported by the SDK. The section below introduces GPBs and their use for
message versioning with application’s model objects. It is recommended the reader reference the
official GPB documentation to understand the complete syntax and all the features available for
the programming language of choice for your application. [50]
75
GPB is a strongly-typed Interface Definition Language (IDL) with many primitive data types. It also
allows for composite types and namespaces through packages. Users define the type of data they
wish to send/store by defining a protocol file (.proto) that defines the field names, types, default
values, requirements, and other metadata that specifies the content of a given record. [50, 51]
Versioning is controlled in the .proto IDL file through a combination of field numbers and tags
(REQUIRED/OPTIONAL/REPEATED). These tags designate which of the named fields must be
present in a message to be considered valid. There are well-known rules of how to design a .proto
file definition to allow for compatible versions of the data to be sent and received without errors
(see Versioning Rules section that follows).
From the protocol file a provided Java GPB compiler (protoc) then generates the data access
classes for the user’s language of choice. In the generated GPB class, field access and builder
methods are provided for the application to interact with the data. The compiler also enforces the
general version rules of messages to help flag not only syntax and semantic error, but also errors
related to incompatibility between versions of a message.
The application will ultimately use the Model Object it defines and maps to the GPB class that will
be distributed. The conversion from Model Object to GPB object takes place in the custom
serializer the programmer will have to write and register with the Coordination Service to bridge
the object usage in the application and its distribution over the Coordination Services (See
Application GPB Usage section that follows for more details).
Below is an example of a GPB .proto file that defines a Person by their contact information and an
AddressBook by a list of Persons. This example demonstrates the features and syntax of a GPB
message. String and int32 are just two of the 15 definable data types (including enumerated
types) which are similar to existing Java primitive types. Each field requires a tag, type, name, and
number to be valid. Default values are optional. Message structures can be composed of other
messages. In this example we see that a name, id and number are the minimum fields required to
make up a valid Person record. If this were version 1 of the message then, for example, version 2
could include an “optional string website = 5;” field to expand the record further without breaking
compatibility with version 1 of the Person record. The Addressbook message defines a
composition of this Person message to hold a list of people using the repeated tag. [51]
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string num be r = 1;
optional PhoneTy pe ty pe = 2 [default = HOME ];
}
76
repeated PhoneNumber phon e = 4;
}
message Addr essBook {
repeated Person person = 1;
}
The protocol file above would be run through GPB’s Java compiler (See “.proto Compilation
Process” Below) to generate the data access classes to represent these messages. Message
builders would allow new instances of the message to be created for distribution by the
Coordination Services. Normal set/get accessor methods will also be provided for each field.
Below are examples of creating a new instance of the message in Java. Reading the record out
will return this GPB generated object for the application to interact with as usual.
public class AddPerson {
// This function creates a simple instance of a GPB Person object
// that can then be written to one of the Coordination Services.
public static Person createTestPerson(){
A message version is a function of the field numbering and tags provided by GPB and how those
are changed between different iterations of the data structure. The following are general rules
about how .proto fields should be updated to insure compatible GPB versioned data:
· Do not change the numeric tags for any existing (previous version) fields.
· New fields should be tagged OPTIONAL/REPEATED (never REQUIRED). New fields should
also be assigned a new, unique field ID.
· Removal of OPTIONAL/REPEATED tagged fields are allowed and will not affect
compatibility.
· Changing a default value for a field is allowed. (Default values are sent only if the field is
not provided.)
· There are specific rules for changing the field types. Some type conversions are compatible
while others are not (see GPB documentation for specific details).
Note: It is generally advised that the minimal number of fields be marked with a REQUIRED tag as
these fields become fixed in the schema and will always have to be present in future versions of
the message.
.proto Compilation Process
The following is a description of the process by which .proto files should be defined for an
application, compiled with the Java GPB compiler, and how the derived data classes should
be imported and used in application code. Application developers that wish to make use of
GPB in their designs will need to download and install Google Protocol Buffers (GPB) on
their local development machine. Those steps are as follows for GPB 2.5.0v:
Compiling and installing the protoc binary
The protoc binary is the tool used to compile your text-based .proto file into a source file
based on the language of your choice (Java in this example). You will need to follow these
steps if you plan on being able to compile GPB-related code.
1. Download the "full source" of Google's Protocol Buffers. For this example we are
using 2.5.0v in the instructions below.
2. Extract it somewhere locally.
3. Run the following line:
cd protobuf-2.5.0
./configure && make && make check && sudo make install
4. Add the following to your shell profile and also run this command:
export LD_LIBRARY_PATH=/usr/local/lib
5. Try to run it standalone to verify protoc is in your path and the LD_LIBRARY_PATH is
set correctly. Running “protoc” on the command line should return “Missing input
file.” If everything is setup correctly.
Compiling .proto Files
78
We recommend under the project you wish to define and use GBP you place .proto files
under the /src/main/proto directory. You can then make use of the GPB “option
java_package” syntax to control the subdirectory/package structure that will be created for
the generated Java code from the .proto file.
The projects pom.xml file requires the following GPB related fields:
· Compile the generated Java file into class files
79
· Package up the class files into a jar in the target directory
· Install the compiled jar into your local Maven cache (~/.m2/repository)
Have the .proto file and generated .java file displayed properly in your IDE from your
project’s root directory, i.e. where the project’s pom.xml file is, execute the following:
· mvn eclipse:clean
· mvn eclipse:eclipse
· Refresh the project in your IDE (Optional: clean the project as well).
As the resulting Java file is protoc generated code it is not recommended that it be checked
in to your local source code management repo but instead regenerated when the
application is built. The GPB Java Tutorial link on the official GPB website gives a more in
depth walk through of the resulting Java class.
Application GPB Usage
Generated GPB message classes are meant to serve as the versioned definition of data
distributed by the Coordination Service. They are not meant to be used directly by the
application to read/write to the various Coordination Services. It is recommended that a
Model Object be defined for this role. This scheme provides two notable benefits:
1) It allows the application to continue to evolve without concern for the data
versioning at the Coordination Service level.
2) It allows the Model Object to define fields for data it may want to store and use
locally for a version of the data but not have that data shared during distribution.
The recommended procedure for versioning Coordination Service data is shown below and
the sections that follow explain each of these steps with examples and best practices.
1) Define a POJO Model Object for the data that the application will want to operate
on and distribute via a Coordination Service.
2) Define a matching GPB .proto Message to specify which field(s) of the Model
Object are required/optional for a given version of message distributed by the
Coordination Services.
3) Implement and register a Custom Serializer with the Coordination Service that will
convert the Model Object the application uses to the GPB message class that will
be distributed.
Model Object
The application developer will define POJOs for his/her application. They will contain data
and methods necessary to the applications processing and may contain data that the
application wishes to distribute to other members of the controller team. Not all fields may
need to be (or want to be) distributed. The only requirement for the Model Object’s
implementation is that the class being written to the different Coordination Services
80
implement com.hp.api.Distributable (a marker interface) to make it compatible with the
Coordination Service.
In terms of sharing these objects via the Coordination Service, the application developer
should consider which field(s) are required to constitute a version of the Model Object
versus which fields are optional. Commonly those fields that are defined in the objects
constructor arguments can be considered required fields for a version of the object. Later
versions may add additional optional fields to the object that are not set by a constructor.
New required fields may be added for new versions of the Model Object with their
presence as an argument in a new constructor. Note that adding new required fields will
require that field for future versions. Past versions of the application that receive a new
required field will just ignore it. Overall, thinking in terms of what fields are optional or
required will help with the next step in the definition of the GPB .proto message.
The following is an example of a Person Java class an application may want to define and
distribute via a PubSub Message Bus. The name and id fields are the only required as
indicated with the constructor arguments. The application may use other ways to indicate
what required fields are.
public class Person implement s Di stributable {
private String name;
private int id;
private String email;
private Date lastUpdated;
Person(String name, Id id ) {
this.name = name;
this.id = id;
}
// Accessor and other methods.
}
GPB .proto Message
The GPB .proto message serves as the definition of a versioned message to be distributed
by the Coordination Service. The application developer should write the .proto messages
with the Model Object in mind when considering the data type of fields, whether they are
optional or required. etc. The developer should consider all the GPB versioning rules and
best practices mentioned in the previous section. The programmer implements a message
per Model Object that will be distributed following the GPB rules and conventions
previously discussed.
Below is an example .proto message for the Person class. The field data types and
REQUIRED/OPTIONAL tags match the Model Object. Since email was not a field to be set
in the constructor it is marked optional while name and id are marked as required. Notice
that lastUpdated field of the Model Object is not included in the .proto message definition.
This is considered a transient field, in the serialization sense, for the Model Object and it is
not meant to be distributed in any version of the message. With this example the reader can
81
see not all fields in the Person Model Object must be defined and distributed with the .proto
message.
option java_outer_classname = "PersonProto; // Wrapper class name.
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}
The application developer will generate the matching wrapper and builder classes for the
.proto message to have a Java class that defines the message using protoc in the .proto
Compilation Process section above.
Custom Serializer
Finally, a customer serializer needs to be defined to translate between instances of the
Model Object being used in the Coordination Services and instances of the GPB message
that will ultimately be transported by that service. For example, we may wish to write the
Person Model Object on the PubSub Message Bus and have it received by another instance
of the application which has subscribed to Person messages through its local Coordination
Service.
In the custom serializer the developer will map the fields between these two objects on
transmit (serialization) and receive (deserialization). With data types and naming
conventions it should be clear what this 1:1 mapping is in the serializer. The Serializer must
implement the Serializer<Model Object> interface as shown in the example below. It is
recommended this serializer be kept in the <application>-bl project (if using the provided
application project generation script of the SDK). PersonProto is the java_outer_classname
we define in the GPB message above and will be the outer class from which inner GPB
message classes, and their builders, are defined.
import <your package>.Perso nP roto;
public class PersonSerializ er implements Serializer<Perso n> {
@Override
public byte[] serialize(Person subject) {
PersonProto.Person .Builder message =
@Override
public Person deserialize(byte[] serialization) {
82
PersonProto.Person message = null;
try {
message = Person Proto.Person .p arseFrom(serialization);
} catch (InvalidProt ocolBufferEx ce ption e) {
// Handle the error
}
Person newPerson = new Person();
if (message != null) {
newPerson.setName(message.getName());
newPerson.setId(message.getId());
return newPerson;
}
return null;
}
}
In the serialize() method the builder pattern of the generated GPB message class is used to
create a GPB version of the Person Model Object. After the proper fields are set the
message is built and converted to a byte array for transport. In the deserialize() method on
the receiver the byte array is converted back to the expected GPB message object. An
instance of the Model object is then created and returned to be placed into the
Coordination Service for which the serializer is registered.
System Status
The system status (Which can be retrieved using SystemInformationService) depends on two
properties of the controller: Reachability and Health. The following table depicts the status:
The application must register this custom serializer with the Coordination Service it wishes to
use this Model Object and GPB message combination. Below is an example of that
registration process in an OSGI Component of an example application.
@Reference(cardinality = ReferenceCardinality.MANDATORY_UNARY,
policy = ReferencePolicy.DY NA MIC)
protected volatile Coordina ti onService coordinationSvc;
@Activate
public void activate() {
// Register Message Seria li zers
if (coordinationSvc != nu ll ) {
coordinationSvc.registe r Serializer(new PersonSer ializer(),Person.class);
}
}
83
Table 3 System Status
System Status
Coordination Services
Reason
a quorum.
no quorum.
Depends whether active or
suspended
or network partition.
•
•
Active
Suspended
Unreachable
Available
Unavailable
The controller is healthy and part of a cluster with
The controller is unhealthy or part of a cluster with
The controller is unreachable because of failures
Considerations:
A system never sees itself as unreachable. The strategy followed on the event of a network partition is to suspend controllers that are
part of a cluster with no quorum.
The following figure illustrates two examples of how each controller sees the status of the other
controllers that are part of the team. Examples show a 5-node cluster for simplicity; this does not mean
this release supports teams of such size. The behavior shown in the examples can easily be applied to
any cluster size.
84
Figure 32 Application Directory Structure
•
•
•
•
•
•
•
Persistence
Distributed Persistence Overview
The SDN Controller provides a distributed persistence for applications in form of a Cassandra [10]
database node running on each controller instance. A team of controllers serves as a Cassandra
cluster. Cassandra provides the following benefit as a distributed database:
A distributed, peer-to-peer datastore with no single point of failure. Automatic replication of data for improved reliability and availability. An eventually-consistent view of the database from any node in the cluster. Incremental, scale-out growth model. Flexible schemas (column oriented keyspaces). Hadoop integration for large-scale data processing. SQL-like query support via Cassandra Query Language (CQL).
Distributed Persistence Use Case
The distributed persistence architecture is targeted at applications that have distributed activeactive requirements. Specifically, applications should use the distributed persistence framework if
they have one or more of following requirements:
85
•
•
•
Business ObjectData Access Object
Transfer Object
Data Source
Obtains /
Modifies
Uses
Encapsulates
Consumer applications have high scalability requirements i.e. there are generally multiple
instances of the app running on different controller nodes that need access to a common
distributed database store.
The distributed database should be available independent of whether individual nodes are
present or not e.g. if there are controller node crashes.
The applications have high throughput requirements: large number of I/O operations.
Further, they have requirements wherein as the number of controller nodes increases,
performance needs to scale linearly.
For addressing applications with such requirements, a distributed persistence layer that uses
Cassandra is exported as the underlying distributed database. The HP VAN SDN Controller
provides a Data Access Object (DAO) layer on top of Cassandra for performing distributed
persistence operations.
Persistence Data Model
Introduction to DAO Pattern
A data access object (DAO) is an object that provides an abstract interface to some type of
database or persistence mechanism, providing some specific operations without exposing details
of the database. It provides a mapping from application calls to the persistence layer. This
isolation separates the concerns of what data accesses the application needs, in terms of domainspecific objects and data types (the public interface of the DAO), and how these needs can be
satisfied with a specific DBMS, database schema, and so on. Figure 33 and Figure 34 show Data Access Object pattern [30].
Figure 33 Data Access Object Pattern
86
Figure 34 DAO pattern
Distributed Data Model Overview
Cassandra is a “column oriented” distributed database system and provides a structured key-value
store. It is a NOSQL database and this means it is completely non-relational in nature. A reference
table which can be useful for migration of a MySQL (RDBMS) to a NOSQL DB (Cassandra) is as
illustrated in Figure 35.
Figure 35 Mental Model Comparison between Relational Models and Cassandra
Although this table provides a mapping of the terms, a more accurate analogy is a nested sorted
map. Cassandra stores data in the format as follows:
Map<RowKey, SortedMap<Co lum nKey, ColumnValue>>
So, there is a sorted map of RowKeys to an internal Sorted map of Columns sorted by the
Colum nKey. The following figure illustrates a Cassandra row.
87
Figure 36 Cassandra Row
•
•
•
•
•
•
This is a simple row with columns. There are other variants like Composite Columns and Super
Columns which allow more levels of nesting. These can be visited if there is a need for these in the
design.
One important characteristic of Cassandra is that it is schema-optional. This means the columns
need not be defined upfront. They can be added dynamically as and when required and further
all rows need not have the same number and type of columns.
Some important points to be noted during migration of data from RDBMS to NOSQL are as
follows:
Model data with nested sorted maps in mind as mentioned above. This provides an efficient
and faster response time for queries.
Model Column families around queries. De-normalize data as needed. Too much of de-normalization can have side effects. A right
balance needs to be struck.
Modeling Data around Queries
Unlike with relational systems, where entities and relationships are modeled and then indexes are
added to support whatever queries become necessary, with Cassandra queries that need to be
supported efficiently are thought of ahead of time.
Cassandra does not support joins at the query time because of its high scale distributed nature.
This mandates duplication and de-normalization of data. Every column family in a Cassandra
keyspace is self-contained with all data necessary to satisfy a given query. Thus, moving towards a
“Column Family per query” model.
In the HP VAN SDN Controller, define a column family for every entity. For each query on that
entity, define a secondary column family. These secondary column families serve exactly one
quer y.
Reference Application using Distributed Persistence
Any application that needs to use the distributed persistence in the HP VAN SDN Controller needs
to include/define the following components:
A Business Logic component as an OSGi service. A reference to Distributed DataStoreService and Distributed QueryService A DTO (transport object) per entity.
88
•
•
DAO–Data access object to interact with the persistence layer. A sample of each of these will be presented in this section. For demonstration purposes a
Demo application that persists Alerts in the Distributed Database (Cassandra) has been
created.
Business Logic Reference
When the Cassandra demo application is installed, the OSGi service for business logic gets
activated. This service provides a north bound interface. Any external entity/app can use this
service via the API provided by this service. In this case, we have Alert service using Cassandra.
This service provides an API for all north bound operations such as posting an Alert into the
database, deleting the alerts and updating the alert state. There is another interface that provides
for the READ operations and is mostly used by the GUI interface. This second north bound service
is called CassandraAlertUIService.
The implementation of these services needs to interact with the underlying persistence layer. This is
done by using an OSGi @Reference as shown below.
CassandraAlertManager.java:
@Component
@Service
public class CassandraAlert Ma nager implements
CassandraAlertU I Service, CassandraAlertServ ice {
@Reference(policy = ReferencePolicy.DYNAMIC,
cardinality = ReferenceCardinality.MANDATORY_UNARY)
private volatile DataStoreServic e<DataStoreContext> dataStoreService;
The above snippet shows the usage of @Reference. OSGi framework caches the dataStoreService
and queryService objects in the CassandraAlertManager. Whenever, the client or application
issues a query to the database, these objects will be used to get access to the persistence layer.
DTO (Transport Object)
Data that needs to be persisted can be divided into logical groups and these logical groups are
tables of the database. Every table has fixed columns and every row has a fixed type of Row Key
or Primary Key.
DTO is a java representation of a row of a table in the database. Any application that needs to
write a row needs to fill data into a DTO and hand it over to the persistence layer. The persistence
layer understands a DTO and converts it into a format that is required for the underlying database.
The reverse holds too. When reading something from the database, the data will be converted into
a DTO (for a single row read) or a list of DTO (multi row read) or a page of DTO (paged read)
and given back to the requestor.
These are generic queries and need to be qualified appropriately by the application. The
following shows a Distributed Query Service interface that provides application specific queries.
Here is the interface code from the demo application.
WriteQuery<Cassandr a Alert, C> getUpdateAlertState Query(
CassandraAlert alert);
WriteQuery<Long, C> get Tr imAlertQuery(CassandraAl ertFilter alertFilter);
WriteQuery<Long, C> get Ad dAlertListQuery(List<CassandraAlert> alerts);
WriteQuery<Long, C> get Up dateAlertListQuery(
List<String> uids, S tring sysId, boole an state);
WriteQuery<Long, C> get De leteAlertListQuery(
List<String> uid s, St ring sysId);
ReadQuery<Long, C> getCountAlertQuery();
}
This interface has all the queries that are to be used by the demo application. Here is an
implementation example of the interface shown above.
The DistQueryManager provides all queries required by the business logic without exposing the
underlying generic queries directly. This also helps the application to keep a check on the queries
that can be issued to the database. Random queries are not to be accepted. The business logic
uses one of the interface API listed in the interface to perform persistence operations at a given
point in time. An example is shown below. Earlier examples showed that business logic references
distributed data store service and distributed query service. The following example shows how
these references are put to use.
CassandraAlertManager.java Posting Alert:
@Override
public Cassa ndraAlert post(Severity severi ty, CassandraAlertTopic topic,
String origin, String data) throws PersistenceException {
if (topic == null) {
throw new NullPointerExc eption(...);
}
CassandraAlert alert = new Ca ssandraAlert(sysId, true, to pic.id(),
origin, new Date(), severity , data);
The method from the previous listing posts a new Alert into the database. It is a write query that
creates a new row for every alert posted. The post method is called from other components
whenever they want to log an alert message in the database. In this method, the call flow is as
follows:
1. Create a transport object (DTO) for the incoming alert
2. Call the Distributed Query Service API (getAddAlertQuery) to get an object of type
AddQuery. Please see the implementation above for details. The DTO is an input to this
method.
3. Call the Distributed DataStoreService API (execute) to execute the query and pass the
postAlertQuery as an argument.
4. Return the stored Alert on success or throw a PersistenceException on a failure.
This sequence is followed for every write query to the persistence layer from business logic.
The following listing illustrates another example of business logic using persistence layer services
using a query service. This is a read operation and the example code is as follows.
CassandraAlertManager.java Reading from the Database:
The two methods shown read from the database in different ways. The first one issues a find query
using a filter object. The filter specifies the pivot around which the query results are read. The
second method reads a page of alerts and is used when there is a need to paginate results. This is
mostly used by a GUI where pages of Alerts are displayed instead of a single long list of Alerts.
The following is an example of filter object as defined in the demo application.
CassandraAlertFilter.java:
package com. hp.hm.model;
import com.hp.util.filte r.E qualityCondition;
import com.hp.util.filte r.S etCondition;
import com.hp.util.filte r.S tringCondition;
...
public class CasssandraAler tFilter {
// Implement setters and getters for all conditions.
// Good practice to overrid e toString()
}
Every application needs to define its filter parameters as in the above code. In the demo
application, there is severity filter to “find Alerts where Severity = CRITICAL, WARNING” for
example. So, Severity is a Set condition. The find method returns the row if one of the values in a
set condition match. The other conditions in the demo follow similar principles.
They cater to various conditional queries that can be issued as a read query to the database. The
caller who wants to read from the database needs to create a filter object and fill it with
appropriate values before issuing a find query.
Data Access Object - DAO
In the previous information, the business logic called the DataStoreService API to perform any
persistence operation. The API performs the operation using a DAO. The DAO is a layer that acts
as a single point of communication between the business logic and the database. The
infrastructure provides generic abstractions of the DAO. However, each table needs to have a
table or a Column family specific DAO defined. For this Alerts Demo application there is a
CassandraAlertDao. The example code is illustrated in the following listing.
CassandraAlertDao.java:
package com.hp.demo.cassandra.dao.impl;
...
public class CassandraAlert Da o extends
CassAbstractDao<String, String, CassandraAlert,
CassandraStorable<String, String>, CassandraAlertFilter,
94
CassandraAlertSortAttribute> {
public CassandraAler tDa o() throws PersistenceConnEx ception {
cfList.add(new A ler tsBySeverity());
cfList.add(new AlertsByState());
cfList.add(new AlertsByTopic());
cfList.add(new A ler tsByOrigin());
cfList.add(new A ler tsByTimeStamp());
cfList.add(new AlertsCount());
cfList.add(new A ler tsByUidAndSysId());
}
private static class AlertColumnFamily {
private static final ColumnName<String, String> SYS_ID_NAME =
ColumnName.valueOf("sy sId", BasicType.UTF8, false);
private static fina l ColumnName<S tr ing, Severity> SEVERITY_COL_NAME =
ColumnName.valueOf("severity", Ba sicType.UTF8, false);
private static final ColumnName<String, Date> TIMESTAMP_COL_NAME =
ColumnName.valueOf("timestamp", BasicType.DATE, fals e);
private static final ColumnName<String, String> DESC_COL_NAME =
ColumnName.valueOf("description", BasicType.UTF8, fa lse);
private static fina l ColumnName<S tr ing, Boolean> STATE_COL_NAME =
ColumnName.valueOf("state", BasicType.BOOLEAN, false);
private static fina l ColumnName<S tr ing, String> ORIGIN_COL_NAME =
ColumnName. v alueOf("origin", BasicType. UTF8, false);
private static final ColumnName<String, String> TOPIC_COL_NAME =
ColumnName.valueOf("topic", BasicType.UTF8, false);