Intel Xeon Phi Developer's Quick Start Manual

Intel® Xeon Phi™ Coprocessor
DEVELOPERS QUICK START GUIDE
White Paper
Version 1.7
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
Contents
Introduction ........................................................................................................................................................................................................ 4
Goals ............................................................................................................................................................................................................................. 4
This document does: ...................................................................................................................................................................................... 4
This document does not: ............................................................................................................................................................................. 4
Terminology .............................................................................................................................................................................................................. 4
System Configuration .................................................................................................................................................................................... 5
Intel® Xeon Phi™ Software ................................................................................................................................................................................ 5
Intel® Many Integrated Core Architecture Overview ........................................................................................................................ 7
Administrative Tasks ...................................................................................................................................................................................... 8
Preparing Your System for First Use .......................................................................................................................................................... 8
Steps to install the driver and start the card.................................................................................................................................. 8
Steps to install the Software Development tools ........................................................................................................................ 9
Updating an Existing System ....................................................................................................................................................................... 10
Updating a system that already has an Intel® Xeon Phi™ Coprocessor ........................................................................ 10
Regaining Access to the Intel® Xeon Phi™ Coprocessor after Reboot ................................................................................... 11
Restarting the Intel® Xeon Phi™ Coprocessor If It Hangs .............................................................................................................. 11
Monitoring the Intel® Xeon Phi™ Coprocessor ..................................................................................................................................... 12
Running an Intel® Xeon Phi™ Coprocessor program from the host system ........................................................................ 12
Working directly with the uOS Environment Intel® Xeon Phi™ Coprocessor ...................................................................... 13
Useful Administrative Tools ......................................................................................................................................................................... 13
Getting Started/Developing Intel® Xeon Phi™ Software .............................................................................................................. 13
Available Software Development Tools / Environments .............................................................................................................. 14
Development Environment: Available Compilers and Libraries ......................................................................................... 14
Development Environment: Available Tools ................................................................................................................................. 14
General Development Information ............................................................................................................................................................ 14
Development Environment Setup ....................................................................................................................................................... 14
Documentation and Sample Code ....................................................................................................................................................... 15
Build-Related Information ....................................................................................................................................................................... 16
Compiler Switches and Makefiles ........................................................................................................................................................ 16
Debugging During Runtime ..................................................................................................................................................................... 17
Where to Get More Help ........................................................................................................................................................................... 17
Using the Offload Compiler – Explicit Memory Copy Model ......................................................................................................... 17
Reduction .......................................................................................................................................................................................................... 18
Creating the Offload Version ................................................................................................................................................................. 18
2
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
Asynchronous Offload and Data Transfer ..................................................................................................................................... 19
Using the Offload Compiler – Implicit Memory Copy Model ......................................................................................................... 19
Native Compilation ............................................................................................................................................................................................. 21
Parallel Programming Options on the Intel® Xeon Phi™ Coprocessor ..................................................................................... 22
Parallel Programming on the Intel® Xeon Phi™ Coprocessor: OpenMP* ........................................................................ 22
Parallel Programming on the Intel® Xeon Phi™ Coprocessor: OpenMP* + Intel® Cilk™ Plus Extended Array
Notation ............................................................................................................................................................................................................. 23
Parallel Programming on the Intel® Xeon Phi™ Coprocessor: Intel® Cilk™ Plus ........................................................... 24
Parallel Programming on Intel® Xeon Phi™ Coprocessor: Intel® Threading Building Blocks (Intel® TBB) ..... 25
Using Intel® MKL ................................................................................................................................................................................................... 26
SGEMM Sample............................................................................................................................................................................................... 27
Intel® MKL Automatic Offload Model ........................................................................................................................................................ 28
Debugging on the Intel® Xeon Phi™ Coprocessor............................................................................................................................. 29
Performance Analysis on the Intel® Xeon Phi™ Coprocessor ..................................................................................................... 29
About the Authors........................................................................................................................................................................................ 30
Notices ............................................................................................................................................................................................................... 31
Performance Notice ..................................................................................................................................................................................... 32
Optimization Notice ..................................................................................................................................................................................... 32
3
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
1
Introduction
This document will help you get started writing code and running applications on a system (host) that includes the Intel® Xeon Phi™ Coprocessor based on the Intel® Many Integrated Core Architecture (Intel® MIC Architecture). It describes the available tools and includes simple examples to show how to get C/C++ and Fortran-based programs up and running. For now, the developer will have to cut/paste the examples provided in the document to their system.
This document is available at http://software.intel.com/mic-developer under the Overviewtab.
Goals
This document does:
1. Walk you through the Intel® Manycore Platform Software Stack (Intel® MPSS) installation.
2. Introduce the build environment for software enabled to run on Intel® Xeon Phi™ Coprocessor.
3. Give an example of how to write code for Intel® Xeon Phi™ Coprocessor and build using Intel®
Composer XE 2013 SP1.
4. Demonstrate the use of Intel libraries like the Intel® Math Kernel Library (Intel® MKL).
5. Point you to information on how to debug and profile programs running on an Intel® Xeon Phi™
Coprocessor.
6. Share some best known methods (BKMs) developed by users at Intel.
This document does not:
1. Cover each tool in detail. Please refer to the user guides for the individual tools.
2. Provide in-depth training.
Terminology
Host – The Intel® Xeon® platform containing the Intel® Xeon Phi™ Coprocessor installed in a PCIe* slot. The operating systems (OS) supported on the host are Red Hat* Enterprise Linux* 6.0, Red Hat* Enterprise Linux*
6.1, Red Hat* Enterprise Linux* 6.2, Red Hat* Enterprise Linux* 6.3, Red Hat* Enterprise Linux* 6.4, Red Hat* Enterprise Linux* 6.5, SUSE* Linux* Enterprise Server SLES 11 SP2 and SUSE* Linux* Enterprise Server SLES 11 SP3. The user will have to install the OS.
Target – The Intel® Xeon Phi™ Coprocessor and corresponding runtime environment installed inside the coprocessor.
uOS – Micro Operating System – the Linux*-based operating system and tools running on the Intel® Xeon Phi™ Coprocessor.
ISA – Instruction Set Architecture – part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O (Input/Output).1
VPU – Vector Processing Unit- the portion of a CPU responsible for the execution of SIMD (single instruction, multiple data) instructions.
Intel acronyms dictionary, 8/6/2009, http://library.intel.com/Dictionary/Details.aspx?id=5600
4
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
NAcc – Native Acceleration – a mode or form of Intel® MKL in which the data being processed and the MKL function processing the data reside on the Intel® Xeon Phi™ Coprocessor.
Offload Compilers – The Intel® C/C++ Compiler and Intel® Fortran Compiler compilers, which can generate binaries for both the host system and the Intel® Xeon Phi™ Coprocessor. The offload compilers can generate binaries that will run only on the host, only on the Intel® Xeon Phi™ Coprocessor, or paired binaries that run on both the host and the Intel® Xeon Phi™ Coprocessor and communicate with each other.
Intel® MPSS – Intel® Manycore Platform Software Stack– the user- and system-level software that allows programs to run on and communicate with the Intel® Xeon Phi™ Coprocessor.
SCI - Symmetric Communications Interface – the mechanism for inter-node communication within a single platform, where an node is a Intel® Xeon Phi™ Coprocessor or an Intel Xeon processor-based host processor complex. In particular, SCI abstracts the details of communicating over the PCIe bus (and controlling related
Intel® Xeon Phi™ Coprocessor hardware) while providing an API that is symmetric between all types of nodes
System Configuration
The configuration assumed in this document is an Intel workstation containing two Intel® Xeon® processors, one or two Intel® Xeon Phi™ Coprocessors attached to a PCIe* x16 bus, and a GPU for graphics display.
Intel® Xeon Phi™ Software
Figure 1: Software Stack
The Intel® Xeon Phi Coprocessor software stack consists of layered software architecture as noted below and depicted in Figure 1.
Driver Stack:
The Linux software for the Intel® Xeon Phi™ Coprocessor consists of a number of components:
5
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
Device Driver: At the bottom of the software stack in kernel space is the Intel® Xeon Phi™ Coprocessor
device driver. The device driver is responsible for managing device initialization and communication between the host and target devices.
Libraries: The libraries live on top of the device driver in user and system space. The libraries provide basic
card management capabilities such as enumeration of cards in a system, buffer management, and host-to­card communication. The libraries also provide higher-level functionality such as loading and unloading executables onto the Intel® Xeon Phi™ Coprocessor, invoking functions from the executables on the card, and providing a two-way notification mechanism between host and card. The libraries are responsible for buffer management and communication over the PCIe* bus.
Tools: Various tools that help maintain the software stack. Examples include /usr/bin/micinfo for querying
system information, /usr/bin/micflash for updating the card’s flash, /usr/sbin/micctrl to help administrators configure the card, etc.
Card OS (uOS): The Linux-based operating system running on the Intel® Xeon Phi™ Coprocessor.
NOTE: Source for relatively recent versions of the uOS, the device driver, and the low-level SCI library interface can be found at http://software.intel.com/mic-developer .
6
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
Intel® Many Integrated Core Architecture Overview
The Intel® Xeon Phi™ Coprocessor has up to 61 in-order Intel® MIC Architecture processor cores running at 1GHz (up to 1.3GHz). The Intel® MIC Architecture is based on the x86 ISA, extended with 64-bit addressing and new 512-bit wide SIMD vector instructions and registers. Each core supports 4 hardware threads. In addition to the cores, there are multiple on-die memory controllers and other components.
Figure 2: Architecture overview of an Intel® MIC Architecture core
Each core includes a newly-designed Vector Processing Unit (VPU). Each vector unit contains 32 512-bit vector registers. To support the new vector processing model, a new 512-bit SIMD ISA was introduced. The VPU is a key feature of the Intel® MIC Architecture-based cores. Fully utilizing the vector unit is critical for best Intel® Xeon Phi™ Coprocessor performance. It is important to note that Intel® MIC Architecture cores do not support other SIMD ISAs (such as MMX, Intel® SSE, or Intel® AVX).
Each core has a 32KB L1 data cache, a 32KB L1 instruction cache, and a 512KB L2 cache. The L2 caches of all cores are interconnected with each other and the memory controllers via a bidirectional ring bus, effectively creating a shared last-level cache of up to 32MB. The design of each core includes a short in-order pipeline. There is no latency in executing scalar operations and low latency in executing vector operations. Due to the short in–order pipeline, the overhead for branch misprediction is low.
For more details on the machine architecture, please refer to the Intel® Xeon Phi™ Coprocessor Software Developers Guide posted at http://software.intel.com/mic-developer under “TOOLS & DOWNLOADS” tab.
7
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
Administrative Tasks
If you purchased the Intel® Xeon Phi™ Coprocessor from an equipment manufacturer, please go to the Intel® Developer Zone page http://software.intel.com/mic-developer and click on tab TOOLS & DOWNLOADS, then select “Intel® Manycore Architecture Platform Software Stack (Intel® MPSS)” on this page. This brings you to a page where you can download the latest hardware drivers and release notes for the platform.
Preparing Your System for First Use
Steps to install the driver and start the card
1. From Intel® Developer Zone page http://software.intel.com/mic-developer, click on tab TOOLS &
DOWNLOADS”, then select “Intel® Manycore Platform Software Stack (Intel® MPSS) on this page. Navigate to the latest version of MPSS release for Linux and download Readme file for Linux (English) (readme.txt). Also download the release notes (releaseNotes-linux.txt) and the User’s Guide for MPSS.
2. You may install your system with Red Hat* Enterprise Linux 64-bit 6.0 kernel 2.6.32-71, Red Hat
Enterprise Linux 64-bit 6.1 kernel 2.6.32-131, Red Hat Enterprise Linux 6.2 64-bit kernel 2.6.32-220, Red Hat Enterprise Linux 6.3 64-bit kernel 2.6.32-279, Red Hat Enterprise Linux 6.4 64-bit kernel
2.6.32-358, Red Hat Enterprise Linux 6.5 64-bit kernel 2.6.32-431, SUSE Linux Enterprise Server SLES 11 SP2 kernel 3.0.13-0.27-default or SUSE Linux Enterprise Server SLES 11 SP3 kernel 3.0.76-
0.11-default (Section 2.1 in readme.txt ). Be sure to install ssh, which is used to log in to the card’s uOS.
WARNING: On installing Red Hat, it may automatically update you to a new version of the Linux kernel. If this happens, you will not be able to use the pre-built host driver, but will need to rebuild it manually for the new kernel version. Please see section 2.1 in readme.txt for instructions on building an Intel® MPSS host driver for a specific Linux kernel.
3. Log in as root.
4. Download the release driver appropriated for your operating system in step 1 (<mpss-version>-rhel-
6.0.tgz, <mpss-version>-rhel-6.1.tgz, <mpss-version>-rhel-6.2.tgz, <mpss-version>-rhel-6.3.tgz, <mpss-version>-rhel-6.4.tgz, <mpss-version>-rhel-6.5.tgz, <mpss-version>-suse-11.2.tgz or <mpss- version>-suse-11.3.tgz) where <mpss-version> is mpss-3.2 at the time when this document was updated
5. Install the host driver RPMs as detailed in section 2.2 of readme.txt. Don’t skip the creation of
configuration files for your coprocessor.
6. Update the flash on your coprocessor(s) as detailed in section 2.4 of readme.txt.
7. Reboot the system.
8. Start the Intel® Xeon Phi™ Coprocessor (while you can set up the card to start with the host system, it
will not do so by default), and then run micinfo to verify that it is set up properly:
sudo service mpss start
sudo micctrl –w
sudo /usr/bin/micinfo
8
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
MPSS stack installed
Driver Version
MPSS Version
Flash Version
mpss-3.2
3.2-xx
3.2
2.1.03.0386
mpss-3.1
3.1-xx
3.1
2.1.03.0386
mpss_gold_update_3-2.1.6720-13
6720-13
2.1.6720-13
2.1.02.0386
KNC_gold_update_2-2.1.5889-16
5889-16
2.1.5889-16
2.1.05.0385
KNC_gold_update_1-2.1.4982-15
4982-15
2.1.4982-15
2.1.05.0375
KNC_gold-2.1.4346-xx
4346-xx
2.1.4346-xx
2.1.01.0375
Make sure that the Driver Version, MPSS Version and Flash Version are verified according to the
following table:
Table 1: Corresponding Driver Version, MPSS Version and Flash Version found in each MPSS release.
Steps to install the Software Development tools
You can purchase Software Development Tools at http://software.intel.com/en-us/linux-tool-suites. Select the tool(s) that fit(s) your need (e.g., “Intel® Cluster Studio XE 2013”, “Intel® C++ Composer XE for Linux*”, “Intel® Fortran Composer XE for Linux*”, etc.). After selecting the tool that you need and completing the purchasing process, you will receive a serial number. Alternatively, visit http://software.intel.com/en-us/mic-developer ,
under the “Tools and Downloads” select the “Intel® Software Development Products” to find the latest list of
supported tools for the Intel® Xeon Phi™ Coprocessor.
If you acquired a serial number for Intel tools, go to the Intel® Registration Center (IRC) at
http://registrationcenter.intel.com to register and download the products. Click the button “Register Product”
will bring you to the download page of the tool(s) you purchased. The following example shows a case when a user bought the Intel Cluster Studio XE for Linux: from http://software.intel.com/en-us/intel-cluster-studio-xe/ , under the tab Documentation, you can get the Install Guide, Getting Started Guide and Release Notes documents.
1. Follow the instructions in the Install Guide to install the Intel Cluster Studio XE for Linux*. If you
bought the Intel C++ Composer XE for Linux, or the Intel Fortran Composer XE for Linux only, read the corresponding Install Guide to install these packages, as well as separately installing Intel® VTune™ Amplifier XE 2013 for Linux*.
For first time installations, be sure to get the product license number described above that is
required to activate the product, and then provide the license number during installation. Subsequent installations can select the “Use existing license option.
Read the release notes of the product (icsxe2013sp1-update1-release-notes.pdf if you
bought the Intel Cluster Studio XE for Linux, or Release_Notes_C_2013SP1_L_EN_Update2.pdf if you bought the Intel C++ Composer XE for Linux, or Release-notes-f-2013sp1-l-en-u2.pdf if you bought the Intel Fortran Composer XE for Linux) carefully.
Untar the product file
9
Intel® Xeon Phi™ Coprocessor DEVELOPERS QUICK START GUIDE
o tar –xvzf l_ics_2013.<update>.<package_num>.tgz, or o tar –xvf l_ccompxe_intel64_2013.<update>.<package_num>.tgz, or
o tar –xvf l_fcompxe_intel64_2013.<update>.<package_num>.tgz
2. Install the software tools using the previously acquired serial number.
3. Verify that the card is working by running a sample program (located in
/opt/intel/composerxe/Samples/en_US/C++/mic_sample for C/C++ code or in /opt/intel/composerxe/Samples/en_US/Fortran/mic_sample for Fortran code) with
setenv H_TRACE 2 or export H_TRACE=2 to display the dialog between the Host and Intel® Xeon Phi™ Coprocessor (messages from the processor will be prefixed with “MIC:”). If you do see
dialog then everything is running fine and the system is ready for general use.
4. If you intend to collect performance data on this system using Intel VTune Amplifier XE 2013: a) After MPSS gets started, it loads the data collection driver automatically. But for some reason, if
it fails to load the data collection driver, you can manually load the driver by going to /opt/intel/vtune_amplifier_xe/bin64/k1om/ and running:
sudo sep_micboot_install.sh
b) Start (or restart) the Intel® Manycore Platform Software Stack service (this also starts the
sampling driver once the files are copied in the previous step):
sudo service mpss restart
sudo micctrl -r
sudo micctrl -w
The coprocessor has successfully restarted when micctrl –w reports “micx: online
c) The sampling driver will now start every time the coprocessor is restarted d) If you ever need to reinstall the sampling driver, it can be done as follows:
sudo service mpss stop
sudo sep_micboot_uninstall.sh
sudo service mpss restart
sudo micctrl –w
Updating an Existing System
Updating a system that already has an Intel® Xeon Phi™ Coprocessor
1. From Intel® Developer Zone page http://software.intel.com/mic-developer, click on the TOOLS &
DOWNLOADS” tab, then select Software Drivers: Intel® Manycore Platform Software Stack (Intel®
10
Loading...
+ 22 hidden pages