Nvidia DGX Station User Manual

DGX STATION
DU-08255-001 _v2.1 | May 2018
User Guide
TABLE OF CONTENTS
About this Guide..................................................................................................v
Chapter1.Introduction to the NVIDIA® DGX Station™................................................... 1
1.1.What's in the Box.........................................................................................2
1.2.DGX OS Desktop Software Summary...................................................................2
1.3.DGX Station Hardware Summary....................................................................... 2
Chapter2.Setting Up the NVIDIA DGX Station............................................................ 4
2.1.Siting the DGX Station................................................................................... 4
2.2.Removing or Replacing the Packing Inside the DGX Station....................................... 5
2.3.Connecting and Powering on the DGX Station....................................................... 7
2.4.Completing the Initial Ubuntu OS Configuration................................................... 12
2.5.Adding Support for Additional Languages to the DGX Station....................................12
2.6.Registering Your DGX Station..........................................................................13
2.7.Configuring the DGX Station To Use Multiple Displays............................................ 13
2.8.Enabling Multiple Users to Access the DGX Station Remotely....................................15
2.9.Preparing the DGX Station for Use with Docker................................................... 15
2.9.1.Enabling Users To Run Docker Containers......................................................15
2.9.2.Preventing IP Address Conflicts Between Docker and the DGX Station....................16
Chapter3.Updating DGX Station Software............................................................... 18
3.1.Updating DGX Station Software from the Details Window........................................18
3.2.Updating DGX Station Software from the Command Line........................................ 21
3.3.Available DGX Station Software Updates............................................................ 21
3.3.1.Updates to Docker and Software Exclusive to the DGX Station............................ 21
3.3.2.Updates to the Ubuntu Software on the DGX Station........................................22
3.4.Checking for Updates to DGX Station Software....................................................23
3.5.Getting Release Information for DGX Station.......................................................24
3.6.Updating Software on an Air-Gapped DGX Station System....................................... 25
3.6.1.Providing DGX Station Software Updates from a Private Repository.......................25
3.6.2.Loading a Container Image onto an Air-Gapped DGX Station System......................25
Chapter4.Maintaining and Servicing the NVIDIA DGX Station........................................27
4.1.Problem Resolution and Customer Care............................................................. 27
4.2.Cleaning the Mesh Filter Under the DGX Station.................................................. 27
4.3.Collecting Information for Troubleshooting the DGX Station.....................................28
4.4.Checking the Health of the DGX Station............................................................ 29
4.5.Replacing the System and Components..............................................................29
4.5.1.Replacing the System............................................................................. 30
4.5.2.Repacking the DGX Station for Shipment...................................................... 30
4.6.Maintaining the DGX Station Persistent Storage................................................... 33
4.6.1.Changing the RAID Level of the RAID Array................................................... 33
4.6.2.Checking the Status of the DGX Station RAID Array..........................................34
4.6.3.Checking the Status of the DGX Station SSDs................................................. 35
www.nvidia.com
DGX Station DU-08255-001 _v2.1|ii
4.6.4.Replacing an SSD................................................................................... 36
4.6.5.Rebuilding the DGX Station RAID Array........................................................ 40
4.7.Restoring the DGX Station Software Image......................................................... 41
4.7.1.Obtaining the DGX Station Software ISO Image and Checksum File....................... 41
4.7.2.Creating a Bootable Installation Medium...................................................... 42
4.7.2.1.Creating a Bootable USB Flash Drive by Using Startup Disk Creator..................42
4.7.2.2.Creating a Bootable USB Flash Drive by Using Akeo Rufus............................. 43
4.7.3.Verifying the Bootable Installation Medium................................................... 45
4.7.3.1.Verifying a Bootable USB Flash Drive..................................................... 45
4.7.3.2.Verifying a Bootable DVD-ROM............................................................. 46
4.7.4.Installing the DGX Station Software Image from a USB Flash Drive or DVD-ROM.........47
4.8.Updating the DGX Station System BIOS..............................................................48
4.9.Maintaining the GPU Liquid Cooling System........................................................ 49
4.9.1.Monitoring GPU Temperatures................................................................... 49
4.9.2.Checking the Level of the Liquid in the GPU Cooling System.............................. 50
4.9.3.Replenishing the Liquid in the GPU Cooling System..........................................53
Appendix A.Safety............................................................................................. 57
A.1.Intended Application Uses............................................................................. 58
A.2.General Precautions.................................................................................... 58
A.3.Electrical Precautions.................................................................................. 58
A.4.Communications Cable Precautions.................................................................. 59
A.5. Other Hazards........................................................................................... 60
AppendixB.Connections, Controls, and Indicators.....................................................61
B.1.Front-Panel Connections and Controls...............................................................61
B.2.Rear-Panel Connections and Controls................................................................61
B.3.LAN Port Indicators..................................................................................... 63
B.4.Audio I/O Connections................................................................................. 64
AppendixC.Compliance...................................................................................... 66
C.1.DGX Station Model Number........................................................................... 66
C.2. Argentina................................................................................................. 66
C.3.Australia/New Zealand.................................................................................66
C.4. Brazil...................................................................................................... 67
C.5. Canada.................................................................................................... 67
C.6. China...................................................................................................... 68
C.7.European Union..........................................................................................69
C.8. India....................................................................................................... 70
C.9. Israel...................................................................................................... 70
C.10. Japan.................................................................................................... 71
C.11. Russia.................................................................................................... 71
C.12. South Africa............................................................................................ 71
C.13. South Korea.............................................................................................72
C.14. Taiwan................................................................................................... 72
C.15. United States........................................................................................... 73
www.nvidia.com
DGX Station DU-08255-001 _v2.1|iii
C.16.United States/Canada.................................................................................74
C.17. Vietnam..................................................................................................74
AppendixD.DGX Station Hardware Specifications...................................................... 75
D.1.Environmental Conditions..............................................................................75
D.2.Component Specifications............................................................................. 75
D.3.Mechanical Specifications..............................................................................76
D.4.Power Specifications....................................................................................76
www.nvidia.com
DGX Station DU-08255-001 _v2.1|iv
ABOUT THIS GUIDE
DGX Station User Guide explains how to install, set up, and maintain the NVIDIA® DGX Station™.
This guide is aimed at users and administrators who are familiar with the Ubuntu Desktop Linux OS, including use of the command line and the sudo command. For information about how to use the Ubuntu Desktop Linux OS, refer to Ubuntu Desktop
Guide (https://help.ubuntu.com/16.04/ubuntu-help/index.html).
For details about the DGX OS Desktop software for the DGX Station, refer to DGX OS
Desktop Release Notes.
For information about how to use the DGX Station to download and run containers for deep learning frameworks, refer to DGX Container Registry User Guide.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|v
About this Guide
www.nvidia.com
DGX Station DU-08255-001 _v2.1|vi
Chapter1. INTRODUCTION TO THE NVIDIA® DGX
STATION
The NVIDIA DGX Station is a fast, multi-GPU workstation for deep learning and AI analytics. You can use the DGX Station to run neural networks, and deploy deep learning models. Because the DGX Station is software compatible with the NVIDIA DGX-1 server, you can also use the DGX Station to optimize applications to run on a production DGX-1 cluster.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|1
1.1.What's in the Box
DGX Station
Accessory boxes containing:
Quick Start Guide
AC power cable
3 DisplayPort™ 1.2 to HDMI 2.0 adapters
USB recovery flash drive containing a backup copy of the operating system
image and CUDA toolkit DVD-ROM containing source code of open-source software installed on the
DGX Station Toxic Substance Notice and Safety Instructions
Declaration of Conformity
Repacking Instructions/Intra-Transit
Introduction to the NVIDIA® DGX Station
Inspect each piece of equipment in the packing box. If anything is missing or damaged, contact your supplier.
1.2.DGX OS Desktop Software Summary
The DGX OS Desktop software that is supplied with the DGX Station includes the software that you need for downloading and running containers for deep learning frameworks. The software is already installed on the DGX Station, except where licensing requirements mandate that the software be supplied separately. Any software that must be supplied separately is installed automatically when the DGX Station is first powered on.
For details about the DGX OS Desktop software, refer to DGX OS Desktop Release
Notes.
1.3.DGX Station Hardware Summary
Processors
Component Qty Description
CPU 1 Intel Xeon E5-2698 v4 2.2 GHz (20-Core)
GPU - current units 4 NVIDIA Tesla® V100-DGXS-32GB with 32 GB per GPU (128 GB total) of
GPU - earlier units 4 NVIDIA Tesla V100-DGXS-16GB with 16 GB per GPU (64 GB total) of
www.nvidia.com
DGX Station DU-08255-001 _v2.1|2
GPU memory
GPU memory
System Memory and Storage
Introduction to the NVIDIA® DGX Station
Unit
Component Qty
System memory 8 32 GB 256 GB ECC Registered LRDIMM DDR4 SDRAM
Data storage 3 1.92 TB 5.76 TB 2.5" 6 Gb/s SATA III SSD in RAID 0 configuration
OS storage 1 1.92 TB 1.92 TB 2.5" 6 Gb/s SATA III SSD
Capacity
Total Capacity Description
www.nvidia.com
DGX Station DU-08255-001 _v2.1|3
Chapter2. SETTING UP THE NVIDIA DGX STATION
Before using the DGX Station, ensure that its initial set-up is complete.
2.1.Siting the DGX Station
Caution
The DGX Station weighs 88 lbs (40 kg). Do not attempt to lift the DGX Station. Instead, remove the DGX Station from its packaging and move it into position by rolling it on its fitted casters.
To prevent damage to components inside the DGX Station, do not subject the DGX Station to excessive vibration or mechanical shock. After moving or transporting the DGX Station, visually inspect the NVLINK bridge, which connects the GPUs, and the drive trays in the drive cage to see if they have shifted out of position. If any of these components has shifted, reseat the component before operating the DGX Station.
Site the DGX Station in a location that is clean, dust-free, well ventilated, and near an appropriately rated, grounded AC power outlet.
Leave approximately 5" (12.5 cm) of clearance behind and at the sides of the DGX Station to allow sufficient airflow for cooling the unit.
When operating the DGX Station, keep the ambient temperature and relative humidity within the following ranges:
Ambient temperature: 10°C to 30°C (50°F to 86°F)
Relative humidity: 10% to 80% (non-condensing)
Always keep the DGX Station upright. Do not lay the unit on its side.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|4
Setting Up the NVIDIA DGX Station
2.2.Removing or Replacing the Packing Inside the DGX Station
To prevent damage to components inside the DGX Station during transit, a foam packing piece is packed inside the DGX Station. Before you connect and power on the DGX Station, you must remove this packing piece from inside the DGX Station. If you are returning the DGX Station to NVIDIA under a return merchandise authorization (RMA), replace this packing piece before repacking the DGX Station.
Before you begin, ensure that:
The DGX Station is shut down and powered off.
The power cable, all communications cables, and any peripheral devices such as
displays and keyboards are disconnected from the DGX Station.
1.
Push the button on the right side of the DGX Station back panel to release the side panel on the right of the DGX Station when viewed from the rear.
2.
Lift the panel to remove it.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|5
Setting Up the NVIDIA DGX Station
Caution To prevent damage from electrostatic discharge, avoid touching any of the components inside the DGX Station.
3.
Remove or replace the foam packing piece that surrounds the GPU cards inside the DGX Station.
To remove the foam packing piece, gently grasp it and pull it towards you.
If you are unpacking an advance-shipped replacement for a unit that you are returning to NVIDIA under an RMA, retain this foam packing piece with all other DGX Station packaging. You will need the packaging to repack your original DGX Station for shipment to NVIDIA.
To replace the foam packing piece, gently push it into position around the GPU
cards inside the DGX Station.
4.
Align the bottom edge of the side panel with the bottom edge of the DGX Station.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|6
5.
Firmly push the panel back into place to re-engage the latches.
Setting Up the NVIDIA DGX Station
2.3.Connecting and Powering on the DGX Station
To complete this task you need the following items, which are not supplied with the DGX Station:
Display with power cable and connector cable terminated in a DisplayPort
connector or HDMI connector
If your display connector cable is terminated in an HDMI connector, you can use one of the supplied adapters to connect the cable to the DGX Station.
USB keyboard
USB mouse
Ethernet cable
www.nvidia.com
DGX Station DU-08255-001 _v2.1|7
Setting Up the NVIDIA DGX Station
1.
Connect a display to any DisplayPort connector and a keyboard and mouse to any two USB ports.
For initial setup, connect only one display to the DGX Station. After you complete the initial Ubuntu OS configuration, you can configure the DGX Station to use multiple displays. For details, see Configuring the DGX Station To Use
Multiple Displays.
2.
Use any of the two Ethernet ports to connect the DGX Station to your LAN with Internet connectivity.
Connect only one Ethernet port on the DGX Station to the Internet unless you plan to configure the ports manually and disable DHCP on at least one of the ports.
By default, both Ethernet ports on the DGX Station are configured for DHCP. If both the ports are connected simultaneously, each port will get its own IP address. The IP address that the Linux operating system (OS) uses will
www.nvidia.com
DGX Station DU-08255-001 _v2.1|8
Setting Up the NVIDIA DGX Station
then alternate between these addresses, causing the OS and applications to malfunction.
3.
Make sure that the power supply rocker switch is in the OFF position.
Current units:
Earlier units:
4.
Connect the supplied power cable from the power socket at the back of the unit to an appropriately rated, grounded AC outlet.
For details of the power consumption, input voltage, and current rating of the DGX Station, see Power Specifications.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|9
Current units:
Earlier units:
Setting Up the NVIDIA DGX Station
Caution
Use only the supplied power cable and do not use this power cable with any other products or for any other purpose. Not all power cables have the same current ratings.
Do not use household extension cables with your product. Household extension cables do not have overload protection and are not intended for use with computer systems.
5.
Connect the display to a suitable AC outlet and power on the display.
6.
Move the DGX Station power supply rocker switch to the ON position.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|10
Current units:
Earlier units:
Setting Up the NVIDIA DGX Station
7.
Push the Power button on the front of the unit to power on the DGX Station.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|11
Setting Up the NVIDIA DGX Station
2.4.Completing the Initial Ubuntu OS Configuration
When you power on the DGX Station for the first time, you are prompted to accept end user license agreements for NVIDIA software. You are then guided through the process for completing the initial Ubuntu OS configuration. As part of this process, you are prompted to create your user name and password for logging in to the DGX Station.
To protect the DGX Station from unauthorized access, choose a strong password. The strength of the password you choose is indicated as you type it.
After the Ubuntu OS configuration is complete, you can log in to the DGX Station to access your Ubuntu desktop.
Updates to the DGX Station software might have been made available after your DGX Station was manufactured. To ensure that you have the latest DGX Station software, including security updates, check for updates and install any available updates before using your DGX Station. For more information, see Updating DGX
Station Software.
2.5.Adding Support for Additional Languages to the DGX Station
During the initial Ubuntu OS configuration, you are prompted to select the default language on the DGX Station. If the language that you select is included in the DGX OS Desktop software image, it is installed in addition to English and you will see that language after you log in to access your desktop. If the language that you select is not included, you will still see English after logging in and you will need to install the language separately.
The following languages are included in the DGX OS Desktop software image:
English
Chinese (Simplified)
French
German
Italian
Portuguese
Russian
Spanish
www.nvidia.com
DGX Station DU-08255-001 _v2.1|12
Setting Up the NVIDIA DGX Station
For information about how to install languages, see Install languages (https://
help.ubuntu.com/16.04/ubuntu-help/prefs-language-install.html) in the Ubuntu Official
Documentation.
2.6.Registering Your DGX Station
Be sure to register your DGX Station with NVIDIA as soon as you receive your purchase confirmation e-mail. By registering your DGX Station, you will be entitled to receive technical support, warranty services, and software updates. You will also be able to set up an NVIDIA DGX Container Registry account.
To register your DGX Station, you will need information provided in your purchase confirmation e-mail. If you do not have the information, send an e-mail to NVIDIA Enterprise Support at enterprisesupport@nvidia.com.
1.
From a browser, go to the NVIDIA DGX Product Registration (http://
www.nvidia.com/object/dgx-product-registration) page.
2.
Enter all required information and then click SUBMIT to complete the registration process and receive all warranty entitlements and, if applicable, DGX Station support services entitlements.
2.7.Configuring the DGX Station To Use Multiple Displays
One of the NVIDIA Tesla V100 GPU cards in the DGX Station provides three DisplayPort connectors, enabling you to connect up to three displays to the DGX Station. If you want to use more than one display with the DGX Station, configure it to use multiple displays after you complete the initial Ubuntu OS configuration.
1.
Connect the displays that you want to use to the DisplayPort connectors at the rear of the DGX Station.
Each display is automatically detected as you connect it.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|13
Setting Up the NVIDIA DGX Station
2.
Optional: If necessary, adjust the display configuration, such as switching the primary display, or changing monitor positions or orientation.
a)
From the Ubuntu system menu at the right of the desktop menu bar, choose System Settings and in the System Settings window that opens, click Displays.
b)
In the Displays window that opens, make the changes to the display settings that you want and click Apply.
High-resolution displays consume a large quantity of GPU memory. If you have connected three 4K displays to the DGX Station, they may consume most of the GPU memory on the NVIDIA Tesla V100 GPU card to which they are connected, especially if you are running graphics-intensive applications.
If you are running memory-intensive compute workloads on the DGX Station and are experiencing performance issues, consider conserving GPU memory by reducing or minimizing the graphics workload.
To reduce the graphics workload, disconnect any additional displays you connected
and use only one display with the DGX Station.
If you disconnect a display from the DGX Station, the disconnection is automatically detected and the display settings are automatically adjusted for the remaining displays.
To minimize the graphics workload, shut down the LightDM display manager and
use secure shell (SSH) to log in to the DGX Station remotely.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|14
Setting Up the NVIDIA DGX Station
To shut down the LightDM display manager, type the following command:
$ sudo service lightdm stop
To start the LightDM display manager, log in to the DGX Station remotely and type the following command:
$ sudo service lightdm start
2.8.Enabling Multiple Users to Access the DGX Station Remotely
To enable multiple users to access the DGX Station remotely, secure shell (SSH) server is installed and enabled on the DGX Station.
Add other Ubuntu OS users to the DGX Station to allow them to log in remotely to the DGX Station through SSH.
For information about how to add a user, see Add a new user account (https://
help.ubuntu.com/16.04/ubuntu-help/user-add.html) in the Ubuntu Official
Documentation. For information about how to log in remotely through SSH, see
Connecting to an OpenSSH Server (https://help.ubuntu.com/community/SSH/OpenSSH/ ConnectingTo) on the Ubuntu Community Help Wiki.
The DGX Station does not provide any additional isolation guarantees between users beyond the guarantees that the Ubuntu OS offers. For guidelines about how to secure access to the DGX Station over SSH, see Configuring an OpenSSH Server (https://
help.ubuntu.com/community/SSH/OpenSSH/Configuring) on the Ubuntu Community
Help Wiki.
2.9.Preparing the DGX Station for Use with Docker
Some initial setup of the DGX Station is required to ensure that users have the required privileges to run Docker containers and to prevent IP address conflicts between Docker and the DGX Station.
2.9.1.Enabling Users To Run Docker Containers
To prevent the docker daemon from running without protection against escalation of privileges, the Docker software requires sudo privileges to run containers. Meeting this requirement involves enabling users who will run Docker containers to run commands with sudo privileges. Therefore, you should ensure that only users whom you trust
www.nvidia.com
DGX Station DU-08255-001 _v2.1|15
Setting Up the NVIDIA DGX Station
and who are aware of the potential risks to the DGX Station of running commands with sudo privileges are able to run Docker containers.
Before allowing multiple users to run commands with sudo privileges, consult your IT department to determine whether you would be violating your organization's security policies. For the security implications of enabling users to run Docker containers, see
Docker daemon attack surface.
You can enable users to run the Docker containers in one of the following ways:
Add each user as an administrator user with sudo privileges.
Add each user as a standard user without sudo privileges and then add the user
to the docker group. This approach is inherently insecure because any user who can send commands to the docker engine can escalate privilege and run root-user operations.
To add an existing user to the docker group, run this command:
$ sudo usermod -aG docker user-login-id
user-login-id
The user login ID of the existing user that you are adding to the docker group.
2.9.2.Preventing IP Address Conflicts Between Docker and the DGX Station
To ensure that the DGX Station can access the network interfaces for Docker containers, configure the containers to use a subnet distinct from other network resources used by the DGX Station. By default, Docker uses the 172.17.0.0/16 subnet. If addresses within this range are already used on the DGX Station network, change the Docker network to specify the bridge IP address range and container IP address range to be used by Docker containers.
This task requires sudo privileges.
1.
Open the /etc/systemd/system/docker.service.d/docker­override.conf file in a plain-text editor, such as vi.
$ sudo vi /etc/systemd/system/docker.service.d/docker-override.conf
2.
Append the following options to the line that begins ExecStart=/usr/bin/
dockerd, which specifies the command to start the dockerd daemon:
--bip=bridge-ip-address-range
--fixed-cidr=container-ip-address-range
bridge-ip-address-range
The bridge IP address range to be used by Docker containers, for example,
192.168.127.1/24.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|16
Setting Up the NVIDIA DGX Station
container-ip-address-range
The container IP address range to be used by Docker containers, for example,
192.168.127.128/25.
This example shows a complete /etc/systemd/system/docker.service.d/ docker-override.conf file that has been edited to specify the bridge IP address
range and container IP address range to be used by Docker containers.
[Service] ExecStart= ExecStart=/usr/bin/dockerd -H fd:// -s overlay2 --default-shm-size=1G -­bip=192.168.127.1/24 --fixed-cidr=192.168.127.128/25 LimitMEMLOCK=infinity LimitSTACK=67108864
Starting with DGX OS Desktop release 3.1.4, the option --disable-legacy-
registry=false is removed from the Docker CE service configuration file
docker-override.conf. The option is removed for compatibility with
Docker CE 17.12 and later.
3.
Save and close the /etc/systemd/system/docker.service.d/docker­override.conf file.
4.
Reload the Docker settings for the systemd daemon.
$ sudo systemctl daemon-reload
5.
Restart the docker service.
$ sudo systemctl restart docker
www.nvidia.com
DGX Station DU-08255-001 _v2.1|17
Chapter3. UPDATING DGX STATION SOFTWARE
Updates to DGX Station software are available from several sources. These updates may contain important security vulnerability fixes. You are responsible for updating the software on the DGX Station from these sources. For details about the available updates, see Available DGX Station Software Updates.
You can use any of the standard means provided by the Ubuntu Desktop OS to update this software. For examples, see:
Updating DGX Station Software from the Details Window
Updating DGX Station Software from the Command Line
Caution When you use these means to update software on the DGX Station, you update all software for which updates are available from your configured software sources, including applications that you installed yourself. If you want to prevent an application from being updated, you can instruct the Ubuntu package manager to keep the current version. For more information, see Introduction to Holding Packages (https://help.ubuntu.com/community/
PinningHowto#Introduction_to_Holding_Packages) on the Ubuntu Community Help
Wiki.
3.1.Updating DGX Station Software from the Details Window
When you open the Details window to get information about your DGX Station, the system checks for updates and, if any updates are available, gives you the option to install them.
Ensure that you are logged in to your Ubuntu desktop on the DGX Station as an administrator user.
1.
From the Ubuntu system menu at the top right of the desktop, choose About This Computer.
The Details window opens and the system checks for updates.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|18
Updating DGX Station Software
2.
In the Details window, click Install Updates.
3.
In the Software Updater window that opens, review the available updates and click Install Now.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|19
Updating DGX Station Software
If no updates are available, the Software Updater informs you that your software is up to date.
If an update requires the removal of obsolete packages, you will be warned that not all updates can be installed. To continue with the update, perform these steps:
a)
Click Partial Upgrade.
b) Review the list of packages that will be removed.
To identify obsolete DGX OS Desktop packages, see the lists of obsolete packages in the DGX OS Desktop Release Notes for all releases after your current release.
c)
If the list contains only packages that you want to remove, click Start Upgrade.
4.
When prompted to authenticate, type your password into the Password field and click Authenticate.
5.
If necessary, restart your DGX Station when prompted to complete the updates.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|20
Updating DGX Station Software
3.2.Updating DGX Station Software from the Command Line
Use the apt (http://manpages.ubuntu.com/manpages/xenial/en/man8/apt.8.html) command to update DGX Station software from the command line.
Ensure that you are logged in to your Ubuntu desktop on the DGX Station as an administrator user.
1.
Download information from all configured sources about the latest versions of the packages.
$ sudo apt update
2.
Review the available updates by simulating an upgrade of the packages.
$ sudo apt full-upgrade -s
3.
Upgrade the packages to the latest version.
$ sudo apt full-upgrade
3.3.Available DGX Station Software Updates
Updates to DGX Station are made available through standard Ubuntu repositories.
DGX Station is preset to obtain from these repositories updates to the following software:
Docker
Software that is exclusive to the DGX Station, including the CUDA Toolkit and
CUDA Drivers packages Ubuntu software
For more information about repositories, see Repositories/Ubuntu (https://
help.ubuntu.com/community/Repositories/Ubuntu) on the Ubuntu Community Help
Wiki.
3.3.1.Updates to Docker and Software Exclusive to the DGX Station
Updates to Docker and to software that is exclusive to the DGX Station, including the CUDA Toolkit and CUDA Drivers packages, are available from a repository maintained by NVIDIA.
Caution
Do not obtain updates to the CUDA Toolkit and CUDA Drivers packages from the public CUDA package repository for Ubuntu. Updates from the public repository
www.nvidia.com
DGX Station DU-08255-001 _v2.1|21
Updating DGX Station Software
may be incompatible with the DGX Optimized Frameworks that are available from the NVIDIA® DGX™ Container Registry.
Do not obtain updates to Docker from Docker's repositories. NVIDIA Container Runtime for Docker has strict dependencies on the Docker CE version and updates from Docker's repository may cause NVIDIA Container Runtime for Docker to be removed.
The repository maintained by NVIDIA is enabled by default in Ubuntu Software & Updates, Other Software on the DGX Station, as shown in the following screen capture.
Although a Docker repository is also enabled, DGX Station no longer uses this repository to obtain updates to Docker because the repository maintained by NVIDIA takes precedence over the Docker repository.
3.3.2.Updates to the Ubuntu Software on the DGX Station
Updates to the Ubuntu software on the DGX Station are available from the Canonical repositories.
The repositories that are enabled by default in Ubuntu Software & Updates, Ubuntu Software on the DGX Station are shown in the following screen capture.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|22
Updating DGX Station Software
By default, the DGX Station does not notify you of available updates or automatically install any updates, including important security updates. To minimize the risk to your DGX Station from security vulnerabilities, you must ensure that it is kept up to date with the latest important security updates.
Updates to another LTS base OS version are blocked because they can disrupt the DGX Station software and disable the NVIDIA graphics drivers.
3.4.Checking for Updates to DGX Station Software
To check for software updates and to configure updates from the Ubuntu software repositories, use System Settings, Software & Updates. You can configure your DGX Station to notify you of important security updates more frequently than other updates.
In the following example, the DGX Station is configured to check for updates daily, to display important security updates immediately, and to display other updates every two weeks.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|23
Updating DGX Station Software
3.5.Getting Release Information for DGX Station
The file /etc/dgx-release provides release information for the DGX Station, such as the product name and serial number.
This file also tracks the history of DGX OS Desktop software updates by providing the following information:
The version number and installation date of the last version to be installed from an
ISO image
The version number and update date of each over-the-network update applied since
the software was last installed from an ISO image
You can use this information to determine if your DGX Station is running the current version of the DGX OS Desktop software.
To get release information for the DGX Station, view the content of the file /etc/dgx- release.
For example:
$ more /etc/dgx-release DGX_NAME="DGX Station" DGX_PRETTY_NAME="NVIDIA DGX Station" DGX_SWBUILD_DATE="2017-09-18" DGX_SWBUILD_VERSION="3.1.2" DGX_COMMIT_ID="15cd1f473bb53d9b64503e06c5fee8d2e3738ece" DGX_SERIAL_NUMBER=XXXXXXXXXXXXX
DGX_OTA_VERSION="3.1.3" DGX_OTA_DATE="Wed Nov 15 15:35:25 PST 2017"
DGX_OTA_VERSION="3.1.4" DGX_OTA_DATE="Fri Jan 19 13:49:06 PST 2018"
www.nvidia.com
DGX Station DU-08255-001 _v2.1|24
Updating DGX Station Software
3.6.Updating Software on an Air-Gapped DGX Station System
For security purposes, some installations require that the DGX Station be an air-gapped system. An air-gapped system is not connected to any unsecured networks, such as the public Internet or an unsecured LAN, or to any other computers connected to an unsecured network. The default mechanisms for updating software on the DGX Station and loading container images from the NVIDIA DGX Container Registry require an Internet connection. On an air-gapped system, which is isolated from the Internet, you must provide alternative mechanisms for updating software and loading container images.
3.6.1.Providing DGX Station Software Updates from a Private Repository
The public NVIDIA and Canonical repositories that provide software updates to the DGX Station are Ubuntu repositories. Access to these repositories requires an Internet connection. On an air-gapped system, which is isolated from the Internet, you must provide these updates from a private repository that mirrors the public repositories.
1.
Identify the sources corresponding to the public NVIDIA and Canonical repositories that provide updates to the DGX Station software.
You can identify these sources from the /etc/apt/sources.list file and the contents of /etc/apt.sources.list.d/ directory, or by using System Settings, Software & Updates.
2.
Create and maintain a private repository that mirrors the sources that you identified in the previous step.
For detailed instructions, refer to Debian Repository Setup (https://wiki.debian.org/
DebianRepository/Setup) on the Debian wiki.
3.
Update the sources that provide updates to the DGX Station to use your private repository instead of the public repositories.
You can update these sources by modifying the /etc/apt/sources.list file and the contents of /etc/apt.sources.list.d/ directory, or by using System Settings, Software & Updates.
Future updates to the DGX Station software will be obtained from your private repository.
3.6.2.Loading a Container Image onto an Air-Gapped DGX Station System
Loading a container image from the NVIDIA DGX Container Registry requires an Internet connection. On an air-gapped system, which is isolated from the Internet, you
www.nvidia.com
DGX Station DU-08255-001 _v2.1|25
Updating DGX Station Software
must use a removable medium to copy the container image from a system with an Internet connection to the air-gapped system.
1.
On a system with an Internet connection, log in to the NVIDIA DGX Container Registry and load the container image that you want.
For instructions, refer to DGX Container Registry User Guide.
2.
Save the container image as a tar archive.
$ docker save nvcr.io/registry-space/repository:tag > archive-file.tar
registry-space
The name of the space within the registry that contains the container image. For container images provided by NVIDIA, the registry space is nvidia.
repository
The repository that contains the container image. A repository is a collection of all versions of a container image with the same name. The repository name is the main container image name.
tag
A tag that identifies the version of the container image.
archive-file
Your choice of name for the archive file to which you are saving the container image.
3.
Transfer the image to the air-gapped system by using a removable medium such as a USB flash drive or DVD-ROM.
4.
On the air-gapped system, load the container image from the local copy of the archive file that contains the image.
$ docker load –i framework.tar
5.
Confirm that the image is loaded on the air-gapped system.
$ docker images
www.nvidia.com
DGX Station DU-08255-001 _v2.1|26
Chapter4. MAINTAINING AND SERVICING THE NVIDIA
DGX STATION
Be sure to familiarize yourself with the NVIDIA Terms & Conditions documents before attempting to perform any modification or repair to the DGX Station. These Terms & Conditions for the DGX Station can be found through the NVIDIA DGX Systems
Support (http://www.nvidia.com/object/dgxsystems-support.html) page.
Caution The DGX Station is designed as an integrated system and does not support the installation of additional PCIe devices such as GPU cards. Any attempt to modify the DGX Station by installing additional PCIe devices is an unauthorized modification and will void the DGX Station hardware warranty. Any such modification will also impair the performance of the system, may overload the system’s electrical circuits, and may cause it to overheat.
4.1.Problem Resolution and Customer Care
Log on to the NVIDIA Enterprise Services (https://nvid.nvidia.com/dashboard/) site for assistance with troubleshooting, diagnostics, or to report problems with your DGX Station.
4.2.Cleaning the Mesh Filter Under the DGX Station
To prevent dust from entering the DGX Station through the ventilation holes under the unit, a mesh filter is fitted to the underside of the DGX Station. Clean this mesh filter periodically to prevent the accumulation of dust on the filter from impeding the flow of air through the DGX Station.
1.
Reach under the front of the DGX Station and grasp the mesh filter by its handle.
2.
Pull the mesh filter towards you to slide it out from the font of the unit.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|27
Maintaining and Servicing the NVIDIA DGX Station
3.
Use compressed air to blow the dust from the mesh filter.
4.
Line up the mesh filter with the runners under the DGX Station and slide it back into position under the unit.
4.3.Collecting Information for Troubleshooting the DGX Station
To help diagnose and resolve issues, the DGX Station provides a tool to collect troubleshooting information for NVIDIA Support Enterprise Services.
The tool verifies basic functionality and performance of the DGX Station and collects the following information:
Log files
www.nvidia.com
DGX Station DU-08255-001 _v2.1|28
Maintaining and Servicing the NVIDIA DGX Station
Hardware inventory
SW inventory
To collect information for troubleshooting the DGX Station, run the following command:
sudo nvsysinfo [-o output-file]
For DGX OS Desktop releases 3.1.1 through 3.1.3, the command to run is as follows:
sudo nvidia-sysinfo [-o output-file]
output-file is the name and the path of the file in which the information is written. If you omit the output file, the information is written to the file /tmp/ nvsysinfo-timestamp.random-number.out.
For DGX OS Desktop releases 3.1.1 through 3.1.3, the file name is /tmp/nvidia-
sys-info-timestamp.random-number.out.
Use any method that is convenient for you to send the file to NVIDIA Support Enterprise Services. For example, send the file as an e-mail attachment.
4.4.Checking the Health of the DGX Station
The DGX Station provides the NVIDIA System Health Checker (nvhealth) tool to exercise the system and verify its health. The output of nvhealth is an itemized list of checks and their status, typically Healthy or Unhealthy. On a healthy system, all checks should return Healthy. You should investigate any checks that return Unhealthy to determine their root cause and resolve them.
To check the health of the DGX Station, run the following command:
$ sudo nvhealth [-k output-file]
output-file is the name and the path of the file in which the raw state of the system is written. If you omit the output file, the information is written to the file /tmp/
nvhealth-log.random-string.jsonl, for example, /tmp/nvhealth­log.6wf3WriAC3.jsonl. The nvhealth command displays this file name at the end
of the output from the command.
4.5.Replacing the System and Components
Be sure to familiarize yourself with the NVIDIA Terms & Conditions documents before attempting to perform any modification or repair to the DGX Station. These Terms & Conditions for the DGX Station can be found through the NVIDIA DGX Systems
Support (http://www.nvidia.com/object/dgxsystems-support.html) page.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|29
Maintaining and Servicing the NVIDIA DGX Station
Contact NVIDIA Enterprise Customer support to obtain an RMA number for any system or component that needs to be returned for repair or replacement.
The only components that are customer-replaceable are the Solid State Drives (SSDs).
Return the failed components to NVIDIA.
4.5.1.Replacing the System
When returning a DGX Station under RMA, consider the following points.
Packaging
To prevent damage during shipping, repack the DGX Station in the packaging in which the replacement unit was advanced shipped by following the instructions in Repacking
the DGX Station for Shipment.
SSDs
If necessary, you can remove and keep the SSDs prior to shipping the system back for replacement. If you already received a replacement system and you want to keep the original SSDs, install the new SSDs into the defective system when shipping it back.
AC Power Cable
Do not return the AC power cable when returning the DGX Station.
Accessories
Include all supplied accessories except the AC power cable when returning the DGX Station.
4.5.2.Repacking the DGX Station for Shipment
If you are returning the DGX Station to NVIDIA under an RMA, repack it in the packaging in which the replacement unit was advanced shipped to prevent damage during shipment.
Caution The DGX Station weighs 88 lbs (40 kg). Do not attempt to lift the DGX Station. Instead, move it into position by rolling it on its fitted casters.
Before you begin, ensure that the foam packing piece that surrounds the GPU cards inside the DGX Station has been replaced. For detailed instructions, see Removing or
Replacing the Packing Inside the DGX Station.
1.
Place the bottom tray of the DGX Station shipping carton on the floor and ensure that the flap at the front of the tray is pulled down to form a ramp.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|30
Maintaining and Servicing the NVIDIA DGX Station
2.
Roll the DGX Station up the ramp into the bottom tray of its shipping carton.
Caution Ensure that you have a second person to help you roll the DGX Station into position.
3.
Insert the front packing piece into the tray, ensuring that the lip of the packing piece is under the DGX Station.
4.
Insert the side packing pieces into the tray, ensuring that the lip of each piece is under the DGX Station.
5.
Pack all supplied accessories in the accessory boxes except the AC power cable.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|31
Maintaining and Servicing the NVIDIA DGX Station
Keep the AC power cable to use with your replacement DGX Station.
6.
Place both accessory boxes in the slots in the tray on each side of the DGX Station.
Ensure that the lugs that protrude from the edges of each accessory box are facing away from the DGX Station.
The accessory boxes are required to help hold the DGX Station in place in its packaging during shipment. Be sure to place both accessory boxes in the slots in the tray, even if one or both boxes are empty.
7.
Pull up the flap at the front of the bottom tray of the DGX Station shipping carton.
8.
Lower the top cover of the shipping carton into position so that the holes in the top cover and the holes in the bottom tray are aligned.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|32
Maintaining and Servicing the NVIDIA DGX Station
9.
Insert the packing clasps into the cutouts in the top cover of the shipping carton and engage the clasps to secure the top cover in place.
To prevent the packing clasps from becoming jammed inside the shipping carton, do not use excessive force when inserting them into the cutouts.
4.6.Maintaining the DGX Station Persistent Storage
The DGX Station persistent storage consists of SSDs for data storage and the operating system. As supplied from the factory, these SSDs are configured as described in System
Memory and Storage.
4.6.1.Changing the RAID Level of the RAID Array
As supplied from the factory, the RAID level of the DGX Station RAID array is RAID 0. RAID 0 provides the maximum storage capacity, but does not provide any redundancy.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|33
Maintaining and Servicing the NVIDIA DGX Station
If a single SSD in the array fails, all data stored on the array is lost. If you are willing to accept reduced capacity in return for some level of protection against failure of a single SSD, you can change the level of the RAID array to RAID 5. If you change the RAID level from RAID 0 to RAID 5, the total storage capacity of the RAID array is reduced from
5.76 TB to 3.84 TB.
Before changing the RAID level of the DGX Station RAID array, back up all data on the array that you want to preserve. Changing the RAID level of the DGX Station RAID array erases all data stored on the array.
The DGX Station software includes the custom script configure_raid_array.py, which you can use to change the level of the RAID array without unmounting the RAID volume.
To change the RAID level to RAID 5, run the following command:
$ sudo configure_raid_array.py -m raid5
To change the RAID level to RAID 0, run the following command:
$ sudo configure_raid_array.py -m raid0
To confirm that the RAID level was changed as required, run the lsblk command. The entry in the TYPE column for each SSD in the RAID array indicates the RAID level of the array.
The following example shows that the RAID level of the array is RAID 0. The name of the RAID volume is md0 and the mount point of the volume is /raid.
~$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk |_sda1 8:1 0 487M 0 part /boot/efi |_sda2 8:2 0 1.8T 0 part / sdb 8:16 0 1.8T 0 disk |_md0 9:0 0 5.2T 0 raid0 /raid sdc 8:32 0 1.8T 0 disk |_md0 9:0 0 5.2T 0 raid0 /raid sdd 8:48 0 1.8T 0 disk |_md0 9:0 0 5.2T 0 raid0 /raid
4.6.2.Checking the Status of the DGX Station RAID Array
Use the mdadm command to print details of the md0 device.
$ sudo mdadm -D /dev/md0
This example shows the status of a RAID array that is functioning properly.
$ sudo mdadm -D /dev/md0 Version : 1.2 Creation Time : Mon Jun 5 17:40:48 2017 Raid Level : raid0 Array Size : 374964224 (357.59 GiB 383.96 GB) Raid Devices : 3
www.nvidia.com
DGX Station DU-08255-001 _v2.1|34
Maintaining and Servicing the NVIDIA DGX Station
Total Devices : 3 Persistence : Superblock is persistent
Update Time : Mon Jun 5 17:40:48 2017 State : clean Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0
Chunk Size : 512K Name : lab-VirtualBox:0 (local to host lab-VirtualBox) UUID : c8ba911a:8634bd99:2ebeea3d:c9a7db4c Events : 0
Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 8 48 2 active sync /dev/sdd
This example shows the status of a RAID array in which one SSD has failed or is missing. The failed or missing SSD is identified by the empty RaidDevice State column.
$ sudo mdadm -D /dev/md0 ...
Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync 2 8 48 2 active sync /dev/sdd
4.6.3.Checking the Status of the DGX Station SSDs
LEDs on the DGX Station SSDs indicate the status of the SSDs. The SSDs are mounted inside the DGX Station and are visible only when the side panel that covers the SSDs is removed.
1.
Remove the side panel on the left of the DGX Station when viewed from the rear.
a) Push the button on the left side of the DGX Station back panel to release the
panel.
b) Lift the panel to remove it.
Caution To prevent damage from electrostatic discharge, avoid touching any of the components inside the DGX Station other than any components that you are replacing or servicing.
2.
Examine each SSD to determine its status from the state of the LED on the SSD.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|35
Maintaining and Servicing the NVIDIA DGX Station
On (steady)
The SSD is operational but is idle.
On (blinking)
The SSD is being read from or written to.
Off
The SSD has failed and must be replaced.
3.
Replace the side panel of the DGX Station.
a) Align the bottom edge of the side panel with the bottom edge of the DGX Station. b) Firmly push the panel back into place to re-engage the latches.
If an SSD has failed, you must replace it as explained in Replacing an SSD.
4.6.4.Replacing an SSD
If an SSD in the DGX Station fails, replace the SSD to return the system to operation.
Caution The default RAID level of the array in the DGX Station is RAID 0, which does not provide any redundancy. If a single SSD in the array fails, all data stored on the array is lost. To prevent the failure of an SSD from causing a loss of data, ensure that any data on the array that you want to preserve is backed up.
1.
Remove the side panel on the left of the DGX Station when viewed from the rear.
a) Push the button on the left side of the DGX Station back panel to release the
panel.
b) Lift the panel to remove it.
Caution To prevent damage from electrostatic discharge, avoid touching any of the components inside the DGX Station other than any components that you are replacing or servicing.
2.
On the SSD that you want to replace, press the drive-tray eject button to loosen the drive-tray latch.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|36
Maintaining and Servicing the NVIDIA DGX Station
3.
Pull the drive-tray latch upwards to unseat the drive tray.
4.
Slide the drive tray upwards to completely remove it from the unit.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|37
Maintaining and Servicing the NVIDIA DGX Station
5.
Using a Phillips screwdriver, remove the four screws attaching the SSD to the drive tray.
Save the screws for the replacement SSD.
6.
Slide the SSD out of the drive tray.
7.
Slide the replacement SSD into the drive tray.
Make sure that the connector is on the open edge side of the tray.
8.
Secure the replacement SSD to the drive tray using the four screws.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|38
Maintaining and Servicing the NVIDIA DGX Station
9.
With the drive-tray eject button at the right, insert the drive tray into the appropriate drive bay, then slide the drive tray all the way into the drive bay.
10.
Press the drive-try latch downwards until you hear a click to completely seat the drive tray.
11.
Replace the side panel of the DGX Station.
a) Align the bottom edge of the side panel with the bottom edge of the DGX Station. b) Firmly push the panel back into place to re-engage the latches.
What you need to do to return the DGX Station to service depends on whether you replaced an SSD in the RAID array the OS SSD.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|39
Maintaining and Servicing the NVIDIA DGX Station
If you replaced an SSD in the RAID array, rebuild the RAID array as explained in
Rebuilding the DGX Station RAID Array.
If you replaced the OS SSD, restore the software image as explained in Restoring the
DGX Station Software Image.
4.6.5.Rebuilding the DGX Station RAID Array
If the DGX Station RAID array is degraded because an SSD failed, replace the SSD as explained in Replacing an SSD.
After replacing a failed SSD in the RAID array, you must rebuild the array to add the new SSD to a RAID 0 array or to regenerate the lost data on the new SSD in a RAID 5 array. The DGX Station software includes the custom script configure_raid_array.py for this purpose.
To rebuild the array, run the following command:
$ sudo configure_raid_array.py -r
The time required to rebuild a RAID 5 array depends on factors such as system load, SSD capacity, and the number of SSDs in the array. Rebuilding the array of three, 1.92­terabyte SSDs in the DGX Station may require several hours.
You can monitor the progress of a long-running rebuild by examining the contents of the
/proc/mdstat file:
$ cat /proc/mdstat Personalities : [raid0] [linear] [multipath] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid5 sdb[0] sdd[3] sdc[1] 3750486016 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
[>....................] recovery = 4.0% (75580956/1875243008)
finish=438.3min speed=68419K/sec bitmap: 2/14 pages [8KB], 65536KB chunk
unused devices: <none>
In this example, the rebuild is 4.0% complete and the rebuild is estimated to finish in
438.3 minutes.
The RAID array is rebuilt with its existing RAID level.
If the array is a RAID 0 array, all data that was on the array is erased after array is
rebuilt.
If the array is a RAID 5 array, the data on the array is preserved after array is rebuilt.
If you have rebuilt a RAID 0 array and have a backup of data on the array that you want to preserve, restore the data from the backup.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|40
Maintaining and Servicing the NVIDIA DGX Station
4.7.Restoring the DGX Station Software Image
If the DGX Station software image becomes corrupted or the OS SSD was replaced after a failure, restore the DGX Station software image to its original factory condition from a pristine copy of the image.
A USB flash drive is supplied from which you can restore the DGX Station software image. Before using this USB drive to restore the DGX Station software image, contact NVIDIA Support Enterprise Services to see if a later version of the software image is available. If a later version of the image is available, prepare a bootable installation medium that contains the current software image as explained in the following topics:
Obtaining the DGX Station Software ISO Image and Checksum File
Creating a Bootable Installation Medium
When you have a bootable installation medium that contains the current software image, install the image as explained in Installing the DGX Station Software Image from a USB
Flash Drive or DVD-ROM.
Updates to the DGX Station software might have been made available after the latest available ISO image file was created. To ensure that you have the latest DGX Station software, including security updates, check for updates and install any available updates after you restore the software image. For more information, see
Updating DGX Station Software.
4.7.1.Obtaining the DGX Station Software ISO Image and Checksum File
To ensure that you restore the latest available version of the DGX Station software image, obtain the current ISO image file from NVIDIA Support Enterprise Services. A checksum file is provided for the image to enable you to verify the bootable installation medium that you create from the image file.
1.
Log on to the NVIDIA Enterprise Services (https://nvid.nvidia.com/dashboard/) site.
2.
Click the Announcements tab to locate the download links for the DGX Station software image.
3.
Download the ISO image and its checksum file and save them to your local disk.
The ISO image is also available in an archive file. If you download the archive file, be sure to extract the ISO image before proceeding.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|41
Maintaining and Servicing the NVIDIA DGX Station
4.7.2.Creating a Bootable Installation Medium
After obtaining an ISO file that contains the software image from NVIDIA Support Enterprise Services, create a bootable installation medium, such as a USB flash drive or DVD-ROM, that contains the image.
If you are creating a bootable USB flash drive, follow the instructions for the
platform that you are using:
On Ubuntu Desktop, see Creating a Bootable USB Flash Drive by Using Startup
Disk Creator.
On Windows, see Creating a Bootable USB Flash Drive by Using Akeo Rufus.
If you are creating a bootable DVD-ROM, you can use any of the methods
described in Burning the ISO on to a DVD (https://help.ubuntu.com/community/
BurningIsoHowto#Burning_the_ISO_on_to_a_DVD) on the Ubuntu Community
Help Wiki.
4.7.2.1.Creating a Bootable USB Flash Drive by Using Startup Disk Creator
On an Ubuntu Desktop system, you can use Startup Disk Creator to create a bootable USB flash drive that contains the DGX Station software image.
Ensure that the following prerequisites are met:
The correct DGX Station software image is saved to your local disk. For more
information, see Obtaining the DGX Station Software ISO Image and Checksum File.
The USB flash drive has a capacity of at least 4 GB.
1.
Plug the USB flash drive into one of the USB ports of your Ubuntu Desktop system.
2.
Open the Dash, search for Startup Disk Creator , and click the Startup Disk Creator icon.
3.
In the Make Startup Disk window that opens, from the Source disc image (.iso) list, select the DGX Station software image file.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|42
Maintaining and Servicing the NVIDIA DGX Station
If the DGX Station software image file is not listed, click Other and in the window that opens, navigate to the file, select the file, and click Open.
4.
From the Disk to use list, select the USB flash drive and click Make Startup Disk.
4.7.2.2.Creating a Bootable USB Flash Drive by Using Akeo Rufus
On a Windows system, you can use the Akeo Reliable USB Formatting Utility (Rufus)
(https://rufus.akeo.ie/) to create a bootable USB flash drive that contains the DGX Station
software image.
Ensure that the following prerequisites are met:
The correct DGX Station software image is saved to your local disk. For more
information, see Obtaining the DGX Station Software ISO Image and Checksum File.
The USB flash drive has a capacity of at least 4 GB.
1.
Plug the USB flash drive into one of the USB ports of your Windows system.
2.
Download and launch the Akeo Reliable USB Formatting Utility (Rufus) (https://
rufus.akeo.ie/).
www.nvidia.com
DGX Station DU-08255-001 _v2.1|43
Maintaining and Servicing the NVIDIA DGX Station
3.
Under Partition scheme and target system type, select GPT partition scheme for UEFI.
4.
Select the Create a bootable disk using option and from the dropdown menu, select ISO image.
5.
Click the optical drive icon and open the DGX Station software ISO image.
6.
Click Start. Because the image is a hybrid ISO file, you are prompted to select whether to write the image in ISO Image (file copy) mode or DD Image (disk image) mode.
7.
Select Write in ISO Image mode and click OK.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|44
Maintaining and Servicing the NVIDIA DGX Station
4.7.3.Verifying the Bootable Installation Medium
On a Linux system, you can use the checksum file provided for the DGX Station software image to verify the installation medium that you created from the image.
Ensure that the following prerequisites are met:
The checksum file for the DGX Station software image is saved to your local disk.
For more information, see Obtaining the DGX Station Software ISO Image and
Checksum File.
You have created a bootable installation medium from the image. For more
information, see Creating a Bootable Installation Medium.
How to verify a bootable installation medium depends on whether it is a USB flash drive or a DVD-ROM.
4.7.3.1.Verifying a Bootable USB Flash Drive
1.
Plug the USB flash drive into one of the USB ports of your Linux system.
2.
Obtain the device ID of the USB flash drive by running the lsblk (http://
manpages.ubuntu.com/manpages/xenial/man8/lsblk.8.html) command.
$ lsblk
You can identify the USB flash drive from its size, which is much smaller than the size of the SSDs in the DGX Station, and from the mount points of any partitions on the drive, which are under /media.
In the following example, the device ID of the USB flash drive is sde1.
$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk |_sda1 8:1 0 487M 0 part /boot/efi |_sda2 8:2 0 1.8T 0 part / sdb 8:16 0 1.8T 0 disk |_md0 9:0 0 5.2T 0 raid0 /raid sdc 8:32 0 1.8T 0 disk |_md0 9:0 0 5.2T 0 raid0 /raid sdd 8:48 0 1.8T 0 disk |_md0 9:0 0 5.2T 0 raid0 /raid sde 8:64 1 3.7G 0 disk |_sde1 8:65 1 3.2G 0 part /media/deepl/DGXSTATION |_sde2 8:66 1 2.3M 0 part $
3.
Compute the checksum of the image on the USB flash drive.
$ sudo dd if=device-id bs=block-size | cksum
device-id
The device ID of the USB flash drive, for example, /dev/sde1.
block-size
The block size to be used by the dd command, for example, 1M.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|45
Maintaining and Servicing the NVIDIA DGX Station
This example computes the checksum of an image on the USB flash drive with device ID /dev/sde1 using a block size of 1 MB.
$ sudo dd if=/dev/sde1 bs=1M | cksum 3299+1 records in 3299+1 records out 3459317760 bytes (3.5 GB, 3.2 GiB) copied, 164.369 s, 21.0 MB/s 3992706625 3459317760
4.
Obtain the checksum value from the checksum file.
$ cat checksum-file
checksum-file
The path, including the file name, to the checksum file.
This example obtains the checksum value for the image
DGXStation-3.1.2_56d4a9.iso from the checksum file DGXStation-3.1.2_56d4a9.crc in the current working directory.
$ cat DGXStation-3.1.2_56d4a9.crc 3992706625 3459317760 DGXStation-3.1.2_56d4a9.iso
If the value obtained from the checksum file matches the value computed from the image, the integrity of the installation medium has been successfully verified.
4.7.3.2.Verifying a Bootable DVD-ROM
1.
Load the DVD-ROM into an optical drive connected to your Linux system.
2.
Compute the checksum of the image on the DVD-ROM.
$ cksum < /dev/sr0
This example computes the checksum of an image on a DVD-ROM.
$ cksum < /dev/sr0 3992706625 3459317760
3.
Obtain the checksum value from the checksum file.
$ cat checksum-file
checksum-file
The path, including the file name, to the checksum file.
This example obtains the checksum value for the image
DGXStation-3.1.2_56d4a9.iso from the checksum file DGXStation-3.1.2_56d4a9.crc in the current working directory.
$ cat DGXStation-3.1.2_56d4a9.crc 3992706625 3459317760 DGXStation-3.1.2_56d4a9.iso
If the value obtained from the checksum file matches the value computed from the image, the integrity of the installation medium has been successfully verified.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|46
Maintaining and Servicing the NVIDIA DGX Station
4.7.4.Installing the DGX Station Software Image from a USB Flash Drive or DVD-ROM
Before installing the DGX Station software image, ensure that you have a bootable USB flash drive or DVD-ROM that contains the current DGX Station software image.
Caution Installing the DGX Station software image erases all data stored on the OS SSD. The /home partition, where all users' documents, software settings, bookmarks, and other personal files are stored, resides on the OS SSD and will be erased. However, if you chose to install the DGX Station software and preserve the RAID array contents, persistent data stored in the RAID array is unaffected.
1.
Shut down the DGX Station.
2.
Load the USB flash drive or DVD-ROM into the DGX Station.
If you are using a USB flash drive, plug it into one of the USB ports of the DGX
Station. If you are using a DVD-ROM, connect an external optical drive to the DGX
Station and load the DVD-ROM into the drive.
3.
Power on the DGX Station.
4.
At the first NVIDIA screen to appear, press F8 to select the boot device.
5.
In the menu for selecting the boot device, use the arrow keys to select UEFI: usb­key-or-dvd-rom-name, Partition n (size) and press Enter.
6.
When the GNU GRUB menu appears, select the option you want for installing the DGX Station software and press Enter.
To install the software while preserving persistent data stored in the RAID array,
select Install DGX OS Desktop release and preserve RAID contents. To install the software and re-initialize the RAID array, select Install DGX OS
Desktop release and re-initialize RAID0 volume.
Caution If you chose this option, all data stored in the RAID array will be erased.
The installation requires several minutes to complete.
Licensing requirements prevent some DGX Station software, such as the NVIDIA Graphics Drivers, from being supplied in the software image. The DGX Station automatically installs this software when installation from the software image is complete.
7.
When the installation is complete, respond to the prompts to accept end user license agreements for NVIDIA software and to configure the Ubuntu OS, including creating your user name and password for logging in to the DGX Station.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|47
Maintaining and Servicing the NVIDIA DGX Station
8.
After the Ubuntu OS configuration is complete, log in to the DGX Station to access your Ubuntu desktop.
9.
Eject the USB flash drive or DVD-ROM.
10.
Unplug the USB flash drive or optical drive from the DGX Station.
4.8.Updating the DGX Station System BIOS
If you need to update the DGX Station system BIOS, you can obtain the current version of it from NVIDIA Support Enterprise Services.
Caution
Update the system BIOS only if required to resolve an issue with the DGX Station. If your DGX Station is operating normally, do not update the system BIOS. An error during an attempt to update the system BIOS may leave your DGX Station unable to boot.
If you must update the system BIOS, be sure to obtain the BIOS file from NVIDIA Enterprise Services. Do not obtain a BIOS file from the motherboard manufacturer or any other source.
To complete this task, you need a USB flash drive formatted to a single FAT 16 or FAT 32 partition.
1.
Obtain the system BIOS file.
a) Log on to NVIDIA Enterprise Services (https://nvid.nvidia.com/dashboard/). b)
Click the Announcements tab to locate the download links for the archive file containing the DGX Station system BIOS file.
c) Download the archive file and extract the system BIOS file.
2.
Copy the system BIOS file to the USB flash drive.
3.
Shut down the DGX Station.
4.
Plug the USB flash drive into one of the USB ports of the DGX Station.
5.
Power on the DGX Station.
6.
At the first NVIDIA screen to appear, press Delete or F2 to enter the UEFI BIOS setup.
7.
In the UEFI BIOS Utility - EZ Mode screen, click Advanced Mode.
8.
From the Tool menu, choose EZ 3 Flash Utility and press Enter.
9.
In the EZ 3 Flash Update screen, select via Storage Device(s) as the BIOS update method and press Enter.
10.
In the Drive list, use the up arrow and down arrow keys to select the USB flash drive that contains the BIOS file and press Enter.
11.
In the Folder list, use the up arrow and down arrow keys to select the BIOS file.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|48
Maintaining and Servicing the NVIDIA DGX Station
12.
Press Enter to start the BIOS update process.
Caution To avoid the risk of leaving your DGX Station unable to boot, do not shut down or reset the DGX Station during the BIOS update process.
13.
When the BIOS update process is complete, reboot the DGX Station.
4.9.Maintaining the GPU Liquid Cooling System
A liquid cooling system keeps the GPUs in the DGX Station within their required operating temperature range. To ensure reliable operation of the cooling system, you must maintain it periodically.
4.9.1.Monitoring GPU Temperatures
1.
Open the Dash, search for NVIDIA X Server Settings, and click the NVIDIA X Server Settings icon.
2.
Under each GPU in the list of GPUs in the NVIDIA X Server Settings window, click Thermal Settings.
Thermal sensor information for the GPU is displayed, including its current temperature and an indication of whether the temperature is within the GPU's operating range.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|49
Maintaining and Servicing the NVIDIA DGX Station
If the GPUs are running too hot, check the level of the liquid in the GPU cooling system as explained in Checking the Level of the Liquid in the GPU Cooling System.
4.9.2.Checking the Level of the Liquid in the GPU Cooling System
In normal operation, some coolant liquid may be lost from system. Every 12 months, check the level of the liquid in the cooling system to ensure that it remains at the required level.
1.
Remove the side panel on the right of the DGX Station when viewed from the rear.
a) Push the button on the right side of the DGX Station back panel to release the
panel.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|50
b) Lift the panel to remove it.
Maintaining and Servicing the NVIDIA DGX Station
Caution To prevent damage from electrostatic discharge, avoid touching any of the components inside the DGX Station other than any components that you are replacing or servicing.
2.
Look at the gauge on the side of the cooling system pump to determine the level of the liquid in the cooling system.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|51
Maintaining and Servicing the NVIDIA DGX Station
If level of the liquid in the cooling system is at or above the Minimum Level in the
reservoir, go to the next step. If the liquid has fallen below the Minimum Level in the reservoir, replenish it as
explained in Replenishing the Liquid in the GPU Cooling System.
3.
Replace the side panel of the DGX Station.
a) Align the bottom edge of the side panel with the bottom edge of the DGX Station.
b) Firmly push the panel back into place to re-engage the latch.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|52
Maintaining and Servicing the NVIDIA DGX Station
4.9.3.Replenishing the Liquid in the GPU Cooling System
Replenish the liquid in the GPU cooling system if the liquid is below the required level or to refill the cooling system after draining it to renew the cooling liquid.
To complete this task, you need the following tools and materials:
Torx T20 Allen wrench
1 bottle of EK-CryoFuel Clear Premix coolant
Caution Use only EK-CryoFuel Clear coolant. Do not use any other type of coolant. Use of other types of coolant will void the DGX Station hardware warranty and may cause damage to or impair the performance of the system.
Flexible plastic filling bottle with delivery tube
www.nvidia.com
DGX Station DU-08255-001 _v2.1|53
Maintaining and Servicing the NVIDIA DGX Station
Before you begin, ensure that the DGX Station is powered off.
1.
Fill the plastic filling bottle with the mixture.
2.
Use the Torx T20 Allen wrench to loosen the filler cap at top of the cooling system pump and when the cap is loose, remove it.
3.
Insert the delivery tube of the filling bottle into the open filler cap at the top of the pump.
4.
Gently squeeze the filler bottle to dispense the coolant liquid into the pump until the liquid reaches the Maximum Level in the reservoir.
5.
Replace the filler cap at top of the pump and use the Torx T20 Allen wrench to tighten the cap until it is finger tight.
Do not over tighten the filler cap.
6.
Power on the DGX Station and let it run for one minute.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|54
Maintaining and Servicing the NVIDIA DGX Station
If the pump makes a grinding noise, power off and power on the DGX Station four times.
7.
Ensure that the level of the liquid in the cooling system is at the Maximum Level in the reservoir.
If the liquid has fallen below the Maximum Level in the reservoir, repeat the following sequence of steps until level of the liquid in the cooling system remains at the Maximum Level.
a) Remove the filler cap at top of the cooling system pump. b) Dispense more coolant liquid into the pump until the liquid reaches the
Maximum Level in the reservoir again. c) Replace the filler cap at top of the pump. d) Power on the DGX Station and let it run for one minute. e) Check the level of the liquid in the cooling system.
8.
Power off the DGX Station.
9.
Replace the side panel of the DGX Station.
a) Align the bottom edge of the side panel with the bottom edge of the DGX Station.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|55
Maintaining and Servicing the NVIDIA DGX Station
b) Firmly push the panel back into place to re-engage the latch.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|56
AppendixA. SAFETY
To reduce the risk of bodily injury, electrical shock, fire, and equipment damage, read this document and observe all warnings and precautions in this guide before installing or maintaining your product. NVIDIA products are designed to operate safely when installed and used according to the product instructions and general safety practices. The guidelines included in this document explain the potential risks associated with computer operation and provide important safety practices designed to minimize these risks.
The product is designed and tested to meet IEC 60950-1, the Standard for the Safety of Information Technology Equipment. This also covers the national implementation of IEC 60950-1 based safety standards around the world, for example, UL 60950-1. These standards reduce the risk of injury from the following hazards:
Electric shock: Hazardous voltage levels contained in parts of the product
Fire: Overload, temperature, material flammability
Mechanical: Sharp edges, moving parts, instability
Energy: Circuits with high energy levels (240 volt amperes) or potential as burn
hazards Heat: Accessible parts of the product at high temperatures
Chemical: Chemical fumes and vapors
Radiation: Noise, ionizing, laser, ultrasonic waves
Retain and follow all product safety and operating instructions. Always refer to the documentation supplied with your equipment. Observe all warnings on the product and in the operating instructions.
WARNING: FAILURE TO FOLLOW THESE SAFETY INSTRUCTIONS COULD RESULT IN FIRE, ELECTRIC SHOCK OR OTHER INJURY OR DAMAGE. ELECTRICAL EQUIPMENT CAN BE HAZARDOUS IF MISUSED. OPERATION OF THIS PRODUCT, OR SIMILAR PRODUCTS, MUST ALWAYS BE SUPERVISED BY AN ADULT. DO NOT ALLOW CHILDREN ACCESS TO THE INTERIOR OF ANY ELECTRICAL PRODUCT AND DO NOT PERMIT THEM TO HANDLE ANY CABLES.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|57
Safety
A.1.Intended Application Uses
This product was evaluated as Information Technology Equipment (ITE), which may be installed in offices, schools, computer rooms, and similar commercial type locations. The suitability of this product for other product categories and environments (such as medical, industrial, residential, alarm systems, and test equipment), other than an ITE application, may require further evaluation.
A.2.General Precautions
To reduce the risk of personal injury or damage to the equipment:
Shut down the product and disconnect all AC power cables before installation.
Do not connect or disconnect any cables when performing installation, maintenance,
or reconfiguration of this product during an electrical storm. Never turn on any equipment when there is evidence of fire, water, or structural
damage. Place the product away from radiators, heat registers, stoves, amplifiers, or other
products that produce heat. Never use the product in a wet location.
Avoid inserting foreign objects through openings in the product.
Do not use conductive tools that could bridge live parts.
Do not make mechanical or electrical modifications to the equipment.
Use the product only with approved equipment.
Follow all cautions and instructions marked on the equipment. Do not attempt to
defeat safety interlocks (where provided). Operate the DGX Station in a place where the temperature is always in the range
10°C to 30°C (50°F to 86°F).
A.3.Electrical Precautions
Power Cable
To reduce the risk of electric shock, fire, or damage to the equipment:
Use only the supplied power cable and do not use this power cable with any other
products or for any other purpose. Not all power cables have the same current ratings.
Do not use household extension cables with your product. Household extension
cables do not have overload protection and are not intended for use with computer systems.
If you lose or damage the supplied power cable, or have to change the power cable
for any reason, use a cable rated for your product and for the voltage and current
www.nvidia.com
DGX Station DU-08255-001 _v2.1|58
marked on the electrical ratings label of the product. The voltage and current rating of the cable must be greater than the voltage and current rating marked on the product.
Plug the power cable into a grounded (earthed) electrical outlet that is easily
accessible at all times. The product is equipped with a three-wire electrical grounding-type plug which has a third pin for ground. This plug fits only into a grounded electrical power outlet.
Do not disable the power cable grounding plug. The grounding plug is an important
safety feature.
Do not place objects on power cables. Arrange them so that no one may accidentally
step on or trip over them.
Do not pull on a cable. When unplugging the product from the electrical outlet,
grasp the plug.
When possible, use one hand only to connect or disconnect cables.
Do not modify power cables or plugs. Consult a licensed electrician or your power
company for site modifications.
Safety
Power Supply
Ensure that the voltage and frequency of your power source match the voltage and
frequency inscribed on the equipment’s electrical rating label. If you have a question about the type of power source to use, contact your authorized service provider.
Connect the equipment to a properly wired and grounded electrical outlet and
always follow your local or national wiring rules.
Ensure that the socket outlet is near the equipment and is readily accessible for
disconnection.
To help protect your system from sudden, transient increases and decreases in
electrical power, consider using a surge suppressor or line conditioner.
Never force a connector into a port. Check for obstructions on the port. If the
connector and port don’t join with reasonable ease, they probably don’t match. Make sure that the connector matches the port and that you have positioned the connector correctly in relation to the port.
Do not open the power supply. Hazardous voltage, current and energy levels are
present inside the power supply. The power supply in this product contains no user­serviceable parts. Return to manufacturer for servicing.
A.4.Communications Cable Precautions
To reduce the risk of exposure to electrical shock hazards from communications cables:
Do not connect communications cables during an electrical storm. There may be a
risk of electric shock from lightning. Do not connect or use communications cables in a wet location.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|59
Disconnect the communications cables before opening a product enclosure, or
touching or installing internal components.
A.5.Other Hazards
Proposition 65 Warning
This product contains chemicals known to the State of California to cause cancer and birth defects or other reproductive harm.
California Department of Toxic Substances Control
Perchlorate Material – special handling may apply. See www.dtsc.ca.gov/
hazardouswaste/perchlorate.
Perchlorate Material: Lithium battery (CR2032) contains perchlorate. Please follow instructions for disposal.
Safety
Nickel
The decorative metal foam on the DGX Station casework contains some nickel. The metal foam is not intended for direct and prolonged skin contact. While nickel exposure is unlikely to be a problem, you should be aware of the possibility in case you’re susceptible to nickel-related reactions.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|60
AppendixB. CONNECTIONS, CONTROLS, AND
INDICATORS
B.1.Front-Panel Connections and Controls
ID Type Qty Description
1 Power Button 1 Press to turn the DGX Station on or off
B.2.Rear-Panel Connections and Controls
Current Units
ID Type Qty Description
1 USB 3.1 Type-C 1 USB 3.1 Type-C port
2 Ethernet 2 10G LAN ports (see LAN Port Indicators):
www.nvidia.com
DGX Station DU-08255-001 _v2.1|61
Connections, Controls, and Indicators
ID Type Qty Description
Lower port: LAN 1
Upper port: LAN 2
3 USB 3.0 4 USB 3.0 ports
4 S/PDIF Audio Output 1 Optical S/PDIF out port
5 eSATA 2 eSATA ports for connecting external storage devices, such as
hard drives or optical drives, with an external power supply
6 AC Input 1 Power supply input
7 Reset Button 1 Press to reboot the system without turning off the system
power
8 USB 3.1 Type-A 1 USB 3.1 Type-A port
9 Audio I/O 5 3.5 mm I/O ports for 2-, 4-, 6-, or 8-channel audio (see
Audio I/O Connections)
10 DisplayPort 3 Ports for connecting up to 3 displays
11 Power Supply Switch 1 Turn the power supply on and off
Earlier Units
ID Type Qty Description
1 USB 3.1 Type-C 1 USB 3.1 Type-C port
2 Ethernet 2 10G LAN ports (see LAN Port Indicators):
www.nvidia.com
DGX Station DU-08255-001 _v2.1|62
Lower port: LAN 1
Upper port: LAN 2
Connections, Controls, and Indicators
ID Type Qty Description
3 USB 3.0 4 USB 3.0 ports
4 S/PDIF Audio Output 1 Optical S/PDIF out port
5 eSATA 2 eSATA ports for connecting external storage devices, such as
hard drives or optical drives, with an external power supply
6 Power Supply Switch 1 Turn the power supply on and off
7 Reset Button 1 Press to reboot the system without turning off the system
power
8 USB 3.1 Type-A 1 USB 3.1 Type-A port
9 Audio I/O 5 3.5 mm I/O ports for 2-, 4-, 6-, or 8-channel audio (see
Audio I/O Connections)
10 DisplayPort 3 Ports for connecting up to 3 displays
11 AC Input 1 Power supply input
B.3.LAN Port Indicators
LEDs on each Ethernet LAN port indicate the connection status as illustrated in the following figure and described in the following tables.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|63
Speed LED
Connections, Controls, and Indicators
Status Description
Off 100 Mbps connection
Orange 1 Gbps connection
Green 10 Gbps connection
Activity/Link LED
Status Description
Off No link
Green Linked
Green (blinking) Data activity
B.4.Audio I/O Connections
ID Port Color 2-Channel 4-Channel 6-Channel 8-Channel
1 Pink Mic In Mic In Mic In Mic In
2 Black N/A Rear Speaker Rear Speaker Rear Speaker
3 Orange N/A N/A Center/Subwoofer Center/Subwoofer
4 Light Blue Line In Line In Line In Side Speaker
5 Lime Green Line Out Front Speaker Front Speaker Front Speaker
www.nvidia.com
DGX Station DU-08255-001 _v2.1|64
Connections, Controls, and Indicators
www.nvidia.com
DGX Station DU-08255-001 _v2.1|65
AppendixC. COMPLIANCE
The NVIDIA DGX Station is compliant with the regulations listed in this section.
C.1.DGX Station Model Number
Model: P2587
C.2.Argentina
S-Mark
C.3.Australia/New Zealand
RCM
www.nvidia.com
DGX Station DU-08255-001 _v2.1|66
C.4.Brazil
INMETRO
C.5.Canada
Innovation, Science and Economic Development Canada (ISED)
Compliance
CAN ICES-3(A)/NMB-3(A)
The Class A digital apparatus meets all requirements of the Canadian Interference­Causing Equipment Regulation.
Cet appareil numérique de la classe A respecte toutes les exigences du Règlement sur le matériel brouilleur du Canada.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|67
C.6.China
RoHS Material Content
Compliance
www.nvidia.com
DGX Station DU-08255-001 _v2.1|68
Compliance
C.7.European Union
European Conformity; Conformité Européenne (CE)
This is a Class A product. In a domestic environment this product may cause radio frequency interference in which case the user may be required to take adequate measures.
The product has been marked with the CE Mark to illustrate its compliance.
This device complies with the following Directives:
EMC Directive (2014/30/EU) for Class A, I.T.E equipment.
Low Voltage Directive (2014/35/EU) for electrical safety.
RoHS Directive (2011/65/EU) for hazardous substances.
ErP Directive (2009/125/EC) for European Ecodesign.
A copy of the Declaration of Conformity to the essential requirements may be obtained directly from NVIDIA GmbH (Floessergasse 2, 81369 Munich, Germany).
www.nvidia.com
DGX Station DU-08255-001 _v2.1|69
C.8.India
BIS
Self Declaration - Conforming to IS13252:2010, R-41078743
Compliance
C.9.Israel
www.nvidia.com
DGX Station DU-08255-001 _v2.1|70
C.10.Japan
VCCI
C.11.Russia
CU-TR
Compliance
C.12.South Africa
LOA
Compliant with SANS IEC 60950
SABS
Compliant with SANS 222 CISPR 22
www.nvidia.com
DGX Station DU-08255-001 _v2.1|71
C.13.South Korea
KC
Compliance
C.14.Taiwan
BSMI
www.nvidia.com
DGX Station DU-08255-001 _v2.1|72
Compliance
C.15.United States
Federal Communications Commission (FCC)
FCC Marking (Class A)
This device complies with part 15 of the FCC Rules. Operation is subject to the following two conditions: (1) this device may not cause harmful interference, and (2) this device must accept any interference received, including any interference that may cause undesired operation of the device.
NOTE: This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate
www.nvidia.com
DGX Station DU-08255-001 _v2.1|73
Compliance
radio frequency energy and, if not installed and used in accordance with the instruction manual, may cause harmful interference to radio communications. Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense.
C.16.United States/Canada
cULus Listing Mark
C.17.Vietnam
ICT
www.nvidia.com
DGX Station DU-08255-001 _v2.1|74
AppendixD. DGX STATION HARDWARE SPECIFICATIONS
D.1.Environmental Conditions
Condition Operating Range Nonoperating Range
Ambient temperature 10°C to 30°C (50°F to 86°F) 5°C to 40°C (41°F to 104°F)
Relative humidity 10% to 80% (non-condensing) 8% to 90% (non-condensing)
D.2.Component Specifications
Component Qty Description
CPU 1 Intel Xeon E5-2698 v4 2.2 GHz (20-Core)
GPU- current units
GPU - earlier units
System memory 8 8×32 GB (256 GB total) ECC Registered LRDIMM DDR4 SDRAM
Data storage 3 3×1.92 TB (5.76 TB total) 2.5" 6 Gb/s SATA III SSD in RAID 0 configuration
OS storage 1 1.92 TB 2.5" 6 Gb/s SATA III SSD
4 NVIDIA Tesla V100-DGXS-32GB, featuring:
4×125 TeraFLOPS (500 TeraFLOPS total), FP16
4×32 GB (128 GB total) GPU memory
4×640 (2,560 total) NVIDIA Tensor Cores
4×5,120 (20,480 total) NVIDIA CUDA® cores
4 NVIDIA Tesla V100-DGXS-16GB, featuring:
4×125 TeraFLOPS (500 TeraFLOPS total), FP16
4×16 GB (64 GB total) GPU memory
4×640 (2,560 total) NVIDIA Tensor Cores
4×5,120 (20,480 total) NVIDIA CUDA® cores
www.nvidia.com
DGX Station DU-08255-001 _v2.1|75
D.3.Mechanical Specifications
Specification Value
Height 25” (639 mm)
Width 10” (256 mm)
Depth 20” (518 mm)
Gross weight 88 lbs (40 kg)
D.4.Power Specifications
Input Comments
DGX Station Hardware Specifications
115 - 240 VAC, 12-8A, (50 - 60 Hz)
The DGX Station power consumption can reach 1,500 W (ambient temperature 30°C) with all system resources under a heavy load.
Be aware of your electrical source’s power capability to avoid overloading the circuit.
www.nvidia.com
DGX Station DU-08255-001 _v2.1|76
Notice
THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION
REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED,
STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY
DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A
PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever,
NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall
be limited in accordance with the NVIDIA terms and conditions of sale for the product.
THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED,
MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE,
AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A
SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE
(INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER
LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS
FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR
IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.
NVIDIA makes no representation or warranty that the product described in this guide will be suitable for
any specified use without further testing or modification. Testing of all parameters of each product is not
necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and
fit for the application planned by customer and to do the necessary testing for the application in order
to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect
the quality and reliability of the NVIDIA product and may result in additional or different conditions and/
or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any
default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA
product in any manner that is contrary to this guide, or (ii) customer product designs.
Other than the right for customer to use the information in this guide with the product, no other license,
either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information
in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without
alteration, and is accompanied by all associated conditions, limitations, and notices.
Trademarks
NVIDIA, the NVIDIA logo, DGX, DGX-1, and DGX Station are trademarks and/or registered trademarks of
NVIDIA Corporation in the Unites States and other countries. Other company and product names may be
trademarks of the respective companies with which they are associated.
Copyright
©
2018 NVIDIA Corporation. All rights reserved.
www.nvidia.com
Loading...