The Radeon™ Pro SSG software library enables peer-to-peer (P2P) data transfers between GPU and
Radeon on board SSD devices. It allows a methodology to read OS file data from SSDs to OpenCL™,
OpenGL® and DirectX® buffers with very low-latency P2P communication. The development kit version
of this library supports only the Microsoft® Windows® 10 operating system.
2 Requirements
To use the development kit version of the SSG library, the following are required:
1.) A Radeon™ Pro SSG Professional Graphics Card
2.) Microsoft® Windows 10 (64 bit) or newer.
3 OpenCL™ Extension Specification
3.1 clCreateSsgFileObjectAMD
The clCreateSsgFileObjectAMD function creates a CL file object. Doing so over a file employs the same
semantics as when opening the file for reading/writing—that is, read/write privileges are required for
that file.
Context [in]
A CL context with which to associate the file.
Flags [in]
Access-privilege flag for the file. Currently, only CL_FILE_READ_ONLY_AMD and
CL_FILE_WRITE_ONLY_AMD are supported. Read (or write) privileges are required for the file, subject to
the same access rights as when opening any file.
file_name [in]
In Windows: the UTF-16-encoded name of the file to be opened.
In Linux: the UTF-8-encoded name of the file to be opened.
errcode_ret [out]
The return value.
Return Value Description
CL_SUCCESS The function executed successfully.
CL_INVALID_FILE_OBJECT_AMD The file is invalid for OpenCL.
cl_file_flags_amd flags,
constwchar_t* file_name,
cl_int* errcode_ret)
Rev. 1.01
4
cl_file_info_amd
Return Type
Info Returned in
param_value
CL_FILE_BLOCK_SIZE_AMD
cl_uint
Alignment restriction for
the file object
CL_FILE_SIZE_AMD
cl_ulong
File size in bytes
C++
cl_intclGetSsgFileObjectInfoAMD(cl_file_amd file,
C++
cl_intclRetainSsgFileObjectAMD(cl_file_amd file)
3.2 clGetSsgFileObjectInfoAMD
The clGetSsgFileObjectInfoAMD function returns information about a file object.
file [in]
Specifies the file object of query.
param_name [in]
Specifies the information to query. The table below provides a list of supported param_name types and
the information that clGetSsgFileObjectInfoAMD will return in param_value.
CL_SUCCESS The function executed successfully.
CL_INVALID_FILE_OBJECT_AMD The file is invalid for OpenCL.
3.4 clReleaseSsgFileObjectAMD
The clReleaseSsgFileObjectAMD function decrements the file-object reference count.
file [in]
Specifies the file object to be released.
Return Value Description
CL_SUCCESS The function executed successfully.
CL_INVALID_FILE_OBJECT_AMD The file is invalid for OpenCL.
3.5 clEnqueueReadSsgFileAMD
The clEnqueueReadSsgFileAMD function reads from a file object to a CL memory object.
command_queue [in]
Valid host command-queue in which the read command will be queued. Create the buffer and
command_queue using the same OpenCL context.
buffer [in]
A valid buffer object; buffer is the target memory object. Create the buffer using either
CL_MEM_ALLOC_HOST_PTR, CL_MEM_USE_HOST_PTR or CL_MEM_USE_PERSISTENT_MEM_AMD.
blocking_read [in]
Indicates whether the read operation is blocking or non-blocking. If blocking_read is CL_TRUE, the
function call won’t return until the operation has completed. If blocking_read is CL_FALSE, the OpenCL
implementation will perform a non-blocking read. Because the read is non-blocking, the function can
return immediately. The event argument causes the function to return an event object, which can be used
to query the read command’s execution status.
Offset (in bytes) in the buffer object to which the function is writing. It must be a multiple of
CL_FILE_BLOCK_SIZE_AMD.
size [in]
Size (in bytes) of data being read. It must be a multiple of CL_FILE_BLOCK_SIZE_AMD. If the file size isn’t
a multiple of the block size, read the end of the file by aligning the read size with the
next block multiple beyond the file size.
file [in]
File object from which to copy.
file_offset [in]
Offset (in bytes) for copying from the file; file_offset must be a multiple of CL_FILE_BLOCK_SIZE_AMD.
event_wait_list and num_events_in_wait_list [in, optional]
Specify events that must complete before the clEnqueueReadSsgFileAMD command can execute. If
event_wait_list is NULL, the command will proceed without waiting for events to finish. Also, if event_wait_list is NULL, num_events_in_wait_list must be 0. Otherwise, the list of events to which
event_wait_list points must be valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as synchronization points, and the contexts associated with events in
event_wait_list and in command_queue must be the same. The memory associated with event_wait_list
can be reused or freed-up after the function returns.
event [out, optional]
Returns an event object that identifies the clEnqueueReadSsgFileAMD command and can query the
command status or queue a wait for the command to finish executing. The event argument can be NULL,
in which case the application will be unable to query the command status or queue a wait for the
command to finish. Unless event_wait_list and event are both NULL, event should avoid referring to an
element of the event_wait_list array.
Return Value Description
CL_SUCCESS The function executed successfully.
CL_INVALID_FILE_OBJECT_AMD The file is invalid for OpenCL.
CL_INVALID_COMMAND_QUEUE Command_queue is an invalid host command queue.
CL_INVALID_CONTEXT The contexts associated with command_queue and buffer are
different, or the context associated with command_queue and the
events in event_wait_list are different.
CL_INVALID_MEM_OBJECT The buffer object is invalid.
CL_INVALID_VALUE The region specified by buffer_offset, file_offset or size is out of
bounds or is misaligned with CL_FILE_BLOCK_SIZE_AMD.
CL_INVALID_EVENT_WAIT_LIST Event_wait_list is NULL and num_events_in_wait_list is greater than 0,
event_wait_list is not NULL and num_events_in_wait_list is 0, or the event objects in event_wait_list are invalid.
CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST
Read operations are blocking, and the execution status of at least one
event in event_wait_list is a negative integer value.
Memory failed to allocate for data store associated with buffer.
3.6 clEnqueueWriteSsgFileAMD
The clEnqueueWriteSsgFileAMD function writes directly from a CL memory object to a file object.
command_queue [in]
Valid host command queue in which the write command will be queued. Create the
command_queue and buffer with the same OpenCL context.
buffer [in]
A valid buffer object. Buffer is the copy-source memory object. Create the buffer with either
CL_MEM_ALLOC_HOST_PTR, CL_MEM_USE_HOST_PTR or CL_MEM_USE_PERSISTENT_MEM_AMD.
blocking_write [in]
Indicates whether the write operation is blocking or non-blocking. If blocking_write is CL_TRUE, the
function call will not return until the operation is complete. If blocking_write is CL_FALSE, the OpenCL
implementation will perform a non-blocking write, and can return immediately. The event argument
returns an event object that can query the execution status of the write command.
buffer_offset [in]
Offset (in bytes) in the buffer object being read. It must be a multiple of CL_FILE_BLOCK_SIZE_AMD.
size [in]
Size (in bytes) of the data being written. It must be a multiple of CL_FILE_BLOCK_SIZE_AMD.If the file
size is not a multiple of the block size, write to the end of the file by aligning the write size with the next
block multiple beyond the file size.
file [in]
File object to which the copy is initiated.
file_offset [in]
Offset (in bytes) for copying to the file. It must be a multiple of CL_FILE_BLOCK_SIZE_AMD.
event_wait_list and num_events_in_wait_list [in, optional]
Specify events that must complete before clEnqueueWriteSsgFileAMD can execute. If event_wait_list is
NULL, the command will proceed without waiting for events to finish executing. If event_wait_list is NULL,
num_events_in_wait_list must be 0; otherwise, the list of events to which event_wait_list points must be
valid and num_events_in_wait_list must be greater than 0. The events specified in event_wait_list act as
synchronization points, and the contexts associated with events in event_wait_list and command_queue
must be the same. Memory associated with event_wait_list can be reused or freed-up after the function
returns.
event [out, optional]
Returns an event object that identifies the clEnqueueWriteSsgFileAMD command and can query or queue
a wait for this command to finish executing. The event argument can be NULL, in which case the
application will be unable to query the command status or queue a wait for the command to finish. Unless
the event_wait_list and event arguments are NULL, the event argument should avoid referring to an
element of the event_wait_list array.
Return Value Description
CL_SUCCESS The function executed successfully.
CL_INVALID_FILE_OBJECT_AMD The file is invalid for OpenCL.
CL_INVALID_COMMAND_QUEUE Command_queue is an invalid host command queue.
CL_INVALID_CONTEXT The contexts associated with command_queue and buffer are
different, or the contexts associated with command_queue and the
events in event_wait_list are different.
CL_INVALID_MEM_OBJECT The buffer object is invalid.
CL_INVALID_VALUE The region specified by buffer_offset, file_offset or size is out of
bounds or misaligned with CL_FILE_BLOCK_SIZE_AMD.
CL_INVALID_EVENT_WAIT_LIST Either event_wait_list is NULL and num_events_in_wait_list is greater
than 0, event_wait_list is not NULL and num_events_in_wait_list is 0,
or event objects in event_wait_list are invalid.
CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST
Write operations are blocking and the execution status of at least one
event in event_wait_list is a negative integer value.
CL_MEM_OBJECT_ALLOCATION_FAILURE
Memory failed to allocate for data store associated with buffer.
4 OpenCL Performance Guidelines and Caveats
The following guidelines and caveats will optimize for the greatest performance from OpenCL.
• For best performance, create the target resources with the flag
CL_MEM_USE_PERSISTENT_MEM_AMD.
• Because of OS limitations, persistent memory must be referenced by the GPU once before it
truly becomes resident in GPU memory. A one-time performance drop may be experienced the
first time the buffer is used as a file-transfer target if the GPU has yet to access the resource.
Clear the buffer using clEnqueueFillBuffer to generate such a reference.
• The persistent-memory heap available to all applications is 128 MB. If the heap is entirely
consumed, the run time will silently fall back to standard allocations. If the data set is larger than
100 MB, the application should employ the persistent-memory allocation as a staging buffer and
Rev. 1.01
Loading...
+ 17 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.