This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is
presently available at http://www.opencontent.org/openpub/).
Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder.
Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright
holder.
Trademarks
Red Hat, the Red Hat Shadow Man logo®, eCos™, RedBoot™, GNUPro®, and Insight™ are trademarks of Red Hat, Inc.
Sun Microsystems® and Solaris® are registered trademarks of Sun Microsystems, Inc.
SPARC® is a registered trademark of SPARC International, Inc., and is used under license by Sun Microsystems, Inc.
Intel® is a registered trademark of Intel Corporation.
Motorola™ is a trademark of Motorola, Inc.
ARM® is a registered trademark of Advanced RISC Machines, Ltd.
MIPS™ is a trademark of MIPS Technologies, Inc.
Toshiba®is a registered trademark of the Toshiba Corporation.
NEC® is a registered trademark if the NEC Corporation.
Cirrus Logic® is a registered trademark of Cirrus Logic, Inc.
Compaq® is a registered trademark of the Compaq Computer Corporation.
Matsushita™ is a trademark of the Matsushita Electric Corporation.
Samsung® and CalmRISC™ are trademarks or registered trademarks of Samsung, Inc.
Linux® is a registered trademark of Linus Torvalds.
UNIX® is a registered trademark of The Open Group.
Microsoft®, Windows®, and WindowsNT® are registered trademarks of Microsoft Corporation, Inc.
All other brand and product names, trademarks, and copyrights are the property of their respective owners.
Warranty
eCos and RedBoot are open source software, covered by a modified version of the GNU General Public Licence (http://www.gnu.org/copyleft/gpl.html), and
you are welcome to change it and/or distribute copies of it under certain conditions. See http://sources.redhat.com/ecos/license-overview.htmlfor more
information about the license.
eCos and RedBoot software have NO WARRANTY.
Because this software is licensed free of charge, there are no warranties for it, to the extent permitted by applicable law. Except when otherwise stated in
writing, the copyright holders and/or other parties provide the software “as is” without warranty of any kind, either expressed or implied, including, but not
limited to, the implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the quality and performance of the software is
with you. Should the software prove defective,you assume the cost of all necessary servicing, repair or correction.
In no event, unless required by applicable law or agreed to in writing, will any copyright holder, or any other party who may modify and/or redistribute the
program as permitted above, be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or
inability to use the program (including but not limited to loss of data or data being rendered inaccurate or losses sustained by you or third parties or a failure
of the program to operate with any other programs), even if such holder or other party has been advised of the possibility of such damages.
Table of Contents
I. The eCos Kernel...............................................................................................................................................xxv
SMP Support ..................................................................................................................................................35
Thread information ........................................................................................................................................ 43
Thread control ................................................................................................................................................47
Per-thread data ...............................................................................................................................................53
Mail boxes......................................................................................................................................................77
Scheduler Control ..........................................................................................................................................85
II. RedBoot™ User’s Guide.................................................................................................................................ciii
1. Getting Started with RedBoot......................................................................................................................1
More information about RedBoot on the web........................................................................................1
User Interface .........................................................................................................................................2
Load and start a RedBoot RAM instance ................................................................................... 79
Update the primary RedBoot flash image...................................................................................80
Reboot; run the new RedBoot image.......................................................................................... 81
5. Installation and Testing..............................................................................................................................83
7. Architecture, Variant and Platform ..........................................................................................................173
8. General principles .................................................................................................................................... 175
9. HAL Interfaces.........................................................................................................................................177
Base Definitions..................................................................................................................................177
HAL I/O..............................................................................................................................................186
C library startup..................................................................................................................................256
V. I/O Package (Device Drivers).........................................................................................................................259
15. User API.................................................................................................................................................263
16. Serial driver details.................................................................................................................................265
Raw Serial Driver............................................................................................................................... 265
API details.................................................................................................................................272
17. How to Write a Driver............................................................................................................................ 275
How to Write a Serial Hardware Interface Driver..............................................................................276
The API ..............................................................................................................................................288
20. File System Table................................................................................................................................... 313
21. Mount Table...........................................................................................................................................315
25. Initialization and Mounting.................................................................................................................... 323
29. Writing a New Filesystem......................................................................................................................331
VII. PCI Library..................................................................................................................................................335
30. The eCos PCI Library............................................................................................................................ 337
References and Bibliography....................................................................................................................... 367
IX. µITRON .........................................................................................................................................................367
X. TCP/IP Stack Support for eCos.....................................................................................................................387
37. Support Features ....................................................................................................................................401
XIII. DNS for eCos and RedBoot .......................................................................................................................491
DNS API.............................................................................................................................................493
XIV. Ethernet Device Drivers............................................................................................................................. 495
XV. SNMP ............................................................................................................................................................ 509
47. SNMP for eCos...................................................................................................................................... 511
Test cases............................................................................................................................................ 515
SNMP clients and package use...........................................................................................................516
XVI. Embedded HTTP Server ...........................................................................................................................527
System Monitor .................................................................................................................................. 534
XVII. FTP Client for eCos TCP/IP Stack..........................................................................................................535
49. FTP Client Features ...............................................................................................................................537
XVIII. CRC Algorithms......................................................................................................................................539
XIX. CPU load measurements............................................................................................................................543
51. CPU Load Measurements ...................................................................................................................... 545
CPU Load API....................................................................................................................................545
XX. Application profiling....................................................................................................................................547
Power Management Information.................................................................................................................. 557
Changing Power Modes............................................................................................................................... 561
Support for Policy Modules......................................................................................................................... 563
Attached and Detached Controllers ............................................................................................................. 567
Implementing a Power Controller................................................................................................................569
XXII. eCos USB Slave Support ..........................................................................................................................573
USB Enumeration Data................................................................................................................................579
Starting up a USB Device............................................................................................................................585
Receiving Data from the Host......................................................................................................................591
Sending Data to the Host ............................................................................................................................. 595
Control Endpoints ........................................................................................................................................599
Data Endpoints............................................................................................................................................. 605
Writing a USB Device Driver ......................................................................................................................607
Initializing the USB-ethernet Package.........................................................................................................629
USB-ethernet Data Transfers....................................................................................................................... 631
USB-ethernet State Handling.......................................................................................................................633
Network Device for the eCos TCP/IP Stack................................................................................................635
Example Host-side Device Driver................................................................................................................637
Communication Protocol .............................................................................................................................639
Running a Synthetic Target Application...................................................................................................... 649
The I/O Auxiliary’s User Interface .............................................................................................................. 655
The Console Device..................................................................................................................................... 661
System Calls.................................................................................................................................................663
Writing New Devices - target....................................................................................................................... 665
Writing New Devices - host......................................................................................................................... 671
13-1. Behavior of math exception handling...........................................................................................................253
1-2. Sample /etc/named.conf for Red Hat Linux 7.x........................................................................................... 8
xxiii
xxiv
I. The eCos Kernel
Kernel Overview
Name
Kernel — Overview of the eCos Kernel
Description
The kernel is one of the key packages in all of eCos. It provides the core functionality needed for developing
multi-threaded applications:
1. The ability to create new threads in the system, either during startup or when the system is already running.
2. Control over the various threads in the system, for example manipulating their priorities.
3. A choice of schedulers, determining which thread should currently be running.
4. A range of synchronization primitives, allowing threads to interact and share data safely.
5. Integration with the system’s support for interrupts and exceptions.
In some other operating systems the kernel provides additional functionality. For example the kernel may also
provide memory allocation functionality, and device drivers may be part of the kernel as well. This is not the case
for eCos. Memory allocation is handled by a separate package. Similary each device driver will typically be a
separate package. Various packages are combined and configured using the eCos configuration technology to meet
the requirements of the application.
The eCos kernel package is optional. It is possible to write single-threaded applications which do not use any kernel
functionality,for example RedBoot. Typicallysuch applications arebased arounda centralpolling loop,continually
checking all devices and taking appropriate action when I/O occurs. A small amount of calculation is possible every
iteration, at the cost of an increased delay between an I/O event occurring and the polling loop detecting the event.
When the requirements are straightforward it may well be easier to develop the application using a polling loop,
avoiding the complexities of multiple threads and synchronization between threads. As requirements get more
complicated a multi-threaded solution becomes more appropriate, requiring the use of the kernel. In fact some of
the more advanced packages in eCos, for example the TCP/IP stack, use multi-threading internally. Therefore if
the application uses any of those packages then the kernel becomes a required package, not an optional one.
The kernel functionality can be used in one of two ways. The kernel provides its own C API, with functions like
cyg_thread_create and cyg_mutex_lock. These can be called directly from application code or from other
packages. Alternatively there are a number of packages which provide compatibility with existing API’s, for example POSIX threads or µITRON.These allow application code to call standard functions such as pthread_create,
and those functions areimplemented using the basic functionality provided by the eCoskernel. Using compatibility
packages in an eCos application can make it much easier to reuse code developed in other environments, and to
share code.
Although the different compatibility packages have similar requirements on the underlying kernel, for example the
ability to create a new thread, there are differences in the exact semantics. For example, strict µITRON compliance
requires that kernel timeslicing is disabled. This is achieved largely through the configuration technology. The
kernel provides a number of configuration options that control the exact semantics that are provided, and the
various compatibility packages require particular settings for those options. This has two important consequences.
First, it is not usually possible to have two differentcompatibility packages in one eCos configuration because they
will have conflicting requirements on the underlying kernel. Second, thesemantics of the kernel’s own API are only
27
Kernel Overview
loosely defined because of the many configuration options. For example cyg_mutex_lock will always attempt to
lock a mutex, but various configuration options determine the behaviour when the mutex is already locked and
there is a possibility of priority inversion.
The optional nature of the kernel package presents some complications for other code, especially device drivers.
Wherever possible a device driver should work whether or not the kernel is present. However there are some
parts of the system, especially those related to interrupt handling, which should be implemented differently
in multi-threaded environments containing the eCos kernel and in single-threaded environments without the
kernel. To cope with both scenarios the common HAL package provides a driver API, with functions such as
cyg_drv_interrupt_attach. When the kernel package is present these driver API functions map directly on to
the equivalent kernel functions such as cyg_interrupt_attach, using macros to avoid any overheads. When the
kernel is absent the common HAL package implements the driver API directly, but this implementation is simpler
than the one in the kernel because it can assume a single-threaded environment.
Schedulers
When a system involves multiple threads, a scheduler is needed to determine which thread should currently be
running. The eCos kernel can be configured with one of two schedulers, the bitmap scheduler and the multi-level
queue (MLQ) scheduler. The bitmap scheduler is somewhat more efficient, but has a number of limitations. Most
systems will instead use the MLQ scheduler. Other schedulers may be added in the future, either as extensions to
the kernel package or in separate packages.
Both the bitmap and the MLQ scheduler use a simple numerical priority to determine which thread should be
running. The number of priority levels is configurable via the option CYGNUM_KERNEL_SCHED_PRIORITIES, but
a typical system will have up to 32 priority levels. Therefore thread priorities will be in the range 0 to 31, with 0
being the highest priority and 31 the lowest. Usually only the system’s idle thread will run at the lowest priority.
Thread priorities are absolute, so the kernel will only run a lower-priority thread if all higher-priority threads are
currently blocked.
The bitmap scheduler only allows one thread per priority level, so if the system is configured with 32 priority levels
then it is limited to only 32 threads — still enough for many applications. A simple bitmap can be used to keep
track of which threads are currently runnable. Bitmaps can also be used to keep track of threads waiting on a mutex
or other synchronization primitive. Identifying the highest-priority runnable or waiting thread involves a simple
operation on the bitmap, and an array index operation can then be used to get hold of the thread data structure
itself. This makes the bitmap scheduler fast and totally deterministic.
The MLQ scheduler allows multiple threads to run at the same priority. This means that there is no limit on the
number of threads in the system, other than the amount of memory available. However operations such as finding
the highest priority runnable thread are a little bit more expensive than for the bitmap scheduler.
Optionally the MLQ schedulersupports timeslicing, where the scheduler automatically switches from one runnable
thread to another when some number of clock ticks have occurred. Timeslicing only comes into play when there
are two runnable threads at the same priority and no higher priority runnable threads. If timeslicing is disabled
then a thread will not be preempted by another thread of the same priority, and will continue running until either it
explicitly yields the processor or until it blocks by, for example, waiting on a synchronization primitive. The configuration options CYGSEM_KERNEL_SCHED_TIMESLICE and CYGNUM_KERNEL_SCHED_TIMESLICE_TICKS control
timeslicing. The bitmap scheduler does not provide timeslicing support. It only allows one thread per priority level,
so it is not possible to preempt the current thread in favour of another one with the same priority.
GIMP_KERNEL_SCHED_SORTED_QUEUES. This determines what happens when a thread blocks, for example by
Kernel Overview
waiting on a semaphore which has no pending events. The default behaviour of the system is last-in-first-out
queuing. For example if several threads are waiting on a semaphore and an event is posted, the thread that gets
woken up is the last one that called cyg_semaphore_wait. This allows for a simple and fast implementation of
both the queue and dequeue operations. However if there are several queued threads with different priorities, it
may not be the highest priority one that gets woken up. In practice this is rarely a problem: usually there will be at
most one thread waiting on a queue, or when there are several threads they will be of the same priority. However
if the application does require strict priority queueing then the option CYGIMP_KERNEL_SCHED_SORTED_QUEUES
should be enabled. There are disadvantages: more work is needed whenever a thread is queued, and the scheduler
needs to be locked for this operation so the system’s dispatch latency is worse. If the bitmap scheduler is used
then priority queueing is automatic and does not involve any penalties.
Some kernel functionality is currently only supported with the MLQ scheduler, not the bitmap scheduler. This
includes support for SMP systems, and protection against priority inversion using either mutex priority ceilings or
priority inheritance.
Synchronization Primitives
The eCos kernel provides a number of different synchronization primitives: mutexes, condition variables, counting
semaphores, mail boxes and event flags.
Mutexes serve a very different purpose from the other primitives. A mutex allows multiple threads to share a
resource safely: a thread locks a mutex, manipulates the shared resource, and then unlocks the mutex again. The
other primitives are used to communicate information between threads, or alternatively from a DSR associated
with an interrupt handler to a thread.
When a thread that has locked a mutex needs to wait for some condition to become true, it should use a condition
variable. A condition variable is essentially just a place for a thread to wait, and which another thread, or DSR, can
use to wake it up. When a thread waits on a condition variable it releases the mutex before waiting, and when it
wakes up it reacquires it before proceeding. These operations are atomic so that synchronization race conditions
cannot be introduced.
A counting semaphore is used to indicate that a particular event has occurred. A consumer thread can wait for this
event to occur, and a producer thread or a DSR can post the event. There is a count associated with the semaphore
so if the event occurs multiple times in quick succession this information is not lost, and the appropriate number of
semaphore wait operations will succeed.
Mail boxes are also used to indicate that a particular event has occurred, and allows for one item of data to be
exchanged per event. Typically this item of data would be a pointer to some data structure. Because of the need to
store this extra data, mail boxes have a finite capacity. If a producer thread generates mail box events faster than
they can be consumed then, to avoid overflow, it will be blocked until space is again availablein the mail box. This
means that mail boxes usually cannot be used by a DSR to wake up a thread. Instead mail boxes are typically only
used between threads.
Event flags can be used to wait on some number of different events, and to signal that one or several of these
events have occurred. This is achievedby associating bits in a bit mask with the different events. Unlike a counting
semaphore no attempt is made to keep track of the number of events that have occurred, only the fact that an event
has occurred at least once. Unlike a mail box it is not possible to send additional data with the event, but this does
mean that there is no possibility of an overflow and hence event flags can be used between a DSR and a thread as
well as between threads.
29
Kernel Overview
The eCos common HAL package provides its own device driver API which contains some of the above synchronization primitives. These allow the DSR for an interrupt handler to signal events to higher-level code. If the
configuration includes the eCos kernel package then the driver API routines map directly on to the equivalent
kernel routines, allowing interrupt handlers to interact with threads. If the kernel package is not included and the
application consists of just a single thread running in polled mode then the driver API is implemented entirely
within the common HAL, and with no need to worry about multiple threads the implementation can obviously be
rather simpler.
Threads and Interrupt Handling
During normal operation the processor will be running one of the threads in the system. This may be an application
thread, a system thread running inside say the TCP/IP stack, or the idle thread. From time to time a hardware
interrupt will occur, causing control to be transferred briefly to an interrupt handler. When the interrupt has been
completed the system’s scheduler will decide whether to return control to the interrupted thread or to some other
runnable thread.
Threads and interrupt handlers must be able to interact. If a thread is waiting for some I/O operation to complete,
the interrupt handler associated with that I/O must be able to inform the thread that the operation has completed.
This can be achieved in a number of ways. One very simple approach is for the interrupt handler to set a volatile
variable. A thread can then poll continuously until this flag is set, possibly sleeping for a clock tick in between.
Polling continuously means that the cpu time is not availablefor other activities, which may be acceptable for some
but not all applications. Polling once every clock tick imposes much less overhead, but means that the thread may
not detect that the I/O event has occurred until an entire clock tick has elapsed. In typical systems this could be as
long as 10 milliseconds. Such a delay might be acceptable for some applications, but not all.
A better solution would be to use one of the synchronization primitives. The interrupt handler could signal a
condition variable, post to a semaphore, or use one of the other primitives. The thread would perform a wait
operation on the same primitive. It would not consume any cpu cycles until the I/O event had occurred, and when
the event does occur the thread can start running again immediately (subject to any higher priority threads that
might also be runnable).
Synchronization primitives constitute shared data, so care must be taken to avoid problems with concurrent access.
If the thread that was interrupted was just performing some calculations then the interrupt handler could manipulate
the synchronization primitive quite safely.However if the interrupted thread happened to be inside some kernel call
then there is a real possibility that some kernel data structure will be corrupted.
One way of avoiding such problems would be for the kernel functions to disable interrupts when executing any
critical region. On most architectures this would be simple to implement and very fast, but it would mean that
interrupts would be disabled often and for quite a long time. For some applications that might not matter, but many
embedded applications require that the interrupt handler run as soon as possible after the hardware interrupt has
occurred. If the kernel relied on disabling interrupts then it would not be able to support such applications.
Instead the kernel uses a two-level approach to interrupt handling. Associated with every interrupt vector is an
Interrupt Service Routine or ISR, which will run as quickly as possible so that it can service the hardware. However
an ISR can make only a small number of kernel calls, mostly related to the interrupt subsystem, and it cannot make
any call that would cause a thread to wake up. If an ISR detects that an I/O operation has completed and hence
that a thread should be woken up, it can cause the associated Deferred Service Routine or DSR to run. A DSR is
allowed to make more kernel calls, for example it can signal a condition variable or post to a semaphore.
Disabling interrupts prevents ISRs from running, but very few parts of the system disable interrupts and then only
for short periods of time. The main reason for a thread to disable interrupts is to manipulate some state that is
30
Kernel Overview
shared with an ISR. For example if a thread needs to add another buffer to a linked list of free buffers and the ISR
may remove a buffer from this list at any time, the thread would need to disable interrupts for the few instructions
needed to manipulate the list. If the hardware raises an interrupt at this time, it remains pending until interrupts are
reenabled.
Analogous to interrupts being disabled or enabled, the kernel has a scheduler lock. The various kernel functions
such as cyg_mutex_lock and cyg_semaphore_post will claim the scheduler lock, manipulate the kernel data
structures, and then release the scheduler lock. If an interrupt results in a DSR being requested and the scheduler
is currently locked, the DSR remains pending. When the scheduler lock is released any pending DSRs will run.
These may post events to synchronization primitives, causing other higher priority threads to be woken up.
Foran example,consider the following scenario. The system has ahigh priority thread A,responsible for processing
some data coming from an external device. This device will raise an interrupt when data is available.There are two
other threads B and C which spend their time performing calculations and occasionally writing results to a display
of some sort. This display is a shared resource so a mutex is used to control access.
At a particular moment in time thread A is likely to be blocked, waiting on a semaphore or another synchronization
primitiveuntil data is available. Thread B might be running performing some calculations, and thread C is runnable
waiting for its next timeslice. Interrupts are enabled, and the scheduler is unlocked because none of the threads are
in the middle of a kernel operation. At this point the device raises an interrupt. The hardware transfers control
to a low-level interrupt handler provided by eCos which works out exactly which interrupt occurs, and then the
corresponding ISR is run. This ISR manipulates the hardware as appropriate, determines that there is now data
available, and wants to wake up thread A by posting to the semaphore. However ISR’s are not allowed to call
cyg_semaphore_post directly, so instead the ISR requests that its associated DSR be run and returns. There are
no more interrupts to be processed, so the kernel next checks for DSR’s. One DSR is pending and the scheduler is
currently unlocked, so the DSR can run immediately and post the semaphore. This will have the effect of making
thread A runnable again, so the scheduler’s data structures are adjusted accordingly. When the DSR returns thread
B is no longer the highest priority runnable thread so it will be suspended, and instead thread A gains control over
the cpu.
In the above example no kernel data structures were being manipulated at the exact moment that the interrupt
happened. However that cannot be assumed. Suppose that thread B had finished its current set of calculations and
wanted to write the results to the display. It would claim the appropriate mutex and manipulate the display. Now
suppose that thread B was timesliced in favour of thread C, and that thread C also finished its calculations and
wanted to write the results to the display. It would call cyg_mutex_lock. This kernel call locks the scheduler,
examines the current state of the mutex, discovers that the mutex is already owned by another thread, suspends
the current thread, and switches control to another runnable thread. Another interrupt happens in the middle of
this cyg_mutex_lock call, causing the ISR to run immediately. The ISR decides that thread A should be woken
up so it requests that its DSR be run and returns back to the kernel. At this point there is a pending DSR, but the
scheduler is still locked by the call to cyg_mutex_lock so the DSR cannot run immediately. Instead the call to
cyg_mutex_lock is allowed to continue, which at some point involves unlocking the scheduler. The pending DSR
can now run, safely post the semaphore, and thus wake up thread A.
If the ISR had called cyg_semaphore_post directly rather than leaving it to a DSR, it is likely that there would
have been some sort of corruption of a kernel data structure. For example the kernel might have completely lost
track of one of the threads, and that thread would never have run again. The two-level approach to interrupt handling, ISR’s and DSR’s, prevents such problems with no need to disable interrupts.
31
Kernel Overview
Calling Contexts
eCos defines a number of contexts. Only certain calls are allowed from inside each context, for example most
operations on threads or synchronization primitives are not allowed from ISR context. The different contexts are
initialization, thread, ISR and DSR.
When eCos starts up it goes through a number of phases, including setting up the hardware and invoking C++ static
constructors. During this time interrupts are disabled and the scheduler is locked. When a configuration includes
the kernel package the final operation is a call to cyg_scheduler_start. At this point interrupts are enabled, the
scheduler is unlocked, and control is transferred to the highest priority runnable thread. If the configuration also
includes the C library package then usually the C library startup package will have created a thread which will call
the application’s main entry point.
Some application code can also run before the scheduler is started, and this code runs in initialization context.
If the application is written partly or completely in C++ then the constructors for any static objects will be run.
Alternatively application code can define a function cyg_user_start which gets called after any C++ static
constructors. This allows applications to be written entirely in C.
void
cyg_user_start(void)
{
}
/* Perform application-specific initialization here */
It is not necessary for applications to provide a cyg_user_start function since the system will provide a default
implementation which does nothing.
Typical operations that are performed from inside static constructors or cyg_user_start include creating threads,
synchronization primitives, setting up alarms, and registering application-specific interrupt handlers. In fact for
many applications all such creation operations happen at this time, using statically allocated data, avoiding any
need for dynamic memory allocation or other overheads.
Code running in initialization context runs with interrupts disabled and the scheduler locked. It is not permitted
to reenable interrupts or unlock the scheduler because the system is not guaranteed to be in a totally consistent state at this point. A consequence is that initialization code cannot use synchronization primitives such as
cyg_semaphore_wait to wait for an external event. It is permitted to lock and unlock a mutex: there are no other
threads running so it is guaranteed that the mutex is not yet locked, and therefore the lock operation will never
block; this is useful when making library calls that may use a mutex internally.
At the end of the startup sequence the system will call cyg_scheduler_start and the various threads will
start running. In thread context nearly all of the kernel functions are available. There may be some restrictions
on interrupt-related operations, depending on the target hardware. For example the hardware may require
that interrupts be acknowledged in the ISR or DSR before control returns to thread context, in which case
cyg_interrupt_acknowledge should not be called by a thread.
At any time the processor may receive an external interrupt, causing control to be transferred from the current
thread. Typically a VSR provided by eCos will run and determine exactly which interrupt occurred. Then the VSR
will switch to the appropriate ISR, which can be provided by a HAL package, a device driver, or by the application.
During this time the system is running at ISR context, and most of the kernel function calls are disallowed. This
includes the various synchronization primitives, so for example an ISR is not allowed to post to a semaphore to
indicate that an event has happened. Usually the only operations that should be performed from inside an ISR are
32
Kernel Overview
ones related to the interrupt subsystem itself, for example masking an interrupt or acknowledging that an interrupt
has been processed. On SMP systems it is also possible to use spinlocks from ISR context.
When an ISR returns it can request that the corresponding DSR be run as soon as it is safe to do so, and that
will run in DSR context. This context is also used for running alarm functions, and threads can switch temporarily to DSR context by locking the scheduler. Only certain kernel functions can be called from DSR context, although more than in ISR context. In particular it is possible to use any synchronization primitives which cannot
block. These include cyg_semaphore_post, cyg_cond_signal, cyg_cond_broadcast, cyg_flag_setbits,
and cyg_mbox_tryput. It is not possible to use any primitives that may block such as cyg_semaphore_wait,
cyg_mutex_lock, or cyg_mbox_put. Calling such functions from inside a DSR may cause the system to hang.
The specific documentation for the various kernel functions gives more details about valid contexts.
Error Handling and Assertions
In many APIs each function is expected to perform some validation of its parameters and possibly of the current
state of the system. This is supposed to ensure that each function is used correctly, and that application code is not
attempting to perform a semaphore operation on a mutex or anything like that. If an error is detected then a suitable
error code is returned, for example the POSIX function pthread_mutex_lock can return various error codes
including EINVAL and EDEADLK. There are a number of problems with this approach, especially in the context of
deeply embedded systems:
1. Performing these checks inside the mutex lock and all the other functions requires extra cpu cycles and adds
significantly to the code size. Even if the application is written correctly and only makes system function calls
with sensible arguments and under the right conditions, these overheads still exist.
2. Returning an error code is only useful if the calling code detects these error codes and takes appropriate action.
In practice the calling code will often ignore any errors because the programmer “knows” that the function is
being used correctly. If the programmer is mistaken then an error condition may be detected and reported, but
the application continues running anyway and is likely to fail some time later in mysterious ways.
3. If the calling code does always check for error codes, that adds yet more cpu cycles and code size overhead.
4. Usually there will be no way to recover from certain errors, so if the application code detected an error such
as EINVAL then all it could do is abort the application somehow.
The approach taken within the eCos kernel is different. Functions such as cyg_mutex_lock will not return an error
code. Instead they contain various assertions, which can be enabled or disabled. During the development process
assertions are normally left enabled, and the various kernel functions will perform parameter checks and other
system consistency checks. If a problem is detected then an assertion failure will be reported and the application
will be terminated. In a typical debug session a suitable breakpoint will have been installed and the developer can
now examine the state of the system and work out exactly what is going on. Towards the end of the development
cycle assertions will be disabled by manipulating configuration options within the eCos infrastructure package, and
all assertions will be eliminated at compile-time. The assumption is that by this time the application code has been
mostly debugged: the initial version of the code might have tried to perform a semaphore operation on a mutex, but
any problems like that will have been fixed some time ago. This approach has a number of advantages:
1. In the final application there will be no overheads for checking parameters and other conditions. All that code
will have been eliminated at compile-time.
33
Kernel Overview
2. Because the final application will not suffer any overheads, it is reasonable for the system to do more work
3. There is no need for application programmers to handle error codes returned by various kernel function calls.
4. If an error is detected then an assertion failure will be reported immediately and the application will be halted.
Although none of the kernel functions return an error code, many of them do return a status condition. For example
the function cyg_semaphore_timed_wait waits until either an event has been posted to a semaphore, or until a
certain number of clock ticks have occurred. Usually the calling code will need to know whether the wait operation
succeeded or whether a timeout occurred. cyg_semaphore_timed_wait returns a boolean: a return value of zero
or false indicates a timeout, a non-zero return value indicates that the wait succeeded.
In conventional APIs one common error conditions is lack of memory. For example the POSIX function
pthread_create usually has to allocate some memory dynamically for the thread stack and other per-thread
data. If the target hardware does not have enough memory to meet all demands, or more commonly if the
application contains a memory leak, then there may not be enough memory available and the function call would
fail. The eCos kernel avoids such problems by never performing any dynamic memory allocation. Instead it is the
responsibility of the application code to provide all the memory required for kernel data structures and other
needs. In the case of cyg_thread_create this means a cyg_thread data structure to hold the thread details, and a
char array for the thread stack.
during the development process. In particular the various assertions can test for more error conditions and
more complicated errors. When an error is detected it is possible to give a text message describing the error
rather than just return an error code.
This simplifies the application code.
There is no possibility of an error condition being ignored because application code did not check for an error
code.
In many applications this approach results in all data structures being allocated statically rather than dynamically.
This has several advantages. If the application is in fact too large for the target hardware’s memory then there will
be an error at link-time rather than at run-time, making the problem much easier to diagnose. Static allocation
does not involve any of the usual overheads associated with dynamic allocation, for example there is no need to
keep track of the various free blocks in the system, and it may be possible to eliminate malloc from the system
completely. Problems such as fragmentation and memory leaks cannot occur if all data is allocated statically.
However, some applications are sufficiently complicated that dynamic memory allocation is required, and the
various kernel functions do not distinguish between statically and dynamically allocated memory. It still remains
the responsibility of the calling code to ensure that sufficient memory is available, and passing null pointers to the
kernel will result in assertions or system failure.
34
SMP Support
Name
SMP — Support Symmetric Multiprocessing Systems
Description
eCos contains support for limited Symmetric Multi-Processing (SMP). This is only available on selected architectures and platforms. The implementation has a number of restrictions on the kind of hardware supported. These are
described in the Section called SMP Support in Chapter 9.
The following sections describe the changes that have been made to the eCos kernel to support SMP operation.
System Startup
The system startup sequence needs to be somewhat different on an SMP system, although this is largely transparent
to application code. The main startup takes place on only one CPU, called the primary CPU. All other CPUs, the
secondary CPUs, are either placed in suspended state at reset, or are captured by the HAL and put into a spin as
they start up. The primary CPU is responsible for copying the DATA segment and zeroing the BSS (if required),
calling HAL variant and platform initialization routines and invokingconstructors. It then calls cyg_start to enter
the application. The application may then create extra threads and other objects.
It is only when the application calls cyg_scheduler_start that the secondary CPUs are initialized. This routine
scans the list of available secondary CPUs and invokes HAL_SMP_CPU_START to start each CPU. Finally it calls an
internal function Cyg_Scheduler::start_cpu to enter the scheduler for the primary CPU.
Each secondary CPU starts in the HAL, where it completes anyper-CPU initialization before calling into the kernel
at cyg_kernel_cpu_startup. Here it claims the scheduler lock and calls Cyg_Scheduler::start_cpu.
Cyg_Scheduler::start_cpu is common to both the primary and secondary CPUs. The first thing this code does
is to install an interrupt object for this CPU’s inter-CPU interrupt. From this point on the code is the same as for
the single CPU case: an initial thread is chosen and entered.
From this point on the CPUs are all equal, eCos makes no further distinction between the primary and secondary
CPUs. However, the hardware may still distinguish between them as far as interrupt delivery is concerned.
Scheduling
To function correctly an operating system kernel must protect its vital data structures, such as the run queues,
from concurrent access. In a single CPU system the only concurrent activities to worry about are asynchronous
interrupts. The kernel can easily guard its data structures against these by disabling interrupts. However, in a multiCPU system, this is inadequate since it does not block access by other CPUs.
The eCos kernel protects its vital data structures using the scheduler lock. In single CPU systems this is a simple
counter that is atomically incremented to acquire the lock and decremented to release it. If the lock is decremented
to zero then the scheduler may be invoked to choose a different thread to run. Because interrupts may continue to
be serviced while the scheduler lock is claimed, ISRs are not allowedto access kernel data structures, or call kernel
35
SMP Support
routines that can. Instead all such operations are deferred to an associated DSR routine that is run during the lock
release operation, when the data structures are in a consistent state.
By choosing a kernel locking mechanism that does not rely on interrupt manipulation to protect data structures,
it is easier to convert eCos to SMP than would otherwise be the case. The principal change needed to make eCos
SMP-safe is to convert the scheduler lock into a nestable spin lock. This is done by adding a spinlock and a CPU
id to the original counter.
The algorithm for acquiring the scheduler lock is very simple. If the scheduler lock’s CPU id matches the current
CPU then it can just increment the counter and continue. If it does not match, the CPU must spin on the spinlock,
after which it may increment the counter and store its own identity in the CPU id.
To release the lock, the counter is decremented. If it goes to zero the CPU id value must be set to NONE and the
spinlock cleared.
To protect these sequences against interrupts, they must be performed with interrupts disabled. However, since
these are very short code sequences, they will not have an adverse effect on the interrupt latency.
Beyond converting the scheduler lock, further preparing the kernel for SMP is a relatively minor matter. The main
changes are to convert various scalar housekeeping variables into arrays indexed by CPU id. These include the
current thread pointer, the need_reschedule flag and the timeslice counter.
At present only the Multi-Level Queue (MLQ) scheduler is capable of supporting SMP configurations. The main
change made to this scheduler is to cope with havingseveral threads in execution at the same time. Running threads
are marked with the CPU that they are executingon. When scheduling athread, thescheduler skips past any running
threads until it finds a thread that is pending. While not a constant-time algorithm, as in the single CPU case, this
is still deterministic, since the worst case time is bounded by the number of CPUs in the system.
A second change to the scheduler is in the code used to decide when the scheduler should be called to choose a
new thread. The scheduler attempts to keep the n CPUs running the n highest priority threads. Since an event or
interrupt on one CPU may require a reschedule on another CPU, there must be a mechanism for deciding this. The
algorithm currently implemented is very simple. Given a thread that has just been awakened (or had its priority
changed), the scheduler scans the CPUs, starting with the one it is currently running on, for a current thread that
is of lower priority than the new one. If one is found then a reschedule interrupt is sent to that CPU and the scan
continues, but now using the current thread of the rescheduled CPU as the candidate thread. In this way the new
thread gets to run as quickly as possible, hopefully on the current CPU, and the remaining CPUs will pick up the
remaining highest priority threads as a consequence of processing the reschedule interrupt.
The final change to the scheduleris in the handling of timeslicing.Only one CPU receives timer interrupts, although
all CPUs must handle timeslicing. To make this work, the CPU that receives the timer interrupt decrements the
timeslice counter for all CPUs, not just its own. If the counter for a CPU reaches zero, then it sends a timeslice
interrupt to that CPU. On receiving the interrupt the destination CPU enters the scheduler and looks for another
thread at the same priority to run. This is somewhat more efficient than distributing clock ticks to all CPUs, since
the interrupt is only needed when a timeslice occurs.
All existing synchronization mechanisms work as before in an SMP system. Additional synchronization mechanisms have been added to provide explicit synchronization for SMP, in the form of spinlocks.
SMP Interrupt Handling
The main area where the SMP nature of a system requires special attention is in device drivers and especially
interrupt handling. It is quite possible for the ISR, DSR and thread components of a device driver to execute on
different CPUs. For this reason it is much more important that SMP-capable device drivers use the interrupt-related
36
SMP Support
functions correctly. Typically a device driver would use the driver API rather than call the kernel directly, but it is
unlikely that anybody would attempt to use a multiprocessor system without the kernel package.
Two new functions have been added to the Kernel API to do interrupt routing: cyg_interrupt_set_cpu and
cyg_interrupt_get_cpu. Although not currently supported, special values for the cpu argument may be used in
future to indicate that the interrupt is being routed dynamically or is CPU-local. Once a vector has been routed to
a new CPU, all other interrupt masking and configuration operations are relative to that CPU, where relevant.
There are more details of how interrupts should be handled in SMP systems in the Section called SMP Support in
The cyg_thread_create function allows application code and eCos packages to create new threads. In many
applications this only happens during system initialization and all required data is allocated statically. However
additional threads can be created at any time, if necessary. A newly created thread is always in suspended state
and will not start running until it has been resumed via a call to cyg_thread_resume. Also, if threads are created
during system initialization then they will not start running until the eCos scheduler has been started.
The name argument is used primarily for debugging purposes, making it easier to keep track of which
cyg_thread structure is associated with which application-level thread. The kernel configuration option
CYGVAR_KERNEL_THREADS_NAME controls whether or not this name is actually used.
On creation each thread is assigned a unique handle, and this will be stored in the location pointed at by the
handle argument. Subsequent operations on this thread including the required cyg_thread_resume should use
this handle to identify the thread.
The kernel requires a small amount of space for each thread, in the form of a cyg_thread data structure, to hold
information such as the current state of that thread. To avoid any need for dynamic memory allocation within the
kernel this space has to be provided by higher-level code, typically in the form of a static variable. The thread
argument provides this space.
Thread Entry Point
The entry point for a thread takes the form:
void
thread_entry_function(cyg_addrword_t data)
{
...
}
39
Thread creation
The second argument to cyg_thread_create is a pointer to such a function. The third argument entry_data
is used to pass additional data to the function. Typically this takes the form of a pointer to some static data, or a
small integer, or 0 if the thread does not require any additional data.
If the thread entry function ever returns then this is equivalent to the thread calling cyg_thread_exit. Even
though the thread will no longer run again, it remains registered with the scheduler. If the application needs to
re-use the cyg_thread data structure then a call to cyg_thread_delete is required first.
Thread Priorities
The sched_info argument provides additional information to the scheduler. The exact details depend on the
scheduler being used. For the bitmap and mlqueue schedulers it is a small integer, typically in the range 0 to 31,
with 0 being the highest priority. The lowest priority is normally used only by the system’s idle thread. The exact
number of priorities is controlled by the kernel configuration option CYGNUM_KERNEL_SCHED_PRIORITIES.
It is the responsibility of the application developer to be aware of the various threads in the system, including those
created by eCos packages, and to ensure that all threads run at suitable priorities. For threads created by other
packages the documentation provided by those packages should indicate any requirements.
cyg_thread_get_current_priority can be used to manipulate a thread’s priority.
Stacks and Stack Sizes
Each thread needs its own stack for local variables and to keep track of function calls and returns. Again it is
expected that this stack is provided by the calling code, usually in the form of static data, so that the kernel does not
need any dynamic memory allocation facilities. cyg_thread_create takes two arguments related to the stack, a
pointer to the base of the stack and the total size of this stack. On many processors stacks actually descend from
the top down, so the kernel will add the stack size to the base address to determine the starting location.
The exact stack size requirements for any given thread depend on a number of factors. The most important is
of course the code that will be executed in the context of this code: if this involves significant nesting of
function calls, recursion, or large local arrays, then the stack size needs to be set to a suitably high value.
There are some architectural issues, for example the number of cpu registers and the calling conventions
will have some effect on stack usage. Also, depending on the configuration, it is possible that some other
code such as interrupt handlers will occasionally run on the current thread’s stack. This depends in
part on configuration options such as CYGIMP_HAL_COMMON_INTERRUPTS_USE_INTERRUPT_STACK and
CYGSEM_HAL_COMMON_INTERRUPTS_ALLOW_NESTING.
Determining an application’s actual stack size requirements is the responsibility of the application developer,
since the kernel cannot know in advance what code a given thread will run. However, the system does provide
some hints about reasonable stack sizes in the form of two constants: CYGNUM_HAL_STACK_SIZE_MINIMUM and
CYGNUM_HAL_STACK_SIZE_TYPICAL. These are defined by the appropriate HAL package. The MINIMUM value is
appropriate for a thread that just runs a single function and makes very simple system calls. Trying to create a
thread with a smaller stack than this is illegal. The TYPICAL value is appropriate for applications where application
calls are nested no more than half a dozen or so levels, and there are no large arrays on the stack.
40
If the stack sizes are not estimated correctly and a stack overflow occurs, the probably result is some form of
memory corruption. This can be very hard to track down. The kernel does contain some code to help detect stack
overflows, controlled by the configuration option CYGFUN_KERNEL_THREADS_STACK_CHECKING: a small amount
Thread creation
of space is reserved at the stack limit and filled with a special signature: every time a thread context switch occurs
this signature is checked, and if invalid that is a good indication (but not absolute proof) that a stack overflow has
occurred. This form of stack checking is enabled by default when the system is built with debugging enabled. A
related configuration option is CYGFUN_KERNEL_THREADS_STACK_MEASUREMENT: enabling this option means that
a thread can call the function cyg_thread_measure_stack_usage to find out the maximum stack usage to date.
Note that this is not necessarily the true maximum because, for example, it is possible that in the current run no
interrupt occurred at the worst possible moment.
Valid contexts
cyg_thread_create may be called during initialization and from within thread context. It may not be called from
inside a DSR.
Example
A simple example of thread creation is shown below. This involves creating five threads, one producer and four
consumers or workers. The threads are created in the system’s cyg_user_start: depending on the configuration
it might be more appropriate to do this elsewhere, for example inside main.
For code written in C++ the thread entry function must be either a static member function of a class or an ordinary
function outside any class. It cannot be a normal member function of a class because such member functions take
an implicit additional argument this, and the kernel has no way of knowing what value to use for this argument.
One way around this problem is to make use of a special static member function, for example:
Effectivelythis uses the entry_data argument to cyg_thread_create to hold the this pointer.Unfortunately
this approach does require the use of some C++ casts, so some of the type safety that can be achieved when
programming in C++ is lost.
These functions can be used to obtain some basic information about various threads in the system. Typically they
serve little or no purpose in real applications, but they can be useful during debugging.
cyg_thread_self returns a handle corresponding to the current thread. It will be the same as the value filled in
by cyg_thread_create when the current thread was created. This handle can then be passed to other functions
such as cyg_thread_get_priority.
cyg_thread_idle_thread returns the handle corresponding to the idle thread. This thread is created automati-
cally by the kernel, so application-code has no other way of getting hold of this information.
cyg_thread_get_stack_base and cyg_thread_get_stack_size return information about a specific thread’s
stack. The values returned will match the values passed to cyg_thread_create when this thread was created.
cyg_thread_measure_stack_usageisonlyavailableiftheconfigurationoption
CYGFUN_KERNEL_THREADS_STACK_MEASUREMENT is enabled. The return value is the maximum number of bytes
of stack space used so far by the specified thread. Note that this should not be considered a true upper bound, for
example it is possible that in the current test run the specified thread has not yet been interrupted at the deepest
point in the function call graph. Never the less the value returned can give some useful indication of the thread’s
stack requirements.
cyg_thread_get_next is used to enumerate all the current threads in the system. It should be called initially with
the locations pointed to by thread and id set to zero. On return these will be set to the handle and ID of the first
thread. On subsequent calls, these parameters should be left set to the values returned by the previous call. The
43
Thread information
handle and ID of the next thread in the system will be installed each time, until a false return value indicates the
end of the list.
cyg_thread_get_info fills in the cyg_thread_info structure with information about the thread described by the
thread and id arguments. The information returned includes the thread’s handle and id, its state and name,
priorities and stack parameters. If the thread does not exist the function returns false.
The cyg_thread_info structure is defined as follows by <cyg/kernel/kapi.h>, but may be extended in future
with additional members, and so its size should not be relied upon:
cyg_thread_find returns a handle for the thread whose ID is id. If no such thread exists, a zero handle is
returned.
Valid contexts
cyg_thread_self may only be called from thread context. cyg_thread_idle_thread may be called from
thread or DSR context, but only after the system has been initialized. cyg_thread_get_stack_base,
cyg_thread_get_stack_size and cyg_thread_measure_stack_usage may be called any time after the
specified thread has been created, but measuring stack usage involves looping over at least part of the thread’s
stack so this should normally only be done from thread context.
Examples
A simple example of the use of the cyg_thread_get_next and cyg_thread_get_info follows:
These functions provide some control over whether or not aparticular threadcan run. Apart from the required useof
cyg_thread_resume to start a newly-createdthread, application code should normally use proper synchronization
primitives such as condition variables or mail boxes.
Yield
cyg_thread_yield allows a thread to relinquish control of the processor to some other runnable thread which has
the same priority. This can have no effect on any higher-priority thread since, if such a thread were runnable, the
current thread would have been preempted in its favour. Similarly it can have no effect on any lower-priority thread
because the current thread will always be run in preference to those. As a consequence this function is only useful
in configurations with a scheduler that allows multiple threads to run at the same priority, for example the mlqueue
scheduler. If instead the bitmap scheduler was being used then cyg_thread_yield() would serve no purpose.
Even if a suitable scheduler such as the mlqueue scheduler has been configured, cyg_thread_yield will still
rarely prove useful: instead timeslicing will be used to ensure that all threads of a given priority get a fair slice
of the available processor time. However it is possible to disable timeslicing via the configuration option
CYGSEM_KERNEL_SCHED_TIMESLICE, in which case cyg_thread_yield can be used to implement a form of
cooperative multitasking.
Delay
cyg_thread_delay allows a thread to suspend until the specified number of clock ticks have occurred. For ex-
ample, if a value of 1 is used and the system clock runs at a frequency of 100Hz then the thread will sleep for up
to 10 milliseconds. This functionality depends on the presence of a real-time system clock, as controlled by the
configuration option CYGVAR_KERNEL_COUNTERS_CLOCK.
47
Thread control
If the application requires delays measured in milliseconds or similar units rather than in clock ticks, some calculations are needed to convertbetween these unitsas described inClocks. Usually thesecalculations can bedone bythe
application developer, or at compile-time. Performing such calculations prior to every call to cyg_thread_delay
adds unnecessary overhead to the system.
Suspend and Resume
Associated with each thread is a suspend counter. When a thread is first created this counter is initialized to 1.
cyg_thread_suspend can be used to increment the suspend counter, and cyg_thread_resume decrements it.
The scheduler will never run a thread with a non-zero suspend counter. Therefore a newly created thread will not
run until it has been resumed.
An occasional problem with the use of suspend and resume functionality is that a thread gets suspended
more times than it is resumed and hence never becomes runnable again. This can lead to very
confusing behaviour. To help with debugging such problems the kernel provides a configuration option
CYGNUM_KERNEL_MAX_SUSPEND_COUNT_ASSERT which imposes an upper bound on the number of suspend calls
without matching resumes, with a reasonable default value. This functionality depends on infrastructure assertions
being enabled.
Releasing a Blocked Thread
When a thread is blocked on a synchronization primitive such as a semaphore or a mutex, or when it is waiting
for an alarm to trigger, it can be forcibly woken up using cyg_thread_release. Typically this will call the
affected synchronization primitive to return false, indicating that the operation was not completed successfully.
This function has to be used with great care, and in particular it should only be used on threads that have been
designed appropriately and check all return codes. If instead it were to be used on, say, an arbitrary thread that is
attempting to claim a mutex then that thread might not bother to check the result of the mutex lock operation usually there would be no reason to do so. Therefore the thread will now continue running in the false belief that it
has successfully claimed a mutex lock, and the resulting behaviour is undefined. If the system has been built with
assertions enabled then it is possible that an assertion will trigger when the thread tries to release the mutex it does
not actually own.
The main use of cyg_thread_release is in the POSIX compatibility layer,where it is used in the implementation
of per-thread signals and cancellation handlers.
Valid contexts
cyg_thread_yield can only be called from thread context, A DSR must always run to completion and cannot
yield the processor to some thread. cyg_thread_suspend, cyg_thread_resume, and cyg_thread_release
may be called from thread or DSR context.
48
Thread termination
Name
cyg_thread_exit, cyg_thread_kill, cyg_thread_delete — Allow threads to terminate
In many embedded systems the various threads are allocated statically, created during initialization, and never
need to terminate. This avoids any need for dynamic memory allocation or other resource management facilities.
Howeverif a given application does have a requirement that some threads be created dynamically, must terminate,
and their resources such as the stack be reclaimed, then the kernel provides the functions cyg_thread_exit,
cyg_thread_kill, and cyg_thread_delete.
cyg_thread_exit allows a thread to terminate itself, thus ensuring that it will not be run again by the scheduler.
However the cyg_thread data structure passed to cyg_thread_create remains in use, and the handle returned
by cyg_thread_create remains valid. This allows other threads to perform certain operations on the terminated
thread, for example to determine its stack usage via cyg_thread_measure_stack_usage. When the handle and
cyg_thread structure are no longer required, cyg_thread_delete should be called to release these resources. If
the stack was dynamically allocated then this should not be freed until after the call to cyg_thread_delete.
Alternatively, one thread may use cyg_thread_kill on another This has much the same effect as the affected
thread calling cyg_thread_exit. However killing a thread is generally rather dangerous because no attempt is
made to unlock any synchronization primitives currently owned by that thread or release any other resources that
thread may have claimed. Therefore use of this function should be avoided, and cyg_thread_exit is preferred.
cyg_thread_kill cannot be used by a thread to kill itself.
cyg_thread_delete should be used on a thread after it has exited and is no longer required. After this call the
thread handle is no longer valid, and both the cyg_thread structure and the thread stack can be re-used or freed. If
cyg_thread_delete is invokedon a thread that is still running then there is an implicitcall to cyg_thread_kill.
Valid contexts
cyg_thread_exit, cyg_thread_kill and cyg_thread_delete can only be called from thread context.
49
Thread termination
50
Thread priorities
Name
cyg_thread_get_priority, cyg_thread_get_current_priority,
cyg_thread_set_priority — Examine and manipulate thread priorities
Typical schedulers use the concept of a thread priority to determine which thread should run next. Exactly
what this priority consists of will depend on the scheduler, but a typical implementation would be a small
integer in the range 0 to 31, with 0 being the highest priority. Usually only the idle thread will run at the
lowest priority. The exact number of priority levels available depends on the configuration, typically the option
CYGNUM_KERNEL_SCHED_PRIORITIES.
cyg_thread_get_priority can be used to determinethe priority of a thread, ormore correctlythe value last used
in a cyg_thread_set_priority call or when the thread was first created. In some circumstances it is possible
that the thread is actually running at a higher priority. For example, if it owns a mutex and priority ceilings or
inheritance is being used to prevent priority inversion problems, then the thread’s priority may have been boosted
temporarily. cyg_thread_get_current_priority returns the real current priority.
In many applications appropriate thread priorities can be determined and allocated statically. However, if it is
necessary for a thread’s priority to change at run-time then the cyg_thread_set_priority function provides
this functionality.
Valid contexts
cyg_thread_get_priority and cyg_thread_get_current_priority can be called from thread or DSR con-
text, although the latter is rarely useful. cyg_thread_set_priority should also only be called from thread
context.
51
Thread priorities
52
Per-thread data
Name
cyg_thread_new_data_index, cyg_thread_free_data_index, cyg_thread_get_data,
cyg_thread_get_data_ptr, cyg_thread_set_data — Manipulate per-thread data
In some applications and libraries it is useful to have some data that is specific to each thread. For example, many
of the functions in the POSIX compatibility package return -1 to indicate an error and store additional information
in what appears to be a global variable errno. However, if multiple threads make concurrent calls into the POSIX
library and if errno were really a global variable then a thread would have no way of knowing whether the current
errno value really corresponded to the last POSIX call it made, or whether some other thread had run in the
meantime and made a different POSIX call which updated the variable. To avoid such confusion errno is instead
implemented as a per-thread variable, and each thread has its own instance.
The support for per-thread data can be disabled via the configuration option CYGVAR_KERNEL_THREADS_DATA.
If enabled, each cyg_thread data structure holds a small array of words. The size of this array is determined by
the configuration option CYGNUM_KERNEL_THREADS_DATA_MAX. When a thread is created the array is filled with
zeroes.
If an application needs to use per-thread data then it needs an index into this array which has not yet been allocated
to other code. This index can be obtained by calling cyg_thread_new_data_index, and then used in subsequent
calls to cyg_thread_get_data. Typically indices are allocated during system initialization and stored in static
variables. If for some reason a slot in the array is no longer required and can be re-used then it can be released by
calling cyg_thread_free_data_index,
The current per-thread data in a given slot can be obtained using cyg_thread_get_data. This implicitly operates
on the current thread, and its single argument should be an index as returned by cyg_thread_new_data_index.
The per-thread data can be updated using cyg_thread_set_data. If a particular item of per-thread data is needed
repeatedly then cyg_thread_get_data_ptr can be used to obtain the address of the data, and indirecting through
this pointer allows the data to be examined and updated efficiently.
Some packages, for example the error and POSIX packages, have pre-allocated slots in the array of per-thread
data. These slots should not normally be used by application code, and instead slots should be allocated during
initialization by a call to cyg_thread_new_data_index. If it is known that, for example, the configuration will
53
Per-thread data
never include the POSIX compatibility package then application code may instead decide to re-use the slot allocated to that package, CYGNUM_KERNEL_THREADS_DATA_POSIX, but obviously this does involve a risk of strange
and subtle bugs if the application’s requirements ever change.
Valid contexts
Typically cyg_thread_new_data_index is only called during initialization, but may also be called at any time
in thread context. cyg_thread_free_data_index, if used at all, can also be called during initialization or from
thread context. cyg_thread_get_data, cyg_thread_get_data_ptr, and cyg_thread_set_data may only be
called from thread context because they implicitly operate on the current thread.
54
Thread destructors
Name
cyg_thread_add_destructor, cyg_thread_rem_destructor — Call functions on thread
termination
These functions are provided for cases when an application requires a function to be automatically called when a
thread exits. This is often useful when, for example, freeing up resources allocated by the thread.
This support must be enabled with the configuration option CYGPKG_KERNEL_THREADS_DESTRUCTORS. When enabled, you may register a function of type cyg_thread_destructor_fn to be called on thread termination using
cyg_thread_add_destructor. You may also provide it with a piece of arbitrary information in the data argu-
ment which will be passed to the destructor function fn when the thread terminates. If you no longer wish to call a
function previous registered with cyg_thread_add_destructor, you may call cyg_thread_rem_destructor
with the same parameters used to register the destructor function. Both these functions return true on success and
false on failure.
By default, thread destructors are per-thread, which means that registering a destructor function only registers
that function for the current thread. In other words, each thread has its own list of destructors. Alternatively you
may disable the configuration option CYGSEM_KERNEL_THREADS_DESTRUCTORS_PER_THREAD in which case any
registered destructors will be run when any threads exit. In other words, the thread destructor list is global and all
threads have the same destructors.
There is a limit to the number of destructors which may be registered, which can be controlled with the
CYGNUM_KERNEL_THREADS_DESTRUCTORS configuration option. Increasing this value will very slightly
increase the amount of memory in use, and when CYGSEM_KERNEL_THREADS_DESTRUCTORS_PER_THREAD
is enabled, the amount of memory used per thread will increase. When the limit has been reached,
cyg_thread_add_destructor will return false.
Valid contexts
WhenCYGSEM_KERNEL_THREADS_DESTRUCTORS_PER_THREADisenabled,thesefunctionsmust
only be called from a thread context as they implicitly operate on the current thread. When
CYGSEM_KERNEL_THREADS_DESTRUCTORS_PER_THREAD is disabled, these functions may be called from thread
Sometimes code attempts operations that are not legal on the current hardware, for example dividing by zero, or accessing data through a pointer that is not properly aligned. When this happens the hardware will raise an exception.
This is very similar to an interrupt, but happens synchronously with code execution rather than asynchronously and
hence can be tied to the thread that is currently running.
The exceptions that can be raised depend very much on the hardware, especially the processor. The corresponding
documentation should be consulted for more details. Alternatively the architectural HAL header file hal_intr.h,
or one of the variant or platform header files it includes, will contain appropriate definitions. The details of how to
handle exceptions, including whether or not it is possible to recover from them, also depend on the hardware.
CYGPKG_KERNEL_EXCEPTIONS. If an application has been exhaustively tested and is trusted never to raise
a hardware exception then this option can be disabled and code and data sizes will be reduced somewhat.
If exceptions are left enabled then the system will provide default handlers for the various exceptions, but
these do nothing. Even the specific type of exception is ignored, so there is no point in attempting to
decode this and distinguish between say a divide-by-zero and an unaligned access. If the application installs
its own handlers and wants details of the specific exception being raised then the configuration option
CYGSEM_KERNEL_EXCEPTIONS_DECODE has to be enabled.
An alternative handler can be installed using cyg_exception_set_handler. This requires a code for the exception, a function pointer for the new exception handler, and a parameter to be passed to this handler. Details of the
previously installed exception handler will be returned via the remaining two arguments, allowing that handler to
be reinstated, or null pointers can be used if this information is of no interest. An exception handling function
should take the following form:
The data argument corresponds to the new_data parameter supplied to cyg_exception_set_handler. The
exception code is provided as well, in case a single handler is expected to support multiple exceptions. The info
argument will depend on the hardware and on the specific exception.
cyg_exception_clear_handler can be used to restore the default handler, if desired. It is also possible for
software to raise an exception and cause the current handler to be invoked, but generally this is useful only for
testing.
By default the system maintains a single set of global exception handlers. However, since exceptions
occur synchronously it is sometimes useful to handle them on a per-thread basis, and have a different
set of handlers for each thread. This behaviour can be obtained by disabling the configuration
option CYGSEM_KERNEL_EXCEPTIONS_GLOBAL. If per-thread exception handlers are being used then
cyg_exception_set_handler and cyg_exception_clear_handler apply to the current thread. Otherwise
they apply to the global set of handlers.
Caution
In the current implementation cyg_exception_call_handler canonly beused on the current
thread. There is no support for delivering an exception to another thread.
Note: Exceptions at the eCos kernel level refer specifically to hardware-related events such as unaligned
accesses to memory or division by zero. There is no relation with other concepts that are also known as
exceptions, for example the throw and catch facilities associated with C++.
Valid contexts
If the system is configured with a single set of global exception handlers then cyg_exception_set_handler
and cyg_exception_clear_handler may be called during initialization or from thread context. If instead perthread exception handlers are being used then it is not possible to install new handlers during initialization because the functions operate implicitly on the current thread, so they can only be called from thread context.
cyg_exception_call_handler should only be called from thread context.
Kernel counters can be used to keep track of how many times a particular eventhas occurred. Usually this event is
an external signal of some sort. The most common use of counters is in the implementation of clocks, but they can
be useful with other event sources as well. Application code can attach alarms to counters, causing a function to be
called when some number of events have occurred.
A new counter is initialized by a call to cyg_counter_create. The first argument is used to return a handle to the
new counter which can be used for subsequent operations. The second argument allows the application to provide
the memory needed for the object, thus eliminating any need for dynamic memory allocation within the kernel. If
a counter is no longer required and does not have any alarms attached then cyg_counter_delete can be used to
release the resources, allowing the cyg_counter data structure to be re-used.
Initializing a counter does not automatically attach it to any source of events. Instead some other code needs to
call cyg_counter_tick whenever a suitable event occurs, which will cause the counter to be incremented
and may cause alarms to trigger. The current value associated with the counter can be retrieved using
cyg_counter_current_value and modified with cyg_counter_set_value. Typically the latter function is
only used during initialization, for example to set a clock to wallclock time, but it can be used to reset a counter if
necessary. However cyg_counter_set_value will never trigger any alarms. A newly initialized counter has a
starting value of 0.
GIMP_KERNEL_COUNTERS_SINGLE_LIST which stores all alarms attached to the counter on a single list. This is
simple and usually efficient. However when a tick occurs the kernel code has to traverse this list, typically at DSR
level, so if there are a significant number of alarms attached to a single counter this will affect the system’s
dispatch latency. The alternative implementation, CYGIMP_KERNEL_COUNTERS_MULTI_LIST, stores each alarm in
one of an array of lists such that at most one of the lists needs to be searched per clock tick. This involves extra
code and data, but can improve real-time responsiveness in some circumstances. Another configuration option that
is relevant here is CYGIMP_KERNEL_COUNTERS_SORT_LIST, which is disabled by default. This provides a trade
59
Counters
off between doing work whenever a new alarm is added to a counter and doing work whenever a tick occurs. It is
application-dependent which of these is more appropriate.
Valid contexts
cyg_counter_create is typically called during system initialization but may also be called in thread
context. Similarly cyg_counter_delete may be called during initialization or in thread context.
cyg_counter_current_value, cyg_counter_set_value and cyg_counter_tick may be called during
initialization or from thread or DSR context. In fact, cyg_counter_tick is usually called from inside a DSR in
response to an external event of some sort.
60
Clocks
Name
cyg_clock_create, cyg_clock_delete, cyg_clock_to_counter,
cyg_clock_set_resolution, cyg_clock_get_resolution, cyg_real_time_clock,
cyg_current_time — Provide system clocks
In the eCos kernel clock objects are a special form of counter objects. They are attached to a specific type of
hardware, clocks that generate ticks at very specific time intervals, whereas counters can be used with any event
source.
In a default configuration the kernel provides a single clock instance, the real-time clock. This gets used for timeslicing and for operations that involvea timeout, for example cyg_semaphore_timed_wait. If this functionality is
not required it can be removed from the system usingthe configurationoption CYGVAR_KERNEL_COUNTERS_CLOCK.
Otherwise the real-time clock can be accessed by a call to cyg_real_time_clock, allowing applications to attach
alarms, and the current counter value can be obtained using cyg_current_time.
Applications can create and destroy additional clocks if desired, using cyg_clock_create and
cyg_clock_delete. The first argument to cyg_clock_create specifies the resolution this clock will run at. The
second argument is used to return a handle for this clock object, and the third argument provides the kernel with
the memory needed to hold this object. This clock will not actually tick by itself. Instead it is the responsibility of
application code to initialize a suitable hardware timer to generate interrupts at the appropriate frequency, install
an interrupt handler for this, and call cyg_counter_tick from inside the DSR. Associated with each clock is a
kernel counter, a handle for which can be obtained using cyg_clock_to_counter.
Clock Resolutions and Ticks
At the kernel level all clock-related operations including delays, timeouts and alarms work in units of clock ticks,
rather than in units of seconds or milliseconds. If the calling code, whether the application or some other package,
needs to operate using units such as milliseconds then it has to convert from these units to clock ticks.
61
Clocks
The main reason for this is that it accurately reflects the hardware: calling something like nanosleep with a delay
of ten nanoseconds will not work as intended on any real hardware because timer interrupts simply will not happen
that frequently; instead calling cyg_thread_delay with the equivalent delay of 0 ticks gives a much clearer
indication that the application is attempting something inappropriate for the target hardware. Similarly, passing a
delay of five ticks to cyg_thread_delay makes it fairly obvious that the current thread will be suspended for
somewhere between four and five clock periods, as opposed to passing 50000000 to nanosleep which suggests a
granularity that is not actually provided.
A secondary reason is that conversion between clock ticks and units such as milliseconds can be somewhat expensive,and whenever possible should be done at compile-time or by the application developerrather than at run-time.
This saves code size and cpu cycles.
The information needed to perform these conversions is the clock resolution. This is a structure with two fields,
a dividend and a divisor, and specifies the number of nanoseconds between clock ticks. For example a clock
that runs at 100Hz will have10 milliseconds between clock ticks, or 10000000 nanoseconds. The ratio between the
resolution’sdividend and divisor will therefore be10000000 to 1, and typicalvalues for these might be 1000000000
and 100. If the clock runs at a different frequency, say 60Hz, the numbers could be 1000000000 and 60 respectively.
Given a delay in nanoseconds, this can be converted to clock ticks by multiplying with the the divisor and then
dividing by the dividend. For example a delay of 50 milliseconds corresponds to 50000000 nanoseconds, and with
a clock frequency of 100Hz this can be converted to ((50000000 * 100) / 1000000000) = 5 clock ticks. Given the
large numbers involved this arithmetic normally has to be done using 64-bit precision and the long long data type,
but allows code to run on hardware with unusual clock frequencies.
The default frequency for the real-time clock on any platform is usually about 100Hz, but platform-specific documentation should be consulted for this information. Usually it is possible to override this default by configuration
options, but again this depends on the capabilities of the underlying hardware. The resolution for any clock can
be obtained using cyg_clock_get_resolution. For clocks created by application code, there is also a function
cyg_clock_set_resolution. This does not affect the underlying hardware timer in any way, it merely updates
the information that will be returned in subsequent calls to cyg_clock_get_resolution: changing the actual
underlying clock frequency will require appropriate manipulation of the timer hardware.
Valid contexts
cyg_clock_create is usually only called during system initialization (if at all), but may also be called from
thread context. The same applies to cyg_clock_delete. The remaining functions may be called during initialization, from thread context, or from DSR context, although it should be noted that there is no locking between
cyg_clock_get_resolution and cyg_clock_set_resolution so theoretically it is possible that the former
returns an inconsistent data structure.
62
Alarms
Name
cyg_alarm_create, cyg_alarm_delete, cyg_alarm_initialize, cyg_alarm_enable,
cyg_alarm_disable — Run an alarm function when a number of events have occurred
Kernel alarms are used together with counters and allow for action to be taken when a certain number of events
have occurred. If the counter is associated with a clock then the alarm action happens when the appropriate number
of clock ticks have occurred, in other words after a certain period of time.
Setting up an alarm involves a two-step process. First the alarm must be created with a call to cyg_alarm_create.
This takes five arguments. The first identifies the counter to which the alarm should be attached. If the alarm should
be attached to the system’s real-time clock then cyg_real_time_clock and cyg_clock_to_counter can be
used to get hold of the appropriate handle. The next two arguments specify the action to be taken when the alarm
is triggered, in the form of a function pointer and some data. This function should take the form:
The data argument passed to the alarm function corresponds to the third argument passed to cyg_alarm_create.
The fourth argument to cyg_alarm_create is used to return a handle to the newly-created alarm object, and the
final argument provides the memory needed for the alarm object and thus avoids any need for dynamic memory
allocation within the kernel.
Once an alarm has been created a further call to cyg_alarm_initialize is needed to activate it. The first argument specifies the alarm. The second argument indicates the number of events, for example clock ticks, that need
to occur before the alarm triggers. If the third argument is 0 then the alarm will only trigger once. A non-zero value
specifies that the alarm should trigger repeatedly, with an interval of the specified number of events.
63
Alarms
Alarms can be temporarily disabled and reenabled using cyg_alarm_disable and cyg_alarm_enable. Alternatively another call to cyg_alarm_initialize can be used to modify the behaviour of an existing alarm. If an
alarm is no longer required then the associated resources can be released using cyg_alarm_delete.
The alarm function is invoked when a counter tick occurs, in other words when there is a call to
cyg_counter_tick, and will happen in the same context. If the alarm is associated with the system’s real-time
clock then this will be DSR context, following a clock interrupt. If the alarm is associated with some other
application-specific counter then the details will depend on how that counter is updated.
If two or more alarms are registered for precisely the same countertick, theorder ofexecution of the alarm functions
is unspecified.
Valid contexts
cyg_alarm_create cyg_alarm_initialize is typically called during system initialization but may
also be called in thread context. The same applies to cyg_alarm_delete. cyg_alarm_initialize,
cyg_alarm_disable and cyg_alarm_enable may be called during initialization or from thread or DSR
context, but cyg_alarm_enable and cyg_alarm_initialize may be expensive operations and should only be
called when necessary.
The purpose of mutexes is to let threads share resources safely.If two or more threads attempt to manipulate a data
structure with no locking between them then the system may run for quite some time without apparent problems,
but sooner or later the data structure will become inconsistent and the application will start behaving strangely and
is quite likely to crash. The same can apply even when manipulating a single variable or some other resource. For
example, consider:
static volatile int counter = 0;
void
process_event(void)
{
...
counter++;
}
Assume that after a certain period of time counter has a value of 42, and two threads A and B running at the
same priority call process_event. Typically thread A will read the value of counter into a register, increment
this register to 43, and write this updated value back to memory. Thread B will do the same, so usually counter
will end up with a value of 44. However if thread A is timesliced after reading the old value 42 but before writing
back 43, thread B will still read back the old value and will also write back 43. The net result is that the counter
only gets incremented once, not twice, which depending on the application may prove disastrous.
65
Mutexes
Sections of code like the above which involve manipulating shared data are generally known as critical regions.
Code should claim a lock before entering a critical region and release the lock when leaving. Mutexes provide an
appropriate synchronization primitive for this.
static volatile int counter = 0;
static cyg_mutex_tlock;
A mutex must be initialized before it can be used, by calling cyg_mutex_init. This takes a pointer to a
cyg_mutex_t data structure which is typically statically allocated, and may be part of a larger data structure. If a
mutex is no longer required and there are no threads waiting on it then cyg_mutex_destroy can be used.
The main functions for using a mutex are cyg_mutex_lock and cyg_mutex_unlock. In normal operation
cyg_mutex_lock will return success after claiming the mutex lock, blocking if another thread currently owns the
mutex. However the lock operation may fail if other code calls cyg_mutex_release or cyg_thread_release,
so if these functions may get used then it is important to check the return value. The current owner of a mutex
should call cyg_mutex_unlock when a lock is no longer required. This operation must be performed by the
owner, not by another thread.
cyg_mutex_trylock is a variant of cyg_mutex_lock that will always return immediately, returning success or
failure as appropriate. This function is rarely useful. Typical code locks a mutex just before entering a critical
region, so if the lock cannot be claimed then there may be nothing else for the current thread to do. Use of this
function may also cause a form of priority inversion if the owner owner runs at a lower priority, because the
priority inheritance code will not be triggered. Instead the current thread continues running, preventing the owner
from getting any cpu time, completing the critical region, and releasing the mutex.
cyg_mutex_release can be used to wake up all threads that are currently blocked inside a call to
cyg_mutex_lock for a specific mutex. These lock calls will return failure. The current mutex owner is not
affected.
Priority Inversion
The use of mutexes gives rise to a problem known as priority inversion. In a typical scenario this requires three
threads A, B, and C, running at high, medium and low priority respectively. Thread A and thread B are temporarily
blocked waiting for some event, so thread C gets a chance to run, needs to enter a critical region, and locks a mutex.
At this point threads A and B are woken up - the exact order does not matter. Thread A needs to claim the same
mutex but has to wait until C has left the critical region and can release the mutex. Meanwhile thread B works on
something completely different and can continue running without problems. Because thread C is running a lower
priority than B it will not get a chance to run until B blocks for some reason, and hence thread A cannot run either.
The overall effect is that a high-priority thread A cannot proceed because of a lower priority thread B, and priority
inversion has occurred.
66
Mutexes
In simple applications it may be possible to arrange the code such that priority inversion cannot occur, for example
by ensuring that a given mutex is nevershared by threads running at different priority levels.However this may not
always be possible even at the application level. In addition mutexes may be used internally by underlying code,
for example the memory allocation package, so careful analysis of the whole system would be needed to be sure
that priority inversion cannot occur. Instead it is common practice to use one of two techniques: priority ceilings
and priority inheritance.
Priority ceilings involve associating a priority with each mutex. Usually this will match the highest priority thread
that will everlock themutex. When athread runningat alower priority makes a successfulcall tocyg_mutex_lock
or cyg_mutex_trylock its priority will be boosted to that of the mutex. For example, given the previous example
the priority associated with the mutex would be that of thread A, so for as long as it owns the mutex thread C will
run in preference to thread B. When C releases the mutex its priority drops to the normal value again, allowing A
to run and claim the mutex. Setting the priority for a mutex involves a call to cyg_mutex_set_ceiling, which
is typically called during initialization. It is possible to change the ceiling dynamically but this will only affect
subsequent lock operations, not the current owner of the mutex.
Priority ceilings are very suitable for simple applications, where for every thread in the system it is possible to
work out which mutexes will be accessed. For more complicated applications this may prove difficult, especially
if thread priorities change at run-time. An additional problem occurs for any mutexes outside the application,
for example used internally within eCos packages. A typical eCos package will be unaware of the details of the
various threads in the system, so it will have no way of setting suitable ceilings for its internal mutexes. If those
mutexesare not exported to application code then using priority ceilings may not be viable. The kernel does provide
a configuration option CYGSEM_KERNEL_SYNCH_MUTEX_PRIORITY_INVERSION_PROTOCOL_DEFAULT_PRIORITY
that can be used to set the default priority ceiling for all mutexes, which may prove sufficient.
The alternative approach is to use priority inheritance: if a thread calls cyg_mutex_lock for a mutex that it currently owned by a lower-priority thread, then the owner will have its priority raised to that of the current thread.
Often this is more efficient than priority ceilings because priority boosting only happens when necessary, not for
every lock operation, and the required priority is determined at run-time rather than by static analysis. However
there are complications when multiple threads running at different priorities try to lock a single mutex, or when
the current owner of a mutex then tries to lock additional mutexes, and this makes the implementation significantly
more complicated than priority ceilings.
CYGSEM_KERNEL_SYNCH_MUTEX_PRIORITY_INVERSION_PROTOCOL_INHERITor
CYGSEM_KERNEL_SYNCH_MUTEX_PRIORITY_INVERSION_PROTOCOL_CEILING will be selected, so that one of
the two protocols is available for all mutexes. It is possible to select multiple protocols, so that some mutexes can
have priority ceilings while others use priority inheritance or no priority inversion protection at all. Obviously
this flexibility will add to the code size and to the cost of mutex operations. The default for all mutexes will
be controlled by CYGSEM_KERNEL_SYNCH_MUTEX_PRIORITY_INVERSION_PROTOCOL_DEFAULT, and can be
changed at run-time using cyg_mutex_set_protocol.
Priority inversion problems can also occur with other synchronization primitives such as semaphores. For example
there could be a situation where a high-priority thread A is waiting on a semaphore, a low-priority thread C needs
to do just a little bit more work before posting the semaphore, but a medium priority thread B is running and
preventing C from making progress. However a semaphore does not have the concept of an owner, so there is no
way for the system to know that it is thread C which would next post to the semaphore. Hence there is no way for
the system to boost the priority of C automatically and prevent the priority inversion. Instead situations like this
67
Mutexes
have to be detected by application developers and appropriate precautions have to be taken, for example making
sure that all the threads run at suitable priorities at all times.
Warning
The current implementation of priority inheritance within the eCos kernel does not handle
certain exceptional circumstances completely correctly. Problems will only arise if a thread
owns one mutex, then attempts to claim another mutex, and there are other threads attempting to lock these same mutexes. Although the system will continue running, the current
owners of the various mutexes involved may not run at the priority they should. This situation
never arises in typical code because a mutex will only be locked for a small critical region,
and there is no need to manipulate other shared resources inside this region. A more complicated implementation of priority inheritance is possible but would add significant overhead
and certain operations would no longer be deterministic.
Warning
Support for priority ceilings and priority inheritance is not implemented for all schedulers. In
particular neither priority ceilings norpriority inheritanceare currently availablefor the bitmap
scheduler.
Alternatives
In nearly all circumstances, if two or more threads need to share some data then protecting this data with a mutex
is the correct thing to do. Mutexes are the only primitive that combine a locking mechanism and protection against
priority inversion problems. However this functionality is achieved at a cost, and in exceptional circumstances such
as an application’s most critical inner loop it may be desirable to use some other means of locking.
When a critical region is very very small it is possible to lock the scheduler, thus ensuring that no other
thread can run until the scheduler is unlocked again. This is achieved with calls to cyg_scheduler_lock
and cyg_scheduler_unlock. If the critical region is sufficiently small then this can actually improve both
performance and dispatch latency because cyg_mutex_lock also locks the scheduler for a brief period of time.
This approach will not work on SMP systems because another thread may already be running on a different
processor and accessing the critical region.
Another way of avoiding the use of mutexes is to make sure that all threads that access a particular critical region
run at the same priority and configure the system with timeslicing disabled (CYGSEM_KERNEL_SCHED_TIMESLICE).
Without timeslicing a thread can only be preempted by a higher-priority one, or if it performs some operation that
can block. This approach requires that none of the operations in the critical region can block, so for example it is
not legal to call cyg_semaphore_wait. It is also vulnerable to any changes in the configuration or to the various
thread priorities: any such changes may now have unexpected side effects. It will not work on SMP systems.
Recursive Mutexes
The implementation of mutexes within the eCos kernel does not support recursive locks. If a thread has locked a
mutex and then attempts to lock the mutex again, typically as a result of some recursive call in a complicated call
graph, then either an assertion failure will be reported or the thread will deadlock. This behaviour is deliberate.
68
Mutexes
When a thread has just locked a mutex associated with some data structure, it can assume that that data structure is
in a consistent state. Before unlocking the mutex again it must ensure that the data structure is again in a consistent
state. Recursive mutexes allow a thread to make arbitrary changes to a data structure, then in a recursive call lock
the mutex again while the data structure is still inconsistent. The net result is that code can no longer make any
assumptions about data structure consistency, which defeats the purpose of using mutexes.
Valid contexts
cyg_mutex_init, cyg_mutex_set_ceiling and cyg_mutex_set_protocol are normally called during ini-
tialization but may also be called from thread context. The remaining functions should only be called from thread
context. Mutexes serve as a mutual exclusion mechanism between threads, and cannot be used to synchronize
between threads and the interrupt handling subsystem. If a critical region is shared between a thread and a DSR
then it must be protected using cyg_scheduler_lock and cyg_scheduler_unlock. If a critical region is shared
between a thread and an ISR, it must be protected by disabling or masking interrupts. Obviously these operations
must be used with care because they can affect dispatch and interrupt latencies.
Condition variables are used in conjunction with mutexes to implement long-term waits for some condition to
become true. For example consider a set of functions that control access to a pool of resources:
cyg_mutex_t res_lock;
res_t res_pool[RES_MAX];
int res_count = RES_MAX;
void res_init(void)
{
cyg_mutex_init(&res_lock);
<fill pool with resources>
}
res_t res_allocate(void)
{
res_t res;
cyg_mutex_lock(&res_lock);// lock the mutex
if( res_count == 0 )// check for free resource
res = RES_NONE;// return RES_NONE if none
else
{
res_count--;// allocate a resources
res = res_pool[res_count];
}
71
Condition Variables
cyg_mutex_unlock(&res_lock);// unlock the mutex
return res;
}
void res_free(res_t res)
{
cyg_mutex_lock(&res_lock);// lock the mutex
res_pool[res_count] = res;// free the resource
res_count++;
cyg_mutex_unlock(&res_lock);// unlock the mutex
}
These routines use the variable res_count to keep track of the resources available. If there are none then
res_allocate returns RES_NONE, which the caller must check for and take appropriate error handling actions.
Now suppose that we do not want to return RES_NONE when there are no resources, but want to wait for one to
become available. This is where a condition variable can be used:
cyg_mutex_t res_lock;
cyg_cond_t res_wait;
res_t res_pool[RES_MAX];
int res_count = RES_MAX;
res_count--;// allocate a resource
res = res_pool[res_count];
cyg_mutex_unlock(&res_lock);// unlock the mutex
return res;
}
void res_free(res_t res)
{
72
Condition Variables
cyg_mutex_lock(&res_lock);// lock the mutex
res_pool[res_count] = res;// free the resource
res_count++;
cyg_cond_signal(&res_wait);// wake up any waiting allocators
cyg_mutex_unlock(&res_lock);// unlock the mutex
}
In this version of the code, when res_allocate detects that there are no resources it calls cyg_cond_wait.
This does two things: it unlocks the mutex, and puts the calling thread to sleep on the condition variable. When
res_free is eventually called, it puts a resource back into the pool and calls cyg_cond_signal to wake up any
thread waiting on the condition variable. When the waiting thread eventually gets to run again, it will re-lock the
mutex before returning from cyg_cond_wait.
There are two important things to note about the way in which this code works. The first is that the mutex unlock
and wait in cyg_cond_wait are atomic: no other thread can run between the unlock and the wait. If this were not
the case then a call to res_free by that thread would release the resource but the call to cyg_cond_signal would
be lost, and the first thread would end up waiting when there were resources available.
The second feature is that the call to cyg_cond_wait is in a while loop and not a simple if statement. This is
because of the need to re-lock the mutex in cyg_cond_wait when the signalled thread reawakens. If there are
other threads already queued to claim the lock then this thread must wait. Depending on the scheduler and the
queue order, many other threads may have entered the critical section before this one gets to run. So the condition
that it was waiting for may have been rendered false. Using a loop around all condition variable wait operations is
the only way to guarantee that the condition being waited for is still true after waiting.
Before a condition variable can be used it must be initialized with a call to cyg_cond_init. This requires two
arguments, memory for the data structure and a pointer to an existing mutex. This mutex will not be initialized
by cyg_cond_init, instead a separate call to cyg_mutex_init is required. If a condition variable is no longer
required and there are no threads waiting on it then cyg_cond_destroy can be used.
When a thread needs to wait for a condition to be satisfied it can call cyg_cond_wait. The thread must have
already locked the mutex that was specified in the cyg_cond_init call. This mutex will be unlocked and the
current thread will be suspended in an atomic operation. When some other thread performs a signal or broadcast
operation the current thread will be woken up and automatically reclaim ownership of the mutex again, allowing it
to examine global state and determine whether or not the condition is nowsatisfied. The kernel supplies a variant of
this function, cyg_cond_timed_wait, which can be used to wait on the condition variable or until some numberof
clock ticks have occurred. The mutex will always be reclaimed before cyg_cond_timed_wait returns, regardless
of whether it was a result of a signal operation or a timeout.
There is no cyg_cond_trywait function because this would not serve any purpose. If a thread has locked the
mutex and determined that the condition is satisfied, it can just release the mutex and return. There is no need to
perform any operation on the condition variable.
When a thread changes shared state that may affect some other thread blocked on a condition variable, it should
call either cyg_cond_signal or cyg_cond_broadcast. These calls do not require ownership of the mutex, but
usually the mutex will have been claimed before updating the shared state. A signal operation only wakes up the
first thread that is waiting on the condition variable, while a broadcast wakes up all the threads. If there are no
threads waiting on the condition variable at the time, then the signal or broadcast will have no effect: past signals
are not counted up or remembered in any way. Typically a signal should be used when all threads will check the
73
Condition Variables
same condition and at most one thread can continue running. A broadcast should be used if threads check slightly
different conditions, or if the change to the global state might allow multiple threads to proceed.
Valid contexts
cyg_cond_init is typically called during system initialization but may also be called in thread context. The
same applies to cyg_cond_delete. cyg_cond_wait and cyg_cond_timedwait may only be called from thread
context since they may block. cyg_cond_signal and cyg_cond_broadcast may be called from thread or DSR
context.
void cyg_semaphore_init(cyg_sem_t* sem, cyg_count32 val);
void cyg_semaphore_destroy(cyg_sem_t* sem);
cyg_bool_t cyg_semaphore_wait(cyg_sem_t* sem);
cyg_bool_t cyg_semaphore_timed_wait(cyg_sem_t* sem, cyg_tick_count_t abstime);
cyg_bool_t cyg_semaphore_trywait(cyg_sem_t* sem);
void cyg_semaphore_post(cyg_sem_t* sem);
void cyg_semaphore_peek(cyg_sem_t* sem, cyg_count32* val);
Description
Counting semaphores are a synchronization primitive that allow threads to wait until an event has occurred. The
event may be generated by a producer thread, or by a DSR in response to a hardware interrupt. Associated with
each semaphore is an integer counter that keeps track of the number of events that have not yet been processed. If
this counter is zero, an attempt by a consumer thread to wait on the semaphore will block until some other thread
or a DSR posts a new event to the semaphore. If the counter is greater than zero then an attempt to wait on the
semaphore will consume one event, in other words decrement the counter, and return immediately. Posting to a
semaphore will wake up the first thread that is currently waiting, which will then resume inside the semaphore wait
operation and decrement the counter again.
Another use of semaphores is for certain forms of resource management. The counter would correspond to how
many of a certain type of resource are currently available, with threads waiting on the semaphore to claim a
resource and posting to release the resource again. In practice condition variables are usually much better suited
for operations like this.
cyg_semaphore_init is used to initialize a semaphore. It takes two arguments,a pointer to a cyg_sem_t structure
and an initial value for the counter. Note that semaphore operations, unlike some other parts of the kernel API, use
pointers to data structures rather than handles. This makes it easier to embed semaphores in a larger data structure.
The initial counter value can be any number, zero, positive or negative, but typically a value of zero is used to
indicate that no events have occurred yet.
cyg_semaphore_wait is used by a consumer thread to wait for an event. If the current counter is greater than 0,
in other words if the event has already occurred in the past, then the counter will be decremented and the call will
return immediately. Otherwise the current thread will be blocked until there is a cyg_semaphore_post call.
75
Semaphores
cyg_semaphore_post is called when an event has occurs. This increments the counter and wakes up
the first thread waiting on the semaphore (if any). Usually that thread will then continue running inside
cyg_semaphore_wait and decrement the counter again. However other scenarioes are possible. For example the
thread calling cyg_semaphore_post may be running at high priority, some other thread running at medium
priority may be about to call cyg_semaphore_wait when it next gets a chance to run, and a low priority thread
may be waiting on the semaphore. What will happen is that the current high priority thread continues running until
it is descheduled for some reason, then the medium priority thread runs and its call to cyg_semaphore_wait
succeeds immediately, and later on the low priority thread runs again, discovers a counter value of 0, and blocks
until another event is posted. If there are multiple threads blocked on a semaphore then the configuration option
CYGIMP_KERNEL_SCHED_SORTED_QUEUES determines which one will be woken up by a post operation.
cyg_semaphore_wait returns a boolean. Normally it will block until it has successfully decremented the
counter, retrying as necessary, and return success. However the wait operation may be aborted by a call to
cyg_thread_release, and cyg_semaphore_wait will then return false.
cyg_semaphore_timed_wait is a variant of cyg_semaphore_wait. It can be used to wait until either an event
has occurred or a number of clock ticks have happened. The function returns success if the semaphore wait operation succeeded, or false if the operation timed out or was aborted by cyg_thread_release. If support for
the real-time clock has been removed from the current configuration then this function will not be available.
cyg_semaphore_trywait is another variant which will always return immediately rather than block, again re-
turning success or failure.
cyg_semaphore_peek can be used to get hold of the current counter value. This function is rarely useful except
for debugging purposes since the counter value may change at any time if some other thread or a DSR performs a
semaphore operation.
Valid contexts
cyg_semaphore_init is normally called during initialization but may also be called from thread context.
cyg_semaphore_wait and cyg_semaphore_timed_wait may only be called from thread context because these
operations may block. cyg_semaphore_trywait, cyg_semaphore_post and cyg_semaphore_peek may be
called from thread or DSR context.
Mail boxes are a synchronization primitive. Like semaphores they can be used by a consumer thread to wait until a
certain event has occurred, but the producer also has the ability to transmit some data along with each event. This
data, the message, is normally a pointer to some data structure. It is stored in the mail box itself, so the producer
thread that generates the event and provides the data usually does not have to block until some consumer thread
is ready to receive the event. However a mail box will only have a finite capacity, typically ten slots. Even if the
system is balanced and events are typically consumed at least as fast as they are generated, a burst of events can
cause the mail box to fill up and the generating thread will block until space is available again. This behaviour is
very different from semaphores, where it is only necessary to maintain a counter and hence an overflow is unlikely.
Before a mail box can be used it must be created with a call to cyg_mbox_create. Each mail box has a unique
handle which will be returned via the first argument and which should be used for subsequent operations.
cyg_mbox_create also requires an area of memory for the kernel structure, which is provided by the cyg_mbox
second argument. If a mail box is no longer required then cyg_mbox_delete can be used. This will simply
discard any messages that remain posted.
The main function for waiting on a mail box is cyg_mbox_get. If there is a pending message because of a call
to cyg_mbox_put then cyg_mbox_get will return immediately with the message that was put into the mail box.
Otherwise this function will block until there is a put operation. Exceptionally the thread can instead be unblocked
by a call to cyg_thread_release, in which case cyg_mbox_get will return a null pointer. It is assumed that there
77
Mail boxes
will never be a call to cyg_mbox_put with a null pointer, because it would not be possible to distinguish between
that and a release operation. Messages are always retrieved in the order in which they were put into the mail box,
and there is no support for messages with different priorities.
There are two variants of cyg_mbox_get. The first, cyg_mbox_timed_get will wait until either a message is
available or until a number of clock ticks have occurred. If no message is posted within the timeout then a null
pointer will be returned. cyg_mbox_tryget is a non-blocking operation which will either return a message if one
is available or a null pointer.
New messages are placed in the mail box by calling cyg_mbox_put or one of its variants. The main put function
takes two arguments, a handle to the mail box and a pointer for the message itself. If there is a spare slot in the
mail box then the new message can be placed there immediately, and if there is a waiting thread it will be woken
up so that it can receive the message. If the mail box is currently full then cyg_mbox_put will block until there
has been a get operation and a slot is available. The cyg_mbox_timed_put variant imposes a time limit on the
put operation, returning false if the operation cannot be completed within the specified number of clock ticks. The
cyg_mbox_tryput variant is non-blocking, returning false if there are no free slots available and the message
cannot be posted without blocking.
There are a further four functions available for examining the current state of a mailbox. The results of these
functions must be used with care because usually the state can change at any time as a result of activity within
other threads, but they may prove occasionally useful during debugging or in special situations. cyg_mbox_peek
returns a count of the number of messages currently stored in the mail box. cyg_mbox_peek_item retrieves the
first message, but it remains in the mail box until a get operation is performed. cyg_mbox_waiting_to_get and
cyg_mbox_waiting_to_put indicate whether or not there are currently threads blocked in a get or a put operation
on a given mail box.
Thenumberofslotsineachmailboxiscontrolledbyaconfigurationoption
CYGNUM_KERNEL_SYNCH_MBOX_QUEUE_SIZE, with a default value of 10. All mail boxes are the same size.
Valid contexts
cyg_mbox_create is typically called during system initialization but may also be called in thread context. The
remaining functions are normally called only during thread context. Of special note is cyg_mbox_put which can
be a blocking operation when the mail box is full, and which therefore must never be called from DSR context. It
is permitted to call cyg_mbox_tryput, cyg_mbox_tryget, and the information functions from DSR context but
this is rarely useful.
Event flags allow a consumer thread to wait for one of several different types of event to occur. Alternatively it is
possible to wait for some combination of events. The implementation is relatively straightforward. Each event flag
contains a 32-bit integer. Application code associates these bits with specific events, so for example bit 0 could
indicate that an I/O operation has completed and data is available, while bit 1 could indicate that the user has
pressed a start button. A producer thread or a DSR can cause one or more of the bits to be set, and a consumer
thread currently waiting for these bits will be woken up.
Unlike semaphores no attempt is made to keep track of event counts. It does not matter whether a given event
occurs once or multiple times before being consumed, the corresponding bit in the event flag will change only
once. However semaphores cannot easily be used to handle multiple event sources. Event flags can often be used
as an alternative to condition variables, although they cannot be used for completely arbitrary conditions and they
only support the equivalent of condition variable broadcasts, not signals.
Before an event flag can be used it must be initialized by a call to cyg_flag_init. This takes a pointer to a
cyg_flag_t data structure, which can be part of a larger structure. All 32 bits in the event flag will be set to 0,
indicating that no events have yet occurred. If an event flag is no longer required it can be cleaned up with a call to
cyg_flag_destroy, allowing the memory for the cyg_flag_t structure to be re-used.
79
Event Flags
A consumer thread can wait for one or more events by calling cyg_flag_wait. This takes three arguments. The
first identifies a particular event flag. The second is some combination of bits, indicating which events are of
interest. The final argument should be one of the following:
CYG_FLAG_WAITMODE_AND
The call to cyg_flag_wait will block until all the specified event bits are set. The event flag is not cleared
when the wait succeeds, in other words all the bits remain set.
CYG_FLAG_WAITMODE_OR
The call will block until at least one of the specified event bits is set. The event flag is not cleared on return.
CYG_FLAG_WAITMODE_AND | CYG_FLAG_WAITMODE_CLR
The call will block until all the specified event bits are set, and the entire event flag is cleared when the call
succeeds. Note that if this mode of operation is used then a single event flag cannot be used to store disjoint
sets of events, even though enough bits might be available. Instead each disjoint set of events requires its own
event flag.
CYG_FLAG_WAITMODE_OR | CYG_FLAG_WAITMODE_CLR
The call will block until at least one of the specified event bits is set, and the entire flag is cleared when the
call succeeds.
A call to cyg_flag_wait normally blocks until the required condition is satisfied. It will return the value of
the event flag at the point that the operation succeeded, which may be a superset of the requested events. If
cyg_thread_release is used to unblock a thread that is currently in a wait operation, the cyg_flag_wait call
will instead return 0.
cyg_flag_timed_wait is a variant of cyg_flag_wait which adds a timeout: the wait operation must succeed
within the specified number of ticks, or it will fail with a return value of 0. cyg_flag_poll is a non-blocking variant: if the wait operation can succeed immediately it acts like cyg_flag_wait, otherwise it returns immediately
with a value of 0.
cyg_flag_setbits is called by a producer thread or from inside a DSR when an event occurs. The specified bits
are or’d into the current event flag value. This may cause a waiting thread to be woken up, if its condition is now
satisfied.
cyg_flag_maskbits can be used to clear one or more bits in the event flag. This can be called from a producer
when a particular condition is no longer satisfied, for example when the user is no longer pressing a particular
button. It can also be used by a consumer thread if CYG_FLAG_WAITMODE_CLR was not used as part of the wait
operation, to indicate that some but not all of the activeevents have been consumed. If there are multiple consumer
threads performing wait operations without using CYG_FLAG_WAITMODE_CLR then typically some additional synchronization such as a mutex is needed to prevent multiple threads consuming the same event.
Two additional functions are provided to query the current state of an event flag. cyg_flag_peek returns the
current value of the event flag, and cyg_flag_waiting can be used to find out whether or not there are any
threads currently blocked on the event flag. Both of these functions must be used with care because other threads
may be operating on the event flag.
80
Event Flags
Valid contexts
cyg_flag_init is typically called during system initialization but may also be called in thread context. The same
applies to cyg_flag_destroy. cyg_flag_wait and cyg_flag_timed_wait may only be called from thread
context. The remaining functions may be called from thread or DSR context.
Spinlocks provide an additional synchronization primitive for applications running on SMP systems. They operate
at a lower level than the other primitives such as mutexes, and for most purposes the higher-level primitives should
be preferred. However there are some circumstances where a spinlock is appropriate, especially when interrupt
handlers and threads need to share access to hardware, and on SMP systems the kernel implementation itself
depends on spinlocks.
Essentially a spinlock is just a simple flag. When code tries to claim a spinlock it checks whether or not the flag
is already set. If not then the flag is set and the operation succeeds immediately. The exact implementation of this
is hardware-specific, for example it may use a test-and-set instruction to guarantee the desired behaviour even if
several processors try to access the spinlock at the exact same time. If it is not possible to claim a spinlock then
the current thead spins in a tight loop, repeatedly checking the flag until it is clear. This behaviour is very different
from other synchronization primitives such as mutexes, where contention would cause a thread to be suspended.
The assumption is that a spinlock will only be held for a very short time. If claiming a spinlock could cause the
current thread to be suspended then spinlocks could not be used inside interrupt handlers, which is not acceptable.
This does impose a constraint on any code which uses spinlocks. Specifically it is important that spinlocks are held
only for a short period of time, typically just some dozens of instructions. Otherwise another processor could be
blocked on the spinlock for a long time, unable to do any useful work. It is also important that a thread which
owns a spinlock does not get preempted because that might cause another processor to spin for a whole timeslice
period, or longer. One way of achieving this is to disable interrupts on the current processor, and the function
cyg_spinlock_spin_intsave is provided to facilitate this.
Spinlocks should not be used on single-processor systems. Consider a high priority thread which attempts to claim
a spinlock already held by a lower priority thread: it will just loop forever and the lower priority thread will never
83
Spinlocks
get another chance to run and release the spinlock. Even if the two threads were running at the same priority, the
one attempting to claim the spinlock would spin until it was timesliced and a lot of cpu time would be wasted. If an
interrupt handler tried to claim a spinlock owned by a thread, the interrupt handler would loop forever. Therefore
spinlocks are only appropriate for SMP systems where the current owner of a spinlock can continue running on a
different processor.
Before a spinlock can be used it must be initialized by a call to cyg_spinlock_init. This takes two arguments,
a pointer to a cyg_spinlock_t data structure, and a flag to specify whether the spinlock starts off locked or
unlocked. If a spinlock is no longer required then it can be destroyed by a call to cyg_spinlock_destroy.
There are two routines for claiming a spinlock: cyg_spinlock_spin and cyg_spinlock_spin_intsave. The
former can be used when it is known the current code will not be preempted, for example because it is running in
an interrupt handler or because interrupts are disabled. The latter will disable interrupts in addition to claiming the
spinlock, so is safe to use in all circumstances. The previous interrupt state is returned via the second argument,
and should be used in a subsequent call to cyg_spinlock_clear_intsave.
cyg_spinlock_clear_intsave. Typically the former will be used if the spinlock was claimed by a call to
cyg_spinlock_spin, and the latter when cyg_spinlock_intsave was used.
There are two additional routines. cyg_spinlock_try is a non-blocking version of cyg_spinlock_spin: if
possible the lock will be claimed and the function will return true; otherwise the function will return immediately
with failure. cyg_spinlock_test can be used to find out whether or not the spinlock is currently locked. This
function must be used with care because, especially on a multiprocessor system, the state of the spinlock can
change at any time.
Spinlocks should only be held for a short period of time, and attempting to claim a spinlock will never cause a
thread to be suspended. This means that there is no need to worry about priority inversion problems, and concepts
such as priority ceilings and inheritance do not apply.
Valid contexts
All of the spinlock functions can be called from any context, including ISR and DSR context. Typically
cyg_spinlock_init is only called during system initialization.
84
Scheduler Control
Name
cyg_scheduler_start, cyg_scheduler_lock, cyg_scheduler_unlock,
cyg_scheduler_safe_lock, cyg_scheduler_read_lock — Control the state of the scheduler
cyg_scheduler_start should only be called once, to mark the end of system initialization. In typical configu-
rations it is called automatically by the system startup, but some applications may bypass the standard startup in
which case cyg_scheduler_start will have to be called explicitly. The call will enable system interrupts, allowing I/O operations to commence. Then the scheduler will be invoked and control will be transferred to the highest
priority runnable thread. The call will never return.
The various data structures inside the eCos kernel must be protected against concurrent updates. Consider a call
to cyg_semaphore_post which causes a thread to be woken up: the semaphore data structure must be updated to
remove the thread from its queue; the scheduler data structure must also be updated to mark the thread as runnable;
it is possible that the newly runnable thread has a higher priority than the current one, in which case preemption
is required. If in the middle of the semaphore post call an interrupt occurred and the interrupt handler tried to
manipulate the same data structures, for example by making another thread runnable, then it is likely that the
structures will be left in an inconsistent state and the system will fail.
To prevent such problems the kernel contains a special lock known as the scheduler lock. A typical kernel function
such as cyg_semaphore_post will claim the scheduler lock, do all its manipulation of kernel data structures, and
then release the scheduler lock. The current thread cannot be preempted while it holds the scheduler lock. If an
interrupt occurs and a DSR is supposed to run to signal that some event has occurred, that DSR is postponed until
the scheduler unlock operation. This prevents concurrent updates of kernel data structures.
The kernel exports three routines for manipulating the scheduler lock. cyg_scheduler_lock can be called to
claim the lock. On return it is guaranteed that the current thread will not be preempted, and that no other code
is manipulating any kernel data structures. cyg_scheduler_unlock can be used to release the lock, which may
cause the current thread to be preempted. cyg_scheduler_read_lock can be used to query the current state of
the scheduler lock. This function should never be needed because well-written code should always know whether
or not the scheduler is currently locked, but may prove useful during debugging.
The implementation of the scheduler lock involvesa simplecounter.Code can call cyg_scheduler_lock multiple
times, causing the counter to be incremented each time, as long as cyg_scheduler_unlock is called the same
85
Scheduler Control
number of times. This behaviour is different from mutexes where an attempt by a thread to lock a mutex multiple
times will result in deadlock or an assertion failure.
Typical application code should not use the scheduler lock. Instead other synchronization primitives such as mutexes and semaphores should be used. While the scheduler is locked the current thread cannot be preempted, so any
higher priority threads will not be able to run. Also no DSRs can run, so device drivers may not be able to service
I/O requests. However there is one situation where locking the scheduler is appropriate: if some data structure
needs to be shared between an application thread and a DSR associated with some interrupt source, the thread can
use the scheduler lock to prevent concurrent invocations of the DSR and then safely manipulate the structure. It is
desirable that the scheduler lock is held for only a short period of time, typically some tens of instructions. In exceptional cases there may also be some performance-critical code where it is more appropriate to use the scheduler
lock rather than a mutex, because the former is more efficient.
Valid contexts
cyg_scheduler_start can only be called during system initialization, since it marks the end of that phase. The
remaining functions may be called from thread or DSR context. Locking the scheduler from inside the DSR has
no practical effect because the lock is claimed automatically by the interrupt subsystem before running DSRs, but
allows functions to be shared between normal thread code and DSRs.
The kernel provides an interface for installing interrupt handlers and controlling when interrupts occur. This functionality is used primarily by eCos device driversand by any application code that interacts directly with hardware.
However in most cases it is better to avoid using this kernel functionality directly, and instead the device driver
API provided by the common HAL package should be used. Use of the kernel package is optional, and some applications such as RedBoot work with no need for multiple threads or synchronization primitives. Any code which
calls the kernel directly rather than the device driver API will not function in such a configuration. When the kernel
package is present the device driver API is implemented as #define’s to the equivalent kernel calls, otherwise it
is implemented inside the common HAL package. The latter implementation can be simpler than the kernel one
because there is no need to consider thread preemption and similar issues.
87
Interrupt Handling
The exact details of interrupt handling vary widely between architectures. The functionality provided by the kernel
abstracts away from many of the details of the underlying hardware, thus simplifying application development.
However this is not always successful. For example, if some hardware does not provide any support at all for
masking specific interrupts then calling cyg_interrupt_mask may not behave as intended: instead of masking
just the one interrupt source it might disable all interrupts, because that is as close to the desired behaviour as is
possible given the hardware restrictions. Another possibility is that masking a given interrupt source also affects all
lower-priorityinterrupts, but still allows higher-priority ones. The documentation for the appropriate HAL packages
should be consulted for more information about exactly how interrupts are handled on any given hardware. The
HAL header files will also contain useful information.
Interrupt Handlers
Interrupt handlers are created by a call to cyg_interrupt_create. This takes the following arguments:
cyg_vector_t vector
The interrupt vector, a small integer, identifies the specific interrupt source. The appropriate hardware documentation or HAL header files should be consulted for details of which vector corresponds to which device.
cyg_priority_t priority
Some hardware may support interrupt priorities, where a low priority interrupt handler can in turn be interrupted by a higher priority one. Again hardware-specific documentation should be consulted for details about
what the valid interrupt priority levels are.
cyg_addrword_t data
When an interrupt occurs eCos will first call the associated interrupt service routine or ISR, then optionally
a deferred service routine or DSR. The data argument to cyg_interrupt_create will be passed to both
these functions. Typically it will be a pointer to some data structure.
cyg_ISR_t isr
When an interrupt occurs the hardware will transfer control to the appropriate vector service routine or VSR,
which is usually provided by eCos. This performs any appropriate processing, for example to workout exactly
which interrupt occurred, and then as quickly as possible transfers control the installed ISR. An ISR is a C
function which takes the following form:
The first argument identifies the particular interrupt source, especially useful if there multiple instances of
a given device and a single ISR can be used for several different interrupt vectors. The second argument
is the data field passed to cyg_interrupt_create, usually a pointer to some data structure. The exact
conditions under which an ISR runs will depend partly on the hardware and partly on configuration options.
Interrupts may currently be disabled globally, especially if the hardware does not support interrupt priorities.
Alternatively interrupts may be enabled such that higher priority interrupts are allowed through. The ISR may
be running on a separate interrupt stack, or on the stack of whichever thread was running at the time the
interrupt happened.
A typical ISR will do as little work as possible, just enough to meet the needs of the hardware and then
acknowledge the interrupt by calling cyg_interrupt_acknowledge. This ensures that interrupts will be
quickly reenabled, so higher priority devices can be serviced. For some applications there may be one device
which is especially important and whose ISR can take much longer than normal. However eCos devicedrivers
usually will not assume that they are especially important, so their ISRs will be as short as possible.
The return value of an ISR is normally one of CYG_ISR_CALL_DSR or CYG_ISR_HANDLED. The former indicates that further processing is required at DSR level, and the interrupt handler’s DSR will be run as soon as
possible. The latter indicates that the interrupt has been fully handled and no further effort is required.
An ISR is allowed to make very few kernel calls. It can manipulate the interrupt mask, and on SMP systems
it can use spinlocks. However an ISR must not make higher-level kernel calls such as posting to a semaphore,
instead any such calls must be made from the DSR. This avoids having to disable interrupts throughout the
kernel and thus improves interrupt latency.
If an interrupt has occurred and the ISR has returned a value CYG_ISR_CALL_DSR, the system will call the
deferred service routine or DSR associated with this interrupt handler. If the scheduler is not currently locked
then the DSR will run immediately. However if the interrupted thread was in the middle of a kernel call and
had locked the scheduler, then the DSR will be deferred until the scheduler is again unlocked. This allows the
DSR to make certain kernel calls safely, for example posting to a semaphore or signalling a condition variable.
A DSR is a C function which takes the following form:
void
dsr_function(cyg_vector_t vector,
{
}
The first argument identifies the specific interrupt that has caused the DSR to run. The second argument
indicates the number of these interrupts that have occurred and for which the ISR requested a DSR. Usually
this will be 1, unless the system is suffering from a very heavy load. The third argument is the data field
passed to cyg_interrupt_create.
cyg_handle_t* handle
The kernel will return a handle to the newlycreated interrupt handler via this argument.Subsequent operations
on the interrupt handler such as attaching it to the interrupt source will use this handle.
cyg_ucount32 count,
cyg_addrword_t data)
cyg_interrupt* intr
This provides the kernel with an area of memory for holding this interrupt handler and associated data.
89
Interrupt Handling
The call to cyg_interrupt_create simply fills in a kernel data structure. A typical next step is to call
cyg_interrupt_attach using the handle returned by the create operation. This makes it possible to have
several different interrupt handlers for a given vector, attaching whichever one is currently appropriate.
Replacing an interrupt handler requires a call to cyg_interrupt_detach, followed by another call to
cyg_interrupt_attach for the replacement handler. cyg_interrupt_delete can be used if an interrupt
handler is no longer required.
Some hardware may allow for further control over specific interrupts, for example whether an interrupt is level or
edge triggered. Any such hardware functionality can be accessed using cyg_interrupt_configure: the level
argument selects between level versus edge triggered; the up argument selects between high and low level, or
between rising and falling edges.
Usually interrupt handlers are created, attached and configured during system initialization, while global interrupts
are still disabled. On most hardware it will also be necessary to call cyg_interrupt_unmask, since the sensible
default for interrupt masking is to ignore any interrupts for which no handler is installed.
Controlling Interrupts
eCos provides two ways of controlling whether or not interrupts happen. It is possible to disable and reenable all
interrupts globally, using cyg_interrupt_disable and cyg_interrupt_enable. Typically this works by manipulating state inside the cpu itself, for example setting a flag in a status register or executing special instructions.
Alternatively it may be possible to mask a specific interrupt source by writing to one or to several interrupt mask
registers. Hardware-specific documentation should be consulted for the exact details of how interrupt masking
works, because a full implementation is not possible on all hardware.
The primary use for these functions is to allow data to be shared between ISRs and other code such as DSRs or
threads. If both a thread and an ISR need to manipulate either a data structure or the hardware itself, there is a
possible conflict if an interrupt happens just when the thread is doing such manipulation. Problems can be avoided
by the thread either disabling or masking interrupts during the critical region. If this critical region requires only
a few instructions then usually it is more efficient to disable interrupts. For larger critical regions it may be more
appropriate to use interrupt masking, allowing other interrupts to occur. There are other uses for interrupt masking.
For example if a device is not currently being used by the application then it may be desirable to mask all interrupts
generated by that device.
cyg_interrupt_mask_intunsafe. On typical hardware masking an interrupt is not an atomic operation,
so if two threads were to perform interrupt masking operations at the same time there could be problems.
cyg_interrupt_mask disables all interrupts while it manipulates the interrupt mask. In situations where
interrupts are already know to be disabled, cyg_interrupt_mask_intunsafe can be used instead. There are
matching functions cyg_interrupt_unmask and cyg_interrupt_unmask_intsafe.
SMP Support
On SMP systems the kernel provides an additional two functions related to interrupt handling.
cyg_interrupt_set_cpu specifies that a particular hardware interrupt should always be handled on one specific
processor in the system. In other words when the interrupt triggers it is only that processor which detects it, and it
is only on that processor that the VSR and ISR will run. If a DSR is requested then it will also run on the same
CPU. The function cyg_interrupt_get_cpu can be used to find out which interrupts are handled on which
processor.
90
Interrupt Handling
VSR Support
When an interrupt occurs the hardware will transfer control to a piece of code known as the VSR, or Vector Service
Routine. By default this code is provided by eCos. Usually it is written in assembler, but on some architectures it
may be possible to implement VSRs in C by specifying an interrupt attribute. Compiler documentation should be
consulted for more information on this. The default eCos VSR will work out which ISR function should process
the interrupt, and set up a C environment suitable for this ISR.
For some applications it may be desirable to replace the default eCos VSR and handle some interrupts directly. This
minimizes interrupt latency, but it requires application developers to program at a lower level. Usually the best way
to write a custom VSR is to copy the existing one supplied by eCos and then make appropriate modifications.
The function cyg_interrupt_get_vsr can be used to get hold of the current VSR for a given interrupt vector,
allowing it to be restored if the custom VSR is no longer required. cyg_interrupt_set_vsr can be used to install
a replacement VSR. Usually the vsr argument will correspond to an exported label in an assembler source file.
Valid contexts
In a typical configuration interrupt handlers are created and attached during system initialization, and never
detached or deleted. However it is possible to perform these operations at thread level, if desired. Similarly
cyg_interrupt_configure, cyg_interrupt_set_vsr, and cyg_interrupt_set_cpu are usually called
only during system initialization, but on typical hardware may be called at any time. cyg_interrupt_get_vsr
and cyg_interrupt_get_cpu may be called at any time.
The functions for enabling, disabling, masking and unmasking interrupts can be called in any context, when appropriate. It is the responsibility of application developersto determine when the use of these functions is appropriate.
91
Interrupt Handling
92
Kernel Real-time Characterization
Name
tm_basic — Measure the performance of the eCos kernel
Description
When building a real-time system, care must be taken to ensure that the system will be able to perform properly
within the constraints of that system. One of these constraints may be how fast certain operations can be performed.
Another might be how deterministic the overall behavior of the system is. Lastly the memory footprint (size) and
unit cost may be important.
One of the major problems encountered while evaluating a system will be how to compare it with possible alternatives.Most manufacturers of real-time systems publish performance numbers, ostensibly so that users can compare
the different offerings. However, what these numbers mean and how they were gathered is often not clear. The
values are typically measured on a particular piece of hardware, so in order to truly compare, one must obtain
measurements for exactly the same set of hardware that were gathered in a similar fashion.
Two major items need to be present in any given set of measurements. First, the raw values for the various operations; these are typically quite easy to measure and will be available for most systems. Second, the determinacy of
the numbers; in other words how much the value might change depending on other factors within the system. This
value is affected by a number of factors: how long interrupts might be masked, whether or not the function can
be interrupted, even very hardware-specific effects such as cache locality and pipeline usage. It is very difficult to
measure the determinacy of any given operation, but that determinacy is fundamentally important to proper overall
characterization of a system.
In the discussion and numbers that follow, three key measurements are provided. The first measurement is an
estimate of the interrupt latency: this is the length of time from when a hardware interrupt occurs until its Interrupt Service Routine (ISR) is called. The second measurement is an estimate of overall interrupt overhead: this
is the length of time average interrupt processing takes, as measured by the real-time clock interrupt (other interrupt sources will certainly take a different amount of time, but this data cannot be easily gathered). The third
measurement consists of the timings for the various kernel primitives.
Methodology
Key operations in the kernel were measured by using a simple test program which exercises the various kernel
primitive operations. A hardware timer, normally the one used to drive the real-time clock, was used for these
measurements. In most cases this timer can be read with quite high resolution, typically in the range of a few
microseconds. For each measurement, the operation was repeated a number of times. Time stamps were obtained
directly before and after the operation was performed. The data gathered for the entire set of operations was then
analyzed, generating average (mean), maximum and minimum values. The sample variance (a measure of how
close most samples are to the mean) was also calculated. The cost of obtaining the real-time clock timer values was
also measured, and was subtracted from all other times.
Most kernel functions can be measured separately. In each case, a reasonable number of iterations are performed.
Where the test case involves a kernel object, for example creating a task, each iteration is performed on a different
object. There is also a set of tests which measures the interactions between multiple tasks and certain kernel
primitives. Most functions are tested in such a way as to determine the variations introduced by varying numbers
93
Kernel Real-time Characterization
of objects in the system. For example, the mailbox tests measure the cost of a ’peek’ operation when the mailbox
is empty, has a single item, and has multiple items present. In this way,any effects of the state of the object or how
many items it contains can be determined.
There are a few things to consider about these measurements. Firstly, they are quite micro in scale and only measure
the operation in question. These measurements do not adequately describe how the timings would be perturbed in
a real system with multiple interrupting sources. Secondly, the possible aberration incurred by the real-time clock
(system heartbeat tick) is explicitly avoided. Virtually all kernel functions have been designed to be interruptible.
Thus the times presented are typical, but best case, since any particular function may be interrupted by the clock
tick processing. This number is explicitly calculated so that the value may be included in any deadline calculations
required by the end user. Lastly, the reported measurements were obtained from a system built with all options
at their default values. Kernel instrumentation and asserts are also disabled for these measurements. Any number
of configuration options can change the measured results, sometimes quite dramatically. For example, mutexes
are using priority inheritance in these measurements. The numbers will change if the system is built with priority
inheritance on mutex variables turned off.
The final value that is measured is an estimate of interrupt latency. This particular value is not explicitly calculated
in the test program used, but rather by instrumenting the kernel itself. The raw number of timer ticks that elapse
between the time the timer generates an interrupt and the start of the timer ISR is kept in the kernel. These values
are printed by the test program after all other operations have been tested. Thus this should be a reasonable estimate
of the interrupt latency over time.
Using these Measurements
These measurements can be used in a number of ways. The most typical use will be to compare different realtime kernel offerings on similar hardware, another will be to estimate the cost of implementing a task using eCos
(applications can be examined to see what effect the kernel operations will have on the total execution time).
Another use would be to observe how the tuning of the kernel affects overall operation.
Influences on Performance
A number of factors can affect real-time performance in a system. One of the most common factors, yet most
difficult to characterize, is the effect of device drivers and interrupts on system timings. Different device drivers
will have differing requirements as to how long interrupts are suppressed, for example. The eCos system has been
designed with this in mind, by separating the management of interrupts (ISR handlers) and the processing required
by the interrupt (DSR—Deferred Service Routine— handlers). However, since there is so much variability here,
and indeed most device drivers will come from the end users themselves, these effects cannot be reliably measured.
Attempts have been made to measure the overhead of the single interrupt that eCos relies on, the real-time clock
timer. This should give you a reasonable idea of the cost of executing interrupt handling for devices.
Measured Items
This section describes the various tests and the numbers presented. All tests use the C kernel API (available by way
of cyg/kernel/kapi.h). There is a single main thread in the system that performs the various tests. Additional
threads may be created as part of the testing, but these are short lived and are destroyed between tests unless
otherwise noted. The terminology “lower priority” means a priority that is less important, not necessarily lower in
94
numerical value. A higher priority thread will run in preference to a lower priority thread even though the priority
value of the higher priority thread may be numerically less than that of the lower priority thread.
Thread Primitives
Create thread
This test measures the cyg_thread_create() call. Each call creates a totally new thread. The set of threads
created by this test will be reused in the subsequent thread primitive tests.
Yield thread
This test measures the cyg_thread_yield() call. For this test, there are no other runnable threads, thus the
test should just measure the overhead of trying to give up the CPU.
Suspend [suspended] thread
This test measures the cyg_thread_suspend() call. A thread may be suspended multiple times; each thread
is already suspended from its initial creation, and is suspended again.
Resume thread
This test measures the cyg_thread_resume() call. All of the threads have a suspend count of 2, thus this
call does not make them runnable. This test just measures the overhead of resuming a thread.
Kernel Real-time Characterization
Set priority
This test measures the cyg_thread_set_priority() call. Each thread, currently suspended, has its priority
set to a new value.
Get priority
This test measures the cyg_thread_get_priority() call.
Kill [suspended] thread
This test measures the cyg_thread_kill() call. Each thread in the set is killed. All threads are known to be
suspended before being killed.
Yield [no other] thread
This test measures the cyg_thread_yield() call again. This is to demonstrate that the
cyg_thread_yield() call has a fixed overhead, regardless of whether there are other threads in the system.
Resume [suspended low priority] thread
This test measures the cyg_thread_resume() call again. In this case, the thread being resumed is lower
priority than the main thread, thus it will simply become ready to run but not be granted the CPU. This test
measures the cost of making a thread ready to run.
Resume [runnable low priority] thread
This test measures the cyg_thread_resume() call again. In this case, the thread being resumed is lower
priority than the main thread and has already been made runnable, so in fact the resume call has no effect.
95
Kernel Real-time Characterization
Suspend [runnable] thread
This test measures the cyg_thread_suspend() call again. In this case, each thread has already been made
runnable (by previous tests).
Yield [only low priority] thread
This test measures the cyg_thread_yield() call. In this case, there are many other runnable threads, but
they are all lower priority than the main thread, thus no thread switches will take place.
Suspend [runnable->not runnable] thread
This test measures the cyg_thread_suspend() call again. The thread being suspended will become nonrunnable by this action.
Kill [runnable] thread
This test measures the cyg_thread_kill() call again. In this case, the thread being killed is currently
runnable, but lower priority than the main thread.
Resume [high priority] thread
This test measures the cyg_thread_resume() call. The thread being resumed is higher priority than the
main thread, thus a thread switch will take place on each call. In fact there will be two thread switches; one to
the new higher priority thread and a second back to the test thread. The test thread exits immediately.
Thread switch
This test attempts to measure the cost of switching from one thread to another. Two equal priority threads are
started and they will each yield to the other for a number of iterations. A time stamp is gathered in one thread
before the cyg_thread_yield() call and after the call in the other thread.
Scheduler Primitives
Scheduler lock
This test measures the cyg_scheduler_lock() call.
Scheduler unlock [0 threads]
This test measures the cyg_scheduler_unlock() call. There are no other threads in the system and the
unlock happens immediately after a lock so there will be no pending DSR’s to run.
Scheduler unlock [1 suspended thread]
This test measures the cyg_scheduler_unlock() call. There is one other thread in the system which is
currently suspended.
Scheduler unlock [many suspended threads]
This test measures the cyg_scheduler_unlock() call. There are many other threads in the system which
are currently suspended. The purpose of this test is to determine the cost of having additional threads in the
system when the scheduler is activated by way of cyg_scheduler_unlock().
96
Scheduler unlock [many low priority threads]
This test measures the cyg_scheduler_unlock() call. There are many other threads in the system which are
runnable but are lower priority than the main thread. The purpose of this test is to determine the cost of having
additional threads in the system when the scheduler is activated by way of cyg_scheduler_unlock().
Mutex Primitives
Init mutex
This test measures the cyg_mutex_init() call. A number of separate mutex variables are created. The
purpose of this test is to measure the cost of creating a new mutex and introducing it to the system.
Lock [unlocked] mutex
This test measures the cyg_mutex_lock() call. The purpose of this test is to measure the cost of locking a
mutex which is currently unlocked. There are no other threads executing in the system while this test runs.
Unlock [locked] mutex
This test measures the cyg_mutex_unlock() call. The purpose of this test is to measure the cost of unlocking
a mutex which is currently locked. There are no other threads executing in the system while this test runs.
Kernel Real-time Characterization
Trylock [unlocked] mutex
This test measures the cyg_mutex_trylock() call. The purpose of this test is to measure the cost of locking
a mutex which is currently unlocked. There are no other threads executing in the system while this test runs.
Trylock [locked] mutex
This test measures the cyg_mutex_trylock() call. The purpose of this test is to measure the cost of locking
a mutex which is currently locked. There are no other threads executing in the system while this test runs.
Destroy mutex
This test measures the cyg_mutex_destroy() call. The purpose of this test is to measure the cost of deleting
a mutex from the system. There are no other threads executing in the system while this test runs.
Unlock/Lock mutex
This test attempts to measure the cost of unlocking a mutex for which there is another higher priority thread
waiting. When the mutex is unlocked, the higher priority waiting thread will immediately take the lock. The
time from when the unlock is issued until after the lock succeeds in the second thread is measured, thus giving
the round-trip or circuit time for this type of synchronizer.
Mailbox Primitives
Create mbox
This test measures the cyg_mbox_create() call. A number of separate mailboxes is created. The purpose of
this test is to measure the cost of creating a new mailbox and introducing it to the system.
97
Kernel Real-time Characterization
Peek [empty] mbox
This test measures the cyg_mbox_peek() call. An attempt is made to peek the value in each mailbox, which
is currently empty. The purpose of this test is to measure the cost of checking a mailbox for a value without
blocking.
Put [first] mbox
This test measures the cyg_mbox_put() call. One item is added to a currently empty mailbox. The purpose
of this test is to measure the cost of adding an item to a mailbox. There are no other threads currently waiting
for mailbox items to arrive.
Peek [1 msg] mbox
This test measures the cyg_mbox_peek() call. An attempt is made to peek the value in each mailbox, which
contains a single item. The purpose of this test is to measure the cost of checking a mailbox which has data to
deliver.
Put [second] mbox
This test measures the cyg_mbox_put() call. A second item is added to a mailbox. The purpose of this test
is to measure the cost of adding an additional item to a mailbox. There are no other threads currently waiting
for mailbox items to arrive.
Peek [2 msgs] mbox
This test measures the cyg_mbox_peek() call. An attempt is made to peek the value in each mailbox, which
contains two items. The purpose of this test is to measure the cost of checking a mailbox which has data to
deliver.
Get [first] mbox
This test measures the cyg_mbox_get() call. The first item is removedfrom a mailbox that currently contains
two items. The purpose of this test is tomeasure thecost ofobtaining anitem froma mailboxwithout blocking.
Get [second] mbox
This test measures the cyg_mbox_get() call. The last item is removedfrom a mailbox that currently contains
one item. The purpose of this test is to measure the cost of obtaining an item from a mailbox without blocking.
Tryput [first] mbox
This test measures the cyg_mbox_tryput() call. A single item is added to a currently empty mailbox. The
purpose of this test is to measure the cost of adding an item to a mailbox.
Peek item [non-empty] mbox
This test measures the cyg_mbox_peek_item() call. A single item is fetched from a mailbox that contains a
single item. The purpose of this test is to measure the cost of obtaining an item without disturbing the mailbox.
Tryget [non-empty] mbox
98
This test measures the cyg_mbox_tryget() call. A single item is removed from a mailbox that contains
exactly one item. The purpose of this test is to measure the cost of obtaining one item from a non-empty
mailbox.
Peek item [empty] mbox
This test measures the cyg_mbox_peek_item() call. An attempt is made to fetch an item from a mailbox
that is empty. The purpose of this test is to measure the cost of trying to obtain an item when the mailbox is
empty.
Tryget [empty] mbox
This test measures the cyg_mbox_tryget() call. An attempt is made to fetch an item from a mailbox that is
empty. The purpose of this test is to measure the cost of trying to obtain an item when the mailbox is empty.
Waiting to get mbox
This test measures the cyg_mbox_waiting_to_get() call. The purpose of this test is to measure the cost of
determining how many threads are waiting to obtain a message from this mailbox.
Waiting to put mbox
This test measures the cyg_mbox_waiting_to_put() call. The purpose of this test is to measure the cost of
determining how many threads are waiting to put a message into this mailbox.
Delete mbox
This test measures the cyg_mbox_delete() call. The purpose of this test is to measure the cost of destroying
a mailbox and removing it from the system.
Kernel Real-time Characterization
Put/Get mbox
In this round-trip test, one thread is sending data to a mailbox that is being consumed by another thread. The
time from when the data is put into the mailbox until it has been delivered to the waiting thread is measured.
Note that this time will contain a thread switch.
Semaphore Primitives
Init semaphore
This test measures the cyg_semaphore_init() call. A number of separate semaphore objects are created
and introduced to the system. The purpose of this test is to measure the cost of creating a new semaphore.
Post [0] semaphore
This test measures the cyg_semaphore_post() call. Each semaphore currently has a value of 0 and there
are no other threads in the system. The purpose of this test is to measure the overhead cost of posting to a
semaphore. This cost will differ if there is a thread waiting for the semaphore.
Wait [1] semaphore
This test measures the cyg_semaphore_wait() call. The semaphore has a current value of 1 so the call is
non-blocking. The purpose of the test is to measure the overhead of “taking” a semaphore.
Trywait [0] semaphore
This test measures the cyg_semaphore_trywait() call. The semaphore has a value of 0 when the call is
made. The purpose of this test is to measure the cost of seeing if a semaphore can be “taken” without blocking.
In this case, the answer would be no.
99
Kernel Real-time Characterization
Trywait [1] semaphore
This test measures the cyg_semaphore_trywait() call. The semaphore has a value of 1 when the call is
made. The purpose of this test is to measure the cost of seeing if a semaphore can be “taken” without blocking.
In this case, the answer would be yes.
Peek semaphore
This test measures the cyg_semaphore_peek() call. The purpose of this test is to measure the cost of obtaining the current semaphore count value.
Destroy semaphore
This test measures the cyg_semaphore_destroy() call. The purpose of this test is to measure the cost of
deleting a semaphore from the system.
Post/Wait semaphore
In this round-trip test, two threads are passing control back and forth by using a semaphore. The time from
when one thread calls cyg_semaphore_post() until theother thread completes its cyg_semaphore_wait()
is measured. Note that each iteration of this test will involve a thread switch.
Counters
Create counter
Get counter value
Set counter value
Tick counter
Delete counter
Alarms
Create alarm
This test measures the cyg_counter_create() call. A number of separate counters are created. The purpose
of this test is to measure the cost of creating a new counter and introducing it to the system.
This test measures the cyg_counter_current_value() call. The current value of each counter is obtained.
This test measures the cyg_counter_set_value() call. Each counter is set to a new value.
This test measures the cyg_counter_tick() call. Each counter is “ticked” once.
This test measures the cyg_counter_delete() call. Each counter is deleted from the system. The purpose
of this test is to measure the cost of deleting a counter object.
100
This test measures the cyg_alarm_create() call.A numberof separate alarms are created, all attached to the
same counter object. The purpose of this test is to measure the cost of creating a new counter and introducing
it to the system.
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.