NXP S32K1 Application Notes

Optimizing the S32K1xx eDMA for Performance Demanding Applications
by: NXP Semiconductors
Direct memory access (DMA) is a feature for data transfers between main memory and a peripheral device without passing it through the CPU. It is possible to achieve higher transmission speed by parallel work with the host processor.
The eDMA controller can perform complex data transfers, but sometimes its transfer speed is limited with the simultaneous access to the internal buses.
Another factor that limits the transfer speed is the response time of the connected peripheral (frequency at which its engine is working), it is the key to find a proper combination of peripheral clock, eDMA and CPU clock frequencies.
This application note intends to provide the reader with clues and good practices to improve the performance of their application. See the S32K1xx-
RM for more details.
NXP Semiconductors
Document Number: AN12972
Application Notes
Rev. 0
,
02/2021
Contents
1. Introduction ........................................................................ 1
2. Integration of the eDMA module on S32K14x ................... 2
2.1. Scatter Gather feature .............................................. 4
2.2. Channel to channel linking ...................................... 5
2.3. eDMA data transfer process .................................... 7
3. Use cases ............................................................................ 7
3.1. eDMA for serial communications ........................... 8
3.2. ADC readings through eDMA ............................... 11
3.3. Other factors .......................................................... 13
4. Conclusions ...................................................................... 15
5. Reference .......................................................................... 15
Integration of the eDMA module on S32K14x
Optimizing the S32K1xx eDMA for Performance Demanding Applications, Rev. 0, 02/2021
2 NXP Semiconductors
NOTE
It is important to remark that the performance of the eDMA would vary because of application dependent factors and in some use cases they cannot be changed. This document intends to provide the analysis of those factors and how some cases could be managed.

2. Integration of the eDMA module on S32K14x

The following figure shows the S32K14x product integration and it is followed by a list that shows some features/components that may affect the performance of the data transfers.
Figure 1. S32K14x Block Diagram
1. eDMA options: Number of channels, priority configuration, bandwidth control, transfer size,
features enabled, etc. They will be discussed in next section.
2. Code cache: The chip includes one 4KB code cache to minimize the performance impact of
memory access latencies. The LMEM controller provides the processor with tightly-coupled processor-local memories and bus paths to all slave memory spaces.
3. Crossbar switch: AXBS provides arbitration among the bus masters when they access the same
slave. One bottleneck arises when one eDMA tries to access SRAM (or any slave bus) while
Integration of the eDMA module on S32K14x
Optimizing the S32K1xx eDMA for Performance Demanding Applications, Rev. 0, 02/2021
NXP Semiconductors 3
other master bus (i.e. CPU) tries to access that slave bus as well. Depending on the application use case, it could be needed to add bandwidth control with MCM_CPCR[CBRR] register. MCM_CPCR[CBRR] could program the crossbar arbitration as round robin or fixed priority. Applications that require the eDMA to have control of the bus as much as possible, they will need to set fixed priority (with starvation risk of other masters).
4. Peripheral configuration: depending in the application that it is needed to implement, it could be
some configuration of the peripheral related with eDMA that can improve the performance. In next chapter it will be discussed some use cases.
The following figure shows the eDMA internal composition, followed by a list of configurations which might impact eDMA performance.
Figure 2. eDMA Block Diagram
Integration of the eDMA module on S32K14x
Optimizing the S32K1xx eDMA for Performance Demanding Applications, Rev. 0, 02/2021
4 NXP Semiconductors
1. Priority configuration: eDMA has the capability to select fixed priority or round robin between
channels, depending on the application, you could need one channel to have higher priority than others. It is more optimum to have fixed priority and assigned higher priority to critical channels, but some cases and depending in the application, this is not an option.
2. Number of channels: As long as there is only one eDMA engine, only one channel can be
serviced at a time. With more active channels and constant eDMA requests, it is very likely to have delays to attend one channel as other channel is being attended.
3. Bandwidth control: In some applications (with large transfer size) you need to avoid starvations
of other bus master in the crossbar, and in order to do this, eDMA can stall its own engine for each R/W operation. This option allows other masters (like CPU) to take control of the slave port and be able to work along with the eDMA. This feature does not improve the performance, but it needs to be taken in consideration for some implementations.
4. Transfer size: eDMA supports programmable source, destination and transfer size, for data
transfers where the source and destination sizes are equal, the eDMA engine performs a series of source-read / destination-write operations. For descriptors where the sizes are not equal, multiple accesses of the smaller size data are required for each reference of the larger size. E.g. Source sizes references 8-bit data and destination is 32-bit data, four reads are performed, then one single 32-bit write.
5. eDMA Features enabled: There are some features of the eDMA (like Scatter gather, linking
channel or Minor loop offset) that can ease the implementation of specific applications. Some of those features has the disadvantage of increasing transfer times and in those cases, software designer must analyze the potential benefits at the expense of eDMA performance.
2.1.

Scatter Gather feature

Scatter Gather (SGA) feature allows an eDMA channel to load different Transfer Control Descriptors (TCD) when major loop is completed. The basic idea is that when one channel completes its major loop, the channel will be reloaded with a new TCD that is saved in local memory, all without the intervention of the CPU.
This capability allows the user to define different TCDs for one channel, but it causes a delay in the eDMA transfers because to load the new TCD at the end of the major loop the eDMA engine needs to de-reference the pointer address of the TCD from the memory where it was saved (flash or RAM), so the eDMA engine needs to go through some busses to reach it, this represents clock cycles added to the process which could increase depending in bus availability.
The following figure shows the path that the eDMA engine must go through in order to recharge the TCD config (yellow) and the possible bottleneck by bus traffic (red) (assuming TCD is saved in RAM).
Integration of the eDMA module on S32K14x
Optimizing the S32K1xx eDMA for Performance Demanding Applications, Rev. 0, 02/2021
NXP Semiconductors 5
Figure 3. eDMA path to RAM memory when TCD reloading
2.2.

Channel to channel linking

Linking channel feature allows to trigger a eDMA channel when another eDMA channel completes a
transfer (major or minor loop). This capability allows to “connect” or “link” channels, so when one
transfer is completed, other transfer config from another channel can start. This feature is often used when one transfers depends on the job of another transfer or when a specific order of transfers is required.
This feature is a better option than the scatter gathers in terms of memory and time. Contrary to SGA, linking channel does not use extra memory space (for the TCD configs) as you configure the channel directly in the registers of the eDMA channel. Now, as the eDMA engine does not have to go through any external bus (like the crossbar bus) to reload the TCD, linking channel does not add time to the process (only the normal startup time for the eDMA channel).
This feature is limited to the number of channels available, which represent a limit number of TCD configurations. Also, the minor link feature shares the register space memory with the NBYTES field, therefore if you enable minor linking channel feature, you will be limited to 512 transfers or minor loops count.
In the following figure you can see how the control submodule perform the linking process without going out to crossbar bus and internal bus.
Integration of the eDMA module on S32K14x
Optimizing the S32K1xx eDMA for Performance Demanding Applications, Rev. 0, 02/2021
6 NXP Semiconductors
Figure 4. Linking process inside eDMA module
Loading...
+ 12 hidden pages