I don't recall the TI designator, but they make some DSP parts that
have peripherals like MCUs. I know that some time back, ARM made a
push into DSP territory by adding some DSPish instructions to I
believe it was the CM3 devices, or maybe CM4.
Anyone here use these crossover devices? What sort of apps? Why did
you pick that device over others?
You are maybe thinking of the TMS320F family of DSP/MCU's from TI.
These have a traditional DSP-style processor core - 16-bit "char"
(no 8-bit byte access at all), gruesome assembly where each
instruction does several different things in a single cycle,
multiple memory buses for simultaneous accesses, hardware support
for cyclic buffers, FFT twiddling, etc.
...
The Cortex-M4 is basically a Cortex-M3 with DSP instructions added -
MACs in various formats, saturating arithmetic, and 8-bit and 16-bit
SIMD instructions (within 32-bit registers). ...
...
On 12/22/2022 15:36, David Brown wrote:
...
The Cortex-M4 is basically a Cortex-M3 with DSP instructions added -
MACs in various formats, saturating arithmetic, and 8-bit and 16-bit
SIMD instructions (within 32-bit registers). ...
...
Just a word of caution for Rick re this portion.
Make sure that a 32 bit accumulator will be enough for what you are
doing; it can easily fall short in many cases. "Normal" DSPs have
40 or so bits for this reason; or, you can pick some processor with
64 bit FPU MAC ability, 32 bit FPU will fall a lot shorter even than
the 32 bit integer regs David is mentioning.
David said it all, I am just cautioning because this is the kind of
"oh shit" factor which comes at the end of the project (a friend once
told me of that "oh shit", you either say it at the beginning or at
the end :).
On 2022-12-22, David Brown <david.brown@hesbynett.no> wrote:
You are maybe thinking of the TMS320F family of DSP/MCU's from TI.
These have a traditional DSP-style processor core - 16-bit "char"
(no 8-bit byte access at all), gruesome assembly where each
instruction does several different things in a single cycle,
multiple memory buses for simultaneous accesses, hardware support
for cyclic buffers, FFT twiddling, etc.
IIRC, branches were also delayed.
The later 320's (C30/C40 and on)
were all 32-bit (in C: char, int, long int, float, double were all
"one byte" which contained 32-bits). And the floating point format
wasn't IEEE.
That combination made supporting byte-oriented serial protocols that
used IEEE FP extra fun.
The dev tools from TI were a but clunky, but worked OK and were
available for Solaris (including the in-circuit emulators).
But, compared to what else was available 20+ years ago, they were damn
fast (especially for the price).
I was just curious about what people have used for DSP applications,
but in particular if anyone had used one of the "crossover" parts.
So far, the answer has been "no".
On Thursday, December 22, 2022 at 12:46:00 PM UTC-5, Dimiter wrote:for a couple of parts that are not made anymore. The new design will still use an FPGA. If I need an MCU in the design, it will be a custom design in the FPGA. I have one I've been pushing around in my head that would have one CPU, pipelined to work
On 12/22/2022 15:36, David Brown wrote:
...
The Cortex-M4 is basically a Cortex-M3 with DSP instructions added -
MACs in various formats, saturating arithmetic, and 8-bit and 16-bit
SIMD instructions (within 32-bit registers). ...
...
Just a word of caution for Rick re this portion.
Make sure that a 32 bit accumulator will be enough for what you are
doing; it can easily fall short in many cases. "Normal" DSPs have
40 or so bits for this reason; or, you can pick some processor with
64 bit FPU MAC ability, 32 bit FPU will fall a lot shorter even than
the 32 bit integer regs David is mentioning.
David said it all, I am just cautioning because this is the kind of
"oh shit" factor which comes at the end of the project (a friend once
told me of that "oh shit", you either say it at the beginning or at
the end :).
I'm not selecting a DSP part. I typically use FPGAs for what I do. Not because they are required for speed, but because they work well and have complete flexibility. I used a $10 FPGA in a product I designed in 2008 and have to refresh the design
I was just curious about what people have used for DSP applications, but in particular if anyone had used one of the "crossover" parts. So far, the answer has been "no".
On 22/12/2022 20:57, Rick C wrote:
I was just curious about what people have used for DSP applications,I don't know exactly how you are defining a "crossover" part.
but in particular if anyone had used one of the "crossover" parts.
So far, the answer has been "no".
But if it
is "a DSP with microcontroller features", then the answer so far is
"yes". Both Grant and I have used TMS320F parts - but I would not
choose to use one again if I could avoid it. (I can't answer for Grant there.) I have also used a "DSP with microcontroller features" from Freescale (from the MC56000 family, IIRC) - though I hadn't mentioned
that at all.
And if you mean "a microcontroller with DSP features", then as I said
almost everyone who works with embedded software has used Cortex-M4
devices. I have lost count of the number of different ones I have used
(plus Cortex-M7, ColdFire, and PPC based microcontrollers that had DSP features).
So I don't quite see how you could have interpreted the posts as "no".
On 22/12/2022 16:54, Grant Edwards wrote:
On 2022-12-22, David Brown <david.brown@hesbynett.no> wrote:
You are maybe thinking of the TMS320F family of DSP/MCU's from TI.
These have a traditional DSP-style processor core - 16-bit "char"
(no 8-bit byte access at all), gruesome assembly where each
instruction does several different things in a single cycle,
multiple memory buses for simultaneous accesses, hardware support
for cyclic buffers, FFT twiddling, etc.
IIRC, branches were also delayed.
If you say so - I don't remember. (Delayed branches are not uncommon in processors designed for single-cycle instruction throughput - they are
also found in several RISC architectures.)
The later 320's (C30/C40 and on) were all 32-bit (in C: char, int,
long int, float, double were all "one byte" which contained
32-bits). And the floating point format wasn't IEEE.
I did not know they were part of the TMS320F family, though I know Texas Instruments made other DSP's with 32-bit "char".
I was just curious about what people have used for DSP applications,
but in particular if anyone had used one of the "crossover" parts. So
far, the answer has been "no".
Rick C <gnuarm.del...@gmail.com> writes:
I was just curious about what people have used for DSP applications,I've done some audio stuff on ordinary CPU's, that in an embedded system would probably go on something like a Cortex M4, if that's what you call
but in particular if anyone had used one of the "crossover" parts. So
far, the answer has been "no".
a crossover part. The next thing after that is probably a GPU or FPGA, either of which contains a stupendous amount of parallel MAC's. As
others have said, dedicated DSP's are now pretty niche.
FPGA's may have displaced general purpose processors for some realtime applications as well, since you get low latency and deterministic timing without having to go crazy worrying about caches and interrupts.
I didn't personally work on it, but spent a while studying a
cryptography app that ran on the now ancient Motorola DSP 56000 series.
The model number came from the architecture's 24 bit words and 56 bit
MAC accumulator. The app wasn't particularly connected with realtime or
with signal processing. Rather, the 24*24->56 MAC came in handy for
high precision arithmetic used by the crypto algorithm.
On 2022-12-22, David Brown <david.brown@hesbynett.no> wrote:
On 22/12/2022 16:54, Grant Edwards wrote:
On 2022-12-22, David Brown <david.brown@hesbynett.no> wrote:
You are maybe thinking of the TMS320F family of DSP/MCU's from TI.
These have a traditional DSP-style processor core - 16-bit "char"
(no 8-bit byte access at all), gruesome assembly where each
instruction does several different things in a single cycle,
multiple memory buses for simultaneous accesses, hardware support
for cyclic buffers, FFT twiddling, etc.
IIRC, branches were also delayed.
If you say so - I don't remember. (Delayed branches are not uncommon in
processors designed for single-cycle instruction throughput - they are
also found in several RISC architectures.)
The later 320's (C30/C40 and on) were all 32-bit (in C: char, int,
long int, float, double were all "one byte" which contained
32-bits). And the floating point format wasn't IEEE.
I did not know they were part of the TMS320F family, though I know Texas
Instruments made other DSP's with 32-bit "char".
Ah, I overlooked the "F" in your original post. I don't remember any F
parts. Interestingly the Wikipedia page on TMS320 doesn't mention the
F parts at all. I did find this page abouit the TMS320F28335, but it's
a 32-bit part also:
https://www.ti.com/product/TMS320F28335
On Thursday, December 22, 2022 at 4:03:29 PM UTC-5, David Brown
wrote:
On 22/12/2022 20:57, Rick C wrote:
I was just curious about what people have used for DSPI don't know exactly how you are defining a "crossover" part.
applications, but in particular if anyone had used one of the
"crossover" parts. So far, the answer has been "no".
Please read the first post in this thread for that.
But if it is "a DSP with microcontroller features", then the answer
so far is "yes". Both Grant and I have used TMS320F parts - but I
would not choose to use one again if I could avoid it. (I can't
answer for Grant there.) I have also used a "DSP with
microcontroller features" from Freescale (from the MC56000 family,
IIRC) - though I hadn't mentioned that at all.
And if you mean "a microcontroller with DSP features", then as I
said almost everyone who works with embedded software has used
Cortex-M4 devices. I have lost count of the number of different
ones I have used (plus Cortex-M7, ColdFire, and PPC based
microcontrollers that had DSP features).
So I don't quite see how you could have interpreted the posts as
"no".
I was looking for some insight into their experiences with such
devices for DSP work, and I'm counting both DSP like MCUs and MCU
like DSPs. I don't see in your post that you talk about any
particular experience, rather offer a 10,000 foot overview of the
state of the market. Thanks for that, but this is not new to me. So
your post was pretty much a "no", to me.
I guess I was not quite explicit enough in my initial post. I was
asking about specific experiences where a crossover part was chosen
for a project with a significant DSP content, which would have
required a DSP chip, if these devices were not available.
I am fully aware that MCUs are getting faster and more capable, but
that doesn't mean DSPs are not needed. It simply means they are used
in other applications that require more horsepower. Sometimes, it's
not even the horsepower, but the performance to power consumption
ratio. There are application specific DSPs for hearing aids that run
on very low power, much better than any MCU could do.
Years ago DSP split into two categories based on the cell phone
market. The high performance devices needed their own power plants,
but cranked out some serious MIPS/MFLOPS. The much smaller, lower
power, fixed point devices gained in speed, without sucking all the
juice from mobile batteries, while serving in hand sets. Now the
hand sets have dedicated CPU chips with built in DSP sections for the
front end processing of cell phones, rather than separate DSP chips.
There's no shortage of DSP cores in the world, we just don't see all
of them because they are part of system chips.
On Friday, December 23, 2022 at 1:31:55 AM UTC-5, Paul Rubin wrote:
Rick C <gnuarm.del...@gmail.com> writes:
I was just curious about what people have used for DSPI've done some audio stuff on ordinary CPU's, that in an embedded
applications, but in particular if anyone had used one of the
"crossover" parts. So far, the answer has been "no".
system would probably go on something like a Cortex M4, if that's
what you call a crossover part. The next thing after that is
probably a GPU or FPGA, either of which contains a stupendous
amount of parallel MAC's. As others have said, dedicated DSP's are
now pretty niche.
FPGA's may have displaced general purpose processors for some
realtime applications as well, since you get low latency and
deterministic timing without having to go crazy worrying about
caches and interrupts.
I didn't personally work on it, but spent a while studying a
cryptography app that ran on the now ancient Motorola DSP 56000
series. The model number came from the architecture's 24 bit words
and 56 bit MAC accumulator. The app wasn't particularly connected
with realtime or with signal processing. Rather, the 24*24->56 MAC
came in handy for high precision arithmetic used by the crypto
algorithm.
At that time there were generally, 16 bit fixed point DSP, and 32 bit floating point DSP. Neither was appropriate for audio work. 16 bits
is not enough resolution for high quality audio and 32 bit floating
point was overkill, using extra power and burning extra dollars.
Motorola came out with 24 bit devices as the sweet spot for high
quality audio work.
On 23/12/2022 02:11, Rick C wrote:
On Thursday, December 22, 2022 at 4:03:29 PM UTC-5, David Brown
wrote:
On 22/12/2022 20:57, Rick C wrote:
I was just curious about what people have used for DSPI don't know exactly how you are defining a "crossover" part.
applications, but in particular if anyone had used one of the
"crossover" parts. So far, the answer has been "no".
Please read the first post in this thread for that.
I did. That's why I said I don't know exactly how you are defining your personal meaning of "crossover part".
But I see you've given more
information below, so maybe people can give you more helpful feedback
(or at least say that they don't have the relevant experience).
But if it is "a DSP with microcontroller features", then the answer
so far is "yes". Both Grant and I have used TMS320F parts - but I
would not choose to use one again if I could avoid it. (I can't
answer for Grant there.) I have also used a "DSP with
microcontroller features" from Freescale (from the MC56000 family,
IIRC) - though I hadn't mentioned that at all.
And if you mean "a microcontroller with DSP features", then as I
said almost everyone who works with embedded software has used
Cortex-M4 devices. I have lost count of the number of different
ones I have used (plus Cortex-M7, ColdFire, and PPC based
microcontrollers that had DSP features).
So I don't quite see how you could have interpreted the posts as
"no".
I was looking for some insight into their experiences with suchOf course it is an overview. Do you want detailed information about everything I have done for the past 15 years or so since Cortex-M
devices for DSP work, and I'm counting both DSP like MCUs and MCU
like DSPs. I don't see in your post that you talk about any
particular experience, rather offer a 10,000 foot overview of the
state of the market. Thanks for that, but this is not new to me. So
your post was pretty much a "no", to me.
devices took over the embedded world?
I can give a bit more insight into my experience with the TI320F24x
device. That was over 20 years ago, and lots will have changed since
then. The device was horrible to use. The assembly was impenetrable,
and extremely complicated to do well. The C compiler was hopelessly inefficient, meaning you /had/ to use assembly for critical parts. The hardware debugging tools were absurdly overpriced (some $1500 for what
was basically a couple of 74-series logic chips), and broke easily. The software tools had annoying quirks. But the sensorless BLDC motor
control worked well in the end.
I would not willingly choose to do development on these parts again -
there are simply too many alternatives that are vastly easier to work
with for most purposes. But I know TI sell various pre-programmed parts
as dedicated motor control peripherals, and I'd be quite happy to
consider them.
As I said, the great majority of embedded microcontroller work is now
done with Cortex-M microcontrollers - they dominate the industry.
At
the low end you have Cortex-M0 and M0+ devices for the very cheapest,
but the most popular are M3 or M4 parts (and the M7 at the high end).
The M4 is like an M3 but with added "DSP" instructions - MAC's of
various types, simple SIMD, saturating arithmetic. In reality,
relatively few people actually do anything that could be called "DSP"
work - it's usually more general control code. And when you want a
digital filter or FFT, you typically use ARM's optimised libraries.
Your code runs the same whether the device has DSP optimisation
instructions or not - only the speed is different.
So when you ask about "experience using these devices", you are really
asking "experience doing microcontroller development".
I guess I was not quite explicit enough in my initial post. I was
asking about specific experiences where a crossover part was chosen
for a project with a significant DSP content, which would have
required a DSP chip, if these devices were not available.
That is a different question, and more specific.
I've only done quite limited DSP algorithms (such as simple filters) in
my own code, and these devices are absolutely fine for that. As always,
you have to be careful about your scalings when working with fixed-point numbers.
If you want floating point, some Cortex-M4 have single-precision
floating point (Cortex-M4F). You need to be careful to avoid
accidentally using double precision operations in your C code - there
are gcc flags to help warn you about this. If you want double
precision, it's worth going for an M7 microntroller like an NXP RT10xx
device (ironically called a "crossover microcontroller" by NXP), since
these have double precision floating point in hardware.
I have been involved in a project that was more relevant, using wavelet transformations, but I did not work directly on the wavelet code. I did
help out on some of the optimising and translation from the original
code (from a PC). Working that way is not optimal, but it was good
enough - we required a certain amount of transformations per second, and
got that from the chip we had on the board, and did not see any point in going further.
There is no doubt that dedicated DSP cores have instruction types and features that can make a significant difference to the efficiency of
some algorithms. A good DSP can do "x += *p++ * *q++;" in a single
operation, once per cycle. They generally support cyclic buffers
directly, which can save a fair bit of code. And they have the
specialised bit manipulation instructions useful for FFT's.
However, it is all about getting the results out in the time (and power
and cost budget) you need. And if your code runs fast enough on the
device you have, it really doesn't matter if a different device could do
it faster.
A lot of the choice will, as so often, come down to experience and familiarity. Getting decent DSP algorithm performance from an M4 is not
too hard if you are already a good embedded programmer. It comes down
to knowing your toolchain, knowing how to write efficient code, and
knowing how to work with vendor's libraries. And since you have good toolchains, easy and cheap debugging (usually), and peripherals such as serial ports, USB, and Ethernet, you often have a much nicer development environment. If you develop appropriately, the same code will also
compile directly on a PC making simulation and testing vastly easier.
On a DSP, getting optimal performance is very difficult - there is a
/lot/ you need to track, and you are often making use of so many
compiler extensions, intrinsics, etc., that you are really programming
in assembly. Getting the same code running on a PC for testing is
hugely harder. Accidentally getting significantly poorer efficiency is
very easy - you might find that writing "while (--n)" gives you
extremely fast specialised loop modes while "while (n--)" gives you
explicit decrements, comparisons and jumps. Toolchains are often poor
quality and very expensive (that is not universal, however). And
non-DSP code is much harder than in a microcontroller - you often don't
have access to 8-bit bytes, and portability between the DSP and other processors is poor.
We haven't talked much about peripherals or hardware, but DSP's usually
have fewer "general" peripherals, and their interfaces can be more specialised.
I am fully aware that MCUs are getting faster and more capable, but
that doesn't mean DSPs are not needed. It simply means they are used
in other applications that require more horsepower. Sometimes, it's
not even the horsepower, but the performance to power consumption
ratio. There are application specific DSPs for hearing aids that run
on very low power, much better than any MCU could do.
Yes, that is correct.
DSP's are still very much an important technology, but they are getting
more niche. There are few people that develop with them - the majority
of companies that have a DSP on their boards will buy the code ready
made, often just as a binary blob or pre-programmed. In many cases, the
code is written by the companies that develop the DSP.
This is not just because getting maximal efficiency from a DSP is
technically hard and requires knowledge and experience (and if you don't
need maximal efficiency, why are you bothering with the DSP in the first place?). IP and patent licensing is a nightmare in many of the
applications where DSPs really shine, such as in audio and video codecs.
If you are Sony or Sonos, you can afford a big development team and an
even bigger lawyer team and make your own audio codecs. For most
companies, it is a fraction of the overall price if you buy your DSP's
with licenses for codec binary blobs all in one.
Standalone DSP chips are also getting rarer - it is more common to see
them as accelerators alongside a "host" processor that handles the
non-DSP functionality, all within the same die.
Years ago DSP split into two categories based on the cell phone
market. The high performance devices needed their own power plants,
but cranked out some serious MIPS/MFLOPS. The much smaller, lower
power, fixed point devices gained in speed, without sucking all the
juice from mobile batteries, while serving in hand sets. Now the
hand sets have dedicated CPU chips with built in DSP sections for the
front end processing of cell phones, rather than separate DSP chips.
There's no shortage of DSP cores in the world, we just don't see allAgreed.
of them because they are part of system chips.
Most (in terms of numerical quantities) are probably generated
specifically for the ASIC or dedicated chip they are used in. There are parametrized DSP cores available that are often used with 24-bit or
18-bit "bytes" - TMS320's with 16-bit or 32-bit "char" are programmer-friendly in comparison. And sometimes it is not easy to draw
the line between hardware filters with very programmable state machines,
and limited DSPs.
But a lot is changing. At the high end, processors with SIMD are able
to do many of the tasks that DSP's used to do. Other kinds of
accelerators such as found in graphics card cores can do a better job
than traditional DSPs, while also being easier to work with. At the
lower end, normal microcontrollers, possibly augmented with a few DSP-friendly instructions, can do a better job. For your hearing aids,
when you have a Cortex-M device that takes less power than the leakage current of the smallest battery while doing all the filtering fast
enough, the DSP has lost its advantage.
I don't recall the TI designator, but they make some DSP parts that have peripherals like MCUs. I know that some time back, ARM made a push into
DSP territory by adding some DSPish instructions to I believe it was the
CM3 devices, or maybe CM4.
Anyone here use these crossover devices? What sort of apps? Why did
you pick that device over others?
On 23/12/2022 09:14, Rick C wrote:
On Friday, December 23, 2022 at 1:31:55 AM UTC-5, Paul Rubin wrote:
I didn't personally work on it, but spent a while studying a
cryptography app that ran on the now ancient Motorola DSP 56000
series. The model number came from the architecture's 24 bit words
and 56 bit MAC accumulator. The app wasn't particularly connected
with realtime or with signal processing. Rather, the 24*24->56 MAC
came in handy for high precision arithmetic used by the crypto
algorithm.
At that time there were generally, 16 bit fixed point DSP, and 32 bit
floating point DSP. Neither was appropriate for audio work. 16 bits
is not enough resolution for high quality audio and 32 bit floating
point was overkill, using extra power and burning extra dollars.
Motorola came out with 24 bit devices as the sweet spot for high
quality audio work.
Yes. There are many manufacturers of 24-bit DSPs, and they almost all
have a background in audio.
(Now, of course, you just use the 32-bit - the millidollar difference in hardware costs is worth it for the added convenience.)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 493 |
Nodes: | 16 (0 / 16) |
Uptime: | 165:01:36 |
Calls: | 9,702 |
Calls today: | 2 |
Files: | 13,733 |
Messages: | 6,177,891 |