Forum: >>> Magnum BBS <<<

Dark
Log in

Username Password

Memory Powering the AI Revolution

From Mild Shock@21:1/5 to All on Thu Jan 16 11:06:27 2025

I currently believe that some of the fallacies
around LLMs is that one assumes that the learning
generates some small light NNs (Neural Networks),

which are then subject to blurred categories and
approximative judgments. But I guess its quite
different the learning generates very large massive NNs,

which can afford representing ontologies quite precise
and with breadth. But how is it done? One puzzle piece
could be new types of memory, so called High-Bandwidth

Memory (HBM), an architecture where DRAM dies are
vertically stacked and connected using Through-Silicon
Vias (TSVs). For example found in NVIDIA GPUs like the

A100, H100. Compare to DDR3 that might be found in
your Laptop or PC. Could give you a license to trash
L1/L2 Caches with your algorithms?

HBM3 DDR3
Bandwidth 1.2 TB/s (per stack) 12.8 GB/s to 25.6 GB/s
Latency Low, optimized for Higher latency
real-time tasks
Power Efficiency
More efficient Higher power consumption
despite high speeds than HBM3

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Thu Jan 16 11:07:15 2025

See also:

The Special Memory Powering the AI Revolution https://www.youtube.com/watch?v=yAw63F1W_Us

Mild Shock schrieb:

I currently believe that some of the fallacies
around LLMs is that one assumes that the learning
generates some small light NNs (Neural Networks),

which are then subject to blurred categories and
approximative judgments. But I guess its quite
different the learning generates very large massive NNs,

which can afford representing ontologies quite precise
and with breadth. But how is it done? One puzzle piece
could be new types of memory, so called High-Bandwidth

Memory (HBM), an architecture where DRAM dies are
vertically stacked and connected using Through-Silicon
Vias (TSVs). For example found in NVIDIA GPUs like the

A100, H100. Compare to DDR3 that might be found in
your Laptop or PC. Could give you a license to trash
L1/L2 Caches with your algorithms?

             HBM3                  DDR3 Bandwidth    1.2 TB/s (per stack) 12.8 GB/s to 25.6 GB/s Latency      Low, optimized for    Higher latency
             real-time tasks
Power Efficiency
             More efficient        Higher power consumption
             despite high speeds   than HBM3

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Plume
  Sat Jul 5 20:09:30 2025
  from Uk via Raw
- Bagwaa
  Sat Jul 5 18:16:22 2025
  from Nottingham via Telnet
- Plume
  Sat Jul 5 16:49:18 2025
  from Uk via Raw
- Bagwaa
  Sat Jul 5 11:46:53 2025
  from Nottingham via Telnet
- Plume
  Sat Jul 5 10:11:41 2025
  from Uk via SSH
- William Grenness
  Sat Jul 5 03:13:16 2025
  from Melbourne, Victoria via SSH
- William Grenness
  Sat Jul 5 01:42:37 2025
  from Melbourne, Victoria via SSH
- William Grenness
  Sat Jul 5 00:31:47 2025
  from Melbourne, Victoria via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	505
Nodes:	16 (2 / 14)
Uptime:	59:00:12
Calls:	9,924
Calls today:	11
Files:	13,804
Messages:	6,348,357
Posted today:	2