• Memory Powering the AI Revolution

    From Mild Shock@21:1/5 to All on Thu Jan 16 11:06:27 2025
    I currently believe that some of the fallacies
    around LLMs is that one assumes that the learning
    generates some small light NNs (Neural Networks),

    which are then subject to blurred categories and
    approximative judgments. But I guess its quite
    different the learning generates very large massive NNs,

    which can afford representing ontologies quite precise
    and with breadth. But how is it done? One puzzle piece
    could be new types of memory, so called High-Bandwidth

    Memory (HBM), an architecture where DRAM dies are
    vertically stacked and connected using Through-Silicon
    Vias (TSVs). For example found in NVIDIA GPUs like the

    A100, H100. Compare to DDR3 that might be found in
    your Laptop or PC. Could give you a license to trash
    L1/L2 Caches with your algorithms?

    HBM3 DDR3
    Bandwidth 1.2 TB/s (per stack) 12.8 GB/s to 25.6 GB/s
    Latency Low, optimized for Higher latency
    real-time tasks
    Power Efficiency
    More efficient Higher power consumption
    despite high speeds than HBM3

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Thu Jan 16 11:07:15 2025
    See also:

    The Special Memory Powering the AI Revolution https://www.youtube.com/watch?v=yAw63F1W_Us

    Mild Shock schrieb:
    I currently believe that some of the fallacies
    around LLMs is that one assumes that the learning
    generates some small light NNs (Neural Networks),

    which are then subject to blurred categories and
    approximative judgments. But I guess its quite
    different the learning generates very large massive NNs,

    which can afford representing ontologies quite precise
    and with breadth. But how is it done? One puzzle piece
    could be new types of memory, so called High-Bandwidth

    Memory (HBM), an architecture where DRAM dies are
    vertically stacked and connected using Through-Silicon
    Vias (TSVs). For example found in NVIDIA GPUs like the

    A100, H100. Compare to DDR3 that might be found in
    your Laptop or PC. Could give you a license to trash
    L1/L2 Caches with your algorithms?

                 HBM3                  DDR3 Bandwidth    1.2 TB/s (per stack)  12.8 GB/s to 25.6 GB/s Latency      Low, optimized for    Higher latency
                 real-time tasks
    Power Efficiency
                 More efficient        Higher power consumption
                 despite high speeds   than HBM3

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)