• Wafer viability (was Re: fun with AI. rofl...)

    From Scott Lurndal@21:1/5 to David Brown on Tue May 27 17:49:37 2025
    David Brown <david.brown@hesbynett.no> writes:
    On 27/05/2025 17:40, MitchAlsup1 wrote:
    On Tue, 27 May 2025 7:32:07 +0000, David Brown wrote:

    Yes, there are ways to increase the viability of a wafer.  There are the >>> obvious ones - as you produce more of the wafer and design, you can
    fine-tune the process and eliminate the riskiest parts.  But you can
    also have a certain amount of redundancy to deal with damaged areas.  I >>> don't know how much this is done in current processors, but certainly in >>> the past companies have produced wafers where the design was a number of >>> cores and size of cache, and if any of the cores or cache blocks did not >>> work during testing then the chips were simply sold as smaller core
    count parts.  The wafer won't give a full output of the high-cost parts, >>> but at least damaged parts won't be entirely wasted.

    For decades, single lines in a cache (any level) could be put in a
    "no allocate" state, so your 16KB L1 caches would become a 16KB-64B
    Cache.

    Memory is easy to allow for a few transistors, wires, or connections
    to be bad and it still works OK-ish. Function units, busses, and
    sequencers:: not so much.

    Sure - unless you simply disable an entire cpu core as bad. (If the
    damage is on the shared logic, your whole chip is screwed.)

    It would be more likely that the entire tile (core, L2, core control
    logic) would be e-fused out and the on-chip routing (mesh, ring, bus)
    updated to bypass the faulty tile. Even the shared logic can be designed
    with some redundancy to support yield managment.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From MitchAlsup1@21:1/5 to Scott Lurndal on Wed May 28 22:31:52 2025
    On Tue, 27 May 2025 17:49:37 +0000, Scott Lurndal wrote:

    David Brown <david.brown@hesbynett.no> writes:
    On 27/05/2025 17:40, MitchAlsup1 wrote:
    On Tue, 27 May 2025 7:32:07 +0000, David Brown wrote:

    Yes, there are ways to increase the viability of a wafer.  There are the >>>> obvious ones - as you produce more of the wafer and design, you can
    fine-tune the process and eliminate the riskiest parts.  But you can
    also have a certain amount of redundancy to deal with damaged areas.  I >>>> don't know how much this is done in current processors, but certainly in >>>> the past companies have produced wafers where the design was a number of >>>> cores and size of cache, and if any of the cores or cache blocks did not >>>> work during testing then the chips were simply sold as smaller core
    count parts.  The wafer won't give a full output of the high-cost parts, >>>> but at least damaged parts won't be entirely wasted.

    For decades, single lines in a cache (any level) could be put in a
    "no allocate" state, so your 16KB L1 caches would become a 16KB-64B
    Cache.

    Memory is easy to allow for a few transistors, wires, or connections
    to be bad and it still works OK-ish. Function units, busses, and
    sequencers:: not so much.

    Sure - unless you simply disable an entire cpu core as bad. (If the
    damage is on the shared logic, your whole chip is screwed.)

    It would be more likely that the entire tile (core, L2, core control
    logic) would be e-fused out and the on-chip routing (mesh, ring, bus)
    updated to bypass the faulty tile. Even the shared logic can be
    designed
    with some redundancy to support yield managment.

    Once you start having "unit counts" bigger than 8, 9 seems to be
    the correct choice. 1-bad, just fuse it out, all good, fuse one
    out randomly. Works for::

    cores, SRAMs, ports into cross bars, ...

    Then the 9-core "macro" becomes step-and-repeat every future
    generation.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)