• 8080 question

    From dxforth@21:1/5 to All on Wed Aug 2 13:33:48 2023
    I have a count in a double register.

    Is there a way to decrement and test for 0 without using the A register?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Heitzer@21:1/5 to dxforth on Wed Aug 2 08:40:42 2023
    dxforth <dxforth@gmail.com> wrote:
    I have a count in a double register.

    Is there a way to decrement and test for 0 without using the A register?
    No fast way.
    Only the 8 bit register opcodes set the flags. So you cannot use
    a 16 bit decrement but have to do the decrement with the high and low registers.
    Decrement low register.
    If carry was set decrement the high register and continue.
    If z was set decrement the high register and check if carry was set.
    That means that the high register was zero before the decrement.
    If no carry was set you have to increment the high register in order to
    restore the value before the decrement.





    --
    Dipl.-Inform(FH) Peter Heitzer, peter.heitzer@rz.uni-regensburg.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Douglas Miller@21:1/5 to All on Wed Aug 2 04:04:13 2023
    NOTE! 8-bit increment/decrement does NOT affect the carry. Only sign or zero may be detected after those. If you use zero as the trigger for a dual 8-bit decrement loop (i.e. accomplishing a 16-bit decrement using two 8-bit decrements) you need to take
    into account that zero is not the same as carry and you will have to adjust your starting counts accordingly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Heitzer@21:1/5 to Douglas Miller on Wed Aug 2 11:48:28 2023
    Douglas Miller <durgadas311@gmail.com> wrote:
    NOTE! 8-bit increment/decrement does NOT affect the carry. Only sign or zero may be detected after those. If you use zero as the trigger for a dual 8-bit decrement loop (i.e. accomplishing a 16-bit decrement using two 8-bit decrements) you need to take
    into account that zero is not the same as carry and you will have to adjust your starting counts accordingly.

    You are right. My mistake. Using only the detection of zero a possible
    solution is doing a 16 bit decrement and for the low and high bytes
    an increment followed by a decrement. So the value of the register does
    not change but only the flags are set. A test costs 36 cycles if
    both registers have to be checked or at least 18 cycles if the low byte
    is not zero after the 16 Bit decrement.


    --
    Dipl.-Inform(FH) Peter Heitzer, peter.heitzer@rz.uni-regensburg.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Peter Heitzer on Wed Aug 2 23:01:39 2023
    On 2/08/2023 9:48 pm, Peter Heitzer wrote:
    Douglas Miller <durgadas311@gmail.com> wrote:
    NOTE! 8-bit increment/decrement does NOT affect the carry. Only sign or zero may be detected after those. If you use zero as the trigger for a dual 8-bit decrement loop (i.e. accomplishing a 16-bit decrement using two 8-bit decrements) you need to
    take into account that zero is not the same as carry and you will have to adjust your starting counts accordingly.

    You are right. My mistake. Using only the detection of zero a possible solution is doing a 16 bit decrement and for the low and high bytes
    an increment followed by a decrement. So the value of the register does
    not change but only the flags are set. A test costs 36 cycles if
    both registers have to be checked or at least 18 cycles if the low byte
    is not zero after the 16 Bit decrement.

    My initial thought was to save/restore A e.g.

    push h
    mov l,a
    mov a,e
    ora d
    mov a,l
    pop h

    but hoping there was something smarter. I later discovered a logical error
    in my code and after correcting it the problem went away...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Douglas Miller@21:1/5 to Peter Heitzer on Wed Aug 2 05:26:05 2023
    On Wednesday, August 2, 2023 at 6:48:32 AM UTC-5, Peter Heitzer wrote:
    You are right. My mistake. Using only the detection of zero a possible solution is doing a 16 bit decrement and for the low and high bytes
    an increment followed by a decrement. So the value of the register does
    not change but only the flags are set. A test costs 36 cycles if
    both registers have to be checked or at least 18 cycles if the low byte
    is not zero after the 16 Bit decrement.
    --
    Dipl.-Inform(FH) Peter Heitzer, peter....@rz.uni-regensburg.de

    Typically, I just see that the starting count is adjusted to account for the difference. These cases just use the zero flag to detect the carry over, but the starting count makes less sense (possibly, depending on whether it was an arbitrary number or
    had specific meaning). I've also seen these sorts of loops be "inverted" (do increment instead of decrement), which means the starting count needs to be negative - but usually makes more sense in terms of absolute numbers.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Heitzer@21:1/5 to dxforth on Wed Aug 2 14:03:12 2023
    dxforth <dxforth@gmail.com> wrote:
    On 2/08/2023 9:48 pm, Peter Heitzer wrote:
    Douglas Miller <durgadas311@gmail.com> wrote:
    NOTE! 8-bit increment/decrement does NOT affect the carry. Only sign or zero may be detected after those. If you use zero as the trigger for a dual 8-bit decrement loop (i.e. accomplishing a 16-bit decrement using two 8-bit decrements) you need to
    take into account that zero is not the same as carry and you will have to adjust your starting counts accordingly.

    You are right. My mistake. Using only the detection of zero a possible
    solution is doing a 16 bit decrement and for the low and high bytes
    an increment followed by a decrement. So the value of the register does
    not change but only the flags are set. A test costs 36 cycles if
    both registers have to be checked or at least 18 cycles if the low byte
    is not zero after the 16 Bit decrement.

    My initial thought was to save/restore A e.g.

    push h
    mov l,a
    mov a,e
    ora d
    mov a,l
    pop h

    That makes 38 cycles + 10 cycles for the jump which is more than my
    suggested code but it uses less bytes and is better readable.

    but hoping there was something smarter. I later discovered a logical error >in my code and after correcting it the problem went away...

    --
    Dipl.-Inform(FH) Peter Heitzer, peter.heitzer@rz.uni-regensburg.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russell Marks@21:1/5 to dxforth on Wed Aug 2 18:32:39 2023
    dxforth <dxforth@gmail.com> wrote:

    [Re: testing if a register pair is zero on an 8080]

    My initial thought was to save/restore A e.g.

    push h
    mov l,a
    mov a,e
    ora d
    mov a,l
    pop h

    but hoping there was something smarter. I later discovered a logical error in my code and after correcting it the problem went away...

    That makes it sound like you don't need the test any more, but I may
    as well still say that self-modifying code would be slightly faster:

    sta aop+1
    mov a,e
    ora d
    aop: mvi a,0

    -Rus.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Russell Marks on Thu Aug 3 12:04:32 2023
    On 3/08/2023 4:32 am, Russell Marks wrote:
    dxforth <dxforth@gmail.com> wrote:

    [Re: testing if a register pair is zero on an 8080]

    My initial thought was to save/restore A e.g.

    push h
    mov l,a
    mov a,e
    ora d
    mov a,l
    pop h

    but hoping there was something smarter. I later discovered a logical error >> in my code and after correcting it the problem went away...

    That makes it sound like you don't need the test any more, but I may
    as well still say that self-modifying code would be slightly faster:

    sta aop+1
    mov a,e
    ora d
    aop: mvi a,0

    Only 2 extra instructions which makes it very clean. Must remember it
    for next time!

    Despite the 8080 having few registers I'm often surprised by how few
    times that's been a problem.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Lougheed@21:1/5 to All on Wed Aug 2 20:13:20 2023
    8080 or 8085? If 8085, there is the undocumented instruction "DSUB" (08h), where:
    HL = HL - BC (Z, S, P, CY, AC and X5, V all flag receives influence)

    LXI H some number
    LXI B 1
    DSUB
    JZ (or JNZ) somewhere

    https://electronicerror.blogspot.com/2007/08/undocumented-flags-and-instructions.html

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Heitzer@21:1/5 to Mark Lougheed on Thu Aug 3 08:04:03 2023
    Mark Lougheed <mdlougheed@gmail.com> wrote:
    8080 or 8085? If 8085, there is the undocumented instruction "DSUB" (08h), where:
    HL = HL - BC (Z, S, P, CY, AC and X5, V all flag receives influence)

    LXI H some number
    LXI B 1
    DSUB
    JZ (or JNZ) somewhere

    https://electronicerror.blogspot.com/2007/08/undocumented-flags-and-instructions.html
    In 8080:
    PUSH B
    LXI B 0ffffh
    DAD B
    POP B
    JC iszero

    Needs 51 cycles and the counter to be tested must be in HL.

    --
    Dipl.-Inform(FH) Peter Heitzer, peter.heitzer@rz.uni-regensburg.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Ogden@21:1/5 to Peter Heitzer on Sun Aug 6 12:54:30 2023
    On Thursday, 3 August 2023 at 09:04:06 UTC+1, Peter Heitzer wrote:
    Mark Lougheed <mdlou...@gmail.com> wrote:
    8080 or 8085? If 8085, there is the undocumented instruction "DSUB" (08h), where:
    HL = HL - BC (Z, S, P, CY, AC and X5, V all flag receives influence)

    LXI H some number
    LXI B 1
    DSUB
    JZ (or JNZ) somewhere

    https://electronicerror.blogspot.com/2007/08/undocumented-flags-and-instructions.html
    In 8080:
    PUSH B
    LXI B 0ffffh
    DAD B
    POP B
    JC iszero

    Needs 51 cycles and the counter to be tested must be in HL.
    --
    Dipl.-Inform(FH) Peter Heitzer, peter....@rz.uni-regensburg.de
    Depending on usage, one option is to pre increment the two bytes of the count as in
    lxi b,count + 101h

    then the test for 0 can be done with
    dcr c
    jnz not0
    dcr b
    jnz not0
    ; count was zero

    or similar variants
    e.g.
    lxi b,count + 101h
    loop:
    do something
    ...
    dcr c
    jnz loop
    dcr b
    jnz loop
    ; all done

    this can lead to quicker code as the dcr b is only done once every time the c register is zero
    note if count is calculated, just do inr c, inr b

    Mark

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phillip Stevens@21:1/5 to dxforth on Mon Aug 7 21:11:19 2023
    On Wednesday, 2 August 2023, dxforth wrote:
    I have a count in a double register.
    Is there a way to decrement and test for 0 without using the A register?

    If the target is 8085 then there is an undocumented flag and jump instruction available.

    The flag is usually called K and I often use the JP NK instruction to emulate a z80 LDIR instruction. The K flag is set on 0 to 0xFFFF so the loop counter needs to be pre-decremented.

    Usage xamples are here. https://github.com/RC2014Z80/RC2014/blob/master/ROMs/CPM-IDE/acia85cf/cpm22preamble.asm#L34

    Of course this is only going to work if your target is actually 8085.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Phillip Stevens on Tue Aug 8 22:54:02 2023
    On 8/08/2023 2:11 pm, Phillip Stevens wrote:
    On Wednesday, 2 August 2023, dxforth wrote:
    I have a count in a double register.
    Is there a way to decrement and test for 0 without using the A register?

    If the target is 8085 then there is an undocumented flag and jump instruction available.

    The flag is usually called K and I often use the JP NK instruction to emulate a z80 LDIR instruction. The K flag is set on 0 to 0xFFFF so the loop counter needs to be pre-decremented.

    Usage xamples are here. https://github.com/RC2014Z80/RC2014/blob/master/ROMs/CPM-IDE/acia85cf/cpm22preamble.asm#L34

    Of course this is only going to work if your target is actually 8085.

    A pity Intel didn't seriously look at an extended instruction set.
    Why Faggin left?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phillip Stevens@21:1/5 to dxforth on Tue Aug 8 15:10:25 2023
    On Tuesday, 8 August 2023 at 22:54:05 UTC+10, dxforth wrote:
    On 8/08/2023 2:11 pm, Phillip Stevens wrote:
    Of course this is only going to work if your target is actually 8085.

    A pity Intel didn't seriously look at an extended instruction set.

    A bit OT but since you asked. The enhancements Intel made to the 8085, and then decided not to support, made the 8085 almost perfect imho. Undocumented stack access using DE instructions is much faster than IX/IY in the z80, and the 16 bit rotations
    really accelerated math (given no hardware multiply).

    Using these enhancements, and a “native” C compiler, the 8085 actually beats the z80 in some of our z88dk benchmarks.

    So one excellent code table (256) of instructions is a pretty nice design. https://feilipu.me/2021/09/27/8085-software/

    Why Faggin left?

    Wasn’t there, but I guess it is easier to get rich working for yourself. 😊 Something he perhaps wouldn’t have achieved at Intel.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Phillip Stevens on Wed Aug 9 12:34:34 2023
    On 9/08/2023 8:10 am, Phillip Stevens wrote:
    On Tuesday, 8 August 2023 at 22:54:05 UTC+10, dxforth wrote:
    On 8/08/2023 2:11 pm, Phillip Stevens wrote:
    Of course this is only going to work if your target is actually 8085.

    A pity Intel didn't seriously look at an extended instruction set.

    A bit OT but since you asked. The enhancements Intel made to the 8085, and then decided not to support, made the 8085 almost perfect imho. Undocumented stack access using DE instructions is much faster than IX/IY in the z80, and the 16 bit rotations
    really accelerated math (given no hardware multiply).

    Using these enhancements, and a “native” C compiler, the 8085 actually beats the z80 in some of our z88dk benchmarks.

    So one excellent code table (256) of instructions is a pretty nice design. https://feilipu.me/2021/09/27/8085-software/

    I assumed any undoc instructions would be either side-effects or planned but broken in some way. From what you say this doesn't appear to be the case here. Sounds like it was an executive decision to leave them out. Unfortunately the result is the same - software that uses undoc instructions have a certain odour to them.

    Why Faggin left?

    Wasn’t there, but I guess it is easier to get rich working for yourself. 😊
    Something he perhaps wouldn’t have achieved at Intel.

    Though for creative folk leaving tends to be more about frustration. For Intel execs it would have been about where best to invest their money.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Phillip Stevens@21:1/5 to All on Sat Aug 12 07:33:56 2023
    Using these enhancements, and a “native” C compiler, the 8085 actually beats the z80 in some of our z88dk benchmarks.

    So one excellent code table of (256) instructions is a pretty nice design.
    https://feilipu.me/2021/09/27/8085-software/
    I assumed any undoc instructions would be either side-effects or planned but broken in some way. From what you say this doesn't appear to be the case here.
    Sounds like it was an executive decision to leave them out. Unfortunately the
    result is the same - software that uses undoc instructions have a certain odour
    to them.

    I wouldn’t worry too much. Tundra Semiconductor licensed the 80c85 design and published them as “enhanced instructions” in their datasheet. Complete with their own mistakes and mis interpretation.

    http://images.100y.com.tw/pdf_file/34-TUNDRA-CA80C85B.pdf

    Ken Shirriff covers the reverse engineering in some details and corrects the error in the Tundra datasheet.
    His whole series is great reading on the 8085.

    http://www.righto.com/2013/02/looking-at-silicon-to-understanding.html?m=1

    So safe to use with no hesitation or odour. :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From George Phillips@21:1/5 to All on Sat Aug 12 11:57:55 2023
    If you're willing to alter the count you can do the 16 bit loop as two 8 bit tests which will be faster. For example, suppose your 16 bit loop count is in DE. This code will execute the loop DE times:

    ; transform DE to split count
    dcx d
    inr d
    inr e
    loop:
    ; ... processing here
    dcr e
    jnz loop
    dcr d
    jnz loop

    To see how it works consider the two cases going in where E is zero and not zero. Or, equivalently, when DE is a exact multiple of 256 or not. Note that the value of E never changes. If E is not zero then 1 is added to D. The inner loop will do E
    iterations and then after that it will do 256 iterations D times (the original value of D). Check it with $0003 and $0103 to get the idea.

    If E is zero then D is not altered either. The inner loop will always be 256 iterations and will be executed D times. Checking $0100 and $0000 will show the correctness. Like with a conventional 16 bit loop, $0000 means 65536 iterations.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to Phillip Stevens on Sun Aug 13 12:35:12 2023
    On 13/08/2023 12:33 am, Phillip Stevens wrote:
    Using these enhancements, and a “native” C compiler, the 8085 actually beats the z80 in some of our z88dk benchmarks.

    So one excellent code table of (256) instructions is a pretty nice design. >>> https://feilipu.me/2021/09/27/8085-software/
    I assumed any undoc instructions would be either side-effects or planned but >> broken in some way. From what you say this doesn't appear to be the case here.
    Sounds like it was an executive decision to leave them out. Unfortunately the
    result is the same - software that uses undoc instructions have a certain odour
    to them.

    I wouldn’t worry too much. Tundra Semiconductor licensed the 80c85 design and published them as “enhanced instructions” in their datasheet. Complete with their own mistakes and mis interpretation.

    http://images.100y.com.tw/pdf_file/34-TUNDRA-CA80C85B.pdf

    Ken Shirriff covers the reverse engineering in some details and corrects the error in the Tundra datasheet.
    His whole series is great reading on the 8085.

    http://www.righto.com/2013/02/looking-at-silicon-to-understanding.html?m=1

    So safe to use with no hesitation or odour. :-)

    Intel gave Tundra license to reproduce the silicon. Publishing undoc instructions
    that were never their own, not industry standard, and unbelievably - screwing it
    up - might give some customers pause to think. Marketing gimmick gone wrong?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From dxforth@21:1/5 to George Phillips on Sun Aug 13 12:47:05 2023
    On 13/08/2023 4:57 am, George Phillips wrote:
    If you're willing to alter the count you can do the 16 bit loop as two 8 bit tests which will be faster. For example, suppose your 16 bit loop count is in DE. This code will execute the loop DE times:

    ; transform DE to split count
    dcx d
    inr d
    inr e
    loop:
    ; ... processing here
    dcr e
    jnz loop
    dcr d
    jnz loop

    To see how it works consider the two cases going in where E is zero and not zero. Or, equivalently, when DE is a exact multiple of 256 or not. Note that the value of E never changes. If E is not zero then 1 is added to D. The inner loop will do E
    iterations and then after that it will do 256 iterations D times (the original value of D). Check it with $0003 and $0103 to get the idea.

    If E is zero then D is not altered either. The inner loop will always be 256 iterations and will be executed D times. Checking $0100 and $0000 will show the correctness. Like with a conventional 16 bit loop, $0000 means 65536 iterations.

    Also ROMable

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mark Ogden@21:1/5 to George Phillips on Sun Aug 13 08:29:00 2023
    On Saturday, 12 August 2023 at 19:57:57 UTC+1, George Phillips wrote:
    If you're willing to alter the count you can do the 16 bit loop as two 8 bit tests which will be faster. For example, suppose your 16 bit loop count is in DE. This code will execute the loop DE times:

    ; transform DE to split count
    dcx d
    inr d
    inr e
    loop:
    ; ... processing here
    dcr e
    jnz loop
    dcr d
    jnz loop

    To see how it works consider the two cases going in where E is zero and not zero. Or, equivalently, when DE is a exact multiple of 256 or not. Note that the value of E never changes. If E is not zero then 1 is added to D. The inner loop will do E
    iterations and then after that it will do 256 iterations D times (the original value of D). Check it with $0003 and $0103 to get the idea.

    If E is zero then D is not altered either. The inner loop will always be 256 iterations and will be executed D times. Checking $0100 and $0000 will show the correctness. Like with a conventional 16 bit loop, $0000 means 65536 iterations.
    If a count of zero is valid, then the tests need to be moved to the start of the loop
    ; de = count
    inr d ; or lxi d, count + 101h
    inr e
    endtest:
    dcr e
    jnz loop
    dcr d
    jz done
    loop:
    ; processing here
    jmp endtest

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)