Forum: >>> Magnum BBS <<<

Concedtina III May Be Returning

From John Savard@21:1/5 to All on Sun Aug 31 06:17:07 2025

I have had so much success in adjusting Concertina II to achieve my goals
more fully than I had thought possible... that I now think that it may be possible to proceed from Concertina II to a design which gets rid of the
one feature of Concertina II that has been the most controversial.

Yes, I think that I could actually do without block structure.

What would Concertina III look like?

Well, the basic instruction set would be similar to that of Concertina II.
But the P bits would be taken out of the operate instructions, and so
would the option of replacing a register specification by a pseudo-
immediate pointer.

The tiny gaps between the opcodes of some instructions to squeeze out
space for block headers would be removed.

But the big spaces for the shortest block header prefixes would be what is
used for doing without headers.

Instead of a block header being used to indicate code consisting of variable-length instructions, variable-length instructions would be
contained within a sequence of pairs of 32-bit instructions of this form:

11110xx(17 bits)(8 bits)
11111x(9 bits)(17 bits)

Instructions could be 17 bits long, 34 bits long, 51 bits long, and so on,
any multiple of 17 bits in length.

In the first instruction slot of the pair, the two bits xx indicate, for
the two 17-bit regions of the variable-length instruction area that start
in it, if they are the first 17-bit area of an instruction. The second instruction slot only contains the start of one 17-bit area, so only one
bit x is needed. Since 17 is an odd number, this meshes perfectly with the
fact that the 17-bit area which straddles both words isn't split evenly,
but rather one extra bit of it is in the second 32-bit instruction slot.

I had been hoping to use 18-bit areas instead, but after re-checking my calculations, I found there just wasn't enough opcode space.

Long instructions that contain immediates would not be part of variable-
length instruction code. Instead, their lengths would be multiples of 32
bits, making them part of ordinary code with 32-bit instructions.

Their form would be like this:

32-bit immediate:

1010(12 bits)(16 bits)
10111(11 bits)(16 bits)'

where the first parenthesized area belongs to the instruction, and the
second to the immediate.

48-bit immediate:

1010(12 bits)(16 bits)
10110(11 bits)(16 bits)
10111(11 bits)(16 bits)

64-bit immediate:

1010(12 bits)(16 bits)
10110(3 bits)(24 bits)
10111(3 bits)(24 bits)

Since the instruction, exclusive of the immediate, really only needs 12
bits - 7 bit opcode, and 5 bit destination register - in each case there's enough additional space for the instruction to begin with a few bits that indicates its length, so that decoding is simple.

The scheme is not really space-efficient.

But the question that I really have is... is this really any better than
having block headers? Or is it just as bad, just as complicated?

John Savard

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Savard@21:1/5 to BGB on Tue Sep 2 09:15:59 2025

On Sun, 31 Aug 2025 13:12:52 -0500, BGB wrote:

How about, say, 16/32/48/64/96:
xxxx-xxxx-xxxx-xxx0 //16 bit
xxxx-xxxx-xxxx-xxxx-xxxx-xxxx-xxyy-yyy1 //32 bit
xxxx-xxxx-xxxx-xxxx-xxxx-xxxx-xx11-1111 //64/48/96 bit prefix

Already elaborate enough...

Thank you for your interesting suggestions.

I'm envisaging Concertina III as closely based on Concertina II, with only minimal changes.

Like Concertina II, it is to meet the overriding condition that
instructions do not have to be decoded sequentially. This means that
whenever an instruction, or group of instructions, spans more than 32
bits, the 32 bit areas of the instruction, other than the first, must
begin with a combination of bits that says "don't decode me".

The first 32 bits of an instruction get decoded directly, and then trigger
and control the decoding of the rest of the instruction.

This has the consequence that any immediate value that is 32 bits or more
in length has to be split up into smaller pieces; this is what I really
don't like about giving up the block structure.

John Savard

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup@21:1/5 to All on Tue Sep 2 18:40:16 2025

John Savard <quadibloc@invalid.invalid> posted:

On Sun, 31 Aug 2025 13:12:52 -0500, BGB wrote:

How about, say, 16/32/48/64/96:
xxxx-xxxx-xxxx-xxx0 //16 bit
xxxx-xxxx-xxxx-xxxx-xxxx-xxxx-xxyy-yyy1 //32 bit
xxxx-xxxx-xxxx-xxxx-xxxx-xxxx-xx11-1111 //64/48/96 bit prefix

Already elaborate enough...

Thank you for your interesting suggestions.

I'm envisaging Concertina III as closely based on Concertina II, with only minimal changes.

Like Concertina II, it is to meet the overriding condition that
instructions do not have to be decoded sequentially. This means that
whenever an instruction, or group of instructions, spans more than 32
bits, the 32 bit areas of the instruction, other than the first, must
begin with a combination of bits that says "don't decode me".

The first 32 bits of an instruction get decoded directly, and then trigger and control the decoding of the rest of the instruction.

This has the consequence that any immediate value that is 32 bits or more
in length has to be split up into smaller pieces; this is what I really
don't like about giving up the block structure.

I found this completely unnecessary.

Only a small number of Major OpCodes can have constants, denoted by:: 0b'001xxxdd dddsssss D12dsmin orxsssss

D=0 signifies '1' and '2' specify 5-bit immediates
D=1 signifies a constant
d=0 signifies 32-bit constant
d=1 signifies 64-bit constant
'1' signifies negation of Src1
'2' signifies negation of Src2

In effect, D12ds is a routing specifier, telling DECODE what to route
where in an easy to determine pattern. You could go so far as to call
it a routing OpCode. This field is a large contributor to how My 66000
requires fewer instructions than Other ISAs.

However, I also found that STs need an immediate and a displacement, so,
Major == 0b'001001 and minor == 0b'011xxx has 4 ST instructions with
potential displacement (from D12ds above) and the immediate has the
size of the ST. This provides for::
std #4607182418800017408,[r3,r2<<3,96]

Lest one thinks this results in serial decoding, consider that the
pattern decoder is 40 gates (just larger than 3-flip-flops) so one
can afford to put this pattern decoder on every word in the inst-
buffer and then inst[0] selects inst[1], but inst[1] has already
selected inst[2] which has selected inst[3] and we have a tree
pattern that can parse 16-instructions in a 16-gate cycle time
from a 24-32 word input-buffer to DECODE. I call this stage of
the pipeline PARSE.

Also note that 1 My 66000 instruction does the work of 1.4 RISC-V
instructions, so, a 6-wide My 66000 machine is equivalent to a
8.4-to-9 wide RISC-V machine.

John Savard

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (2 / 14)
Uptime:	17:11:29
Calls:	10,389
Files:	14,061
Messages:	6,416,946

Concedtina III May Be Returning

Who's Online

System Info