Forum: >>> Magnum BBS <<<

Re: Chipsandcheese article on the CDC6600

From Lawrence D'Oliveiro@21:1/5 to Thomas Koenig on Sat Jul 20 09:08:47 2024

On Sat, 20 Jul 2024 08:35:20 -0000 (UTC), Thomas Koenig wrote:

https://chipsandcheese.com/2024/04/01/inside-control-data-corporations-cdc-6600/

Note the date. ;)

It is true, though, that old Seymour Cray had little truck with fancy “operating systems” and “memory management” and “paging” and “protected mode” and all that jazz. He was just after speed, speed,
speed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to All on Sat Jul 20 08:35:20 2024

Not sure who follows this web site, it has quite some interesting
articles, usually on newer processor designs. Here's one that's
maybe not quite so serious, but amusing and informative at the
same time:

https://chipsandcheese.com/2024/04/01/inside-control-data-corporations-cdc-6600/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Terje Mathisen@21:1/5 to Thomas Koenig on Sat Jul 20 19:52:13 2024

Thomas Koenig wrote:

Not sure who follows this web site, it has quite some interesting
articles, usually on newer processor designs. Here's one that's
maybe not quite so serious, but amusing and informative at the
same time:

https://chipsandcheese.com/2024/04/01/inside-control-data-corporations-cdc-6600/

I really like their "tongue-in-cheek" delivery. :-)

"Frontend: Branch Prediction
There is no branch prediction."

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to All on Sat Jul 20 22:18:42 2024

Errors in the text::

a) the use of the word non-blocking n the sense that increment does not
block MUL is weird, we use the tern nowadays within a FU not between.

b) CDC 6600 is NOT in-order, the whole motivation for the scoreboard is
to enable Out-or-Order execution.

c) At its introduction, the word cache had not yet been invented--so
sure it had no cache and thereby had to have powerful central memory.
(banked and interleaved).

d) It did have a branch predictor--it predicted backwards branches were
taken.

e) The strange notion of self modifying code needing to be 32
instructions
in advance of the instruction being modified is also weird.

f) It is not In Order, the scoreboard tacks RAW and WAR hazards and
prevents WAW from passing out of the decoder. Then execution order
is based on the availability of operands (typical) and the
availability
of a result bus (novel).

g) Oh and BTW, there were 20 ports to the register fileS

h) the 2 increment units were 2 sets of operand registers but 1 ALU, and
ALU could not do back to back integer ADDs.

Other than that, not bad.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Terje Mathisen on Sun Jul 21 01:06:02 2024

On Sat, 20 Jul 2024 19:52:13 +0200, Terje Mathisen wrote:

"Frontend: Branch Prediction There is no branch prediction."

Is that like an American car? Great on the straights, not so good on
cornering.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Savard@21:1/5 to ldo@nz.invalid on Sun Jul 21 07:35:55 2024

On Sat, 20 Jul 2024 09:08:47 -0000 (UTC), Lawrence D'Oliveiro
<ldo@nz.invalid> wrote:

On Sat, 20 Jul 2024 08:35:20 -0000 (UTC), Thomas Koenig wrote:

https://chipsandcheese.com/2024/04/01/inside-control-data-corporations-cdc-6600/

Note the date. ;)

That bit at the end, about how the world would need maybe only ten
computers, because the 6600 is so powerful, and out-of-order machines
running at 5 GHz would never be needed, of course, is the funny
part...

John Savard

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to John Savard on Sun Jul 21 22:01:59 2024

On Sun, 21 Jul 2024 07:35:55 -0600, John Savard wrote:

That bit at the end, about how the world would need maybe only ten computers, because the 6600 is so powerful, and out-of-order machines
running at 5 GHz would never be needed, of course, is the funny part...

Also a dig at IBM, from whose (early) boss (Watson Sr?) the quote about
the world only needing ten computers is supposed to have originated.

The 6600 (and its siblings) also popularized the term “supercomputer”.
When people at the time thought “computer”, they thought “IBM”. But here
was a machine that was so far ahead in performance, it left IBM (and
everybody else) in the dust.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Lawrence D'Oliveiro on Sun Jul 21 22:21:41 2024

On Sun, 21 Jul 2024 22:01:59 +0000, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 07:35:55 -0600, John Savard wrote:

That bit at the end, about how the world would need maybe only ten
computers, because the 6600 is so powerful, and out-of-order machines
running at 5 GHz would never be needed, of course, is the funny part...

Also a dig at IBM, from whose (early) boss (Watson Sr?) the quote about
the world only needing ten computers is supposed to have originated.

The 6600 (and its siblings) also popularized the term “supercomputer”. When people at the time thought “computer”, they thought “IBM”. But here
was a machine that was so far ahead in performance, it left IBM (and everybody else) in the dust.

I

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Mon Jul 22 00:05:40 2024

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in the
laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively junior
programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost our
industry leadership position by letting someone else offer the
world’s most powerful computer.

as quoted in Charles J Murray’s 1997 book “The Supermen”, page 93.
Good description of the history of CDC and Cray Research/Cray
Computer. Notwithstanding the plural in its title, it primarily
focuses on Seymour Cray, and is not so complimentary about the work of
some others, notably Steve Chen.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Niklas Holsti@21:1/5 to Lawrence D'Oliveiro on Mon Jul 22 10:43:44 2024

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in the
laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively junior
programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost our
industry leadership position by letting someone else offer the
world’s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson has answered his own question."

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Niklas Holsti on Mon Jul 22 13:08:27 2024

On Mon, 22 Jul 2024 10:43:44 +0300
Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in
the laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively
junior programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost
our industry leadership position by letting someone else offer the
world’s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson
has answered his own question."

At the end, the influence of 6600 on computers we use today is close to
zero. On the other hand, influence of S/360 Model 85 is massive and
influence of S/360 Model 91 is significant, although far less than the
credit it is often given in popular articles.
Back at their time 6600 was huge success and both Model 85 and Model 91
were probably considered failures.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andreas Eder@21:1/5 to Michael S on Mon Jul 22 13:46:20 2024

On Mo 22 Jul 2024 at 13:08, Michael S <already5chosen@yahoo.com> wrote:

On Mon, 22 Jul 2024 10:43:44 +0300
Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in
the laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively
junior programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost
our industry leadership position by letting someone else offer the
world’s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson
has answered his own question."

At the end, the influence of 6600 on computers we use today is close to
zero. On the other hand, influence of S/360 Model 85 is massive and
influence of S/360 Model 91 is significant, although far less than the
credit it is often given in popular articles.
Back at their time 6600 was huge success and both Model 85 and Model 91
were probably considered failures.

And what does that tell us about computer architecture? Not much. But it
says sonething market dominance and power.

'Andreas

--
ceterum censeo redmondinem esse delendam

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Mon Jul 22 13:41:40 2024

On Mon, 22 Jul 2024 10:08:27 +0000, Michael S wrote:

On Mon, 22 Jul 2024 10:43:44 +0300
Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in
the laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively
junior programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost
our industry leadership position by letting someone else offer the
world’s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson
has answered his own question."

At the end, the influence of 6600 on computers we use today is close to
zero.

CDC 6600 had a RISC instruction set !!

On the other hand, influence of S/360 Model 85 is massive and influence of S/360 Model 91 is significant, although far less than the
credit it is often given in popular articles.
Back at their time 6600 was huge success and both Model 85 and Model 91
were probably considered failures.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Michael S on Mon Jul 22 12:52:35 2024

Michael S <already5chosen@yahoo.com> writes:

At the end, the influence of 6600 on computers we use today is close to
zero. On the other hand, influence of S/360 Model 85 is massive and
influence of S/360 Model 91 is significant, although far less than the
credit it is often given in popular articles.

Yes, all modern computers have virtual memory (which started with
Atlas (and later S/360 Model 67), they have caches, which started with
Titan (and later S/360 Model 85), they have reservation stations
(which started with S/360 Model 91).

However, the main reason why reservation stations won is because
hardware branch prediction outpaced compiler branch prediction since
the early 1990s*, and because the reorder buffer was invented, neither
of which is due to anything done in any S/360 model or the CDC 6600).

If hardware branch prediction had never been invented or had turned
out to be a dud, maybe we would all be using EPIC architectures that
use scoreboards rather then reservation stations; or maybe the
register interlocks that were used in advanced in-order RISCs (those
that Mitch Alsup calls OoO) and AFAIK in IA-64 implementations were
good enough and one would have done without scoreboard.

[*] More supercomputing-oriented people may claim that it has to do
with the number of in-flight memory accesses, but actually IA-64 shone
on SPEC FP (where in-flight memory accesses are more important than
for SPECint), so it seems that there are ways to get the needed
in-flight memory accesses with in-order execution.

Back at their time 6600 was huge success and both Model 85 and Model 91
were probably considered failures.

From what I read, the Model 91 was considered a technical (and
marketing) success, but commercially a failure (sold at a loss, and
therefore quickly canceled). But apparently the market benefit was
enough that they then built the 360/195 and 370/195. 15 91s were
built and about 20 195s. The 195 was withdrawn in 1977, and AFAIK
that was the end of IBM's supercomputing ambitions for a while. This
may have had to do with the introduction of the Cray-1 in 1976 or the
IBM 3033 in 1977. IBM eventually announced the optional vector
facility for the 3090 in 1985. OoO processing vanished from S/360
successors with the 195 and only reappeared quite a while after it had
appeared in Intel and RISC CPUs.

The Model 85 was built only 30 times, but it was the fastest IBM
machine until the Model 195 (the CDC 7600 was faster, though, and also
a bit faster than the 195). And the cache was then included in the
Model 195, and on the 303x, and I expect all later IBM mainframes. So
that certainly was a success.

Concerning the CDC 6600, the barrel processor features of its PP can
be considered predecessors of modern SMT (about as close as Model 91 reservation stations are to modern OoO).

Concerning the Atlas, it's interesting that a project intended as a supercomputer introduced virtual memory, while Cray rejected VM for as
long as he designed the CPUs.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to mitchalsup@aol.com on Mon Jul 22 17:51:21 2024

On Mon, 22 Jul 2024 13:41:40 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

On Mon, 22 Jul 2024 10:08:27 +0000, Michael S wrote:

On Mon, 22 Jul 2024 10:43:44 +0300
Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24
people, including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which
they officially announced their 6600 system. I understand that in
the laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively
junior programmer. Contrasting this modest effort with our own
vast development activities, I fail to understand why we have lost
our industry leadership position by letting someone else offer the
world’s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson
has answered his own question."

At the end, the influence of 6600 on computers we use today is
close to zero.

CDC 6600 had a RISC instruction set !!

It has RISC-like features, most importantly load-store architecture.
Simple instruction formats and only two instruction width options also
sound RISCy. But hard coupling between A registers and X registers does
not look like RISC to me.
May be, I didn't think enough about it.

Anyway, it seems to me that modern RISC was re-invented more or less
from scratch in the late 70s and influence of 6600 architecture likely
didn't play major role in that process.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Anton Ertl on Mon Jul 22 15:09:22 2024

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

Michael S <already5chosen@yahoo.com> writes:

At the end, the influence of 6600 on computers we use today is close to >>zero. On the other hand, influence of S/360 Model 85 is massive and >>influence of S/360 Model 91 is significant, although far less than the >>credit it is often given in popular articles.

Yes, all modern computers have virtual memory (which started with
Atlas (and later S/360 Model 67), they have caches, which started with
Titan (and later S/360 Model 85), they have reservation stations
(which started with S/360 Model 91).

However, the main reason why reservation stations won is because
hardware branch prediction outpaced compiler branch prediction since
the early 1990s*, and because the reorder buffer was invented, neither
of which is due to anything done in any S/360 model or the CDC 6600).

FWIW, The burroughs medium systems had hardware branch
prediction circa 1979. It had some warts, however,
when used in SMP configurations.

Basically, the hardware would modify the branch opcode in
memory after every branch to track the last two taken/not-taken
states.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Anton Ertl on Mon Jul 22 15:10:50 2024

On Mon, 22 Jul 2024 12:52:35 +0000, Anton Ertl wrote:

Michael S <already5chosen@yahoo.com> writes:

At the end, the influence of 6600 on computers we use today is close to >>zero. On the other hand, influence of S/360 Model 85 is massive and >>influence of S/360 Model 91 is significant, although far less than the >>credit it is often given in popular articles.

Yes, all modern computers have virtual memory (which started with
Atlas (and later S/360 Model 67), they have caches, which started with
Titan (and later S/360 Model 85), they have reservation stations
(which started with S/360 Model 91).

However, the main reason why reservation stations won is because
hardware branch prediction outpaced compiler branch prediction since
the early 1990s*, and because the reorder buffer was invented, neither
of which is due to anything done in any S/360 model or the CDC 6600).

If hardware branch prediction had never been invented or had turned
out to be a dud, maybe we would all be using EPIC architectures that
use scoreboards rather then reservation stations; or maybe the

CDC 7600 predicted backwards branches to be taken and that this was
worth a handful of % in performance gain:: and used no storage to
do it. So an "as dumb as possible" predictor delivered gains.

It is all uphill from there.

register interlocks that were used in advanced in-order RISCs (those
that Mitch Alsup calls OoO) and AFAIK in IA-64 implementations were
good enough and one would have done without scoreboard.

Register interlocks is the means to allow GHW to move instructions
around in the pipeline--you just have to obey RAW, WAR, and WAW
hazards.

[*] More supercomputing-oriented people may claim that it has to do
with the number of in-flight memory accesses, but actually IA-64 shone
on SPEC FP (where in-flight memory accesses are more important than
for SPECint), so it seems that there are ways to get the needed
in-flight memory accesses with in-order execution.

IA-64 had 2× the number of pins compared to its x86 brethren.
No wonder it could consume more BW.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to Scott Lurndal on Mon Jul 22 15:42:18 2024

Scott Lurndal <scott@slp53.sl.home> schrieb:

Basically, the hardware would modify the branch opcode in
memory after every branch to track the last two taken/not-taken
states.

Ingenious.

Talk about self-modifying code...

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Scott Lurndal on Mon Jul 22 16:33:28 2024

scott@slp53.sl.home (Scott Lurndal) writes:

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

However, the main reason why reservation stations won is because
hardware branch prediction outpaced compiler branch prediction since
the early 1990s*, and because the reorder buffer was invented, neither
of which is due to anything done in any S/360 model or the CDC 6600).

FWIW, The burroughs medium systems had hardware branch
prediction circa 1979. It had some warts, however,
when used in SMP configurations.

Basically, the hardware would modify the branch opcode in
memory after every branch to track the last two taken/not-taken
states.

That sounds like the two-bit scheme that early conditional branch
predictors used, but inlined in the instructions and thus
architecturally visible (whereas branch prediction is normally microarchitecture, i.e., architecturally invisible). These two-bit
schemes were about as good as the better compiler-based schemes (IIRC
10% mispredictions on typical integer code). It was branch prediction
schemes with global history tables etc. that made hardware branch
prediction much more accurate and meant that OoO execution
outperformed EPIC.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to mitchalsup@aol.com on Mon Jul 22 19:31:24 2024

On Mon, 22 Jul 2024 15:10:50 +0000
mitchalsup@aol.com (MitchAlsup1) wrote:

IA-64 had 2× the number of pins compared to its x86 brethren.
No wonder it could consume more BW.

Madison Itanium with 400 MHz FSB had the same theoretical memory
bandwidth (6.4 GB/s) as AMD S939/S940 K8 Opteron/Athlon64 processors.
And somewhat lower practical bandwidth because of higher latency caused
by off-chip memory controller.
Despite that its SPECfp2000 scores were slightly (5-6%) higher.

Contemporary Intel x86 processors (Nocona Xeon) had twice narrower
data bus, but it was running at twice higher data rate. So, at least theoretically, peak bandwidth was the same. Practically, bandwidth was
somewhat lower because source-synchronous bus of Xeon had higher
latency than simpler clock-synchronous bus of Itanium2.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to mitchalsup@aol.com on Mon Jul 22 16:40:48 2024

mitchalsup@aol.com (MitchAlsup1) writes:

If hardware branch prediction had never been invented or had turned
out to be a dud, maybe we would all be using EPIC architectures that
use scoreboards rather then reservation stations; or maybe the

CDC 7600 predicted backwards branches to be taken

That's a primitive form of compiler branch prediction. More advanced
schemes had a direction hint in the instruction. These schemes are
not hardware branch prediction as far as "compiler vs. hardware branch prediction" is concerned.

register interlocks that were used in advanced in-order RISCs (those
that Mitch Alsup calls OoO) and AFAIK in IA-64 implementations were
good enough and one would have done without scoreboard.

Register interlocks is the means to allow GHW to move instructions
around in the pipeline--you just have to obey RAW, WAR, and WAW
hazards.

What is GHW? Stanford MIPS and most of MIPS R2000/R3000 moved
instructions in the pipeline without interlocks. It's in their name: Microprocessor without interlocked pipeline stages.

[*] More supercomputing-oriented people may claim that it has to do
with the number of in-flight memory accesses, but actually IA-64 shone
on SPEC FP (where in-flight memory accesses are more important than
for SPECint), so it seems that there are ways to get the needed
in-flight memory accesses with in-order execution.

IA-64 had 2× the number of pins compared to its x86 brethren.
No wonder it could consume more BW.

Did not help it a bit with integer code. If the myth was true that
only OoO enables many in-flight memory accesses, it would not help for bandwidth-hungry code, either. The fact that IA-64 implementations
could make use of the bandwidth busts that myth.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Anton Ertl on Mon Jul 22 18:23:02 2024

On Mon, 22 Jul 2024 16:40:48 +0000, Anton Ertl wrote:

What is GHW?

HW with a G that should not have been there.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lynn Wheeler@21:1/5 to Anton Ertl on Mon Jul 22 13:09:12 2024

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

From what I read, the Model 91 was considered a technical (and
marketing) success, but commercially a failure (sold at a loss, and
therefore quickly canceled). But apparently the market benefit was
enough that they then built the 360/195 and 370/195. 15 91s were
built and about 20 195s. The 195 was withdrawn in 1977, and AFAIK
that was the end of IBM's supercomputing ambitions for a while. This
may have had to do with the introduction of the Cray-1 in 1976 or the
IBM 3033 in 1977. IBM eventually announced the optional vector
facility for the 3090 in 1985. OoO processing vanished from S/360
successors with the 195 and only reappeared quite a while after it had appeared in Intel and RISC CPUs.

... also shortly after joining IBM, was asked if I could help with
project to multi-thread 370/195 .... also from acs end page: https://people.computing.clemson.edu/~mark/acs_end.html Sidebar:
Multithreading

In summer 1968, Ed Sussenguth investigated making the ACS/360 into a multithreaded design by adding a second instruction counter and a second
set of registers to the simulator. Instructions were tagged with an
additional "red/blue" bit to designate the instruction stream and
register set; and, as was expected, the utilization of the functional
units increased since more independent instructions were available.

IBM patents and disclosures on multithreading include:

US Patent 3,728,692, J.W. Fennel, Jr., "Instruction selection in a
two-program counter instruction unit," filed August 1971, and issued
April 1973.

US Patent 3,771,138, J.O. Celtruda, et al., "Apparatus and method for serializing instructions from two independent instruction streams,"
filed August 1971, and issued November 1973. [Note that John Earle is
one of the inventors listed on the '138.]

"Multiple instruction stream uniprocessor," IBM Technical Disclosure
Bulletin, January 1976, 2pp. [for S/370]

... snip ...

370/195 had 64 instruction pipeline and could do out-of-order ... but
didn't have branch prediction or speculative executive ... so
conditional branches drained pipeline and most codes ran at half 195
rated throubhput. Simulating multiprocessor with red/blue instruction
streams ... could get two half-rate streams running 195 a full speed
(modulo MVT/MVS two processor support only having 1.2-1.5 throughput of
single processor). The whole thing was shutdown when it was decided to
add virtual memory to all 370s ... which was decided not practical for
195.

z195 (july2010) documents claim that half of the per-processor
improvement in mip rate (compared to the previous z10) is due to
introduction of introduction of things like out-of-order.

--
virtualization experience starting Jan1968, online at home since Mar1970

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lynn Wheeler@21:1/5 to Michael S on Mon Jul 22 12:57:03 2024

Michael S <already5chosen@yahoo.com> writes:

At the end, the influence of 6600 on computers we use today is close to
zero. On the other hand, influence of S/360 Model 85 is massive and
influence of S/360 Model 91 is significant, although far less than the
credit it is often given in popular articles.
Back at their time 6600 was huge success and both Model 85 and Model 91
were probably considered failures.

Amdahl wins battle to make ACS, 360 compatible, then shortly later ACS
was shutdown (folklore is executives felt it would advance
state-of-the-art too fast and IBM would loose control of the market),
Amdahl leaves IBM shortly later ... lots of history, including some of
the ACS features show up more than two decades later with ES/9000 https://people.computing.clemson.edu/~mark/acs_end.html
Of the 26,000 IBM computer systems in use, 16,000 were S/360 models
(that is, over 60%). [Fig. 1.311.2]

Of the general-purpose systems having the largest fraction of total
installed value, the IBM S/360 Model 30 was ranked first with 12%
(rising to 17% in 1969). The S/360 Model 40 was ranked second with 11%
(rising to almost 15% in 1970). [Figs. 2.10.4 and 2.10.5]

Of the number of operations per second in use, the IBM S/360 Model 65
ranked first with 23%. The Univac 1108 ranked second with slightly over
14%, and the CDC 6600 ranked third with 10%. [Figs. 2.10.6 and 2.10.7]

... snip ...

old email:

To: wheeler
Date: 04/23/81 09:57:42

your ramblings concerning the corp(se?) showed up in my reader
yesterday. like all good net people, i passed them along to 3 other
people. like rabbits interesting things seem to multiply on the
net. many of us here in pok experience the sort of feelings your mail
seems so burdened by: the company, from our point of view, is out of
control. i think the word will reach higher only when the almighty $$$
impact starts to hit. but maybe it never will. its hard to imagine one
stuffed company president saying to another (our) stuffed company
president i think i'll buy from those inovative freaks down the
street. '(i am not defending the mess that surrounds us, just trying to understand why only some of us seem to see it).

bob tomasulo and dave anderson, the two poeple responsible for the model
91 and the (incredible but killed) hawk project, just left pok for the
new stc computer company. management reaction: when dave told them he
was thinking of leaving they said 'ok. 'one word. 'ok. ' they tried to
keep bob by telling him he shouldn't go (the reward system in pok could
be a subject of long correspondence). when he left, the management
position was 'he wasn't doing anything anyway. '

in some sense true. but we haven't built an interesting high-speed
machine in 10 years. look at the 85/165/168/3033/trout. all the same
machine with treaks here and there. and the hordes continue to sweep in
with faster and faster machines. true, endicott plans to bring the
low/middle into the current high-end arena, but then where is the
high-end product development?

... snip ...

first part of 70s, IBM had the Future System effort, completely
different from 370 and going to completely replace it (internal politics
during FS was killing off 370 products, the lack of new 370 during FS is credited with giving clone 370 makers their market foothold, including
Amdahl) ... when FS implodes there was mad rush to get new stuff back
into the 370 product pipelines including kicking off quick&dirty
3033&3081 efforts in parallel.
http://www.jfsowa.com/computer/memo125.htm https://people.computing.clemson.edu/~mark/fs.html

370/xa was referred to "811" for the architecture/design documents'
Nov1978 publication date, nearly all of it was to address MVS short
comings (aka head of POK had shortly before managed to convince
corporate to kill the VM370 product, shutdown the development group and
have all the people transferred to POK for MVS/XA; Endicott did
eventually manage to save the VM370 product mission ... for the
"mid-range).

trivia: when I joined IBM, one of my hobbies was enhanced production
operating system for internal datacenters (including world-wide
sales&market support HONE). In the original morph of CP67->VM370, lots
of stuff was dropped and/or simplified (including multiprocessor
support). In 1974, I start moving a bunch of stuff to VM370R2, including
kernel reorg for multiprocessor support, but not actual multiprocessor
support itself. In 1975, I move my CSC/VM system to VM370R3 and add multiprocessor support, originally for the US consolidated
sales&marketing support HONE datacenters up in Palo Alto (the
consolidated US systems had been consolidated into a single system
image, loosely-coupled, shared DASD operation with load-balancing and
fall-over (one of the largest such complexes in the world). The
multiprocessor support allowed them to add a 2nd processor to each
system (making it the largest in the world, airlines' TPF had similar shared-dasd complexes, but TPF didn't get SMP support for another
decade). I had done some hacks in order to get two processor system
twice the throughput of single process (at the time MVS documentation
was two processor MVS had 1.2-1.5 times the thoughput of a single
processor).

With the implosion of FS (and the demise of the VM370 development group)
... I got roped into helping with a 16-processor 370 SMP and we con'ed
the 3033 processor engineers into working on it in their spare time (a
lot more interesting than remapping 168 logic to 20% faster chips).
Everybody thought it was great until somebody tells the had of POK, it
could be decades before the POK favorite son operating system (MVS)
would have (effective) 16-processor support (POK doesn't ship a
16-processor system until after the turn of the century) and the head of
POK invites some of us to never visit POK again (and tells the 3033
processor engineers, heads down on 3033 and no distractions). Some POK executives were also out bullying internal datacenters (including HONE)
that they had to convert from VM370 to MVS. Once 3033 was out the door,
they start on trout/3090.

trivia: Jan1979, I was con'ed into doing a 6600 forttran benchmark on an engineering IBM4341 (mid-range), for a national lab that was looking at
getting 70 for a compute farm (sort of the leading edge of the coming
cluster supercomputing tsunami) ... the engineering 4341 benchmark was
slightly slower than 6600 but production machines that shipped, were
slightly faster.

--
virtualization experience starting Jan1968, online at home since Mar1970

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Tue Jul 23 00:20:46 2024

On Mon, 22 Jul 2024 13:08:27 +0300, Michael S wrote:

At the end, the influence of 6600 on computers we use today is close to
zero.

He pioneered pipelining and multiple function units. He went on to pioneer vector processing (long vectors, not the short-vector SIMD stuff that
infests CPU designs today). He was always very conservative in the
fabrication technologies he adopted, but he was brilliant at pushing them
to their limits.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Savard@21:1/5 to niklas.holsti@tidorum.invalid on Mon Jul 22 18:18:52 2024

On Mon, 22 Jul 2024 10:43:44 +0300, Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watson�s memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in the
laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively junior
programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost our
industry leadership position by letting someone else offer the
world�s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson has >answered his own question."

While IBM did not appear to understand the wisdom in Cray's remark at
the time - that a large organization can have internal politics and communications overhead and other things that hamper innovation - it
did _eventually_ learn its lesson.

So when it came time for IBM to make its mark in the new, emerging
field of microcomputers, it had a small team, working in isolation
from the rest of IBM, go and design the IBM Personal Computer in all
its 4.77 MHz 8088 glory.

John Savard

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Michael S on Tue Jul 23 00:21:44 2024

On Mon, 22 Jul 2024 17:51:21 +0300, Michael S wrote:

On Mon, 22 Jul 2024 13:41:40 +0000 mitchalsup@aol.com (MitchAlsup1)
wrote:

CDC 6600 had a RISC instruction set !!

It has RISC-like features, most importantly load-store architecture.
Simple instruction formats and only two instruction width options also
sound RISCy. But hard coupling between A registers and X registers does
not look like RISC to me.

RISC-V is adopting a very similar idea, in preference to the widespread
current SIMD fashion.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Lynn Wheeler on Tue Jul 23 00:22:51 2024

On Mon, 22 Jul 2024 12:57:03 -1000, Lynn Wheeler wrote:

Of the general-purpose systems having the largest fraction of total
installed value ...

What an odd way to measure market share, that will naturally favour more expensive machines over the more popular ones.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to John Savard on Tue Jul 23 00:38:53 2024

On Tue, 23 Jul 2024 0:18:52 +0000, John Savard wrote:

On Mon, 22 Jul 2024 10:43:44 +0300, Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24 people,
including the janitor, beat IBM ??

From Watsons memo:

Last week Control Data had a press conference during which they
officially announced their 6600 system. I understand that in the
laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively junior
programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost our
industry leadership position by letting someone else offer the
worlds most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson has >>answered his own question."

While IBM did not appear to understand the wisdom in Cray's remark at
the time - that a large organization can have internal politics and communications overhead and other things that hamper innovation - it
did _eventually_ learn its lesson.

The largest a team should be is 11 + leader--Jesus tried for 12 and
failed.

So when it came time for IBM to make its mark in the new, emerging
field of microcomputers, it had a small team, working in isolation
from the rest of IBM, go and design the IBM Personal Computer in all
its 4.77 MHz 8088 glory.

John Savard

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Anton Ertl on Tue Jul 23 01:53:13 2024

On Mon, 22 Jul 2024 12:52:35 GMT, Anton Ertl wrote:

Yes, all modern computers have virtual memory (which started with Atlas
(and later S/360 Model 67), they have caches, which started with Titan
(and later S/360 Model 85), they have reservation stations (which
started with S/360 Model 91).

What happened to the 360/90? That was IBM’s long-promised answer to the
CDC 6600. Was it ever more than vapourware? In the end, it was a classic
case of over-promising and under-delivering.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to All on Tue Jul 23 01:56:25 2024

On Tue, 23 Jul 2024 00:38:53 +0000, MitchAlsup1 wrote:

The largest a team should be is 11 + leader--Jesus tried for 12 and
failed.

CDC had a team about 3× that size and succeeded.

Just as well we don’t take religious texts as serious guides to anything important, do we? ;)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Lynn Wheeler on Tue Jul 23 01:58:21 2024

On Mon, 22 Jul 2024 13:09:12 -1000, Lynn Wheeler wrote:

IBM patents and disclosures on multithreading include:

...

The whole thing was shutdown when it was decided to
add virtual memory to all 370s ... which was decided not practical for
195.

Yet another bit of evidence that you don’t need to prove an idea works
before getting a patent on it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to John Savard on Tue Jul 23 01:55:00 2024

On Mon, 22 Jul 2024 18:18:52 -0600, John Savard wrote:

While IBM did not appear to understand the wisdom in Cray's remark at
the time - that a large organization can have internal politics and communications overhead and other things that hamper innovation - it did _eventually_ learn its lesson.

So when it came time for IBM to make its mark in the new, emerging field
of microcomputers, it had a small team, working in isolation from the
rest of IBM, go and design the IBM Personal Computer in all its 4.77 MHz
8088 glory.

That turned out to be an exception (a temporary lapse in corporate
control) rather than a rule. IBM as a whole learned no such lesson, as the subsequent PS/2 and OS/2 development showed.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Tue Jul 23 01:59:12 2024

On Mon, 22 Jul 2024 15:09:22 GMT, Scott Lurndal wrote:

Basically, the hardware would modify the branch opcode in memory after
every branch to track the last two taken/not-taken states.

Did the Burroughs share code between processes/threads?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lynn Wheeler@21:1/5 to mitchalsup@aol.com on Mon Jul 22 17:01:10 2024

mitchalsup@aol.com (MitchAlsup1) writes:

The largest a team should be is 11 + leader--Jesus tried for 12 and
failed.

trivia: science center wanted to get a 360/50 to modify for virtual
memory, but all the extra 360/50s were going to FAA ATC ... and so they
had to settle for 360/40 ... they implemented virtual memory with
associative array that held process-ID and virtual page number for each
real page (compared to Atlas associative array, which just had virtual
page number for each real page... effectively just single large virtual address space).

the official IBM operating system for (standard virtual memory) 360/67
was TSS/360 which peaked around 1200 people at a time when the science
center had 12 people (that included secretary) morphing CP/40 into
CP/67.

Melinda's history website
http://www.leeandmelindavarian.com/Melinda#VMHist
description of CP/40 (for modified 360/40) http://www.leeandmelindavarian.com/Melinda/JimMarch/CP40_The_Origin_of_VM370.pdf

--
virtualization experience starting Jan1968, online at home since Mar1970

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Lawrence D'Oliveiro on Tue Jul 23 12:40:32 2024

On Tue, 23 Jul 2024 00:21:44 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 22 Jul 2024 17:51:21 +0300, Michael S wrote:

On Mon, 22 Jul 2024 13:41:40 +0000 mitchalsup@aol.com (MitchAlsup1)
wrote:

CDC 6600 had a RISC instruction set !!

It has RISC-like features, most importantly load-store architecture.
Simple instruction formats and only two instruction width options
also sound RISCy. But hard coupling between A registers and X
registers does not look like RISC to me.

RISC-V is adopting a very similar idea, in preference to the
widespread current SIMD fashion.

WTF are you talking about?
Sounds like you either never read about CDC 6600 instruction set or
completely forgot everything you ever read about it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to Lawrence D'Oliveiro on Tue Jul 23 12:38:21 2024

On Tue, 23 Jul 2024 00:20:46 -0000 (UTC)
Lawrence D'Oliveiro <ldo@nz.invalid> wrote:

On Mon, 22 Jul 2024 13:08:27 +0300, Michael S wrote:

At the end, the influence of 6600 on computers we use today is
close to zero.

He pioneered pipelining

It (6600), not he (Seymour).
No, it didn't.
If instead of 'it' we'd talk about 'he' then the first Cray-Thornton
pipelined computer is 7600. But by then pipelining was hardly new.
7600 can be arguably credited for "pipelining done right", but not as
a pioneer in that area.

and multiple function units.

Multiple functional units existed before. The special thing about 6600
was that it had ALOT of them. Having a lot of non-pipelined functional
units after 1-wide or even 2 or 3-wide front end sounds like
architectural dead end.

He went on to
pioneer vector processing (long vectors, not the short-vector SIMD
stuff that infests CPU designs today).

I am talking about 6600, the computer, not Seymour Cray, the person, or Cray-Thornton, the team.
Cray, the person was innovative and influential.
6600, the computer was innovative and not influential in the long run.

He was always very
conservative in the fabrication technologies he adopted, but he was
brilliant at pushing them to their limits.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael S@21:1/5 to John Savard on Tue Jul 23 12:53:57 2024

On Mon, 22 Jul 2024 18:18:52 -0600
John Savard <quadibloc@servername.invalid> wrote:

On Mon, 22 Jul 2024 10:43:44 +0300, Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:

On 2024-07-22 3:05, Lawrence D'Oliveiro wrote:

On Sun, 21 Jul 2024 22:21:41 +0000, MitchAlsup1 wrote:

IBM CEO said something to the effect:: How can a team of 24
people, including the janitor, beat IBM ??

From Watson’s memo:

Last week Control Data had a press conference during which
they officially announced their 6600 system. I understand that in
the laboratory developing this system there are only 34 people,
including the janitor. Of these, 14 are engineers and 4 are
programmers, and only one person has a Ph.D., a relatively
junior programmer. Contrasting this modest effort with our own vast
development activities, I fail to understand why we have lost
our industry leadership position by letting someone else offer the
world’s most powerful computer.

Per Wikipedia, Cray's reply was sardonic: "It seems like Mr. Watson
has answered his own question."

While IBM did not appear to understand the wisdom in Cray's remark at
the time - that a large organization can have internal politics and communications overhead and other things that hamper innovation - it
did _eventually_ learn its lesson.

So when it came time for IBM to make its mark in the new, emerging
field of microcomputers, it had a small team, working in isolation
from the rest of IBM, go and design the IBM Personal Computer in all
its 4.77 MHz 8088 glory.

John Savard

I don't like how IBM screwed interrupts architecture of IBM PC,
completely ignoring Intel's recommendation to assign hardware
interrupts to INT #32 and higher.
The unnecessary mess they created had negative effect for a long time,
15 years at least.
Also I am not sure that at time (1981) 8250 was the optimal choice for
UART chip.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Tue Jul 23 13:20:21 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 22 Jul 2024 15:09:22 GMT, Scott Lurndal wrote:

Basically, the hardware would modify the branch opcode in memory after
every branch to track the last two taken/not-taken states.

Did the Burroughs share code between processes/threads?

That depends on which of the three mainframe lines you
consider.

Large systems (B6500 descendents) was multithreaded
from the start.

While mediuum systems user code was mostly single threaded, the operating system had full multithreading, with hardware mutex and condition
variables via the LOK, UNLK, WAIT and CAUSE instructions.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Michael S on Tue Jul 23 15:22:12 2024

On Tue, 23 Jul 2024 9:53:57 +0000, Michael S wrote:

I don't like how IBM screwed interrupts architecture of IBM PC,
completely ignoring Intel's recommendation to assign hardware
interrupts to INT #32 and higher.
The unnecessary mess they created had negative effect for a long time,
15 years at least.
Also I am not sure that at time (1981) 8250 was the optimal choice for
UART chip.

It is not just IBM, but every interrupt architecture prior to MSI-X
messages are screwed up (and most after MSI-X, too).

The property one wants is that VMexit has to do nothing to the interrupts/controllers in order to gain full control over the core.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Wed Jul 24 00:17:37 2024

On Tue, 23 Jul 2024 13:20:21 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 22 Jul 2024 15:09:22 GMT, Scott Lurndal wrote:

Basically, the hardware would modify the branch opcode in memory after
every branch to track the last two taken/not-taken states.

Did the Burroughs share code between processes/threads?

Large systems (B6500 descendents) was multithreaded from the start.

Not quite what I asked. I was wondering how those code patches would
impact on shared code.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Lawrence D'Oliveiro on Wed Jul 24 13:22:46 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Tue, 23 Jul 2024 13:20:21 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Mon, 22 Jul 2024 15:09:22 GMT, Scott Lurndal wrote:

Basically, the hardware would modify the branch opcode in memory after >>>> every branch to track the last two taken/not-taken states.

Did the Burroughs share code between processes/threads?

Large systems (B6500 descendents) was multithreaded from the start.

Not quite what I asked. I was wondering how those code patches would
impact on shared code.

Global branch prediction, of course.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Lawrence D'Oliveiro@21:1/5 to Scott Lurndal on Wed Jul 24 23:45:46 2024

On Wed, 24 Jul 2024 13:22:46 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

I was wondering how those code patches would
impact on shared code.

Global branch prediction, of course.

But the characteristics of a program run in one thread/process might not
match those in another. If both runs are modifying the same code, the
reasons will be ... sub-optimal.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From EricP@21:1/5 to Lawrence D'Oliveiro on Thu Jul 25 01:42:28 2024

Lawrence D'Oliveiro wrote:

On Wed, 24 Jul 2024 13:22:46 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

I was wondering how those code patches would
impact on shared code.

Global branch prediction, of course.

But the characteristics of a program run in one thread/process might not match those in another. If both runs are modifying the same code, the
reasons will be ... sub-optimal.

Because they will both keep patching it to be what they want.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Lawrence D'Oliveiro on Thu Jul 25 10:59:16 2024

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 24 Jul 2024 13:22:46 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

I was wondering how those code patches would
impact on shared code.

Global branch prediction, of course.

But the characteristics of a program run in one thread/process might not >match those in another.

They might not, or they might. When the hardware branch predictor
researchers looked into it, they found that there is more synergy than interference. Consequently, they did not take measures to avoid
sharing. And for the approach of patching the hints in the code, the
results of sharing will be beneficial on average, too, because the
only difference from the 2-bit/branch predictor is that the latter is
in microarchitectural state instead of in the code.

Now somebody will point out that sharing makes it possible for an
attacker to train branch predictors in one process to attack a
different process through Spectre and friends. While preventing
sharing would close that, it does not close training the predictors in
the same thread.

Closing Spectre through invisible speculation (several papers exist
about that) makes it irrelevant (for Spectre) whether the predictors
are shared or not. Of course, for invisible speculation the permanent
branch predictors must not be updated speculatively, but that's
probably better anyway.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Scott Lurndal@21:1/5 to Anton Ertl on Thu Jul 25 14:43:44 2024

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 24 Jul 2024 13:22:46 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

I was wondering how those code patches would
impact on shared code.

Global branch prediction, of course.

But the characteristics of a program run in one thread/process might not >>match those in another.

They might not, or they might.

Quite true. In this particular case, the characteristics were generally similar between tasks sharing the same text (executable code).

When the hardware branch predictor
researchers looked into it, they found that there is more synergy than >interference.

Indeed, internal benchmarking showed a definite improvement
in all workloads.

This being a 1960's vintage mainframe architecture, the chances
of introducing exploit code to attempt a spectre style attack
on the 'branch predictor' was zero.

Consequently, they did not take measures to avoid
sharing. And for the approach of patching the hints in the code, the
results of sharing will be beneficial on average, too, because the
only difference from the 2-bit/branch predictor is that the latter is
in microarchitectural state instead of in the code.

Now somebody will point out that sharing makes it possible for an
attacker to train branch predictors in one process to attack a
different process through Spectre and friends. While preventing
sharing would close that, it does not close training the predictors in
the same thread.

Closing Spectre through invisible speculation (several papers exist
about that) makes it irrelevant (for Spectre) whether the predictors
are shared or not. Of course, for invisible speculation the permanent
branch predictors must not be updated speculatively, but that's
probably better anyway.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From MitchAlsup1@21:1/5 to Anton Ertl on Thu Jul 25 17:22:44 2024

On Thu, 25 Jul 2024 10:59:16 +0000, Anton Ertl wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

On Wed, 24 Jul 2024 13:22:46 GMT, Scott Lurndal wrote:

Lawrence D'Oliveiro <ldo@nz.invalid> writes:

I was wondering how those code patches would
impact on shared code.

Global branch prediction, of course.

But the characteristics of a program run in one thread/process might not >>match those in another.

They might not, or they might. When the hardware branch predictor researchers looked into it, they found that there is more synergy than interference. Consequently, they did not take measures to avoid
sharing. And for the approach of patching the hints in the code, the
results of sharing will be beneficial on average, too, because the
only difference from the 2-bit/branch predictor is that the latter is
in microarchitectural state instead of in the code.

Now somebody will point out that sharing makes it possible for an
attacker to train branch predictors in one process to attack a
different process through Spectre and friends. While preventing
sharing would close that, it does not close training the predictors in
the same thread.

Not allowing a dependent AGEN to happen when the first AGEN takes
a fault ALSO prevents SPectré like attacks {Whether the crack is
opened up by the BP, IBP, or any other predictor.} Then not modifying
any cache prior to instruction retirement cements the door closed.

Closing Spectre through invisible speculation (several papers exist
about that) makes it irrelevant (for Spectre) whether the predictors
are shared or not. Of course, for invisible speculation the permanent
branch predictors must not be updated speculatively, but that's
probably better anyway.

- anton

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to mitchalsup@aol.com on Fri Jul 26 16:36:07 2024

mitchalsup@aol.com (MitchAlsup1) writes:

On Thu, 25 Jul 2024 10:59:16 +0000, Anton Ertl wrote:

Now somebody will point out that sharing makes it possible for an
attacker to train branch predictors in one process to attack a
different process through Spectre and friends. While preventing
sharing would close that, it does not close training the predictors in
the same thread.

Not allowing a dependent AGEN to happen when the first AGEN takes
a fault ALSO prevents SPectré like attacks

Spectre does not need a fault. You are probably thinking of Meltdown.
That, at least has been fixed by Intel (and hopefully also ARM) in its
original variant pretty soon, although other variants have been
discovered since then (IIRC including some where the fault has nothing
to do with addresses).

Then not modifying
any cache prior to instruction retirement cements the door closed.

Not changing microarchitectural state (not just caches) through
misspeculation (invisible speculation) is a proper fix for Spectre,
and looks like the best fix to me.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Gretchiie
  Mon Sep 15 05:16:29 2025
  from Derry, Nh via Telnet
- Fred Blogs
  Mon Sep 15 00:03:12 2025
  from Uk via SSH
- Plume
  Sun Sep 14 09:34:52 2025
  from Uk via Raw
- Gretchiie
  Sun Sep 14 06:07:30 2025
  from Derry, Nh via Telnet
- Thlc
  Sat Sep 13 17:11:34 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 17:04:03 2025
  from Rognac, France via Telnet
- Thlc
  Sat Sep 13 16:32:19 2025
  from Rognac, France via SSH
- Thlc
  Sat Sep 13 15:41:11 2025
  from Rognac, France via SSH

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	546
Nodes:	16 (2 / 14)
Uptime:	04:23:55
Calls:	10,387
Calls today:	2
Files:	14,061
Messages:	6,416,782

Re: Chipsandcheese article on the CDC6600

Who's Online

Recent Visitors

System Info