Discussion:
The Attack of the Killer Micros
(too old to reply)
Quadibloc
2024-02-14 05:50:41 UTC
Permalink
In the early days of the microcomputer era, one could either
have a cheap small computer with a single-chip CPU, or, if
one wanted something bigger, moderate performance was available
from bit-slice chips.

If you wanted higher performance than a bit-slice design would
allow, you had to use older, less highly integrated technology,
so the increase in cost was too large to be justified by the
increase in performance.

Eventually, the Pentium Pro, and its popular successor the Pentium
II came along, and now a System 360/195 architecture was placed
on a single chip (two dies, though, as even the L1 cache, which
was on the chip, had to have a separate die) and the problem was
solved.

This explains my goal of including a Cray I style vector capability
on a microprocessor - this is the one historic thing not yet reduced
to a chip which extends into a performance space beyond that of the
360/195. My reasoning may be very naive, because I'm failing to take
into account how the current gap between CPU and DRAM speeds makes
older architectures not practical.

And, as I've noted also, the overwhelming dominance of Windoes on
the x86 shows "there can be only one", which is why I want my new
architecture to offer something the x86 doesn't... efficient
emulation of older architecures with 36-bit, 48-bit, and 60-bit
words, so that those who have really old programs to run are no
longer disadvantaged.

While this seems like a super-niche thing to some, I see it as
something that's practically _essential_ to have a future world of
computers that doesn't leave older code behind - so that the
computer you already have on your desktop is truly general in its
capabilities.

I don't see FPGAs in their current form as efficient enough to
offer a route to the kind of generality I'm seeking.

By explaining what my goals are, rather than discussing the ISA
proposals that I see as a means to those goals, perhaps this makes
it possible for a better and more practical way to achieve those
goals to be suggested.

John Savard
John Dallman
2024-02-14 09:56:00 UTC
Permalink
I want my new architecture to offer something the x86 doesn't...
efficient emulation of older architecures with 36-bit, 48-bit,
and 60-bit words, so that those who have really old programs to
run are no longer disadvantaged.
While this seems like a super-niche thing to some, I see it as
something that's practically _essential_ to have a future world of
computers that doesn't leave older code behind - so that the
computer you already have on your desktop is truly general in its
capabilities.
If this had been available in the 1970s, as the IBM 700/7000 series and
others of their generation faded out of use, it would have been quite
useful.

All that code has been re-written for newer architectures or abandoned by
now; it ran on expensive systems for expensive purposes, so if it was
going to have continued uses there was usually budget to re-write it.

Now that there's general alignment on 32-bit or 64-bit addressing, 8-bit
bytes, and IEEE floating-point, portability is not such a big problem.

John
MitchAlsup1
2024-02-14 17:50:38 UTC
Permalink
Post by Quadibloc
In the early days of the microcomputer era, one could either
have a cheap small computer with a single-chip CPU, or, if
one wanted something bigger, moderate performance was available
from bit-slice chips.
If you wanted higher performance than a bit-slice design would
allow, you had to use older, less highly integrated technology,
so the increase in cost was too large to be justified by the
increase in performance.
Eventually, the Pentium Pro, and its popular successor the Pentium
II came along, and now a System 360/195 architecture was placed
on a single chip (two dies, though, as even the L1 cache, which
was on the chip, had to have a separate die) and the problem was
solved.
This explains my goal of including a Cray I style vector capability
on a microprocessor - this is the one historic thing not yet reduced
to a chip which extends into a performance space beyond that of the
360/195.
It has not been reduced into practice because it takes too many pins,
wiggling at too high a rate, ...
Post by Quadibloc
My reasoning may be very naive, because I'm failing to take
into account how the current gap between CPU and DRAM speeds makes
older architectures not practical.
3 accesses per CPU cycle continuously (2 LDs and 1 ST) and hundreds
of banks {Without cache lines}
Post by Quadibloc
And, as I've noted also, the overwhelming dominance of Windoes on
the x86 shows "there can be only one",
There is now an ARM Windows.
Post by Quadibloc
which is why I want my new
architecture to offer something the x86 doesn't... efficient
emulation of older architecures with 36-bit, 48-bit, and 60-bit
words, so that those who have really old programs to run are no
longer disadvantaged.
Do you have a market demand survey ??
Post by Quadibloc
While this seems like a super-niche thing to some, I see it as
something that's practically _essential_ to have a future world of
computers that doesn't leave older code behind - so that the
computer you already have on your desktop is truly general in its
capabilities.
I don't see FPGAs in their current form as efficient enough to
offer a route to the kind of generality I'm seeking.
By explaining what my goals are, rather than discussing the ISA
proposals that I see as a means to those goals, perhaps this makes
it possible for a better and more practical way to achieve those
goals to be suggested.
John Savard
Lawrence D'Oliveiro
2024-02-14 20:33:48 UTC
Permalink
Eventually, the Pentium Pro ...
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was that
Intel expected that the majority of Windows code would be 32-bit by that
point. It wasn’t.
And, as I've noted also, the overwhelming dominance of Windoes on the
x86 shows "there can be only one", which is why I want my new
architecture to offer something the x86 doesn't... efficient emulation
of older architecures with 36-bit, 48-bit, and 60-bit words, so that
those who have really old programs to run are no longer disadvantaged.
Didn’t a company called “Transmeta” try that ... something like 30 years
ago? It didn’t work.

There is no path forward for Windows on non-x86. Only open-source software
is capable of being truly cross-platform.
Scott Lurndal
2024-02-14 21:32:11 UTC
Permalink
Post by Lawrence D'Oliveiro
Eventually, the Pentium Pro ...
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was that
Intel expected that the majority of Windows code would be 32-bit by that
point.
We used the P6 (aka the Pentium Pro) for a large massively parallel system
(64 2-processor nodes, each with a SCSI controller and 1Gb ethernet port)
running a single-system-image version of SVR4.2ES/MP.

I wouldn't call it a joke. We also had the orange books for the
never-built P7 (which morphed eventually into Itanium).
Post by Lawrence D'Oliveiro
Didn’t a company called “Transmeta” try that ... something like 30 years
ago? It didn’t work.
They tried to build an architecture that supported run-time
translation of x86 instructions to native instructions. Several
former colleagues worked there - one of whom is now with Apple managing
their ARM core development group. He used to take Linus Torvalds (another
former transmeta employee) up in his Cessna 414 (A fun plane to fly).
Post by Lawrence D'Oliveiro
There is no path forward for Windows on non-x86.
That's entirely up to Microsoft. As has been noted, they do have
ARMv8 versions of windows 11.

https://learn.microsoft.com/en-us/windows/arm/overview
Lawrence D'Oliveiro
2024-02-15 00:50:09 UTC
Permalink
Post by Lawrence D'Oliveiro
There is no path forward for Windows on non-x86.
That's entirely up to Microsoft. As has been noted, they do have ARMv8
versions of windows 11.
They’ve been trying for years: Windows Phone 8, Windows RT, that laughable
“Windows 10 IOT Edition” for the Raspberry Pi, whatever the name is for
the current effort ... Windows-on-ARM has always been a trainwreck.
Scott Lurndal
2024-02-15 15:02:38 UTC
Permalink
Post by Lawrence D'Oliveiro
There is no path forward for Windows on non-x86.
That's entirely up to Microsoft. As has been noted, they do have ARMv8
versions of windows 11.
They’ve been trying for years: Windows Phone 8, Windows RT, that laughable
“Windows 10 IOT Edition” for the Raspberry Pi, whatever the name is for
the current effort ... Windows-on-ARM has always been a trainwreck.
https://azure.microsoft.com/en-us/blog/azure-virtual-machines-with-ampere-altra-arm-based-processors-generally-available/
Lawrence D'Oliveiro
2024-02-15 20:19:29 UTC
Permalink
Post by Scott Lurndal
Post by Lawrence D'Oliveiro
Post by Lawrence D'Oliveiro
There is no path forward for Windows on non-x86.
That's entirely up to Microsoft. As has been noted, they do have ARMv8
versions of windows 11.
They’ve been trying for years: Windows Phone 8, Windows RT, that laughable
“Windows 10 IOT Edition” for the Raspberry Pi, whatever the name is for
the current effort ... Windows-on-ARM has always been a trainwreck.
https://azure.microsoft.com/en-us/blog/azure-virtual-machines-with-ampere-altra-arm-based-processors-generally-available/
You know that most of Microsoft’s cloud is running Linux, right?
They’ve admitted as much themselves.
Anton Ertl
2024-02-15 07:24:56 UTC
Permalink
Post by Scott Lurndal
Post by Lawrence D'Oliveiro
There is no path forward for Windows on non-x86.
That's entirely up to Microsoft.
No. Microsoft is trying to commoditize their complement (in
particular, Intel) by making Windows on ARM viable, but the ISVs don't
play along. Of course some of that is Microsoft's own doing, as they
ensured in earlier iterations of this stategy (MIPS, PowerPC, Alpha
during the 1990s, IA-64 during the 2000s; there was also Windows RT)
that all ISVs who invested in non-IA-32/x64 Windows lost their
investment by MS dropping the support for these platforms. So now
every sane ISV just sits back and waits until Microsoft has made the
Windows-on-ARM market big on their own. Of course this does not work,
and the high prices and lack of alternative OS options of the
Windows-on-ARM hardware does not help, either.
Post by Scott Lurndal
As has been noted, they do have
ARMv8 versions of windows 11.
https://learn.microsoft.com/en-us/windows/arm/overview
Doomed.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
Lawrence D'Oliveiro
2024-02-15 08:18:17 UTC
Permalink
Post by Anton Ertl
Microsoft is trying to commoditize their complement (in
particular, Intel) by making Windows on ARM viable, but the ISVs don't
play along.
Can you blame them? They are not going to port their proprietary apps to
ARM until they see the customers buying lots of ARM-based machines, and
customers are staying away from buying ARM-based machines because they
don’t see lots of software that will take advantage of the hardware.

Chicken-and-egg situation, and no way to break out of it.
Anton Ertl
2024-02-15 08:42:54 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Anton Ertl
Microsoft is trying to commoditize their complement (in
particular, Intel) by making Windows on ARM viable, but the ISVs don't
play along.
Can you blame them? They are not going to port their proprietary apps to
ARM until they see the customers buying lots of ARM-based machines, and
customers are staying away from buying ARM-based machines because they
don’t see lots of software that will take advantage of the hardware.
Chicken-and-egg situation, and no way to break out of it.
A possible way would be to offer the ARM-based systems much cheaper,
making the hardware attractive to users who do not use
architecture-specific ISV software. That would result in a
significant number of systems out there, and would inspire big ISVs
like Adobe to support them, increasing the appeal of the platform,
which again would result in increased sales, which would make the
platform attractive to additional ISVs, and so on.

The first part happened for Chromebooks and the Raspberry Pi, and,
e.g., VFX Forth (a proprietary Forth system, i.e., an ISV product) is
available on the Raspi, even though it does not run Windows.

But wrt Windows-on-ARM, what actually happens is that laptops with
that are rather expensive. It seems that someone (Qualcomm? The
laptop producers? MS?) wants to milk that market before it has
calved. This doesn't work.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
John Levine
2024-02-15 19:17:49 UTC
Permalink
Post by Anton Ertl
Post by Lawrence D'Oliveiro
Chicken-and-egg situation, and no way to break out of it.
A possible way would be to offer the ARM-based systems much cheaper,
making the hardware attractive to users who do not use
architecture-specific ISV software. ...
Microsoft doesn't make PCs, and it is not clear to me how they would
bribe OEMs to do that without running into competition issues.

Apple has switched CPUs on the Mac three times, from 68K to POWER to
x86 to ARM, quite successfully since they control both the hardware
and software. Each time they provided software emulation of the
previous CPU, and the new systems were enough faster that the
emulation speed was adequate. Since nobody writes anything in
assembler any more, these days building a version of software for the
new CPU needs little more than changing a few switches and
recompiling.

On my newish M2 Mac, the only thing that doesn't work is an add-in to
the calibre ebook package. Calibre is written in python, and includes
its own copy of python so you can install it as a single app. That
works fine, most add-ins work fine. The one add-in that doesn't calls
an external crypto library, but the copy of that library on my Mac is
ARM while calibre and the add-in are emulated x86. If I cared more
I could probably figure out where to get the x86 version of the library.

Someone else pointed to a press release about ARM chips in Microsoft's
cloud. Keep reading and it becomes clear that they mostly expect
people to run linux on them.
--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
Michael S
2024-02-15 20:58:37 UTC
Permalink
On Thu, 15 Feb 2024 19:17:49 -0000 (UTC)
Post by John Levine
Post by Anton Ertl
Post by Lawrence D'Oliveiro
Chicken-and-egg situation, and no way to break out of it.
A possible way would be to offer the ARM-based systems much cheaper,
making the hardware attractive to users who do not use
architecture-specific ISV software. ...
Microsoft doesn't make PCs, and it is not clear to me how they would
bribe OEMs to do that without running into competition issues.
Of course, Microsoft makes PCs.
https://en.wikipedia.org/wiki/Microsoft_Surface
I think, it is one of the reasons OEMs are not enthusiastic about
Windows on Arm. They don't want to compete with their OS supplier.

I asked google "What is a market share of Microsoft Surface?"
The answer was "In the Personal Computing Devices category, Microsoft
Surface has a market share of about 2.4%."
It means that they are near #10 spot, give or take 1 or 2 places.
Post by John Levine
Apple has switched CPUs on the Mac three times, from 68K to POWER to
x86 to ARM, quite successfully since they control both the hardware
and software. Each time they provided software emulation of the
previous CPU, and the new systems were enough faster that the
emulation speed was adequate. Since nobody writes anything in
assembler any more, these days building a version of software for the
new CPU needs little more than changing a few switches and
recompiling.
On my newish M2 Mac, the only thing that doesn't work is an add-in to
the calibre ebook package. Calibre is written in python, and includes
its own copy of python so you can install it as a single app. That
works fine, most add-ins work fine. The one add-in that doesn't calls
an external crypto library, but the copy of that library on my Mac is
ARM while calibre and the add-in are emulated x86. If I cared more
I could probably figure out where to get the x86 version of the library.
Someone else pointed to a press release about ARM chips in Microsoft's
cloud. Keep reading and it becomes clear that they mostly expect
people to run linux on them.
That's what I expected without reading.
The only thing that I was not sure about is whether Windows is supported
at all.
May be, I should read it myself when I have no better things to read.
Lawrence D'Oliveiro
2024-02-15 21:31:58 UTC
Permalink
Post by Michael S
Of course, Microsoft makes PCs.
https://en.wikipedia.org/wiki/Microsoft_Surface
I think, it is one of the reasons OEMs are not enthusiastic about
Windows on Arm. They don't want to compete with their OS supplier.
And the only reason Microsoft is offering any ARM-based machines at all is
to try to promote the platform.

I don’t think they’ve made money on any ARM machine they’ve ever sold.
Michael S
2024-02-15 21:57:34 UTC
Permalink
On Thu, 15 Feb 2024 21:31:58 -0000 (UTC)
Post by Lawrence D'Oliveiro
Post by Michael S
Of course, Microsoft makes PCs.
https://en.wikipedia.org/wiki/Microsoft_Surface
I think, it is one of the reasons OEMs are not enthusiastic about
Windows on Arm. They don't want to compete with their OS supplier.
And the only reason Microsoft is offering any ARM-based machines at
all is to try to promote the platform.
My theory is that Microsoft started this route because they badly wants
to be in the business of "always connected" devices. Nowadays they hate
to sell software and very much prefer SaaS. "Always connected" helps
there or at least Satya Nadella believes that it helps.
They bought Nokia's smart phones division, but it didn't work.
So, they tried something else that went not great, but at least better.
Post by Lawrence D'Oliveiro
I don’t think they’ve made money on any ARM machine they’ve ever sold.
For Surface as whole I heared, not very recently, that it was
profitable. For Arm-based Surface specifically, I'd think it is a
sectret even within Microsoft, same as for any other Surface model in
isolation.
I happen to have a co-worker that was fired from MS Surface hardware
development not long ago. He worked there many years and never ever was
told about profitability of individual models.
Lawrence D'Oliveiro
2024-02-15 23:24:59 UTC
Permalink
Post by Michael S
They bought Nokia's smart phones division, but it didn't work.
They only did that as a last-ditch effort to save face, because the
company was on the verge of giving up on Windows Phone altogether and
switching to Android.
Lawrence D'Oliveiro
2024-02-17 00:42:14 UTC
Permalink
The Nokia decision to switch to Windows Phone looked unwise when it
was made, and that was amply proven in practice.
All the blame could very much be laid at the door of then-CEO and ex-
Microsoftie Stephen Elop.

The irony is that Nokia were already working on a decent Debian-based
phone by the time he was appointed, called the N9. He was too late to kill
it off completely, but he did see to it that it only underwent limited
release and that there were no followup models.

As I recall, it received rave reviews in the markets where it was
released. Then once stocks ran out, that was the end of it. And Nokia went
back to losing money.
Terje Mathisen
2024-02-16 09:42:05 UTC
Permalink
Post by Michael S
On Thu, 15 Feb 2024 21:31:58 -0000 (UTC)
Post by Lawrence D'Oliveiro
Post by Michael S
Of course, Microsoft makes PCs.
https://en.wikipedia.org/wiki/Microsoft_Surface
I think, it is one of the reasons OEMs are not enthusiastic about
Windows on Arm. They don't want to compete with their OS supplier.
And the only reason Microsoft is offering any ARM-based machines at
all is to try to promote the platform.
My theory is that Microsoft started this route because they badly wants
to be in the business of "always connected" devices. Nowadays they hate
to sell software and very much prefer SaaS. "Always connected" helps
there or at least Satya Nadella believes that it helps.
They bought Nokia's smart phones division, but it didn't work.
So, they tried something else that went not great, but at least better.
Post by Lawrence D'Oliveiro
I don’t think they’ve made money on any ARM machine they’ve ever sold.
For Surface as whole I heared, not very recently, that it was
profitable. For Arm-based Surface specifically, I'd think it is a
sectret even within Microsoft, same as for any other Surface model in
isolation.
I happen to have a co-worker that was fired from MS Surface hardware
development not long ago. He worked there many years and never ever was
told about profitability of individual models.
I have found the Surface machines to be very dependable, I have bought 3
of them over the years, starting with the original (?) Surface Pro which
was the first real PC model. I have since given the first two to my
kids, both are still working along with my newish (3-5 years?) night
table/travel machine.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
Anton Ertl
2024-02-15 20:58:21 UTC
Permalink
Post by John Levine
Post by Anton Ertl
Post by Lawrence D'Oliveiro
Chicken-and-egg situation, and no way to break out of it.
A possible way would be to offer the ARM-based systems much cheaper,
making the hardware attractive to users who do not use
architecture-specific ISV software. ...
Microsoft doesn't make PCs
They make the Surface laptops and also make Surface all-in-one PCs
(but the latter have not been updated for a while).
Post by John Levine
and it is not clear to me how they would
bribe OEMs to do that without running into competition issues.
Anti-trust action has been much weaker in recent decades compared to
the 1970s
<https://doctorow.medium.com/an-antitrust-murder-whodunnit-49f3bd3cc69c>.
Post by John Levine
Since nobody writes anything in
assembler any more, these days building a version of software for the
new CPU needs little more than changing a few switches and
recompiling.
And yet, most ISVs generally don't provide ARM versions of their
Windows software.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
Lawrence D'Oliveiro
2024-02-15 23:25:48 UTC
Permalink
Since nobody writes anything in assembler any more, these days building
a version of software for the new CPU needs little more than changing a
few switches and recompiling.
And yet, most ISVs generally don't provide ARM versions of their Windows
software.
For some reason, it’s hard (i.e. expensive) for proprietary software to be
cross-platform.
BGB
2024-02-15 22:19:41 UTC
Permalink
Post by Anton Ertl
Post by Lawrence D'Oliveiro
Post by Anton Ertl
Microsoft is trying to commoditize their complement (in
particular, Intel) by making Windows on ARM viable, but the ISVs don't
play along.
Can you blame them? They are not going to port their proprietary apps to
ARM until they see the customers buying lots of ARM-based machines, and
customers are staying away from buying ARM-based machines because they
don’t see lots of software that will take advantage of the hardware.
Chicken-and-egg situation, and no way to break out of it.
A possible way would be to offer the ARM-based systems much cheaper,
making the hardware attractive to users who do not use
architecture-specific ISV software. That would result in a
significant number of systems out there, and would inspire big ISVs
like Adobe to support them, increasing the appeal of the platform,
which again would result in increased sales, which would make the
platform attractive to additional ISVs, and so on.
The first part happened for Chromebooks and the Raspberry Pi, and,
e.g., VFX Forth (a proprietary Forth system, i.e., an ISV product) is
available on the Raspi, even though it does not run Windows.
But wrt Windows-on-ARM, what actually happens is that laptops with
that are rather expensive. It seems that someone (Qualcomm? The
laptop producers? MS?) wants to milk that market before it has
calved. This doesn't work.
Yeah.

I would be rather tempted by a many-core ARM machine...
If they were not so expensive that one may as well stick with x86...


Meanwhile, RasPi is cheap, but a RasPi is not a worthwhile alternative
to a desktop PC.

And, if one wants something they can use mainline PCIe cards with, and
can install a bunch of SATA HDDs or similar into, this is not cheap.


Though, I suspect, this may be similar to what killed the IA-64. History
might have gone quite differently if Intel, instead of targeting it at
the high-end, made it first available as a lower-cost alternative to the
Celeron line (or, maybe even, tried to make inroads into the embedded
space, but at the time would likely have been unable to compete with
MIPS and ARM in terms of being cheap enough for use in consumer
electronics, which seemed to have been mostly dominated by 32-bit
single-issue cores).

Had it survived for longer, it could have maybe been a viable option for
smartphones and tablets.



But, yeah, for PC class systems, seemingly x86 remains as both the
cheapest and most readily available options, and as long as this remains
true, x86 will hold its ground (possibly even more so than due to any
software related issues, such as reduced performance due to emulation, etc).

...
John Dallman
2024-02-16 14:32:00 UTC
Permalink
Post by BGB
Though, I suspect, this may be similar to what killed the IA-64.
History might have gone quite differently if Intel, instead of
targeting it at the high-end, made it first available as a
lower-cost alternative to the Celeron line
Selling it to the Celeron market would have been impossible: the games
producers would not have wanted to support it, or found it too hard, much
like Cell a few years later. The x86 emulation would not have saved it:
that was slow by the standards of the time.
Post by BGB
Had it survived for longer, it could have maybe been a viable
option for smartphones and tablets.
IA-64 ran way too hot for portable devices. HP, who'd devised the
architecture, wanted it for large servers, and that was what it was
designed for. In the late 1990s, when those decisions were made, smart
mobile devices didn't exist.

John
Lawrence D'Oliveiro
2024-02-16 21:51:31 UTC
Permalink
Post by John Dallman
In the late 1990s, when those decisions were made, smart
mobile devices didn't exist.
Actually, they did. PDAs, remember?
John Dallman
2024-02-17 00:10:00 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
In the late 1990s, when those decisions were made, smart
mobile devices didn't exist.
Actually, they did. PDAs, remember?
True, but batteries of the period could not have supported Itanium's 100W+
power consumption for any useful time.

John
Lawrence D'Oliveiro
2024-02-17 00:39:22 UTC
Permalink
Post by John Dallman
Post by Lawrence D'Oliveiro
In the late 1990s, when those decisions were made, smart mobile
devices didn't exist.
Actually, they did. PDAs, remember?
True, but batteries of the period could not have supported Itanium's
100W+ power consumption for any useful time.
Nevertheless, smart mobile devices did exist.
BGB
2024-02-17 03:02:48 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
Post by Lawrence D'Oliveiro
In the late 1990s, when those decisions were made, smart mobile
devices didn't exist.
Actually, they did. PDAs, remember?
True, but batteries of the period could not have supported Itanium's
100W+ power consumption for any useful time.
Nevertheless, smart mobile devices did exist.
Presumably they could have scaled it down, while still keeping the core
ISA design intact?...

Like, presumably they had wanted to use the design for things big and
small, which would not have made sense if it could only be used in big
server chips.


Granted...
It would still probably have been unable to compete directly with, say,
32-bit ARM, in the low-power parts of the market.

But, maybe, say, as a CPU for home game-consoles or set-top boxes?...


Or those thin clients that did little other than dial into the internet
and run a web-browser?...

Saw a video about one of these ("i-Opener" IIRC): Where apparently at
the time, they were sticking a full-fledged PC MOBO in the things (with
a laptop style LCD), sold on a loss, with the idea to make the money
back by people using a particular ISP.

But then people ended up realizing they could plug a normal IDE HDD into
the things and use them as low-cost computers (running whatever OS they
wanted), resulting in the company trying to disable HDD booting and
de-solder the IDE connectors, before then going out of business entirely...


Then again, the IA-64 might still not have survived even if it did
manage to entirely take hold over the set-top-box and internet appliance
market... (And, this area had seemingly now morphed into the whole
"Internet of Things" thing, and effectively shoving the internet
appliance into the front-door of a refrigerator or similar, for a "Web
Enabled" refrigerator...).


Well, I guess also there are things like "smart bulbs", which can be
programmed to turn on/off or have various RGB colors, but seemingly
lacking the feature to be able to tell them to use Quake's "fluorescent
flicker" effect as a sort of "mood lighting"... (bonus points if it can
also have the associated buzz and sparking ambient sound effect).

...
BGB
2024-02-17 21:39:08 UTC
Permalink
Presumably they could have scaled [IA-64] down, while still keeping the
core ISA design intact?...
Like, presumably they had wanted to use the design for things big
and small, which would not have made sense if it could only be used
in big server chips.
( Seems my post got posted, ironically, I don't see my own post here... )
Intel and HP showed no desire at the time to use IA-64 in anything
smaller than a workstation.
The huge number of architectural registers (128 64-bit integer, 128
82-bit floating point) would have made shrinks hard. But most of all, the
design is based on the compilers being able to solve a problem that can't
be solved in practice: static scheduling of memory loads in a system with
multiple levels of cache.
AFAIK:
I think the idea was that they already had 100+ registers internally
with their x86 chips (due to register renaming). And, the idea of having
128 GPRs in the IA-64, was to eliminate the register renaming?...


In my case, I have 64 registers on an FPGA, so 128 doesn't seem like too
huge of a stretch, and the FPGA can run this at around 1W.

Also, FWIW, a RV64G chip, with the privileged spec, would need 192
registers internally.


I wouldn't expect early 2000s ASIC's to be that much worse than a modern
FPGA, when by all accounts the affordable/"consumer grade" FPGA lines
are still considerably more limited than a 20 year old ASIC.



It seems like, they probably could have made it work.
Granted, whether or not that would have made IA-64 "not suck", is to be
seen.

Granted, I took a different approach from IA-64, as I realized fairly
early on that I didn't have the compiler technology to make things work
as IA-64 had intended, but could do a "cheaper" alternative to in-order
superscalar.

Though, at present, it seems things are pretty close, and it could have
made sense to also try to prioritize ease of superscalar as well.
But, maybe, say, as a CPU for home game-consoles or set-top
boxes?...
Or those thin clients that did little other than dial into the
internet and run a web-browser?...
It doesn't have any advantages for these roles over simpler, cheaper and
faster RISC or x86 designs.
Except, if they could have made the chip both cheaper and faster than a
corresponding OoO x86 chip.

As I understand it, this was the promise of IA-64.


It is like, if one looks at a Xeon, and then concludes that the Atom
would have been impossible, because of how expensive and power hungry
the Xeon is.

They could have made a chip, say, with only a tiny fraction as much
cache, ...


But, as noted, competing against the ARM chips that existed at the time
probably would have been no go.

Say:
16x 32-bit GPRs, fixed-length 32-bit instructions (say, ARMv5),
partial dual-issue, ...

Would likely have been pretty hard to beat in terms of cost, with
something like an IA-64.
John
Anton Ertl
2024-02-18 08:26:24 UTC
Permalink
Post by BGB
The huge number of architectural registers (128 64-bit integer, 128
82-bit floating point) would have made shrinks hard.
By the time the Itanium and Itanium II were delivered, not really. At
that time they already had the Pentium 4 with 128 physical integer
registers and 128 FP/SIMD registers
<Loading Image...
and the Pentium 4 was the bread-and-butter CPU for Intel; and if it
had been less power-hungry, it would also have been used for mobile.
Post by BGB
I think the idea was that they already had 100+ registers internally
with their x86 chips (due to register renaming). And, the idea of having
128 GPRs in the IA-64, was to eliminate the register renaming?...
No. The idea was that the IA-64 implementations would be ready in
1997, and that it would be superior in performance to the OoO
competition. That's also why they wanted to introduce it to the
market from the high end.

Another idea (and you see it in the IA-64 name that was later dropped
in favour of IPF, and in the IA-32 name that was invented around the
same time) was that in the transition to 64 bits, Intel's customers
would switch from IA-32 to IA-64, and of course that would happen on
servers and workstations first.

The reality was that IA-64 implementations were never generally
superior to the OoO competition. They were doing fine in HPC stuff,
but sucked in anything where performance is not dominated by simple
(software-pipelineable) loops.
Post by BGB
Except, if they could have made the chip both cheaper and faster than a
corresponding OoO x86 chip.
As I understand it, this was the promise of IA-64.
Yes. They just were not able to keep it. And the reason is that they
thought that scheduling in hardware is hard and inefficient, but it
turns out that branch prediction at compile time is so much worse than
hardware branch prediction at run-time that EPIC was not competetive
with OoO.
Post by BGB
It is like, if one looks at a Xeon, and then concludes that the Atom
would have been impossible, because of how expensive and power hungry
the Xeon is.
They wanted to produce superior performance by being wider than (they
thought) was practical for OoO: 6 wide for Merced and McKinley (later,
with Poulson, 12 wide). They did not produce superior performance,
and nowadays, the Cortex-X4 is 10-wide; and Golden Cove (Alder Lake
P-core) renames 6 instructions per cycle and at the same time
eliminates transitive moves and also transitive addition-by-constants.
Post by BGB
They could have made a chip, say, with only a tiny fraction as much
cache, ...
Yes, they could have made a, say, 3-wide IA-64 implementation and
designed it for low power and low area. The result would have been
even slower than the implementations they actually produced. But of
course, given that they thought that their architecture would show its
strengths at wide designs, they certainly did not want to go there at
the start.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
Lawrence D'Oliveiro
2024-02-17 22:09:40 UTC
Permalink
But most of all,
the design is based on the compilers being able to solve a problem that
can't be solved in practice: static scheduling of memory loads in a
system with multiple levels of cache.
That seems insane. Since when did architectural specs dictate the levels
of cache you could have? Normally, that is an implementation detail, that
can vary between different instances of the same architecture.
John Levine
2024-02-17 22:25:24 UTC
Permalink
Post by Lawrence D'Oliveiro
But most of all,
the design is based on the compilers being able to solve a problem that
can't be solved in practice: static scheduling of memory loads in a
system with multiple levels of cache.
That seems insane. Since when did architectural specs dictate the levels
of cache you could have? Normally, that is an implementation detail, that
can vary between different instances of the same architecture.
The point of VLIW was to schedule this stuff statically at compile
time to make the best use of the memory architecture. It more or less
worked in the 1980s but as memory architectures got more complex, and
dynamic hardware scheduling got better, VLIW performance could never
keep up.
--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
Lawrence D'Oliveiro
2024-02-18 00:24:05 UTC
Permalink
A potential alternative would be something like a scaled-up 64-bit
variant of an ESP32 style design (or a 64-bit version of the Qualcomm
Hexagon).
Would you end up with something similar to RISC-V?
BGB
2024-02-18 23:28:07 UTC
Permalink
Post by Lawrence D'Oliveiro
A potential alternative would be something like a scaled-up 64-bit
variant of an ESP32 style design (or a 64-bit version of the Qualcomm
Hexagon).
Would you end up with something similar to RISC-V?
Well, like RISC-V with explicitly parallel instructions (rather than
implicitly via superscalar).

Both ESP32 and Hexagon had used instruction words that could be tagged
to execute in parallel. RISC-V doesn't do this, so the CPU would need to
look at the instructions (and check for register conflicts), before
deciding to do so. For a cost-effective implementation, this logic is
necessarily conservative.



Or, something like my own BJX2 ISA, but it seems, it isn't *that* far
from RISC-V in some areas, and trying to support both in the same CPU
core has led to some amount of convergence (many cases where RV64 had a
feature, but BJX2 lacked a corresponding feature, has resulted in BJX2
gaining the feature in question, albeit often slightly modified; as the
mechanisms often don't demand that exactly the same instruction be
implemented in exactly the same way).

Say, for example, GT and LT are mirrors of each other, immediate size
and encoding matters a lot to the decoder but is mostly ephemeral to the
execute stage logic, etc...

Well, and some major features, such as the presence or absence of ALU
status flags, didn't matter as neither ISA uses ALU status flags.

In my case, SR.T doesn't really count in my case, as it is used almost
exclusively as a predication-control flag. If I were to do a new ISA
with a similar design, likely SR.T would be made exclusively for
predication control, and I would find some other way to do "ADD with
Carry" and similar.


In my case, there are 1/4 as many hardware registers as IA-64, and 1/3
as would be needed for the RISC-V Privileged spec.


Though, would still face the potential issue that WEX (Wide-EXecute)
eats some amount of encoding space, and on a "good" superscalar
implementation, or with OoO, its existence would become mostly moot (it
mostly mattering for cores in the area of too cheap to do superscalar
effectively, but expensive enough to justify being able to execute
instructions in parallel, if the compiler helps them out).

Looks like scaling this up past a width of 2 or 3 becomes mostly no-go,
and say a core with 4+ lanes would almost invariably need to be OoO.

Say, the cost of the Execute stages goes up steadily, but the ability of
the compiler to static schedule things becomes steadily less effective
(seemingly doesn't really work much past 3-wide and a naive
strictly-in-order pipeline).


In practice, this part of the space seems to be mostly dominated by
higher-end microcontrollers, and DSPs (with non-budget "application
class" chips and above mostly going over to OoO).

Meanwhile, the low-end of the microcontroller space tends to be
dominated by 8/16 bit scalar processors, and this probably isn't going
to change anytime quickly (like, if you don't need more than a 16-bit
ALU and 2K of RAM, why go for anything bigger?...).

Like, the processing requirements for keyboards and mice hasn't changed
much (and the main thing they mostly need to deal with is the needless
complexity of the USB protocol).



In my case:

The gains of 3 wide over 2 are small; and main reasons it is 3-wide in
my case is because if I am already needing to pay for 96-bit decode and
a 6R3W register file to get full advantage of the 2-wide case, the
nominal cost increase of a 3rd ALU and similar is low).

Though, did save the cost of eliminating Lane3 integer shift, since
Lane3 integer shift is rare and the integer shift logic isn't cheap (at
least, vs ADD/SUB/AND/OR/XOR, and MOV/EXTS/EXTU). Like, Lane3 existing
mostly for spare register ports and the occasional MOV or ALU op.

Granted, my compiler's strategy is fairly naive:
First emit code as-if it were a plain RISC style ISA;
Feed it through a stage that tries to shuffle and bundle instructions.
Shuffle first to try to untangle RAW dependencies;
Bundle to try to increase ILP;
Though, typically increases the number of RAW dependencies.

Code which is has a big pile of mostly independent operations tends to
do better here (and code with a lots of parallel expressions and lots of
variables, seems to be an area where my ISA is beating RISC-V).

Note that for code with "small tight loops", it is a lot closer, and in
some cases, this is an area where some of RISC-V's design choices make
more sense.

For example, the use of 2-register compare-and-branch operations, are in
general kind of expensive, but seem to be useful in some cases with
tightly running loops:
while(cs<cse)
*ct++=*cs++;


But, in terms of facing off against RISC-V in terms of performance, I
have ended up partly re-evaluating them.

And, in the case of tight loops, even the limitation in my case of them
only having an 8-bit displacement, is less of an issue:
XG2:
BRLT Rm, Rn, Disp8s //Branch if Rn<Rm, +/- 256B
BRLT Rn, Disp13s //Branch if Rn< 0, +/- 8K
CMPGT Rn, Rm; BT Disp23s //Branch if Rm>Rn, +/- 8MB
Baseline:
BRLT Rm, Rn, Disp8s //Branch if Rn<Rm, +/- 256B
BRLT Rn, Disp11s //Branch if Rn< 0, +/- 2K
CMPGT Rn, Rm; BT Disp20s //Branch if Rm>Rn, +/- 1MB


If the loop is tight, then the branch target is much more likely to be
within the 256 byte window (well, and if the CPU has the RV decoder; it
already needs to pay for the EX logic needed to support this
instruction). So, it is more an open question of "is it worthwhile to
have this instruction in a CPU that doesn't have a RISC-V decoder?".

But, looks like, "for best performance", it may be inescapable.
For the 32-bit encoding, displacement doesn't get any bigger, but it is
possible to use a jumbo-encoding to expand it to a 32-bit displacement.


Main weak points on the RV side are still the usual:
Lack of indexed load/store;
Poor handling of constants that don't fit in 12 bits.

So, say, Imm12s for ALU ops is arguably better than Imm9u/Imm9n/Imm10s
(or Imm10u/Imm10n/Imm11s), but not by enough to offset the ISA
effectively falling on its face when Imm12s fails.
Could be helped though if RV added a "LI Xd, Imm17s" instruction.

The "SHnADD" can instruction can help with indexed Load/Store, but would
not entirely "close the gap" for programs like Doom or similar (may or
may not make a difference for Dhrystone, where things are pretty tight,
but seemingly much of this is due to a relative lack of array-oriented
code in Dhrystone, so SHnADD would, similarly, not gain so much here).



But, yeah, higher priorities, if I were to redo things:
Put GBR and LR into GPR space;
Having these in CR space negatively effects prologs and epilogs;
Make encoding rules more consistent;
...


As noted, some of my ideas would make the first 4 registers special:
R0: ZR / PC
R1: LR / TBR (TP)
R2: SP
R3: GBR (GP)

But, likely keep similar register assignments to my existing ISA (but,
unlike my existing register space, could also be directly compatible
with the RISC-V ABI; without something like the existing XG2RV hack).
Putting LR and GBR in GPR space would help with prolog/epilog sequences;
A zero register would eliminate a lot of special cases.

But, this is not compatible with my existing ABI.
Where R2/R3 are used by the existing ABI, and R15 is SP.


But, unclear if it would be worthwhile, since any "redo" would still
have many of my existing issues, and is possibly moot *unless* it can
also either have a significant performance advantage over both RISC-V
and my existing ISA design, or if I instead switched over to RISC-V as
the primary ISA (and implemented the privileged spec) to potentially
leverage the ability to run an existing OS (such as Linux).

But, this latter point would likely only really matter if I found
detailed documentation, say, on SiFive's memory map and hardware
interfaces, and also cloned this (otherwise, a custom CPU core still
isn't going to be able to run an "off the shelf" Linux build; even if
the ISA itself matches up).

Though, a more intermediate option might just be to consider eliminating
the Read-Only pages range (from 0x00010000..0x00FFFFFF), and instead
moving this over to RAM (with 00000000..0000BFFF as ROM, and
0000C000..0000FFFF as SRAM), which would make the hardware memory map
more compatible with what GCC expects (though, would need to take care
such that loading the kernel doesn't stomp the Video RAM or similar,
which had otherwise been mapped into this "hidden" part of the RAM space).

Well, and any OS porting attempts needing to deal with things like the
different hardware interfaces and different approaches to interrupt
handling and similar (this more effects ones' ASM code)

But, I guess, one limitation seems to be:
Vs the existing ISA's, there is little real way to gain much additional
performance for normal integer workloads (at the ISA level);
Similarly, not much obvious way to either make the core significantly
cheaper, nor to make significant gains in terms of clock-speed via ISA
level design choices.



Some of my design attempts had lost the PrWEX encodings, but, if one
allows for a superscalar implementation, the loss of PrWEX isn't too bad
(could try to use optional superscalar support to mop this up).
Similarly, PrWEX is a fairly small subset of the total instruction count.


Say, if seen as 2 bits in the instruction word:
00: Execute if True
01: Execute if False
10: Scalar
11: Wide-Execute (arguably redundant on OoO).

Though, a few "arguably very wasteful" instructions (Load 24 bit
constants into a fixed register), have served multiple purposes (serving
as both the Jumbo Prefixes and PrWEX blocks), and in redesign attempts,
trying to eliminate or replace this "obvious waste" has left the problem
of how to deal with PrWEX and Jumbo prefixes. The unconditional branch
doesn't quite fit, as it also needs to be able to be predicated.

Though, I guess it could also be left as a more generalized:
Unconditional and Scalar-Only block.



But, my existing ISA has continued on with incremental fiddly:

Most recent change was ending up tweaking the decoding rules for the AND
and RSUB instructions, as I was faced with an issue:
Negative immediates for AND and RSUB were a lot more common than
expected, but were N/E with the prior rules;
There isn't really enough encoding space to add new one-extended
variants of these.

So, ended up changing the rules, noting that this seemingly would not
break existing binaries (the previous compiler output was not encoding
AND with immediate values between 512 and 1023), and would allow
encoding the apparent 12-15% of cases where the immediate was negative
(vs the roughly 1.5% of potential immediate values between 512 and 1023).

Also looked at OR and XOR (which were in a similar situation), but noted
that it seems that negative operands to OR and XOR are very rare
(roughly 1% across multiple programs), so it is better to leave these as
the prior rule (even if this now partially breaks symmetry between
AND/OR/XOR).

The percentage of negative inputs to AND was missed previously, as the
stats weren't really distinguishing things based on which operator was
being looked at.

Note that for ADD (with both ADD and SUB combined into one operation):
Balance is ~ 60% positive, 40% negative.
So, zero-extended immediate values would only make sense with separate
ADD/SUB.



It is harder to confirm whether my original choice for going for Disp10s
for Load/Store rather than Disp10u was better. The ratio between
displacements between -511..-64 vs 512..1023 seems to vary between
programs, and is pretty close either way (in any case, probably not
worth dealing with binary breakage over something with an epsilon of
around 1%, and supporting 32-bit ops with a negative displacement is
slightly better in-general than only supporting a positive displacement...).

...
Anton Ertl
2024-02-18 09:11:26 UTC
Permalink
Say, how well IA-64 could perform if only given, say, 16K of L1I$ and
128K of L2 cache, ...
Itanium (Merced) has 16KB I-cache and 96KB L2 cache.

Itanium 2 (McKinley) has 16KB I-cache and 256KB L2 cache.

They both have L3 caches.

But these CPUs actually do fine (for their time) on HPC-style stuff,
so the cache sizes are not the main problem. They perform badly at
code where the compiler cannot predict the branches well, even on
code that tends to perform well with small caches.

Of course, the Cortex-A53 and Bonnell also performs badly, for the
same reason, and Intel learned the lesson and replaced the in-order
Bonnell with the OoO line beginning with Silvermont, and up to the
recent Gracemont (Alder Lake E-core). Apple also went for OoO
E-cores. Only ARM is sticking to in-order cores.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
John Dallman
2024-02-17 22:30:00 UTC
Permalink
Post by Lawrence D'Oliveiro
But most of all, the design is based on the compilers being
static scheduling of memory loads in a system with multiple
levels of cache.
That seems insane.
To a modern understanding, it is insane. That's why I try to explain to
people who think "weird architecture from twenty years ago, didn't work
out, maybe I could make it work" that it is fundamentally flawed.
Post by Lawrence D'Oliveiro
Since when did architectural specs dictate the levels of cache
you could have? Normally, that is an implementation detail, that
can vary between different instances of the same architecture.
IA-64 did not attempt to dictate that, and implementations did have
varying levels and sizes of cache. That makes the attempt at static load
scheduling impractical, even if the processor wasn't taking interrupts.


John
Lawrence D'Oliveiro
2024-02-18 00:27:21 UTC
Permalink
Post by John Dallman
To a modern understanding, it is insane.
I think that was already becoming apparent even before it finally shipped.

I think HP and Intel started the project around 1990, and it only reached
production quality by nearly the end of that decade. During that time,
RISC architectures continued to improve, with things like superscalar,
multiple function units and out-of-order execution--basically leaving
IA-64 in the dust before it could even ship.

I think it was only fear of loss of corporate face that kept the project
going when it became clear it should have been abandoned.
Scott Lurndal
2024-02-18 01:10:46 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
To a modern understanding, it is insane.
I think that was already becoming apparent even before it finally shipped.
I think HP and Intel started the project around 1990,
The HP and Intel didn't join forces on what became Itanium
until intel gave up on the P7 project in 1994.
John Dallman
2024-02-18 08:59:00 UTC
Permalink
Post by Scott Lurndal
Post by Lawrence D'Oliveiro
I think HP and Intel started the project around 1990,
The HP and Intel didn't join forces on what became Itanium
until intel gave up on the P7 project in 1994.
And they didn't start publicising it until 1998, IIRC. If they thought it
wasn't going to work, they could have quietly cancelled it.

It seems to have been a result of groupthink that got established, rather
than face-saving. It was moderately convincing at the time; it took me a
fair while to abandon the intuitive reaction that it ought to be very
fast, and accept that measurement were the only true knowledge.

John
Anton Ertl
2024-02-18 11:50:49 UTC
Permalink
Post by John Dallman
And they didn't start publicising it until 1998, IIRC.
Well, according to ZDNet
<https://web.archive.org/web/20080209211056/http://news.zdnet.com/2100-9584-5984747.html>,
Intel and HP announced their collaboration in 1994, and revealed more
details in 1997. I find postings about IA64 in my archive from 1997,
but I remember reading stuff about it with no details for several
years. I posted my short review of the architecture in October 1999
<https://www.complang.tuwien.ac.at/anton/ia-64-1999.txt>, so by that
time the architecture specification had already been published.
Post by John Dallman
If they thought it
wasn't going to work, they could have quietly cancelled it.
After the 1994 announcement, some people might have asked at one point
what become of the project, but yes.
Post by John Dallman
It seems to have been a result of groupthink that got established, rather
than face-saving.
Yes.
Post by John Dallman
It was moderately convincing at the time; it took me a
fair while to abandon the intuitive reaction that it ought to be very
fast, and accept that measurement were the only true knowledge.
I certainly thought at the time that they were on the right track.
Everything we knew about the success of RISC in the 1980s and about
the difficulties of getting more instruction-level parallelism in the
early 1990s suggested that EPIC would be a good idea.

The worrying thing is that a few decades later, these ideas are still
so seductive, and the reasons of why they OoO+SIMD worked out better
are still so little-known that people still think that EPIC (and their
incarnations IA-64 and Transmeta) are basically good ideas that just
had some marketing mistake (e.g., in this thread), or just would need
a few more good ideas (e.g., the Mill with its belt rather than
rotating register files).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
John Dallman
2024-02-18 15:44:00 UTC
Permalink
Post by Anton Ertl
The worrying thing is that a few decades later, these ideas are
still so seductive, and the reasons of why OoO+SIMD worked out
better are still so little-known that people still think that
EPIC (and their incarnations IA-64 and Transmeta) are basically
good ideas that just had some marketing mistake (e.g., in this
thread),
IA-64 certainly did have some marketing mistakes, but they weren't what
sank it.
Post by Anton Ertl
or just would need a few more good ideas (e.g., the Mill with
its belt rather than rotating register files).
That . . . seems fair, actually. Oh, well. I'll pull it out of my list of
tentative platform names.

It's been clear to me for a while that the differences between
conventional ISAs aren't actually very important, provided they can
exploit all the memory and cache bandwidth and latency available. As
things evolve, new problems arise with existing ISAs.

The triumph of OoO as a means of managing the delays between memory and
CPU suggests that an ISA that made it easier for a CPU to determine
dependencies in some way has potential to make fast processors cheaper. I
don't know how to do that, but it's worth thinking about.

John
MitchAlsup1
2024-02-18 21:48:57 UTC
Permalink
The worrying thing is that a few decades later, these ideas are still so
seductive, and the reasons of why they OoO+SIMD worked out better are
still so little-known that people still think that EPIC (and their
incarnations IA-64 and Transmeta) are basically good ideas that just had
some marketing mistake ...
The equivalent on the software wide would be microkernels--again, there
are those who still think they can be made to work efficiently, in spite
of mounting evidence to the contrary.
When context switches take 1,000+ cycles but CALL/RET only take 5, µKernels
will never succeed. {That is a full context switch including ASID, IP, ROOT
pointers, complete register file, and all associated thread-state.}

µKernels can only succeed when context switch times are similar with CALL/RET.
Otherwise the performance requirements will end up dictating monolithic design.
Also, SIMD, while very fashionable nowadays, with its combinatorial
explosion in the number of added instructions, does tend to make a mockery
of the “R” in “RISC”. That’s why RISC-V is resurrecting the old Cray-style
long vectors instead.
Which is my point over the last ~year~ to stress that the R in RISC needs to
actually mean REDUCED. {{Any ISA with more than 200 instructions cannot be
called RISC.}}
BGB
2024-02-19 00:13:42 UTC
Permalink
The worrying thing is that a few decades later, these ideas are still so
seductive, and the reasons of why they OoO+SIMD worked out better are
still so little-known that people still think that EPIC (and their
incarnations IA-64 and Transmeta) are basically good ideas that just had
some marketing mistake ...
The equivalent on the software wide would be microkernels--again, there
are those who still think they can be made to work efficiently, in spite
of mounting evidence to the contrary.
I suspect it is relative costs:
System call vs task switch;
Number of calls or task switches needed for a request;
...

Say:
Monolithic kernel:
OS request is 1 syscall.
Microkernel:
OS request is (2n)^d context switches.


Say, if a file IO request is handled directly by a system call, this is
faster than if one context switches to a VFS, then to the FS driver,
then to the block-device driver, and then all the way back up the path.


Hybrid approaches can work, say, where normal memory and filesystem
calls are handled by the kernel, but things like GUI can be handled with
IPC calls.

Granted, none of the mainstream OS's run the GUI directly in the kernel,
so this may not not be a factor.
Also, SIMD, while very fashionable nowadays, with its combinatorial
explosion in the number of added instructions, does tend to make a mockery
of the “R” in “RISC”. That’s why RISC-V is resurrecting the old Cray-style
long vectors instead.
This is partly why I say, it can be done, but preferably to go down the
"x" path, and not the "x^2" path.

NEON and the RISC-V 'P' extension are examples of what happens if one
goes head-first into the x^2 path...

Though, I suspect a partial reason why seemingly no one seriously
promotes the 'P' extension is that it is fairly obvious that this is
most likely unworkable.
Lawrence D'Oliveiro
2024-02-19 02:13:08 UTC
Permalink
... things like GUI can be handled with IPC calls.
Which is how X11 and Wayland do it. The bottleneck is in the user response
time, so the overhead of message-passing calls is insignificant.
Granted, none of the mainstream OS's run the GUI directly in the kernel,
so this may not not be a factor.
Both Microsoft and Apple do tie their GUIs quite inextricably into the OS
kernel. That’s why you can’t customize them--at least, not in any easy way
that doesn’t threaten the stability of the system.
BGB
2024-02-19 08:51:55 UTC
Permalink
Post by Lawrence D'Oliveiro
... things like GUI can be handled with IPC calls.
Which is how X11 and Wayland do it. The bottleneck is in the user response
time, so the overhead of message-passing calls is insignificant.
IIRC, X11 had worked by passing message buffers over Unix sockets (with
Xlib as a wrapper interface over the socket-level interface).

These sockets apparently used a datagram based structure, similar to
UDP, except that message delivery was reliable and order preserving
(whereas UDP could deliver messages out of order, duplicated, or not at
all).

Though, admittedly, haven't looked in too much detail as to how the X11
protocol's messaging worked, as I didn't want to go this direction
(seemed like a needlessly high overhead way to approach a GUI).


In my experimental GUI API used in TestKern (also used for full-screen
display as well), it instead maps a COM Object style interface over
system calls.

IIRC, Syscall Numbers:
1000..11FF: Plain Syscall
1200..13FF: Method Number (IPC Object)

When a method number is given, an object is supplied which receives the
method call, and may be redirected to the task containing the exported
object. There are two sub-variants of the mechanism, one where the
object on the client is merely a placeholder stub, and the actual object
only exists in the target task; and another variant where the object is
shared with the client task, but method calls (from any other task) will
redirect to the owning task.

The former type was used for TKGDI, whereas the latter was used for
TKRA-GL's OpenGL context.

There was an idea that any task could export an object (spawning an
appropriate handler task for the object's interface), but this part
isn't implemented as of yet (so, TKGDI and TKRA-GL exist mostly as
kernel-tasks).

Generally, exported interfaces would be identified by one of:
A pair of FOURCC or EIGHTCC values (general idea is that FOURCC values
exist by being zero-extended to 64 bits);
A 128-bit GUID, effectively using the EIGHTCC pair as a GUID (it is
possible to distinguish between the FOURCC, EIGHTCC, and GUID values,
mostly by looking at the bit-pattern in the pair of 64-bit values).

Where, the idea is that "public" interfaces would be described using
FOURCC or EIGHTCC values, and "private" interfaces and object instances
would use GUIDs.


Though, more work is still needed in these areas.

The existing implementation is a bit hacky and a bunch of stuff is still
hard-coded (say, requests for interfaces are routed directly, rather
than be registering an interface object and directing interface queries
through the registered interfaces, ...).


There is also a distinction between "local" and "global" memory, where
only global memory will necessarily be visible from other tasks.

Though, the model in this case is more like, every time a program
allocates memory with "GlobalAlloc", it is like it is a shared-memory
object that is simultaneously mapped to the same address in every
process on the system.

In this case, there would be separate address ranges for process-local
memory, say:
0000_00000000..3FFF_FFFFFFFF: Global
4000_00000000..7FFF_FFFFFFFF: Local
8000_00000000..BFFF_FFFFFFFF: System (Supervisor Mode, Memory)
C000_00000000..FFFF_FFFFFFFF: Special (Supervisor Mode, Hardware)
C000_00000000..CFFF_FFFFFFFF: No MMU, Cached
D000_00000000..DFFF_FFFFFFFF: No MMU, No-Cache
E000_00000000..EFFF_FFFFFFFF: Reserved
F000_00000000..FFFF_FFFFFFFF: MMIO
Post by Lawrence D'Oliveiro
Granted, none of the mainstream OS's run the GUI directly in the kernel,
so this may not not be a factor.
Both Microsoft and Apple do tie their GUIs quite inextricably into the OS
kernel. That’s why you can’t customize them--at least, not in any easy way
that doesn’t threaten the stability of the system.
AFAIK, Windows runs its GUI via service processes and IPC, rather than
by having the GUI as part of the kernel itself.

Apparently, there was a thing early on in the WinNT line where they
didn't want to have graphics drivers running in kernel space either, but
eventually folded on this one as apparently trying to run the graphics
drivers in a user-space process was unacceptable in terms of performance.


Though, still it is more tightly coupled than, say, the X11 Server,
which ran as a userspace process, with the window manager as another
process, ...


But, I guess I can contrast both of them with my project, where
effectively the existing GUI is part of the kernel, just forked off into
a kernel task (along with the systemcall handler task, ...).

...
Scott Lurndal
2024-02-19 15:06:35 UTC
Permalink
Post by BGB
Post by Lawrence D'Oliveiro
... things like GUI can be handled with IPC calls.
Which is how X11 and Wayland do it. The bottleneck is in the user response
time, so the overhead of message-passing calls is insignificant.
IIRC, X11 had worked by passing message buffers over Unix sockets (with
Xlib as a wrapper interface over the socket-level interface).
The shared memory extension allows clients to directly access
buffers in the server.

https://www.x.org/releases/X11R7.7/doc/xextproto/shm.html
MitchAlsup1
2024-02-19 02:42:17 UTC
Permalink
Post by Anton Ertl
Post by John Dallman
And they didn't start publicising it until 1998, IIRC.
Well, according to ZDNet
<https://web.archive.org/web/20080209211056/http://news.zdnet.com/2100-9584-5984747.html>,
Intel and HP announced their collaboration in 1994, and revealed more
details in 1997. I find postings about IA64 in my archive from 1997,
but I remember reading stuff about it with no details for several
years. I posted my short review of the architecture in October 1999
<https://www.complang.tuwien.ac.at/anton/ia-64-1999.txt>, so by that
time the architecture specification had already been published.
Post by John Dallman
If they thought it
wasn't going to work, they could have quietly cancelled it.
After the 1994 announcement, some people might have asked at one point
what become of the project, but yes.
Post by John Dallman
It seems to have been a result of groupthink that got established, rather
than face-saving.
Yes.
Post by John Dallman
It was moderately convincing at the time; it took me a
fair while to abandon the intuitive reaction that it ought to be very
fast, and accept that measurement were the only true knowledge.
I certainly thought at the time that they were on the right track.
In 1991 when I first heard of what became Itanic while designing a 6-wide
GBOoO machine; we had a quick look-see and came to the conclusion it was
doomed from the start.
Post by Anton Ertl
Everything we knew about the success of RISC in the 1980s and about
the difficulties of getting more instruction-level parallelism in the
early 1990s suggested that EPIC would be a good idea.
We came to the opposite conclusion.
Post by Anton Ertl
The worrying thing is that a few decades later, these ideas are still
so seductive, and the reasons of why they OoO+SIMD worked out better
are still so little-known that people still think that EPIC (and their
incarnations IA-64 and Transmeta) are basically good ideas that just
had some marketing mistake (e.g., in this thread), or just would need
a few more good ideas (e.g., the Mill with its belt rather than
rotating register files).
- anton
Lawrence D'Oliveiro
2024-02-19 05:05:47 UTC
Permalink
Post by MitchAlsup1
We came to the opposite conclusion.
As they say, hindsight is 6/6.
Scott Lurndal
2024-02-18 16:16:10 UTC
Permalink
Post by John Dallman
Post by Scott Lurndal
Post by Lawrence D'Oliveiro
I think HP and Intel started the project around 1990,
The HP and Intel didn't join forces on what became Itanium
until intel gave up on the P7 project in 1994.
And they didn't start publicising it until 1998, IIRC. If they thought it
wasn't going to work, they could have quietly cancelled it.
I was at SGI in 1998, when some of SGI's compiler technology was
being considered for Merced.
Post by John Dallman
It seems to have been a result of groupthink that got established, rather
than face-saving. It was moderately convincing at the time; it took me a
fair while to abandon the intuitive reaction that it ought to be very
fast, and accept that measurement were the only true knowledge.
While that's fair, I'd suggest that there haven't been many successes
in the industry when attempting radical new architectures (Cray aside).
Lawrence D'Oliveiro
2024-02-18 21:12:08 UTC
Permalink
... I'd suggest that there haven't been many successes in
the industry when attempting radical new architectures (Cray aside).
Risky ideas are risky ...

After he left CDC, one might say Seymour Cray’s only real success was the
Cray-1. Not sure if the Cray-2 made much money, and the 3 and 4 didn’t
even make it into regular production.
MitchAlsup1
2024-02-18 21:41:55 UTC
Permalink
Post by Lawrence D'Oliveiro
... I'd suggest that there haven't been many successes in
the industry when attempting radical new architectures (Cray aside).
Risky ideas are risky ...
After he left CDC, one might say Seymour Cray’s only real success was the
Cray-1. Not sure if the Cray-2 made much money, and the 3 and 4 didn’t
even make it into regular production.
Seymour's talent was in packaging not in computer architecture.
Thornton was the computer µarchitect of the group.
Lawrence D'Oliveiro
2024-02-18 23:48:46 UTC
Permalink
Post by MitchAlsup1
Seymour's talent was in packaging not in computer architecture.
Bit unlikely, considering his supers didn’t use any very fancy packaging
techniques at all.
John Levine
2024-02-19 16:25:19 UTC
Permalink
Post by MitchAlsup1
Seymour's talent was in packaging not in computer architecture.
Bit unlikely, considering his supers didn’t use any very fancy packaging
techniques at all.
Huh? Maybe not for individual chips, but the wiring and cooling and overall
physical design were famous. Here's an article about it:

https://american.cs.ucdavis.edu/academic/readings/papers/CRAY-technology.pdf
--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
Lawrence D'Oliveiro
2024-02-20 01:05:17 UTC
Permalink
Post by John Levine
Post by Lawrence D'Oliveiro
Post by MitchAlsup1
Seymour's talent was in packaging not in computer architecture.
Bit unlikely, considering his supers didn’t use any very fancy packaging
techniques at all.
Huh? Maybe not for individual chips, but the wiring and cooling and
overall physical design were famous.
From Charles J Murray’s “The Supermen” (1997), pages 128-129:

“Cray had avoided the use of integrated circuits, or chips, for
nearly six years. As early as 1966, when he’d started on the CDC
7600, integrated circuits were commercially available at about
five dollars each, making them roughly equivalent in price to a
pile of discrete components. Even then, engineers understood the
advantages of integrated circuits: They eliminated the need for
careful hand soldering of individual components to a printed
circuit board.

“But Cray had always made a point of lagging a generation behind
the technology curve. That was precisely what he’d done on the
6600—using the silicon transistor almost a decade after its
introduction. ...

“In 1972 Cray knew it was time to use integrated circuits.”

So the Cray-1 was his first computer using integrated circuits.
MitchAlsup1
2024-02-19 18:22:17 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by MitchAlsup1
Seymour's talent was in packaging not in computer architecture.
Bit unlikely, considering his supers didn’t use any very fancy packaging
techniques at all.
Consider cooling a refrigerator sized computer that emits 300 KW
of heat ?? That IS a packaging problem and an interesting one, too.
Lawrence D'Oliveiro
2024-02-18 21:05:30 UTC
Permalink
Post by John Dallman
And they didn't start publicising it until 1998, IIRC. If they thought
it wasn't going to work, they could have quietly cancelled it.
I certainly heard about it before then. As I understood it, things went
quiet because it was taking longer than expected to make it all work. But
there were obviously those sufficiently high up in the management chain
who were determined not to be proven wrong. Otherwise, it could have been
cancelled.
Terje Mathisen
2024-02-19 22:04:50 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
And they didn't start publicising it until 1998, IIRC. If they thought
it wasn't going to work, they could have quietly cancelled it.
I certainly heard about it before then. As I understood it, things went
quiet because it was taking longer than expected to make it all work. But
there were obviously those sufficiently high up in the management chain
who were determined not to be proven wrong. Otherwise, it could have been
cancelled.
I ordered the Itanium architecture manual as soon as the cpu was
announced, and was very impressed. If it had turned up just 3 years
later (instead of 7?), and at the originally promised speed/clock
frequency, it would have been extremely competitive indeed.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
Paul A. Clayton
2024-02-23 03:39:04 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
To a modern understanding, it is insane.
I think that was already becoming apparent even before it finally shipped.
I think HP and Intel started the project around 1990, and it only reached
production quality by nearly the end of that decade. During that time,
RISC architectures continued to improve, with things like superscalar,
multiple function units and out-of-order execution--basically leaving
IA-64 in the dust before it could even ship.
I think it was only fear of loss of corporate face that kept the project
going when it became clear it should have been abandoned.
But how would you use a simulator if you don't have a compiler?
He said, well that's true we don't have a compiler yet, so I
hand assembled my simulations. I asked "How did you do thousands
of line of code that way?" He said “No, I did 30 lines of code”.
Flabbergasted, I said, "You're predicting the entire future of
this architecture on 30 lines of hand generated code?"
[chuckle], I said it just like that, I did not mean to be
insulting but I was just thunderstruck. Andy Grove piped up and
said "we are not here right now to reconsider the future of this
effort, so let’s move on". I said "Okay, it's your money, if
that's what you want."
Suddenly this came up again later in another guise but again
Andy shut me off, he said "we're not here to discuss it". Gordon
Moore is sitting next to me and hasn’t said a word, he looks to
all intents and purposes like he's asleep. He's got his eyes
closed most of the time, you think okay, the guy's tired, he's
old. But no, 20 minutes into this, he suddenly opens his eyes
and he points to me and he asks, "did you get ever get an answer
to your question?" and I said, "actually no, none that I can
understand". Gordon looked around and says, "how are we planning
to move ahead with this, if the answers don't make sense?" and
this time Andy Grove said to him "We’re not here to discuss
that, Gordon".
Scott Lurndal
2024-02-17 15:36:02 UTC
Permalink
Post by John Dallman
Post by Lawrence D'Oliveiro
Post by John Dallman
In the late 1990s, when those decisions were made, smart
mobile devices didn't exist.
Actually, they did. PDAs, remember?
True, but batteries of the period could not have supported Itanium's 100W+
power consumption for any useful time.
I was happy with my linux-based Sharp Zaurus SL-5000 when it first came out
in 2001.
John Dallman
2024-02-16 08:55:00 UTC
Permalink
Post by Scott Lurndal
Post by Lawrence D'Oliveiro
There is no path forward for Windows on non-x86.
That's entirely up to Microsoft. As has been noted, they do have
ARMv8 versions of windows 11.
https://learn.microsoft.com/en-us/windows/arm/overview
Their attitude to it has evolved quite a bit. At first, there was WinRT,
a cut-down version of Windows for 32-bit ARM, which was unsuccessful.
Then they produced full Windows for 64-bit ARM, which initially came with
a simplified GUI that was very limiting, although it could be turned off
to get the full OS.

That was viewed by MS as an "iPad killer", since it had a keyboard and
the "vastly superior Windows GUI" which did seem to be missing the point
quite badly. Development for it was supposed to be done on x64 Windows,
with the ARM Windows device being used via a USB connection, like iPad
development.

However, I found that was hopelessly inconvenient, and installed
compilers /on/ ARM Windows, using the built-in emulator, which was much
easier to work with, although a bit slow. It appears that plenty of other
people did the same thing, because MS now produce a native ARM64 Visual
Studio, after not producing non-x86 versions since NT4 days.

The available hardware has also evolved. At fist, there was only tablets
and laptops, but now Microsoft and Qualcomm sell various mini-desktop
systems for development, which are cheaper and faster than the laptops.

The ecosystem is gradually growing, and ARM Windows is available on Azure.
Qualcomm claim their Snapdragon X Elite CPUs will compete with Apple's
CPUs, although proof will have to wait for them to be available.

John
Lawrence D'Oliveiro
2024-02-16 21:49:52 UTC
Permalink
Post by John Dallman
That was viewed by MS as an "iPad killer", since it had a keyboard and
the "vastly superior Windows GUI" which did seem to be missing the point
quite badly.
A similar thing is happening again, with Valve’s Linux-based Steam Deck,
that offers a handheld gaming platform with a purpose-built UI. Even
though WINE/Proton offers less-than-perfect compatibility with Windows-
only games, it still seems to have found a sustainable niche in the
market.

Microsoft has been showing off a “Handheld Mode” for Windows, in an
attempt to compete, but so far that’s just vapourware.
Post by John Dallman
Development for it was supposed to be done on x64 Windows,
with the ARM Windows device being used via a USB connection, like iPad
development.
Which is such a dumb thing to do, given the Linux alternatives offer self-
hosted development and deployment stacks. Even the humble Raspberry Pi
could manage that from Day 1.
Post by John Dallman
Qualcomm claim their Snapdragon X Elite CPUs will compete with Apple's
CPUs, although proof will have to wait for them to be available.
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole range
of different vendors.
John Dallman
2024-02-17 00:10:00 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Dallman
Development for it was supposed to be done on x64 Windows,
with the ARM Windows device being used via a USB connection, like
iPad development.
Which is such a dumb thing to do, given the Linux alternatives
offer self-hosted development and deployment stacks. Even the
humble Raspberry Pi could manage that from Day 1.
As I said, Microsoft's approach was widely rejected and they've abandoned
it.
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to
Qualcomm chips? ARM Linux can run on a whole range of ARM chips
from a whole range of different vendors.
My knowledge of that story is under NDA at present.

John
Michael S
2024-02-17 17:22:25 UTC
Permalink
On Sat, 17 Feb 2024 00:10 +0000 (GMT Standard Time)
Post by John Dallman
Post by Lawrence D'Oliveiro
Post by John Dallman
Development for it was supposed to be done on x64 Windows,
with the ARM Windows device being used via a USB connection, like
iPad development.
Which is such a dumb thing to do, given the Linux alternatives
offer self-hosted development and deployment stacks. Even the
humble Raspberry Pi could manage that from Day 1.
As I said, Microsoft's approach was widely rejected and they've
abandoned it.
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to
Qualcomm chips? ARM Linux can run on a whole range of ARM chips
from a whole range of different vendors.
My knowledge of that story is under NDA at present.
John
I don't know about you, but I personally find well implemented
cross-development far more convenient than 'native' development.
I never developed for Win-ARM64, so don't know how well-implemented it
was.
Many years ago I wrote few programs for Win-CE on ARM32. Those were
relatively simple programs. So simple that I didn't bother to setup the
link between Visual Studio and my target platform. I just compiled on
my PC, copied to target (originally via Windows sharing, but later on
it was founded to be limiting, so we quickly switched to FTP) and then
run them there via telnet.

I case of CE, native development was not an option, but even if it would
be an option I would not use it. First, because probably there would
not be my preferred programmer's editor installed. Second and far more
important, because it would be too much trouble keeping all sources
synchronized with company's source control servers. There is
approximately zero chance that the target would be allowed to be
connected into corporate network. And it does not matter if the target
is WinArm32, WinArm64 or LinArm32 that I developed for couple of years
ago and likely to touch again in the next couple of weeks. I would not
do it natively, even despite the absence of well-implemented
integration of cross-compiler and the target.

May be, if my apps were order of magnitude more complicated than they
actually are, I'd feel differently. May be, in this case I would prefer
good native development environment over non-integrated cross. But I
am sure that even in this case I'd prefer well-integrated cross over
any native.
John Dallman
2024-02-17 18:22:00 UTC
Permalink
Post by Michael S
I don't know about you, but I personally find well implemented
cross-development far more convenient than 'native' development.
I never developed for Win-ARM64, so don't know how well-implemented
it was.
Not well, for my uses. I don't do applications; I do porting and
performance work for mathematical modelling libraries. These are tested
in a command-line harness, which reads test data from a network server
(there's a lot of test data).

The Microsoft cross-development setup required doing everything in their
IDE. I find that very hard to use, because I'm partially sighted, and it
also doesn't understand our domain-specific programming language. That
compiles to C, but editing the C is a very poor idea: it's used as a
high-level assembly language and regenerated on every compile. Any
changes you make in it have to be back-translated into the DSL by hand
and edited into that, so nobody works that way, and the IDE is only
useful as a debugger.

The cross-development setup also required that all your test data be
bundled with the app and pushed onto the device via USB, controlled by
the IDE. There's enough test data to make that very slow indeed, and it
didn't appear possible to operate the device through the IDE. Instead,
you had to physically operate it. Somebody had apparently been told to
make it just like developing for iOS, and had given it most of those
disadvantages.

We'd killed all of those dragons in supporting iOS, and we really didn't
want to do it all again for a different platform. It was far easier to
put the devices on Ethernet, unlock the GUI and use them as ordinary
Windows machines, with our custom-written development environment.
Post by Michael S
In case of CE, native development was not an option, but even if it
would be an option I would not use it. First, because probably there
would not be my preferred programmer's editor installed.
My favoured editor and tools ran straight away on ARM Windows 10, in the
x86 emulator. That made all of this practical. The difference from CE was
that ARM Windows 10 is real, full-fat Windows: the same kernel, userland,
APIs and utilities. It's compiled for ARM64, but it has an emulator to
run x86 Windows binaries (plus x86-64 if you're running Windows 11) which
works.

Lots of people seem to have done the same thing, given that MS have
switched plans and started producing native ARM64 versions of Visual
Studio (which I still don't use) and its compiler, linker, and so on,
which I will when I get to start that project.
Post by Michael S
Second and far more important, because it would be too much
trouble keeping all sources synchronized with company's source
control servers.
This is no problem at all for me. Being able to mount network filesystems
on ARM Windows solves that problem. This is partly because we don't have
full source trees in our working directories: the product is too big for
that, and takes too long to compile. So we have just a few source files
in our working directories and compile and link against the central build
tree. We can do that because the domain-specific language gives us far
more control over imports and exports than normal C or C++ programming.
Post by Michael S
There is approximately zero chance that the target would be allowed
to be connected into corporate network.
It's real Windows. It integrates fine. Corporate IT can't forbid it
without rendering the company unable to produce software for paying
customers. They did not try.
Post by Michael S
May be, if my apps were order of magnitude more complicated than
they actually are, I'd feel differently.
The main library that I work on is about 65MB as a Windows DLL; similar
sizes on x86-64 and ARM64. The test harness is about 5MB. The full test
data is somewhere over 300GB.

John
Michael S
2024-02-17 20:48:43 UTC
Permalink
On Sat, 17 Feb 2024 18:22 +0000 (GMT Standard Time)
Post by John Dallman
The Microsoft cross-development setup required doing everything in
their IDE.
That's very strange. I know for sure that in Vs2019 they have fully
functioning command line tools for aarch64. Was under impression that
VS2017 also has them.
It is typically more convenient to prepare the setup (project file) in
IDE, but after that you don't have to touch IDE at all if you don't
want to. Just type 'msbuild' from command prompt and everything is
compiled exactly the same as from IDE. At worst, sometimes you need to
add few magic compilation options like 'msbuild
-p:Configuration=Release'.
John Dallman
2024-02-17 21:37:00 UTC
Permalink
Post by Michael S
Post by John Dallman
The Microsoft cross-development setup required doing everything in
their IDE.
That's very strange. I know for sure that in Vs2019 they have fully
functioning command line tools for aarch64. Was under impression
that VS2017 also has them.
They are there, and I use them. Do not try to use aarch64 tools before
VS.2019 v16.7, which was when some significant code generator fixes
appeared. VS.2022 is good from v17.0.0, and fixes a major misfeature in
VS.2019's floating-point code generation.
Post by Michael S
It is typically more convenient to prepare the setup (project file)
in IDE, but after that you don't have to touch IDE at all if you don't
want to. Just type 'msbuild' from command prompt and everything is
compiled exactly the same as from IDE. At worst, sometimes you need
to add few magic compilation options like 'msbuild
-p:Configuration=Release'.
The problems were (a) the project system can't build the domain-specific
language I'm working in and (b) I could not find a way other than the IDE
to do the pushing to the device and running the app. I stopped looking
when I realised I could work on the device, so there may be a way to do
it without the IDE, but it's not something that's easy to find.

The IDE really does not work well with my partial sight. It expects me to
be able to sit far enough from the screen to see the whole screen, while
still able to read text on it. This is not achievable. I simply don't
have the angular discrimination to do it. I need to be very close to a
screen to read it - about 20cm is best - and then I'm simply unaware of
things happening around the edge. I have this problem with all IDEs;
Xcode is even more annoying than Visual Studio.

John
Lawrence D'Oliveiro
2024-02-17 22:05:00 UTC
Permalink
First, because probably there would not be my preferred programmer's
editor installed.
A commonality of OS distribution would fix that. Seems a lot of
development is moving to Linux now, which is why Microsoft is putting so
much effort in WSL. The Raspberry Pi, in particular, runs the same sort of
Debian distro widely available on x86 and over half a dozen other
architectures.
Second and far more
important, because it would be too much trouble keeping all sources
synchronized with company's source control servers. There is
approximately zero chance that the target would be allowed to be
connected into corporate network.
But the target is connected to your main PC, so it could pull indirectly
from there. Or alternatively your main PC could push to it.
Lawrence D'Oliveiro
2024-02-17 00:38:13 UTC
Permalink
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole
range of different vendors.
Qualcomm paid for the port ?!?
Can’t Microsoft afford to port Windows to anything else?
Anton Ertl
2024-02-17 18:08:36 UTC
Permalink
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole
range of different vendors.
Qualcomm paid for the port ?!?
Can’t Microsoft afford to port Windows to anything else?
Given what I read about the woes of running Linux (Android) on various
ARM-based SoCs, and the way that Windows deals with driver variations,
MS would have to pay additional SoC manufacturers to produce Windows
drivers, something that these SoC manufacturers are not set up to do.
So I guess that, indeed, MS does not want to afford the substantial
expense for porting Windows to additional SoCs, for now. I expect
that Qualcomm asked for money or other benefits to do that work for
MS, and likewise, the laptop manufacturer also had to be subsidized by
MS.

One solution would be if MS finally switched to using Linux as the
basis for Windows. Then they would automatically get all the stuff
that is done for Android and for the SBCs, although that is a sad
story, too.

Given the choice of an ARM-based system with some SoC-specific kernel
that is only supported for a few years, or some AMD64-based system,
which is supported by the Linux mainline for decades, I go for the
AMD64 system.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
John Dallman
2024-02-17 18:58:00 UTC
Permalink
Post by Anton Ertl
One solution would be if MS finally switched to using Linux as the
basis for Windows. Then they would automatically get all the stuff
that is done for Android and for the SBCs, although that is a sad
story, too.
Most Android device drivers are proprietary closed-source, belonging to
the SoC designers or device designers. Open-source Android drivers are
mostly written by reverse engineering the hardware, which is why fully
open-source Android offshoots, like LineageOS, usually only support
obsolete hardware.

John
Lawrence D'Oliveiro
2024-02-17 22:13:00 UTC
Permalink
One solution would be if MS finally switched to using Linux as the basis
for Windows.
Once they brought a Linux kernel into Windows with WSL2, it seemed
inevitable that they would rely on it more and more, until it became a
mandatory part of a Windows install.

I would call this
<https://www.theregister.com/2023/12/14/windows_ai_studio_preview/>
the first step.
Anton Ertl
2024-02-18 08:59:45 UTC
Permalink
Post by Lawrence D'Oliveiro
One solution would be if MS finally switched to using Linux as the basis
for Windows.
Once they brought a Linux kernel into Windows with WSL2, it seemed
inevitable that they would rely on it more and more, until it became a
mandatory part of a Windows install.
That's not what I mean. What I mean is to turn Windows into using the
Linux kernel rather than its current VMS-inspired kernel, and on top
of Linux provide a proprietary layer that provides the Win32 etc. ABIs
and APIs (what WINE is trying to do, but of course the WINE project
has neither the resources nor the authority of Microsoft). Similar to
Android.

The benefit for Windows-on-ARM would be that all those SoCs that
support by Android would also support Windows right away. The
disadvantage would be that this support might be just as bad and short
as for Android.

Thinking about it again, the proprietary-binary driver model of
Windows fits the tastes of these SoC manufacturers better than the
free source-level driver model of Linux, so once Windows-on-ARM
actually sells a significant number of SoCs, the SoC manufacturers
will happily provide such drivers.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
Lawrence D'Oliveiro
2024-02-18 21:01:15 UTC
Permalink
Post by Anton Ertl
Post by Lawrence D'Oliveiro
Post by Anton Ertl
One solution would be if MS finally switched to using Linux as the
basis for Windows.
Once they brought a Linux kernel into Windows with WSL2, it seemed
inevitable that they would rely on it more and more, until it became a
mandatory part of a Windows install.
That's not what I mean. What I mean is to turn Windows into using the
Linux kernel rather than its current VMS-inspired kernel ...
That is the next step. It would be the path of least resistance to
implement new functionality on the Linux side, and let the Windows kernel
wither away.
Anton Ertl
2024-02-18 11:19:33 UTC
Permalink
Post by Anton Ertl
Given the choice of an ARM-based system with some SoC-specific kernel
that is only supported for a few years
That's a false choice. See ARM BSA and SBSA.
Ok, I found "ARM Base System Architecture" and "Server Base System
Architecture". What I have not found (and I doubt that I will find it
there) is a mainline Linux kernel that runs on our Odroid N2 (SoC:
Amlogic S922X) and where perf stat produces results. I doubt that I
will find such a kernel in BSA or SBSA. By contrast, that's something
that our complete arsenal of machines with the AMD64 architecture
manages just fine. And that's just one thing.

For a more mainstream problem, installing a new kernel on an AMD64 PC
works the same way across the whole platform (well, UEFI introduced
some excitement and problems, but for the earler machines, and the
ones from after the first years of UEFI, this went smooth). By
contrast, for the ARM-based SoCs, I have to read up about the Do!s and
Don't!s for the Uboot for this particular SoC; I don't have time for
this nonsense, so I don't remember what the specific issues are, only
that there is quite a bit of uncertainty involved.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
Scott Lurndal
2024-02-18 16:22:59 UTC
Permalink
Post by Anton Ertl
Post by Anton Ertl
Given the choice of an ARM-based system with some SoC-specific kernel
that is only supported for a few years
That's a false choice. See ARM BSA and SBSA.
Ok, I found "ARM Base System Architecture" and "Server Base System
Architecture". What I have not found (and I doubt that I will find it
Amlogic S922X) and where perf stat produces results.
Does the Odriod N2 claim compliance to the BSA?

(It won't claim the SBSA, since it's not a server).

All the major OS vendors participate in the SBSA, and all
work properly on SBSA-compliant ARMv8/v9 systems, provided
drivers for proprietary hardware are available upstream
in the linux tree (something high-end SoC customers usually require).
Post by Anton Ertl
I doubt that I
will find such a kernel in BSA or SBSA. By contrast, that's something
that our complete arsenal of machines with the AMD64 architecture
manages just fine. And that's just one thing.
For a more mainstream problem, installing a new kernel on an AMD64 PC
works the same way across the whole platform (well, UEFI introduced
some excitement and problems, but for the earler machines, and the
ones from after the first years of UEFI, this went smooth).
All of our ARMv8 SoC's support either UEFI or uboot, it's up
to the customer to choose which to use based on their
requirements.
Anton Ertl
2024-02-18 18:05:42 UTC
Permalink
Post by Scott Lurndal
Post by Anton Ertl
Post by Anton Ertl
Given the choice of an ARM-based system with some SoC-specific kernel
that is only supported for a few years
That's a false choice. See ARM BSA and SBSA.
Ok, I found "ARM Base System Architecture" and "Server Base System
Architecture". What I have not found (and I doubt that I will find it
Amlogic S922X) and where perf stat produces results.
Does the Odriod N2 claim compliance to the BSA?
I have no idea.
Post by Scott Lurndal
All the major OS vendors participate in the SBSA, and all
work properly on SBSA-compliant ARMv8/v9 systems, provided
drivers for proprietary hardware are available upstream
in the linux tree (something high-end SoC customers usually require).
So the BSA label, if present, tells me that the SoC is supported by
mainline Linux. Unfortunately, most SoCs are not supported by
mainline Linux, because apparently significant hardware on the SoC is
supported only by some driver that sits on some forked Linux without
being upstreamed. And that's what results in smartphones with these
SoCs eventually not being able to get security updates.

As for high-end, I doubt that the SoC on a EUR 100 SBC meets that
description. But I don't think I will find a high-end SoC with a
Cortex-A73, much less in an SBC with support for a GNU/Linux
distribution rather than some Android system.

Overall, there are not that many SBCs around, and even fewer SoCs that
are used in them. The Rockchip SoCs we have used (RK3399, RK3588)
seem to be better supported than the Amlogic ones (S905, S922X). The
Raspis, when they eventually arrive, have good support, but they tend
to be quite late. E.g., we have had the Rock5B (with RK3588,
Cortex-A76s and A55s) for IIRC more than half a year before any word
about the Raspi5 (with a SoC with A76 cores) reached me. The bottom
line is that, for measuring how the A73 performs, the Odroid N2(+) is
the only game in town.
Post by Scott Lurndal
Post by Anton Ertl
For a more mainstream problem, installing a new kernel on an AMD64 PC
works the same way across the whole platform (well, UEFI introduced
some excitement and problems, but for the earler machines, and the
ones from after the first years of UEFI, this went smooth).
All of our ARMv8 SoC's support either UEFI or uboot, it's up
to the customer to choose which to use based on their
requirements.
Yes, I have seen uboot stuff in the documentation of the SBCs we use.
But the instructions for upgrading to a new kernel on these SBCs are
worrying.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-***@googlegroups.com>
John Levine
2024-02-17 02:40:29 UTC
Permalink
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole range
of different vendors.
More likely the Qualcomm chips have some peripherals that Windows wants.
--
Regards,
John Levine, ***@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly
Lawrence D'Oliveiro
2024-02-17 05:20:47 UTC
Permalink
Post by John Levine
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole
range of different vendors.
More likely the Qualcomm chips have some peripherals that Windows wants.
I wonder what they could be?

What’s so special about Qualcomm chips, that is so specific to Windows?
Because the products themselves don’t seem to reflect anything special.
MitchAlsup1
2024-02-17 20:05:05 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by John Levine
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole
range of different vendors.
More likely the Qualcomm chips have some peripherals that Windows wants.
I wonder what they could be?
WiFi radio transceivers, bluetooth, ...
Post by Lawrence D'Oliveiro
What’s so special about Qualcomm chips, that is so specific to Windows?
Because the products themselves don’t seem to reflect anything special.
Michael S
2024-02-17 20:20:57 UTC
Permalink
On Sat, 17 Feb 2024 20:05:05 +0000
Post by MitchAlsup1
Post by Lawrence D'Oliveiro
Post by John Levine
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to
Qualcomm chips? ARM Linux can run on a whole range of ARM chips
from a whole range of different vendors.
More likely the Qualcomm chips have some peripherals that Windows wants.
I wonder what they could be?
WiFi radio transceivers, bluetooth, ...
Those are trivial parts.
Much more importantly, they all have celular modems.
MS wants their WinARM customers to be connected to Internet all the
time, preferably even when big aplication processor is put to sleep.
Post by MitchAlsup1
Post by Lawrence D'Oliveiro
What’s so special about Qualcomm chips, that is so specific to
Windows? Because the products themselves don’t seem to reflect
anything special.
Scott Lurndal
2024-02-17 16:45:48 UTC
Permalink
Post by John Levine
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to Qualcomm
chips? ARM Linux can run on a whole range of ARM chips from a whole range
of different vendors.
More likely the Qualcomm chips have some peripherals that Windows wants.
Unlikely. More likely they fit the power curves required for the portable
devices like the Surface and the Lenovo Thinkpad.

https://github.com/AmpereComputing/Windows-11-On-Ampere
Michael S
2024-02-17 17:34:03 UTC
Permalink
On Sat, 17 Feb 2024 16:45:48 GMT
Post by Scott Lurndal
Post by John Levine
Post by Lawrence D'Oliveiro
The other thing is: why is Windows-on-ARM so heavily tied to
Qualcomm chips? ARM Linux can run on a whole range of ARM chips
from a whole range of different vendors.
More likely the Qualcomm chips have some peripherals that Windows wants.
Unlikely. More likely they fit the power curves required for the
portable devices like the Surface and the Lenovo Thinkpad.
So do Mediatek chips.
And HiSilicon chips as well, but those, of course, are not the option in
the current political climate.
Post by Scott Lurndal
https://github.com/AmpereComputing/Windows-11-On-Ampere
Stefan Monnier
2024-02-14 21:57:54 UTC
Permalink
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was
that Intel expected that the majority of Windows code would be 32-bit
by that point. It wasn’t.
Maybe for some segment of the Windows world, but for the
workstation/unix/RISC world, the Pentium Pro was no joke at all: it was
a game changer.


Stefan
Lawrence D'Oliveiro
2024-02-15 00:51:03 UTC
Permalink
Post by Stefan Monnier
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was
that Intel expected that the majority of Windows code would be 32-bit
by that point. It wasn’t.
Maybe for some segment of the Windows world, but for the
workstation/unix/RISC world, the Pentium Pro was no joke at all: it was
a game changer.
A chip with the emphasis on 32-bit performance, later replaced by the
Pentium II, with a greater emphasis on 16-bit performance ... only in the
x86 world, eh?
MitchAlsup1
2024-02-15 01:00:09 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Stefan Monnier
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was
that Intel expected that the majority of Windows code would be 32-bit
by that point. It wasn’t.
Maybe for some segment of the Windows world, but for the
workstation/unix/RISC world, the Pentium Pro was no joke at all: it was
a game changer.
A chip with the emphasis on 32-bit performance, later replaced by the
Pentium II, with a greater emphasis on 16-bit performance ... only in the
x86 world, eh?
This sounds remarkably like you expected sane behavior from x86 land.
Quadibloc
2024-02-15 11:27:21 UTC
Permalink
Post by MitchAlsup1
Post by Lawrence D'Oliveiro
Post by Stefan Monnier
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was
that Intel expected that the majority of Windows code would be 32-bit
by that point. It wasn’t.
Maybe for some segment of the Windows world, but for the
workstation/unix/RISC world, the Pentium Pro was no joke at all: it was
a game changer.
A chip with the emphasis on 32-bit performance, later replaced by the
Pentium II, with a greater emphasis on 16-bit performance ... only in the
x86 world, eh?
This sounds remarkably like you expected sane behavior from x86 land.
A chip which had leading-edge 32-bit performance, but which performed
poorly on the existing software users already had installed, was replaced
by one which _still_ had great 32-bit performance, but which fixed the
defect of inferior support for the older software that was also in use.

How was that not eminently sane behavior on the part of Intel? And what
isn't sane about x86 users not spending money to replace software that
was doing the job perfectly well?

Only the reduced cache speed - which reduced manufacturing cost to something
sustainable in a consumer-priced product - compromised performance in general.

John Savard
MitchAlsup1
2024-02-14 22:29:39 UTC
Permalink
Post by Lawrence D'Oliveiro
Eventually, the Pentium Pro ...
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was that
Intel expected that the majority of Windows code would be 32-bit by that
point. It wasn’t.
I had the first 200 MHz Pentium Pro out of the Micron factory.
It ran DOOM at 73 fps and Quake at 45+ fps both full screen.
I would not call that a joke.

It was <essentially> the death knell for RISC workstations.
Lynn Wheeler
2024-02-26 17:58:42 UTC
Permalink
Post by MitchAlsup1
I had the first 200 MHz Pentium Pro out of the Micron factory.
It ran DOOM at 73 fps and Quake at 45+ fps both full screen.
I would not call that a joke.
It was <essentially> the death knell for RISC workstations.
2003, 32 processor, max. configured IBM mainframe Z990 benchmarked
aggregate 9BIPS

2003 Pentium4 processor benchmarked 9.7BIPS

Also 1988, ibm branch office asked if I could help LLNL standardized
some serial stuff they were playing with which quickly becomes fibre
channel standard (FCS, initial 1gbit, full-duplex, 200mbytes/sec
aggregate). Then some IBM mainframe engineers become involved and
define a heavy-weight protocol that significantly reduces the native
throughput, which is released as FICON.

The most recent public benchmark I can find is "PEAK I/O" benchmark for
max. configured z196 getting 2M IOPS using 104 FICON (running over 104
FCS). About the same time a FCS was announced for E5-2600 blades
claiming over million IOPS (two having higher throughput than 104
FICON). Also IBM pubs recommend that System Assist Processors ("SAPs"
that do the actual I/O), be kept to no more than 70% processor ... which
would be about 1.5M IOPS).
--
virtualization experience starting Jan1968, online at home since Mar1970
John Dallman
2024-02-26 19:26:00 UTC
Permalink
I had the first 200 MHz Pentium Pro out of the Micron factory...
It was <essentially> the death knell for RISC workstations.
Yup. They struggled on for some time, but they never got near the
price-performance. When the Pentium Pro appeared, my boss was porting the
software I work on to Windows NT on MIPS, because NetPower reckoned they
had a market opportunity until they saw how fast PPro was. They switched
shortly thereafter:
<https://www.hpcwire.com/1996/02/16/netpower-migrates-from-mips-to-intels-
x86-architecture/>

Just as well, really: the Microsoft MIPS compiler was missing some vital
fixes that had gone into SGI's compiler, and would have given loads of
trouble to anyone attempting to do anything mildly complicated.

John
Jean-Marc Bourguet
2024-02-26 19:48:50 UTC
Permalink
Post by John Dallman
I had the first 200 MHz Pentium Pro out of the Micron factory...
It was <essentially> the death knell for RISC workstations.
Yup. They struggled on for some time, but they never got near the
price-performance.
64-bit support was what kept RISC workstations alive for a time.
--
Jean-Marc
Lawrence D'Oliveiro
2024-02-26 21:26:09 UTC
Permalink
Post by Jean-Marc Bourguet
64-bit support was what kept RISC workstations alive for a time.
Still, nowadays it seems a lot of Windows software is still 32-bit.
Whereas on a 64-bit Linux workstation, everything is 64-bit.
John Dallman
2024-02-26 21:57:00 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Jean-Marc Bourguet
64-bit support was what kept RISC workstations alive for a time.
Still, nowadays it seems a lot of Windows software is still 32-bit.
Whereas on a 64-bit Linux workstation, everything is 64-bit.
It is a little harder to port 32-bit Windows applications to 64-bit,
because Windows uses the IL32LLP64 memory model, rather than I32LP64.

Microsoft are gradually retiring 32-bit x86 versions of their operating
system, but they won't take away the ability to run 32-bit applications
in the foreseeable future, because there are still plenty around. That
means that applications that don't actually need 64-bit data addressing
can stay 32-bit, until someone decides to make the change. Even in a
single market segment, some companies have dropped 32-bit, while others
are still firmly 32-bit and seem scared of 64-bit.

John
Lawrence D'Oliveiro
2024-02-26 23:27:26 UTC
Permalink
Post by John Dallman
Microsoft are gradually retiring 32-bit x86 versions of their operating
system, but they won't take away the ability to run 32-bit applications
in the foreseeable future, because there are still plenty around.
I was mildly surprised to discover recently that Microsoft Visual
Studio only made the transition to 64-bit a couple of years ago. And
today I was even more surprised to discover that they haven’t quite
completed the transition: it seems the Windows Forms designer has
trouble because a lot of components are still 32-bit
<https://devclass.com/2024/02/26/microsoft-struggles-to-address-fallout-from-windows-forms-designer-failure-in-64-bit-visual-studio/>.
Terje Mathisen
2024-02-15 06:54:57 UTC
Permalink
Post by Lawrence D'Oliveiro
Eventually, the Pentium Pro ...
Ah, the poor Pentium Pro, that was a bit of a joke. The problem was that
That is so wrong that it isn't even funny.
Post by Lawrence D'Oliveiro
Intel expected that the majority of Windows code would be 32-bit by that
point. It wasn’t.
This is of course correct, but it really didn't matter!

What did matter, a lot, was the fact that when the PPro arrived, at an
initial speed of up to 200 MHz, it immediately took over the crown as
the fastest specINT processor in the world. I.e. it was a huge deal and
have been the basis for pretty much all x86 processors since then.

Dominating a market for ~30 years is not "a bit of a joke" imho.
Post by Lawrence D'Oliveiro
And, as I've noted also, the overwhelming dominance of Windoes on the
x86 shows "there can be only one", which is why I want my new
architecture to offer something the x86 doesn't... efficient emulation
of older architecures with 36-bit, 48-bit, and 60-bit words, so that
those who have really old programs to run are no longer disadvantaged.
Didn’t a company called “Transmeta” try that ... something like 30 years
ago? It didn’t work.
There is no path forward for Windows on non-x86. Only open-source software
is capable of being truly cross-platform.
That is correct, with the exception of special single-vendor platforms,
like the AS400 and several mainframes where the vendor makes sure that
all the old sw can still run with acceptable performance.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
Lawrence D'Oliveiro
2024-02-15 14:25:34 UTC
Permalink
Post by Terje Mathisen
What did matter, a lot, was the fact that when the PPro arrived, at an
initial speed of up to 200 MHz, it immediately took over the crown as
the fastest specINT processor in the world.
SPECint, but not SPECfp? After all, decent workstations had to have good
floating-point performance, and x86 was still saddled with that antiquated
8087-derived joke of a floating-point architecture.

Windows NT liked to call itself a “workstation” OS, but it was really just
a “desktop” OS.
s***@alum.dartmouth.org
2024-02-15 22:43:21 UTC
Permalink
Quadibloc <***@servername.invalid> wrote:
: While this seems like a super-niche thing to some, I see it as
: something that's practically _essential_ to have a future world of
: computers that doesn't leave older code behind - so that the
: computer you already have on your desktop is truly general in its
: capabilities.

This need is very real. At my first job the payroll ran on a
360 using the hardwware emulatior to run a 1401 simulator
for the 705 which ran the actual payroll. But,,,

The only example I pay much attention to are the various PDP-10
(not to be confused with DECSystem-10) simulators that run
PDP-10 code on current hardwaare faster than any actual 10
ever could. This seems like a much cheaper soltuions.

Sarr
--
--------
Sarr Blumson ***@alum.dartmouth.org
http://www-personal.umich.edu/~sarr/
Loading...