Eric Bron wrote:
stupid dog wrote:
Wasn't it confirmed many times that making predictions on the microarchitecture, based on low-quality, low-resolution photos is a bad idea?
I don't see why it's worse than basing them on nothing at all
It isn't worse (not that it's better also) but any way it's just about the quality of recent predictions from Hans.
Quote:
stupid dog wrote:
Now, if Intel is able to implement double pumped FPU
I'll be interested to learn from the EE of the board with recent experience (if any) if it's indeed possible (i.e. to have the FPU at roughly 8 GHz on 32 nm) My understanding is that it's self contained in a small area so it's far easier to have it at twice the core clock than wider structures with more wiring delay
Hmm... I always thought that FPU is a most critical thing from the performance point of view which is actually holds back frequency increase. The only possibility for FPU troughput increasing is adding additional pipeline stages (which is alone not a simple task for mul/div algorithms) and as a result - increasing mul/div instruction latency. This can greatly affect "regular" FPU/SSE2 code. Also I heard that some stages already doublepumped in Intel's implementation of radix-16.
Quote:
stupid dog wrote:
and double pumped sheduler in SB
It's another subject entirely, isn't it ?
Why? Wasn't AMD/Intel choose to double uops to implement SSE2 on 64-bit FPUs. Also, if you already have doublepumped FPU, why not to use it for "regular" x87/SSEx by doubling throughput?
Hans de Vries wrote:
Eric Bron wrote:
I discussed this data dependency issue Agner who looked at AVX in great detail.
He concluded that all instructions which would give data-dependencies were omitted.
There are many SSEx instructions without data dependency which theirs 256 bit implementation was omitted to. This may point that the "second" 128-bit fpu is a bit simplier then the "first" one. Probably omitting data-dependency instriction means that the inter-fpu communications is a bit costly