You are currently viewing our boards as a guest, which gives you limited access to view most discussions and access our other features. By joining our free community, you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, and access many other special features. Registration is fast, simple, and absolutely free, so please, <a href="/profile.php?mode=register">join our community today</a>!
in the 2nd bullet he says : "[...] measures 1.42X speedup (would have predicted 1.5X with the current architecture in the limit; vs. 1.0X if we had double pumped). [...]"
so he is explicitely stating that it isn't double pumped, isn't it ?
Hi, Eric ".....vs. 1.0X if we had double pumped......"
I read "double pumped" here as "using two cycles per 256 bit operation" at the same frequency which would indeed give you a maximum speed up of 1.0X compared to the current architecture, as he says.
Using two cycles at double the frequency should give a maximum speed up of 2.0X compared to the current architecture. (limited to 1.42X due to the L1 cache write ports with 48 bytes per cycle, in the case if that's the bottleneck)
I would expect a >20% increase in core-size with all 256bit units which simply isn't there. There should also be this 1500+ entry trace cache increasing the core transistor count...
I read "double pumped" here as "using two cycles per 256 bit operation" at the same frequency
well maybe, there is some clues (in the same thread at the AVX forum) in favor of your hypothesis : - the "wrong chart" with AVX HIGH and AVX LOW labels - Intel's Shih Kuo remark : "It seems point 1) may have assumed it requires monolithic 256-bit hardware to achieve 1 cycle throughput for 256-bit AVX instructions. That's not true."
Hans de Vries wrote:
There should also be this 1500+ entry trace cache increasing the core transistor count...
I never heard of it for Sandy Bridge, where did you get the info ? a patent ?
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sat Oct 10, 2009 6:07 am
Joined: Wed Aug 08, 2007 1:43 pm Posts: 10
Quote:
Conclusion:
Seems highly likely now that Sandy Bridge's 256 bit AVX is implemented with 128 bit hardware running at twice the frequency, and becoming likely now: That Larrabee's 512 bit LNI uses the "same" 128 bit hardware running at 4 times the clock speed as the rest of the core....
Regards, Hans
Oh no... Hans repeats that crap again and again... Although, given the long history of incorrect predictions about the the Intel's architectures, it should not greatly surprise.
Oh no... Hans repeats that crap again and again... Although, given the long history of incorrect predictions about the the Intel's architectures, it should not greatly surprise.
come on stupid, don't you see it's a lot of fun to try to predict these things?, even if proved partially wrong sometimes
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sat Oct 10, 2009 1:03 pm
Joined: Thu Jul 26, 2007 9:09 pm Posts: 18 Location: Arnhem, Gelderland, The Netherlands, Europe
stupid dog wrote:
Oh no... Hans repeats that crap again and again... Although, given the long history of incorrect predictions about the the Intel's architectures, it should not greatly surprise.
For a fact that this picture is a year before the launch (November 17, 2008). It is rather close. It's not perfect, but also not bad.
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sat Oct 10, 2009 4:10 pm
Joined: Tue Aug 07, 2007 11:57 am Posts: 190
jamannetje wrote:
stupid dog wrote:
Oh no... Hans repeats that crap again and again... Although, given the long history of incorrect predictions about the the Intel's architectures, it should not greatly surprise.
For a fact that this picture is a year before the launch (November 17, 2008). It is rather close. It's not perfect, but also not bad.
J
The image is called "Nehalem_at_1st_glance" and is based on the very first raw Nehalem die picture which:
1) Totally conceals the L2 caches under the wiring. 2) Has half of the memory interface cut off !!
I had to paste another copy of the raw die's memory interface on top of the annotated die to make the size fit with that from later photo's.. The only info from intel was the transistor count.
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sun Oct 11, 2009 6:39 am
Joined: Wed Aug 08, 2007 1:43 pm Posts: 10
Eric Bron wrote:
stupid dog wrote:
Oh no... Hans repeats that crap again and again... Although, given the long history of incorrect predictions about the the Intel's architectures, it should not greatly surprise.
come on stupid, don't you see it's a lot of fun to try to predict these things?, even if proved partially wrong sometimes
Wasn't it confirmed many times that making predictions on the microarchitecture, based on low-quality, low-resolution photos is a bad idea? But even worse idea, it is to protect his unfulfilled predictions, inventing more incredible predictions. Although, I do not think that double pumped fpu is a bad idea, but even in so "speed demon" architecture as Netburst, Intel was able to implement only double pumped simple integer ALU (ADD/SUB). Now, if Intel is able to implement double pumped FPU and double pumped sheduler in SB, I see no reason why not to implement double pumped integer part as well. This thing only would give something like 50% bust overall on integer code. But this is too good to be true.
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sun Oct 11, 2009 8:04 am
Joined: Wed Aug 15, 2007 3:06 am Posts: 51
What's so hard to believe about doubled FP resources on Sandy Bridge? According to PCWatch, the core size has indeed become slightly larger on Sandy Bridge compared to Westmere(unlike your analyzations).
I don't know what it is. Anti-Intel or Pro-AMD or a bit of both? Far as I remember you didn't have such bias mere few years ago.
Wasn't it confirmed many times that making predictions on the microarchitecture, based on low-quality, low-resolution photos is a bad idea?
I don't see why it's worse than basing them on nothing at all
stupid dog wrote:
Now, if Intel is able to implement double pumped FPU
I'll be interested to learn from the EE of the board with recent experience (if any) if it's indeed possible (i.e. to have the FPU at roughly 8 GHz on 32 nm) My understanding is that it's self contained in a small area so it's far easier to have it at twice the core clock than wider structures with more wiring delay
stupid dog wrote:
and double pumped sheduler in SB
It's another subject entirely, isn't it ?
Quote:
, I see no reason why not to implement double pumped integer part as well. This thing only would give something like 50% bust overall on integer code. But this is too good to be true.
since each regular ALU has already 1 clock throughput for common operations and since the L1D cache, decoder and scheduler (for ex.) are the limiting factors, I don't see how do you find this 50% speedup ? => please provide more details
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sun Oct 11, 2009 11:58 am
Joined: Tue Aug 07, 2007 11:57 am Posts: 190
no@spam.com wrote:
> I had to paste another copy of the raw die's memory interface on top > of the annotated die to make the size fit with that from later photo's..
Liar. Here is what came out straight of Intel's presentations back then.
Hey mr. as...., that was not the first picture out on the web.
If you look at the annotated picture then you see the copy and paste I had to do. The memory interface contains two identical parts. It has been there all the time, copied many times over on the internet as a lasting proof.
Last edited by Hans de Vries on Sun Oct 11, 2009 1:06 pm, edited 1 time in total.
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sun Oct 11, 2009 12:18 pm
Joined: Tue Aug 07, 2007 11:57 am Posts: 190
Eric Bron wrote:
I'll be interested to learn from the EE of the board with recent experience (if any) if it's indeed possible (i.e. to have the FPU at roughly 8 GHz on 32 nm) My understanding is that it's self contained in a small area so it's far easier to have it at twice the core clock than wider structures with more wiring delay
This is a fully automated process. The functional logic doesn't need to be faster. Point is that there are no data-dependencies between halfstages. It's just inserting flip-flops at the right locations where the signal propagation is at half the cycle time.
I discussed this data dependency issue Agner who looked at AVX in great detail. He concluded that all instructions which would give data-dependencies were omitted.
Post subject: Re: The first "Sandy Bridge" tape-out revealed?
Posted: Sun Oct 11, 2009 1:53 pm
Joined: Tue Aug 07, 2007 11:57 am Posts: 190
DavidC1 wrote:
What's so hard to believe about doubled FP resources on Sandy Bridge? According to PCWatch, the core size has indeed become slightly larger on Sandy Bridge compared to Westmere(unlike your analyzations).
I don't know what it is. Anti-Intel or Pro-AMD or a bit of both? Far as I remember you didn't have such bias mere few years ago.
Intel would simply be way smarter to do it in the fashion I described:
1) Using less die area. 2) Using less static energy (leakage)
At least I would try to do it this way.
I don't get why people get angry and outraged. I remember a similar outrage because my simple annotated Nehalem picture had an L3 cache in it instead of a common L2 cache shared by all eight threads.
One or two were extremely insulting towards me. The only thing that you can do is to ignore those people and avoid getting angry as well and hitting back. The'll turn in stalkers that follow you all over the internet in order to insult and harass you anonymously wherever and whenever they can.
Users browsing this forum: No registered users and 2 guests
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum