You are currently viewing our boards as a guest, which gives you limited access to view most discussions and access our other features. By joining our free community, you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content, and access many other special features. Registration is fast, simple, and absolutely free, so please, <a href="/profile.php?mode=register">join our community today</a>!
The way I see it, Ruiz is more interested in selling what they have, even more than looking at what they will have to sell and compete against in the future.
I remember him saying "our customers don't buy nanometers"
I don't think that AMD didn't have resources or plans to implement MCM quad core. It would have been very easy to implement: just connect two dies with HT. Performance wouldn't been optimal, but it would be a better solution for servers than dual core CPUs.
I believe this was just a political decision: top management wanted to talk about "native quad core advantage"...
In April 2005 even Intel didn't know wether they want to manufacture a MCM Wolfdale. Only 2 and a half years later we see: Whitefield and K9 died and everything is scrambled up 180 degrees.
In terms of AMD I think there are two things to seperate from each other.
We should talk about Northbridge+IMC+L3-Cache-Team and the Core-Team.
As far as I can see both didn't perform very well, both on very different areas.
The Core-Team don't wanted to risk to much - now they have an IPC improvement but not as much as they wanted to (I think a 4-issue wide processor would have been the better decision).
The Northbridge-Team had a lot of big designchallenges to fight against. Furthermore it was not possible to raise the clock-speeds in the regions they wanted to.
I think that Barcelona has a lot of headroom - when all speed-paths are optimised. The differently clockable areas on the die are very impressive - in my opinion a little bit to flexible. I hope that they will change this approach to a more static model.
Something like this for example:
CPU1 3GHz
CPU2 1.5GHz
CPU3 1.5GHz
CPU4 0.75GHz
NB 3GHz
I think it is not really helpful when you want to reach a L3-Cache-Cell as fast as possible and you have to go through oddly clocked areas. A lot of performance is killed here.
On the point of the OP -- nothing is wrong with the AMD design team, they are some of the top in the field.... taken on it's own, compared back against the prior core, Agena/Barcelona registered a very healthy gain (IPC efficiency) over K8... on average you will find roughly 15% or so...
The issue, from my perspective, is still with the 65 nm process in general, the transistor performance and thermals are just not enough to make a viable product compared to the competition... can you blame design for this? In some parts perhaps, there is certainly an interaction that must be accounted for when approaching design goals for clockspeed as well as IPC gains...
But if process does not deliver the electrical performance that the design goals need to meet the clock demand, then even a good design will result in a poor end product....
taken on it's own, compared back against the prior core, Agena/Barcelona registered a very healthy gain (IPC efficiency) over K8... on average you will find roughly 15% or so...
When you say "on its own", you mean disregarding the 4 years time that has passed? It seems to me that evaluating a core is pretty meaningless without pinning it to a timeline.
taken on it's own, compared back against the prior core, Agena/Barcelona registered a very healthy gain (IPC efficiency) over K8... on average you will find roughly 15% or so...
When you say "on its own", you mean disregarding the 4 years time that has passed? It seems to me that evaluating a core is pretty meaningless without pinning it to a timeline.
What I mean is relative to the older core. There does not need to be a specific timeline associated with revisions, they typically happen on timelines of 4-6 years (designing new archtectures is a lengthy process). In short, it is all relative. For AMD, this is relative to K8, just as K8 was relative to K7 and so forth... the design goals of K10 were decided long before AMD knew of the potency of the C2D design, you cannot fault them for not aiming higher as they suspected another redesign of Netburst (or at least I suspect), you can fault them for underestimating the ferocity that Intel would switch gears and complete the 'right hand turn'
Compare the core uArchitecture in core 2 to it's lineage, which the immediately preceding revision was Yonah, there was a 15-20% IPC gain, healthy, good. Relative to that core revision, Core 2 was a successful redesign.
Similarly, K10 (what it is commonly called) is a healthy 15% or so over the prior generation K8. Which is my statment "taken on it's own" meaning compared against the natural architectural progression.
The problem is that everything in this duopoly is relative, while K10 was a healthy design revision, relative to the competitive product in the markt it is falling about 5-10% short IPC wise (i.e. clock for clock). What this says plainly is that in the mobile space Intel had the architecture that was the equivalent to slightly better clock for clock with Yonah, but with Netburst in the DT/server space, no one really noticed (laptops were never intended for high performance computing).... relative to Netburst, Core simply looked like a massive leap forward (and relative to netburst it was).
So it is not fair to by hypercritical of K10 in the context it was a revision over K8, the only criticism to be levied is that AMD did not plan to be competing against a major revision of an already good core to begin with, as such they appear to have been caught flatfooted and set their design/performance goals too low.
taken on it's own, compared back against the prior core, Agena/Barcelona registered a very healthy gain (IPC efficiency) over K8... on average you will find roughly 15% or so...
I think Barcelona has closer to 10% faster IPC over K8, than 15%, but with the added penalty of not being able to clock as high(and the TLB problem too of course ;-) )
taken on it's own, compared back against the prior core, Agena/Barcelona registered a very healthy gain (IPC efficiency) over K8... on average you will find roughly 15% or so...
I think Barcelona has closer to 10% faster IPC over K8, than 15%, but with the added penalty of not being able to clock as high(and the TLB problem too of course ;-) )
Fair enough, I am going by the data I have seen (relative from Phenom vs Athlon) on the DT -- server is a different beast all together, and I would say the actual data to make a good comparision is rather scant -- there is the Tom's article, and then all the Phenom reviews (without the patch) which do not show clock for clock in most cases. Anand did a realatively simple comparision, quck and dirty... then there were the leaked AMD documents that claimed 15% (which from the data that can be gleened is more or less consistent).
The point is, K10 is an improvement over K8 ... just not enough to over take the deficit of the competition clock for clock and certainly not enough to over take the clock deficit which will exist for quite some time.
10% is not impressive at all. They could have got that with just focusing on the l2 cache size and latency and improving that crap DDR2 controller.
Also integer performance is basically the same, so quite alot of applications will see un-noticeable speed increases if any at all.
For me, that all adds up to being not good enough. It may be an improvement, but its weak and not good enough for the amount of time they have had.
Id like to see a K8 with K10s cache and memory controller improvements(without the TLB problem), I bet the difference between K8 and K10 on all but a very few benchmarks would be un-noticeable.
I think Barcelona has closer to 10% faster IPC over K8, than 15%, but with the added penalty of not being able to clock as high(and the TLB problem too of course ;-) )
Where did you get that from? There are no dual core Barcelonas and no quad core K8s, so any direct comparison is a fanboi armchair quarterbacking job.
Where did you get that from? There are no dual core Barcelonas and no quad core K8s, so any direct comparison is a fanboi armchair quarterbacking job.
You mean single-threaded jobs are "fanboi armchair quarterbacking jobs"?
Or perhaps you could enlighten us as to what a fair comparison would be... And even give us what you think the outcome would be.
Also, perhaps there are no quad-core K8s, but there are dual-core dual-socket K8 systems. Which give us an approximation of how a "dumb" (e.g. two dies on the same socket) quad-core K8 could perform.
I think Barcelona has closer to 10% faster IPC over K8, than 15%, but with the added penalty of not being able to clock as high(and the TLB problem too of course ;-) )
Where did you get that from? There are no dual core Barcelonas and no quad core K8s, so any direct comparison is a fanboi armchair quarterbacking job.
-Charlie
I must have got that from the same place your Dancing in the Aisles came from.
You know I am right, just as I have known for a long time now how you are wrong.
Where did you get that from? There are no dual core Barcelonas and no quad core K8s, so any direct comparison is a fanboi armchair quarterbacking job.
You mean single-threaded jobs are "fanboi armchair quarterbacking jobs"? Or perhaps you could enlighten us as to what a fair comparison would be... And even give us what you think the outcome would be.
Also, perhaps there are no quad-core K8s, but there are dual-core dual-socket K8 systems. Which give us an approximation of how a "dumb" (e.g. two dies on the same socket) quad-core K8 could perform.
Modern CPUs are mainly power limited, so given that both x86 CPU makers have (mostly) nailed themselves to certain power limits, clocks are dictated by that power. If they chose to make a dual core, you can clock it higher than a quad given the same power usage on both chips.
Now, go back and re-read the part I quoted, it specifically says not being able to clock as high along side IPC. If there is no even comparison, you can't say it isn't able to clock as high, and until they come out with a 2C Barcelona, he will be spouting bullshit.
Dual socket is a straw man argument because it effectively doubles the power limit. If you don't get this part, you are hopeless.
A fair comparison would be a 1S 2C K10 cored machine vs the same system with a 1S 2C K8 cored CPU. All other parts can and will be kept equal, and power measured. That or you can just compare clocks when a 1S 2C K10 comes out.
Modern CPUs are mainly power limited, so given that both x86 CPU makers have (mostly) nailed themselves to certain power limits, clocks are dictated by that power. If they chose to make a dual core, you can clock it higher than a quad given the same power usage on both chips.
Why are the tri-cored roadmaps showing lower speed bins than quad-core?
10% is not impressive at all. They could have got that with just focusing on the l2 cache size and latency and improving that crap DDR2 controller. Also integer performance is basically the same, so quite alot of applications will see un-noticeable speed increases if any at all. For me, that all adds up to being not good enough. It may be an improvement, but its weak and not good enough for the amount of time they have had. Id like to see a K8 with K10s cache and memory controller improvements(without the TLB problem), I bet the difference between K8 and K10 on all but a very few benchmarks would be un-noticeable.
Perhaps you should submit your job application to AMD so they can hire better CPU architects.
Users browsing this forum: No registered users and 1 guest
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum