 |
Aceshardware (not so) temporary home for the aceshardware community
|
| View previous topic :: View next topic |
| Author |
Message |
inf64
Joined: 04 Sep 2007 Posts: 69
|
Posted: Mon Nov 19, 2007 4:52 pm Post subject: |
|
|
| Gabriele Svelto wrote: | After looking at the review and considering the previous reviews of K10-based Opterons I'm starting to see a pattern: they screwed the whole MC/L3-thingy.
The in-core changes which went into the K10 should have improved performance over K8 however this improvements have been MIA in a lot of benchmarks. The only explanation I can come with is that the MC/L3 controller is holding the core back, all evidence points in that direction: new Opterons having significantly higher unloaded latency compared to K8s, Phenom's MC/L3 being stuck at 2.0 GHz and the consequent poor performance (relatively speaking).
I may be wrong and this is all speculation on my part still I don't see any other potential reason why performance improvements over K8 (clock-for-clock core-for-core) failed to materialize. |
My thoughts exactly.They messed up the whole Northbridge(clock wise) and they meesed up the L3(TLB bug).Northbridge should run at the same clock as the cores(or even faster in AM2+ boards).
This resulted in sub par mem. tests(compared to X2s ,Phenom really suck badly in this department),and the L3 bug costs them roughly 10%:
From heise article(rough auto translation):
All previously finished Phenoms still have a bug in Translation Lookaside Buffer (TLB) to the L3-Caches. Er soll laut AMD nur unter sehr seltenen Lastfällen auftreten und entspricht den Fehlern im Barcelona. According to AMD, he is only in very rare cases occur load and meets the errors in Barcelona. AMD kann den Fehler durch Deaktivieren des gesamten TLBs für den L3-Cache umgehen, allerdings kostet das laut eigenen Aussagen zehn und mehr Prozent Performance. AMD is the mistake by disabling the entire TLBs for L3-Cache bypass, but it cost, according to its own statements, and ten percent more performance.
From all of this it seems that K10 wasn't quite finished.It is monolithic quad core,with amazing level of integration(3 levels of cache,4 cores on same die,integrated northbridge,core improvements) but done on SOI process which is probably the reason of low clock rates and defects.On top of that,they are still finding bugs in the design :(
I hope they fix these problems real soon,since K10 really does seem well on paper-but in reality it looks like unfinished and rushed product.
|
|
| Back to top |
|
 |
Gabriele Svelto
Joined: 27 Jun 2007 Posts: 290 Location: Milano, Italy
|
Posted: Mon Nov 19, 2007 5:03 pm Post subject: |
|
|
| inf64 wrote: | My thoughts exactly.They messed up the whole Northbridge(clock wise) and they meesed up the L3(TLB bug).Northbridge should run at the same clock as the cores(or even faster in AM2+ boards).
This resulted in sub par mem. tests(compared to X2s ,Phenom really suck badly in this department),and the L3 bug costs them roughly 10% |
A hasty fix in the IMC/L3 might also explain the >20ns extra latency seen on 2.5 GHz K10-based Opterons compared to the K8 ones. I sincerely hope for AMD that this is really a bug and that they can get fixed it quickly, if it's a design flaw they're in for a very rough ride.
| Quote: | | I hope they fix these problems real soon,since K10 really does seem well on paper-but in reality it looks like unfinished and rushed product. |
I agree, if they could pull a HD2900XT-to-HD3870 transition for Phenoms too compared to the current offerings it would be great, but I'm not holding my breath on this one.
|
|
| Back to top |
|
 |
inf64
Joined: 04 Sep 2007 Posts: 69
|
Posted: Mon Nov 19, 2007 5:15 pm Post subject: |
|
|
damn double post.Grr,the board's gremlins are @work again :S
Last edited by inf64 on Mon Nov 19, 2007 5:19 pm; edited 2 times in total |
|
| Back to top |
|
 |
inf64
Joined: 04 Sep 2007 Posts: 69
|
Posted: Mon Nov 19, 2007 5:18 pm Post subject: |
|
|
| Gabriele Svelto wrote: |
I agree, if they could pull a HD2900XT-to-HD3870 transition for Phenoms too compared to the current offerings it would be great, but I'm not holding my breath on this one. |
I still hope they manage to release the 2.4Ghz and 2.6Ghz parts with problems sorted out.It would be nice to see a "full featured" K10 MPU,with full speed L3 cache and no "showstopper" bugs..Be it on 65nm(RV670 is 55nm,great improvement over 80nm R600),i just want to see it running without problems or crippled cache.
One more thing i am looking forward is user OC experiments where they raise the Northbridge clock while OCing via HTT(since L3 clock is linked to base clock).Let's see if the NB can actually run stable above 2.2GHz,just to eliminate the L3 block inside of it as a part that "won't clock"
PS I just looked at pcper article and the power consumtion(even on the OCed 2.6Ghz,1.3V part) is really good.Phenom shines @idle and it's good @ load.At least one positive :)
|
|
| Back to top |
|
 |
lux_interior
Joined: 26 Jul 2007 Posts: 252
|
Posted: Mon Nov 19, 2007 5:21 pm Post subject: |
|
|
| Gabriele Svelto wrote: | | The same graph shows memory accesses having ~18ns of latency, that's quite unlikely. I don't know how the program they use works but it doesn't seem like it's returning correct information for blocks above 2 MiB unless the prefetcher is playing tricks behind its back. |
You're right, there's something fishy. They used RightMark Memory Analyser, I don't know how reliable it is.
|
|
| Back to top |
|
 |
Johan
Joined: 23 Jul 2007 Posts: 161 Location: Belgium
|
Posted: Mon Nov 19, 2007 5:26 pm Post subject: |
|
|
| Gabriele Svelto wrote: |
A hasty fix in the IMC/L3 might also explain the >20ns extra latency seen on 2.5 GHz K10-based Opterons compared to the K8 ones. I sincerely hope for AMD that this is really a bug and that they can get fixed it quickly, if it's a design flaw they're in for a very rough ride. |
Hmmm the extra 20 ns isn't that just a result of the fact that the CPU has to search through the L3 before it goes to the RAM?
|
|
| Back to top |
|
 |
lux_interior
Joined: 26 Jul 2007 Posts: 252
|
Posted: Mon Nov 19, 2007 5:41 pm Post subject: |
|
|
| inf64 wrote: | | PS I just looked at pcper article and the power consumtion(even on the OCed 2.6Ghz,1.3V part) is really good.Phenom shines @idle and it's good @ load.At least one positive :) |
Well... it's compared on the one hand to higher clocked 90nm K8's, on the other hand to Core2's with a different video setup (the reported power consumption is for the entire system).
When measuring only CPU consumption compared to a similarly clocked K8, or to a much better performing 45nm Core2 Quad Extreme, it does not look so rosy:
http://www.hardware.fr/medias/photos_news/00/21/IMG0021456.gif
|
|
| Back to top |
|
 |
bgx
Joined: 11 Oct 2007 Posts: 7
|
Posted: Mon Nov 19, 2007 6:04 pm Post subject: |
|
|
For the rightmark memory 'bug', i found this test on K8 which shows that indeed it does not test random memory latency but in a predictable fashion (forward, mid page):
http://www.digit-life.com/articles2/rmma-general/rmma-k7-k8.html
It is at 51cycle on a 1.8 Ghz OpteronK8, so 40 cycle on a faster unregistered memory doesn't not seem off.
They also mesured 17 cycles for the L2 on this K8, so the L3 does not seem to be that bad (unless random speed is much different).
On the Hardware.fr test, they also show some good core/core real life benchmark improvement of K10 vs K8 (up to 33%). That's really not too bad. Of course, the clock speed being as it is , it is too far little wrt core2, but when AMD sort its problem out, it should be quite competitive in the middle segment, which is good for us buyer.
As for the absolute minimal FPS, it is not the most interesting, the most interesting is the 10% lowest FPS, because if at some point for a plateform problem (HD, or wathever), there is a mili second where you have a very low FPS, then noone mind, it is the fact that this reproduces quite a lot which is problematic. Since the average FPS on the intel machine is quite higer, there is little we can conclude but if we have the whole picture...
|
|
| Back to top |
|
 |
shank15217
Joined: 09 Oct 2007 Posts: 17
|
Posted: Mon Nov 19, 2007 6:54 pm Post subject: |
|
|
With a 900 mhz disadvantage you expected the phenom to beat the Athlon 64? I wish people would stop blowing this out of the water... Clock for clock the k10 is 15-18% faster than the K8, which makes it about 10% slower than Core 2 and 15-18 % slower than Penryn, thats it. Yes its disappointing but consider the k8 situation with penryn, AMD would be at an avg 30% clock for clock disadvantage.
|
|
| Back to top |
|
 |
shank15217
Joined: 09 Oct 2007 Posts: 17
|
Posted: Mon Nov 19, 2007 6:54 pm Post subject: |
|
|
| jack wrote: |
It seems that 2.3GHz Phenom is actually slower than fastest 3.2GHz Athlon 64 on average on that benchmark suite. |
With a 900 mhz disadvantage you expected the phenom to beat the Athlon 64? I wish people would stop blowing this out of the water... Clock for clock the k10 is 15-18% faster than the K8, which makes it about 10% slower than Core 2 and 15-18 % slower than Penryn, thats it. Yes its disappointing but consider the k8 situation with penryn, AMD would be at an avg 30% clock for clock disadvantage.
sorry double post
Last edited by shank15217 on Mon Nov 19, 2007 6:58 pm; edited 1 time in total |
|
| Back to top |
|
 |
Gabriele Svelto
Joined: 27 Jun 2007 Posts: 290 Location: Milano, Italy
|
Posted: Mon Nov 19, 2007 6:55 pm Post subject: |
|
|
| Johan wrote: | | Hmmm the extra 20 ns isn't that just a result of the fact that the CPU has to search through the L3 before it goes to the RAM? |
I think we've already talked about this in another thread though I don't remember if it was here or on RWT. Anyway checking the L3 tags cannot take 20ns, that would be 40 cycles @ 2.0 GHz, that's extremely unlikely. Naturally if the L3 is cache controller is buggy/flawed and it needs an ugly fix to function properly it might become quite possible :)
|
|
| Back to top |
|
 |
jack
Joined: 27 Jun 2007 Posts: 357
|
Posted: Mon Nov 19, 2007 6:57 pm Post subject: |
|
|
| inf64 wrote: |
PS I just looked at pcper article and the power consumtion(even on the OCed 2.6Ghz,1.3V part) is really good.Phenom shines @idle and it's good @ load.At least one positive :) |
PCPer article shows 2.6GHz Phenom having a higher power consumption under load than 3GHz Core2 Quad. I wouldn't call that "good".
Anandtech has power consumption numbers made with identical video cards and they show that Phenom has a higher power consumption than identically clocked 65nm Core2 Quad.
|
|
| Back to top |
|
 |
inf64
Joined: 04 Sep 2007 Posts: 69
|
Posted: Mon Nov 19, 2007 7:33 pm Post subject: |
|
|
| jack wrote: | | inf64 wrote: |
PS I just looked at pcper article and the power consumtion(even on the OCed 2.6Ghz,1.3V part) is really good.Phenom shines @idle and it's good @ load.At least one positive :) |
PCPer article shows 2.6GHz Phenom having a higher power consumption under load than 3GHz Core2 Quad. I wouldn't call that "good".
Anandtech has power consumption numbers made with identical video cards and they show that Phenom has a higher power consumption than identically clocked 65nm Core2 Quad. |
Yes,but C2Q doesn't have integrated NB with 2DCTs,IMC,CB Switch and SRQ ,does it?Add in 15% on top of C2Qs numbers and you get a more realistic picture.
Of course i am talking about CPU only power cons. ,not system level.
|
|
| Back to top |
|
 |
Alberto
Joined: 04 Sep 2007 Posts: 111 Location: Italy
|
Posted: Mon Nov 19, 2007 7:51 pm Post subject: |
|
|
| inf64 wrote: | | jack wrote: | | inf64 wrote: |
PS I just looked at pcper article and the power consumtion(even on the OCed 2.6Ghz,1.3V part) is really good.Phenom shines @idle and it's good @ load.At least one positive :) |
PCPer article shows 2.6GHz Phenom having a higher power consumption under load than 3GHz Core2 Quad. I wouldn't call that "good".
Anandtech has power consumption numbers made with identical video cards and they show that Phenom has a higher power consumption than identically clocked 65nm Core2 Quad. |
Yes,but C2Q doesn't have integrated NB with 2DCTs,IMC,CB Switch and SRQ ,does it?Add in 15% on top of C2Qs numbers and you get a more realistic picture.
Of course i am talking about CPU only power cons. ,not system level. |
15%? you are wrong....
Moreover the complex FSB logic is power free for you, only for you.
http://www.matbe.com/articles/lire/561/amd-phenom-9600-et-9500/page7.php
Only cpu, without system. The difference is larger of your "15%".
Yet the system power consumption of Phenom is odd, so.......
Alberto.
|
|
| Back to top |
|
 |
Uffe Merrild
Joined: 27 Jun 2007 Posts: 108 Location: Silkeborg, Denmark
|
Posted: Mon Nov 19, 2007 8:34 pm Post subject: |
|
|
| jamannetje wrote: | In the review of hardwarecanucks I did find some bright spots for AMD. I do not know if this is partly due to GPU drivers but with all game benchmarks the minimum framerate is higher than those of intel.
To me the minimum framerate is more important than the avarage and maximum framerate because it tells most about the playability of a game.
Jeschael |
Minimum framerate is a bad benchmark if not done right. If you merely take the absolute minimum frame rate which the computer cranked out you won't get a fair picture. A hd read or similar interrupt will put any test of minimum frame rate buttoming out the scale if on an appropriate moment.
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|