Aceshardware Forum Index Aceshardware
(not so) temporary home for the aceshardware community
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups    RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Nvidia's CUDA
Goto page Previous  1, 2, 3, 4, 5  Next
 
Post new topic   Reply to topic    Aceshardware Forum Index -> General forum
View previous topic :: View next topic  
Author Message
ajensen



Joined: 01 Sep 2007
Posts: 133

PostPosted: Sun Jun 15, 2008 10:20 am    Post subject: Reply with quote

Our research group tested and scrapped CUDA although it was close to perfect for our workload. No doubt could we get massive performance increase, but it was too much work (many lines of code), too unlikely that nvidia would be the performance leader over time, and compatibility issues were seen as very likely. For instance the next big thing (tm) from nvidia might well be compatible with CUDA, but would we have to do the performance tuning all over again? With these highly specialized architectures getting the software to do what you want it to do takes like half the time, and the other half is spent on architecture specific tuning to get the performance potential of the hardware. New architecture most likely means a new round with tuning.

It is actually worse than the VLIW dilemma because new uarch don't mean that your binary is broken and all you need to do is recompile the code. In stead your binary probably runs correctly, but dead slow. Recompiling might do the trick in some cases. In other cases you must rewrite the code because too many details are known to the programmer, which of course is at least half the reason you can get so much performance in the first place.

IMO GPUs are not general purpose enough yet to make the effort worth vile for us. The interesting question is if they will have enough performance advantage over CPUs by the time they get there.

We decided to port from matlab to plain boring C++ rather than CUDA. This way we are guaranteed that the code will have some real value in the future and hopefully the througput/watt focus of CPUs will ensure enough performance and it doesn't matter if the CPU is SPARC, Power, IA64 or x64. Luckily the workload is trivial to scale so it boils down to an economic issue really.
Back to top
View user's profile Send private message
Del



Joined: 09 Aug 2007
Posts: 121

PostPosted: Sun Jun 15, 2008 11:56 am    Post subject: Re: CUDA, GPGPU etc. Reply with quote

malcolm wrote:

As someone currently developing in this area, I can tell you that it is something of a paradigm shift. All the big players know this (why else do you think AMD almost killed themselves purchasing ATI?), and as such are working very hard to position themselves for the new paradigm.
A hasty conclusion, not showing anything. Their ATI purchase gave them Puma and 780G, it is already paying off in the fastest growing market out there, laptops. A market where AMD needed a platform to realistically compete with Intel. This alone can explain the motivation behind the purchase.
Quote:
nVidia has CUDA on their GPUs, Intel has Ct (or will have) on Larabee, AMD has Brook+ on their GPUs.
I have to agree with Charlie on this one. As long as there is no common standard, you cannot expect people to accept vendor lock-in at the programming language level. The sooner nvidia and AMD agrees on a common framework, the better.
Quote:
Just to give you an idea of the gulf between a GPU and CPU currently. The new generation GPUs can do about 1TFlop peak, my CPU (AthX2) about 18MFlops... :)
You are not doing anybody any favours by exaggerating. My Phenom that I am writing this on can theoretically do 80Gflops single precision (and will probably achieve within 10% of that on dense matrix multiplication using AMD libraries), and gives me a benched memory bandwidth of about 10GB/s with 800MHz DDR2. Commodity RAM that I can throw in 16GB of with very low cost on any mobo. The power budget is also significantly lower than your reference GPU. What I am heading at is that 18x is not relevant, I believe you are looking at a best case scenario of 10x on performance, significantly less if you bring in power consumption. We now have blades where we can pack four sockets, so the picture doesn't change favourably if you bring in density argument with Tesla.

I really do not want to be negative, but I am in the funders end, and I need to see the value proposition before I can give any GPGPU projects my support. Nvidia did one thing right, and that was providing Cuda for the linux community early. It has resulted in a couple of supported projects that I know of, but not where I was involved in the process. I would like to be convinced though, and I do keep an open mind. Hence I appreciate any effort you can give at convincing me that Cuda, or even GPGPU in general, is the right bet.
Back to top
View user's profile Send private message
ajensen



Joined: 01 Sep 2007
Posts: 133

PostPosted: Sun Jun 15, 2008 1:04 pm    Post subject: Re: CUDA, GPGPU etc. Reply with quote

Del wrote:
What I am heading at is that 18x is not relevant, I believe you are looking at a best case scenario of 10x on performance, significantly less if you bring in power consumption.

And by the end of this year we'll have 65nm CPU systems pulling SPECfp2006_rate baselines of about 100 per socket, or a doubling compared to todays best performing CPU systems. Perhaps Nehalem won't do too bad either with all its fancy SSE extensions and that really puts pressure on GP-GPU performance to make up for all the extra hassle.


Last edited by ajensen on Sun Jun 15, 2008 2:27 pm; edited 1 time in total
Back to top
View user's profile Send private message
foo



Joined: 27 Jun 2007
Posts: 126

PostPosted: Sun Jun 15, 2008 1:11 pm    Post subject: Reply with quote

one of my students is working with CUDA
we are doing security stuff with GPUs and we are seeing very nice speedups
Back to top
View user's profile Send private message
Groo



Joined: 22 Jul 2007
Posts: 178

PostPosted: Sun Jun 15, 2008 3:34 pm    Post subject: Reply with quote

Tvar/Malcom,
The reason I say what I say is nicely summed up by Ajensen at the top of the page. I am not doubting the claims of NV directly, you CAN get the required speedups, sometimes by orders of magnitude.
That said, corps are not buying in, if you run the numbers, it isn't worth it. Specialized cases, yes. Armies of free grad students, yes. Real world where you have to pay programmers to write and maintain code? No.
On paper, it was a good idea. Once you take all the external factors into account, the price/performance advantage drops to near zero.
I have talked to a LOT of big companies who were the exact target market for GPGPU. None of them are deploying, they all do what Ajensen did, kick the tires, evaluate it, run the numbers and walk away.
In the end, it just isn't worth it, big picture, for most people.

-Charlie
Back to top
View user's profile Send private message
who?



Joined: 01 Sep 2007
Posts: 540

PostPosted: Sun Jun 15, 2008 3:36 pm    Post subject: Reply with quote

As long as you don t see SPEC speed up, it is not really ready ...

who?
Back to top
View user's profile Send private message
Carfax



Joined: 07 Aug 2007
Posts: 40

PostPosted: Sun Jun 15, 2008 7:38 pm    Post subject: Reply with quote

who? wrote:

I want to see the CUDA encoding running at 700Kbps output in SD, and 2Mbps for HD ... certainly looks pretty ugly... from what I saw ....
Remember, principal of benchmarking:
1) Make sure you do the work right!

They can have as much speed as they want, when they don t compress well, I can do it faster on CPU, it is call mem copy :)

I don't think that any of those encoders can get close to the CPU version as soon as you get to internet bit rate or movie storage bit rate, at equal end quality. Avivo could not get close, and I expect the same from CUDA like encoders, brut force is not what you need when you try to encode for "personal movies of 1 hour 30 minutes ;0)"

who?


Intel believes CPUs do more video encoding than GPUs
Back to top
View user's profile Send private message
who?



Joined: 01 Sep 2007
Posts: 540

PostPosted: Mon Jun 16, 2008 12:35 am    Post subject: Reply with quote

Carfax wrote:
who? wrote:

I want to see the CUDA encoding running at 700Kbps output in SD, and 2Mbps for HD ... certainly looks pretty ugly... from what I saw ....
Remember, principal of benchmarking:
1) Make sure you do the work right!

They can have as much speed as they want, when they don t compress well, I can do it faster on CPU, it is call mem copy :)

I don't think that any of those encoders can get close to the CPU version as soon as you get to internet bit rate or movie storage bit rate, at equal end quality. Avivo could not get close, and I expect the same from CUDA like encoders, brut force is not what you need when you try to encode for "personal movies of 1 hour 30 minutes ;0)"

who?


Intel believes CPUs do more video encoding than GPUs


At the end of the article, the quote is reversed. I said that C and C++, and ASM code is the only language that Codec programmer use, not the otherway around.
I work on the Top 3 codecs, and I know that no CUDA will be done there, but they are all running x86, isn't it :) if you write 10 line of code and make it run on other cores ... it is all win! ;)
I don't know why, but i feel very confident that in the video codec world, x86 is already done winning this area... hehehee

I 'll say more when i can

who?
Back to top
View user's profile Send private message
martinw



Joined: 06 Sep 2007
Posts: 139

PostPosted: Mon Jun 16, 2008 1:02 am    Post subject: Reply with quote

What Groo said. I'm in one of the key markets targeted by CUDA and GPGPU, and after the initial excitement it has mostly been a disappointment. There has been a lot of tire kicking but deployment is extremely limited, and I don't see that changing any time soon. It's just not worth it at the moment, for all the reasons given above.
Back to top
View user's profile Send private message
who?



Joined: 01 Sep 2007
Posts: 540

PostPosted: Mon Jun 16, 2008 1:34 am    Post subject: Reply with quote

GPGPU only make sense in the context of uniform instruction set, other solutions are mole on the architecture. It does not make any sense to make the instruction set splitted for the CPU and the GPGPU, it needs at least a common ground.

who?
Back to top
View user's profile Send private message
malcolm



Joined: 14 Jun 2008
Posts: 3

PostPosted: Mon Jun 16, 2008 3:35 am    Post subject: GPGPU and stream processing Reply with quote

Groo -
I now get exactly what the point is you are trying to make, and I find it quite easy to believe. Current hardware is not yet flexible enough, and current APIs are fragmented and difficult to develop for. ie. Value proposition for a business is low.

However, I still believe we are on the cusp of a paradigm shift. The hardware is getting faster and more flexible (can't wait for Larabee!), and I expect that eventually the API landscape will mature.

I also expect that the value proposition will be entirely different in a year or so when (if?) Fusion appears and Larabee matures. It also makes me wonder what nVidia is going to do, although it might also explain why they've been so ornery lately! :)

On a personal note, I believe the most fascinating thing about stream processing tech is the enabling aspect. The order of magnitude (+) speedup can make some applications possible that wouldn't otherwise be.
For example:
http://fastra.ua.ac.be/en/index.html

Malcolm
Back to top
View user's profile Send private message
martinw



Joined: 06 Sep 2007
Posts: 139

PostPosted: Mon Jun 16, 2008 6:13 pm    Post subject: Re: GPGPU and stream processing Reply with quote

malcolm wrote:
and I expect that eventually the API landscape will mature.


Speaking of which, it may be that OpenCL is a good candidate for a higher level cross-platform GPGPU API. Apple unveiled it last week at WWDC. I've heard positive reports from at least one person who attended. And it looks like AMD has already stepped on board:

http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~126593,00.html

In keeping with its open systems philosophy, AMD has also joined the Khronos Compute Working Group. This working group’s goals include developing industry standards for data parallel programming and working with proposed specifications like OpenCL. The OpenCL specification can help provide developers with an easy path to development across multiple platforms. [...] "We believe that OpenCL is a step in the right direction and we fully support this effort."
Back to top
View user's profile Send private message
lux_interior



Joined: 26 Jul 2007
Posts: 252

PostPosted: Mon Jun 16, 2008 7:47 pm    Post subject: Reply with quote

Groo wrote:
It is a stunt though, any time someone trots out a test app on card release, it was funded directly by the card maker, and that is about all you will ever see that does this. It is almost guaranteed not to run on the next generation part.

With CPUs however, the odds are MUCH greater that it will work on the next gen, and in 3 years there will still be support for SSEx.


There are stunts on CPUs as well. Does anyone still remember the 64-bit gzip tweaked by AMD to demonstrate the superiority of the K8 architecture?
Sure that gzip probably works on current CPUs, except that nobody supports or maintains, let alone packages it.
Back to top
View user's profile Send private message
HighTech4US



Joined: 19 Aug 2007
Posts: 10

PostPosted: Wed Jun 18, 2008 2:32 pm    Post subject: Re: FUD Reply with quote

Tvar' wrote:
Charlie -

GPUs are turning into general purpose manycore processors - which is why Intel has to make Larrabee. Nvidia understands this well, your FUD to the contrary. They're not going to break forward compatibility with CUDA and PTX.

I'm actually a little surprised at your unfounded FUD. Too bad.


Why the surpise?

Charlie HATES nVidia and will post continuous FUD on anything they do or produce. Seems like nVidia in the past slighted Charlie and now he just acts like a spoiled child.
Back to top
View user's profile Send private message
DUCK of DEATH



Joined: 04 Sep 2007
Posts: 104

PostPosted: Wed Jun 18, 2008 4:37 pm    Post subject: Re: FUD Reply with quote

HighTech4US wrote:
Tvar' wrote:
Charlie -

GPUs are turning into general purpose manycore processors - which is why Intel has to make Larrabee. Nvidia understands this well, your FUD to the contrary. They're not going to break forward compatibility with CUDA and PTX.

I'm actually a little surprised at your unfounded FUD. Too bad.


Why the surpise?

Charlie HATES nVidia and will post continuous FUD on anything they do or produce. Seems like nVidia in the past slighted Charlie and now he just acts like a spoiled child.

Nvidia aren't owned by AMD, so it is no surprise.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Aceshardware Forum Index -> General forum All times are GMT + 1 Hour
Goto page Previous  1, 2, 3, 4, 5  Next
Page 2 of 5   

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB
Hosted by FreeForums.org