Cuda
From Ghoulwiki
Revision as of 19:32, 29 September 2007 by Ghoulsblade (talk | contribs)
- http://www.litec-computer.de/PC-Komponenten/Grafikkarten/PCI-express/nVidia/Gigabyte-GV-NX85T256H-8500GT-512MB-Dual-DVI-TV-out-passiv::13515.html
- http://www.litec-computer.de/PC-Komponenten/Grafikkarten/PCI-express/nVidia/ASUS-EN8500GT-SILENT-MAGIC-HTD-512MB-DVI-TV-out-passiv::13117.html
- beachten : keine 88 (kann kein atomic), nur 85 oder 86, möglichst viel ram (512mb) , keine karten mit "Nur 128-bit Speicherinterface"
- svn+ssh://ghoulsblade@zwischenwelt.org/var/svn/robertprojarbeit
- http://zwischenwelt.org/svn/robertprojarbeit
- nvidia-cuda-forum http://forums.nvidia.com/index.php?showforum=62 (search for 8600)
- nvidia-cuda-hp http://developer.nvidia.com/object/cuda.html
- FAQ : http://forums.nvidia.com/index.php?showtopic=36286&hl=8600 (many interesting programming tips)
- SIMD : http://en.wikipedia.org/wiki/Vector_processor
- samples http://developer.download.nvidia.com/compute/cuda/sdk/website/samples.html
- cuda 1.0 announcement 26.june : http://forums.nvidia.com/index.php?showtopic=39030&hl=8600
- $(CUDA_BIN_PATH)\nvcc.exe -arch sm_11 -ccbin "$(VCInstallDir)bin" -c -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/O2,/Zi,/MT -I"$(CUDA_INC_PATH)" -I./ -I../../common/inc -o $(ConfigurationName)\myproj.obj myproj.cu
##### ##### ##### ##### #####device 0 name : GeForce 8500 GT 261888k totalGlobalMem 16k sharedMemPerBlock 8k regsPerBlock 32 warpSize 256k memPitch 512 maxThreadsPerBlock 512 maxThreadsDim[0] 512 maxThreadsDim[1] 64 maxThreadsDim[2] 63k maxGridSize[0] 63k maxGridSize[1] 1 maxGridSize[2] 64k totalConstMem 1 major 1 minor 1371k clockRate 256 textureAlignment ##### ##### ##### ##### #####device 1 name : GeForce 8500 GT 261824k totalGlobalMem 16k sharedMemPerBlock 8k regsPerBlock 32 warpSize 256k memPitch 512 maxThreadsPerBlock 512 maxThreadsDim[0] 512 maxThreadsDim[1] 64 maxThreadsDim[2] 63k maxGridSize[0] 63k maxGridSize[1] 1 maxGridSize[2] 64k totalConstMem 1 major 1 minor 1371k clockRate 256 textureAlignment
N=65536 SX=2048 SY=64 SZ=2 I0=32 DATASIZE_IN_RAW=2304kb DATASIZE_IN_STATE=256kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=2696kb DATASIZE_OUT_TOTAL=16384kb 0.74 sec : reading data from file assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1) assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1) assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1) assert passed : (sz < 255) -2.478893,-2.459100,-2.459100,...,4.097229 0.09 sec : generating index data 0.02 sec : allocate and init device mem 9.55 sec : exec kernel on device 0.00 sec : receive results from device atom[0]=57100 atom[1]=0 iNumResults=57100 kMaxResults=2097152 check : with index on cpu... check : with index on cpu: iNumResults=57100 17.29 sec : check : with index on cpu Press ENTER to exit...
coarse-strong-ultra-100-96_10000.txt
big file : 750 mb
N=65536 SX=2048 SY=64 SZ=2 I0=32 DATASIZE_IN_RAW=2304kb DATASIZE_IN_STATE=256kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=2696kb DATASIZE_OUT_TOTAL=16384kb line 1:coarse-strong-ultra-000000 000 0.04820 0.05043 0.05505 0.04887 0.05384 0. 05995 0.06215 0.06603 0.06615 0.06679 0.06202 0.06686 0.06654 0.06499 0.06705 0. 05835 0.05513 0.05442 0.05398 0.04658 0.05066 0.05162 0.04627 0.04324 0.04433 0. 04061 0.03673 0.04281 0.04078 0.03822 0.03821 0.03952 0.03648 0.03312 0.03402 0. 03277 0.03029 0.04310 0.04202 0.04168 0.03964 0.04356 0.04193 0.03870 0.04995 0. 04837 0.04460 0.04857 0.04831 0.04714 0.04734 0.05127 0.05319 0.05641 0.05886 0. 05365 0.05752 0.05877 0.05435 0.05526 line 2:0.05103 0.05033 0.04347 0.04750 0.04574 0.03781 0.03926 0.03612 0.03374 0 .03207 0.03060 0.02828 0.02507 0.02721 0.02558 0.02493 0.03374 0.02920 0.03031 0 .02906 0.03283 0.03207 0.03207 0.04170 0.04342 0.04020 0.04298 0.04131 0.04298 0 .04404 0.04810 0.04727 0.04491 0.04764 0.04301 0.04383 0.04317 0.04511 0.04545 0 .04818 0.04908 0.04463 0.04620 0.04699 0.04360 0.04529 0.04346 0.04122 0.04011 0 .03737 0.03293 0.02865 0.02832 0.02520 0.02255 0.02125 0.02035 0.01192 0.01107 0 .00685 0.01059 0.01442 0.01595 0.02073 line 3: 0.01839 0.01905 0.02360 0.02169 0.02749 0.02954 0.03428 0.03788 0.03793 0.04286 0.03990 0.04313 0.04426 0.04382 0.04529 0.04475 0.04587 0.04372 0.04377 0.04492 0.04141 0.04456 0.04552 0.04159 0.04223 0.03591 0.03698 0.03614 0.03736 0.03934 0.03516 0.03583 0.02827 0.02601 0.02256 0.02111 0.02014 0.02016 0.02150 0.01603 0.01632 0.01486 0.01818 0.01843 0.02040 0.02465 0.02337 0.02405 0.02452 0.02435 0.02745 0.02855 0.03582 0.03742 0.04081 0.04198 0.03672 0.03736 0.03749 0.03917 0.04121 0.04257 0.04485 0.0417 line 4:5 0.04121 0.04167 0.03555 0.03806 0.03863 0.03748 0.03451 0.03039 0.02957 0.02493 0.02768 0.03054 0.02767 0.02438 0.02099 0.01713 0.01462 0.01508 0.01577 0.01737 0.01791 0.02091 0.02171 0.02182 0.02522 0.02480 0.02493 0.02913 0.02996 0.03269 0.03296 0.03631 0.03866 0.03451 0.04187 0.03918 0.04450 0.04598 0.04060 0.04482 0.03857 0.04485 0.04499 0.04898 0.05087 0.04374 0.04700 0.04478 0.04084 0.04081 0.04103 0.04135 0.03853 0.03664 0.03382 0.02755 0.02830 0.02619 0.02489 0.02267 0.01911 0.01841 0.01437 0.015 line 5:69 0.01486 0.01429 0.01689 0.02116 0.02161 0.02497 0.02357 0.02480 0.0248 6 0.02717 0.03292 0.03393 0.03558 0.03961 0.04140 0.03921 0.03968 0.03907 0.0418 6 0.04094 0.04488 0.04213 0.04043 0.04132 0.04242 0.04413 0.04329 0.04607 0.0437 9 0.04371 0.04105 0.03754 0.03824 0.03826 0.03761 0.03649 0.03190 0.03074 0.0221 8 0.02409 0.02173 0.01948 0.01822 0.01906 0.02054 0.01712 0.01795 0.01876 0.0171 6 0.02161 0.02460 0.02727 0.02947 0.02950 0.03441 0.03185 0.03891 0.03999 0.0409 8 0.04559 0.04661 0.04359 0.04212 0.04 total lines : 1507200 Drücken Sie eine beliebige Taste . . .
big test 1 (without last index on device)
N=1048576 SX=32768 SY=1024 SZ=32 I0=32 DATASIZE_IN_RAW=36864kb DATASIZE_IN_STATE=4096kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=41096kb DATASIZE_OUT_TOTAL=16384kb ReadTextData data/Corel_ColorMoments_9d.ascii : 68040 lines of real data, added 980536 lines of random data WARNING ! iRealNumLines=68040 does not match the hardcoded N=1048576 1.08 sec : reading data from file assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1) assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1) assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1) assert passed : (sz < 255) -2.478893,-2.478893,-2.478893,...,4.097229 1.86 sec : generating index data 0.06 sec : allocate and init device mem 764.99 sec : exec kernel on device 0.00 sec : receive results from device atom[0]=60051 atom[1]=0 iNumResults=60051 kMaxResults=2097152 check : with index on cpu... check : with index on cpu: iNumResults=60051 357.94 sec : check : with index on cpu Press ENTER to exit...