Difference between revisions of "Cuda"
From Ghoulwiki
					
										
					
					| Ghoulsblade (talk | contribs) | Ghoulsblade (talk | contribs)  | ||
| Line 172: | Line 172: | ||
| check : with index on cpu: iNumResults=60051 | check : with index on cpu: iNumResults=60051 | ||
| 357.94 sec : check : with index on cpu | 357.94 sec : check : with index on cpu | ||
| + | |||
| + | Press ENTER to exit... | ||
| + | </pre> | ||
| + | |||
| + | == big data with all index levels  used in device == | ||
| + | <pre> | ||
| + | N=1048576 | ||
| + | SX=32768 | ||
| + | SY=1024 | ||
| + | SZ=32 | ||
| + | I0=32 | ||
| + | DATASIZE_IN_RAW=36864kb | ||
| + | DATASIZE_IN_STATE=4096kb | ||
| + | DATASIZE_IN_INDEX=136kb | ||
| + | DATASIZE_IN_TOTAL=41096kb | ||
| + | DATASIZE_OUT_TOTAL=16384kb | ||
| + | ReadTextData data/Corel_ColorMoments_9d.ascii : 68040 lines of real data, added | ||
| + | 980536 lines of random data | ||
| + | WARNING ! iRealNumLines=68040 does not match the hardcoded N=1048576 | ||
| + | 0.96 sec : reading data from file | ||
| + | assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1) | ||
| + | assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1) | ||
| + | assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1) | ||
| + | assert passed : (sz < 255) | ||
| + | -2.478893,-2.478893,-2.478893,...,4.097229 | ||
| + | 1.86 sec : generating index data | ||
| + | 0.06 sec : allocate and init device mem | ||
| + | 354.73 sec : exec kernel on device | ||
| + | 0.00 sec : receive results from device | ||
| + | atom[0]=60051 atom[1]=0 iNumResults=60051 kMaxResults=2097152 | ||
| + | check : with index on cpu... | ||
| + | check : with index on cpu: iNumResults=60051 | ||
| + | 357.97 sec : check : with index on cpu | ||
| Press ENTER to exit... | Press ENTER to exit... | ||
| </pre> | </pre> | ||
Revision as of 19:48, 29 September 2007
- http://www.litec-computer.de/PC-Komponenten/Grafikkarten/PCI-express/nVidia/Gigabyte-GV-NX85T256H-8500GT-512MB-Dual-DVI-TV-out-passiv::13515.html
- http://www.litec-computer.de/PC-Komponenten/Grafikkarten/PCI-express/nVidia/ASUS-EN8500GT-SILENT-MAGIC-HTD-512MB-DVI-TV-out-passiv::13117.html
- beachten : keine 88 (kann kein atomic), nur 85 oder 86, möglichst viel ram (512mb) , keine karten mit "Nur 128-bit Speicherinterface"
- svn+ssh://ghoulsblade@zwischenwelt.org/var/svn/robertprojarbeit
- http://zwischenwelt.org/svn/robertprojarbeit
- nvidia-cuda-forum http://forums.nvidia.com/index.php?showforum=62 (search for 8600)
- nvidia-cuda-hp http://developer.nvidia.com/object/cuda.html
- FAQ : http://forums.nvidia.com/index.php?showtopic=36286&hl=8600 (many interesting programming tips)
- SIMD : http://en.wikipedia.org/wiki/Vector_processor
- samples http://developer.download.nvidia.com/compute/cuda/sdk/website/samples.html
- cuda 1.0 announcement 26.june : http://forums.nvidia.com/index.php?showtopic=39030&hl=8600
- $(CUDA_BIN_PATH)\nvcc.exe -arch sm_11 -ccbin "$(VCInstallDir)bin" -c -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/Wp64,/O2,/Zi,/MT -I"$(CUDA_INC_PATH)" -I./ -I../../common/inc -o $(ConfigurationName)\myproj.obj myproj.cu
##### ##### ##### ##### #####device 0
name : GeForce 8500 GT
    261888k totalGlobalMem
        16k sharedMemPerBlock
         8k regsPerBlock
        32 warpSize
       256k memPitch
       512 maxThreadsPerBlock
       512 maxThreadsDim[0]
       512 maxThreadsDim[1]
        64 maxThreadsDim[2]
        63k maxGridSize[0]
        63k maxGridSize[1]
         1 maxGridSize[2]
        64k totalConstMem
         1 major
         1 minor
      1371k clockRate
       256 textureAlignment
##### ##### ##### ##### #####device 1
name : GeForce 8500 GT
    261824k totalGlobalMem
        16k sharedMemPerBlock
         8k regsPerBlock
        32 warpSize
       256k memPitch
       512 maxThreadsPerBlock
       512 maxThreadsDim[0]
       512 maxThreadsDim[1]
        64 maxThreadsDim[2]
        63k maxGridSize[0]
        63k maxGridSize[1]
         1 maxGridSize[2]
        64k totalConstMem
         1 major
         1 minor
      1371k clockRate
       256 textureAlignment
N=65536 SX=2048 SY=64 SZ=2 I0=32 DATASIZE_IN_RAW=2304kb DATASIZE_IN_STATE=256kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=2696kb DATASIZE_OUT_TOTAL=16384kb 0.74 sec : reading data from file assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1) assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1) assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1) assert passed : (sz < 255) -2.478893,-2.459100,-2.459100,...,4.097229 0.09 sec : generating index data 0.02 sec : allocate and init device mem 9.55 sec : exec kernel on device 0.00 sec : receive results from device atom[0]=57100 atom[1]=0 iNumResults=57100 kMaxResults=2097152 check : with index on cpu... check : with index on cpu: iNumResults=57100 17.29 sec : check : with index on cpu Press ENTER to exit...
coarse-strong-ultra-100-96_10000.txt
big file : 750 mb
N=65536 SX=2048 SY=64 SZ=2 I0=32 DATASIZE_IN_RAW=2304kb DATASIZE_IN_STATE=256kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=2696kb DATASIZE_OUT_TOTAL=16384kb line 1:coarse-strong-ultra-000000 000 0.04820 0.05043 0.05505 0.04887 0.05384 0. 05995 0.06215 0.06603 0.06615 0.06679 0.06202 0.06686 0.06654 0.06499 0.06705 0. 05835 0.05513 0.05442 0.05398 0.04658 0.05066 0.05162 0.04627 0.04324 0.04433 0. 04061 0.03673 0.04281 0.04078 0.03822 0.03821 0.03952 0.03648 0.03312 0.03402 0. 03277 0.03029 0.04310 0.04202 0.04168 0.03964 0.04356 0.04193 0.03870 0.04995 0. 04837 0.04460 0.04857 0.04831 0.04714 0.04734 0.05127 0.05319 0.05641 0.05886 0. 05365 0.05752 0.05877 0.05435 0.05526 line 2:0.05103 0.05033 0.04347 0.04750 0.04574 0.03781 0.03926 0.03612 0.03374 0 .03207 0.03060 0.02828 0.02507 0.02721 0.02558 0.02493 0.03374 0.02920 0.03031 0 .02906 0.03283 0.03207 0.03207 0.04170 0.04342 0.04020 0.04298 0.04131 0.04298 0 .04404 0.04810 0.04727 0.04491 0.04764 0.04301 0.04383 0.04317 0.04511 0.04545 0 .04818 0.04908 0.04463 0.04620 0.04699 0.04360 0.04529 0.04346 0.04122 0.04011 0 .03737 0.03293 0.02865 0.02832 0.02520 0.02255 0.02125 0.02035 0.01192 0.01107 0 .00685 0.01059 0.01442 0.01595 0.02073 line 3: 0.01839 0.01905 0.02360 0.02169 0.02749 0.02954 0.03428 0.03788 0.03793 0.04286 0.03990 0.04313 0.04426 0.04382 0.04529 0.04475 0.04587 0.04372 0.04377 0.04492 0.04141 0.04456 0.04552 0.04159 0.04223 0.03591 0.03698 0.03614 0.03736 0.03934 0.03516 0.03583 0.02827 0.02601 0.02256 0.02111 0.02014 0.02016 0.02150 0.01603 0.01632 0.01486 0.01818 0.01843 0.02040 0.02465 0.02337 0.02405 0.02452 0.02435 0.02745 0.02855 0.03582 0.03742 0.04081 0.04198 0.03672 0.03736 0.03749 0.03917 0.04121 0.04257 0.04485 0.0417 line 4:5 0.04121 0.04167 0.03555 0.03806 0.03863 0.03748 0.03451 0.03039 0.02957 0.02493 0.02768 0.03054 0.02767 0.02438 0.02099 0.01713 0.01462 0.01508 0.01577 0.01737 0.01791 0.02091 0.02171 0.02182 0.02522 0.02480 0.02493 0.02913 0.02996 0.03269 0.03296 0.03631 0.03866 0.03451 0.04187 0.03918 0.04450 0.04598 0.04060 0.04482 0.03857 0.04485 0.04499 0.04898 0.05087 0.04374 0.04700 0.04478 0.04084 0.04081 0.04103 0.04135 0.03853 0.03664 0.03382 0.02755 0.02830 0.02619 0.02489 0.02267 0.01911 0.01841 0.01437 0.015 line 5:69 0.01486 0.01429 0.01689 0.02116 0.02161 0.02497 0.02357 0.02480 0.0248 6 0.02717 0.03292 0.03393 0.03558 0.03961 0.04140 0.03921 0.03968 0.03907 0.0418 6 0.04094 0.04488 0.04213 0.04043 0.04132 0.04242 0.04413 0.04329 0.04607 0.0437 9 0.04371 0.04105 0.03754 0.03824 0.03826 0.03761 0.03649 0.03190 0.03074 0.0221 8 0.02409 0.02173 0.01948 0.01822 0.01906 0.02054 0.01712 0.01795 0.01876 0.0171 6 0.02161 0.02460 0.02727 0.02947 0.02950 0.03441 0.03185 0.03891 0.03999 0.0409 8 0.04559 0.04661 0.04359 0.04212 0.04 total lines : 1507200 Drücken Sie eine beliebige Taste . . .
big test 1 (without last index on device)
N=1048576 SX=32768 SY=1024 SZ=32 I0=32 DATASIZE_IN_RAW=36864kb DATASIZE_IN_STATE=4096kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=41096kb DATASIZE_OUT_TOTAL=16384kb ReadTextData data/Corel_ColorMoments_9d.ascii : 68040 lines of real data, added 980536 lines of random data WARNING ! iRealNumLines=68040 does not match the hardcoded N=1048576 1.08 sec : reading data from file assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1) assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1) assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1) assert passed : (sz < 255) -2.478893,-2.478893,-2.478893,...,4.097229 1.86 sec : generating index data 0.06 sec : allocate and init device mem 764.99 sec : exec kernel on device 0.00 sec : receive results from device atom[0]=60051 atom[1]=0 iNumResults=60051 kMaxResults=2097152 check : with index on cpu... check : with index on cpu: iNumResults=60051 357.94 sec : check : with index on cpu Press ENTER to exit...
big data with all index levels used in device
N=1048576 SX=32768 SY=1024 SZ=32 I0=32 DATASIZE_IN_RAW=36864kb DATASIZE_IN_STATE=4096kb DATASIZE_IN_INDEX=136kb DATASIZE_IN_TOTAL=41096kb DATASIZE_OUT_TOTAL=16384kb ReadTextData data/Corel_ColorMoments_9d.ascii : 68040 lines of real data, added 980536 lines of random data WARNING ! iRealNumLines=68040 does not match the hardcoded N=1048576 0.96 sec : reading data from file assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1) assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1) assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1) assert passed : (sz < 255) -2.478893,-2.478893,-2.478893,...,4.097229 1.86 sec : generating index data 0.06 sec : allocate and init device mem 354.73 sec : exec kernel on device 0.00 sec : receive results from device atom[0]=60051 atom[1]=0 iNumResults=60051 kMaxResults=2097152 check : with index on cpu... check : with index on cpu: iNumResults=60051 357.97 sec : check : with index on cpu Press ENTER to exit...

