Difference between revisions of "CudaVsNested"

From Ghoulwiki
Jump to: navigation, search
 
 
Line 1: Line 1:
<pre>
+
<pre>N=32768
N=41472
+
SX=2048
SX=1728
+
SY=128
SY=72
+
SZ=8
SZ=3
+
I0=16
I0=24
+
DATASIZE_IN_RAW=1152kb
DATASIZE_IN_RAW=1458kb
+
DATASIZE_IN_STATE=128kb
DATASIZE_IN_STATE=162kb
+
DATASIZE_IN_INDEX=18kb
DATASIZE_IN_INDEX=58kb
+
DATASIZE_IN_TOTAL=1298kb
DATASIZE_IN_TOTAL=1678kb
 
 
DATASIZE_OUT_TOTAL=65536kb
 
DATASIZE_OUT_TOTAL=65536kb
 
assert passed : (N / kThreadBlockSize <= 63*1024 && "grid_size larger than suppo
 
assert passed : (N / kThreadBlockSize <= 63*1024 && "grid_size larger than suppo
 
rted (cudaGetDeviceProperties:maxGridSize: 63k currently)")
 
rted (cudaGetDeviceProperties:maxGridSize: 63k currently)")
ReadTextData data/Corel_ColorMoments_9d.ascii : 41472 lines of real data, added
+
ReadTextData data/Corel_ColorMoments_9d.ascii : 32768 lines of real data, added
 
0 lines of random data
 
0 lines of random data
 
0.00 sec : reading data from file
 
0.00 sec : reading data from file
Line 19: Line 18:
 
assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1)
 
assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1)
 
assert passed : (sz < 255)
 
assert passed : (sz < 255)
-2.478893,-2.459100,-2.459100,...,3.864799
+
-2.453896,-2.369326,-2.369326,...,3.864799
0.05 sec : generating index data
+
0.04 sec : generating index data
 
0.05 sec : allocate and init device mem
 
0.05 sec : allocate and init device mem
2.09 sec : exec kernel on device
+
1.21 sec : exec kernel on device
 
0.00 sec : receive results from device
 
0.00 sec : receive results from device
atom[0]=31333 atom[1]=0 iNumResults=31333 kMaxResults=8388608
+
atom[0]=21875 atom[1]=0 iNumResults=21875 kMaxResults=8388608
 
check : sequential...
 
check : sequential...
check:iNumResults=31466 maxc=26 for i=21904
+
check:iNumResults=21959 maxc=26 for i=17803
38.83 sec : check : sequential
+
22.53 sec : check : sequential
gpu/cpu=    0.1  ok  err<=0.5% N=41472 size=1MB IO=24 i3:0,0 tgpu=2.1s tcpu=38.
+
gpu/cpu=    0.1  ok  err<=0.4% N=32768 size=1MB IO=16 i3:0,0 tgpu=1.2s tcpu=22.
8
+
5
  
 
Press ENTER to exit...
 
Press ENTER to exit...
 
 
</pre>
 
</pre>

Latest revision as of 18:24, 17 October 2007

N=32768
SX=2048
SY=128
SZ=8
I0=16
DATASIZE_IN_RAW=1152kb
DATASIZE_IN_STATE=128kb
DATASIZE_IN_INDEX=18kb
DATASIZE_IN_TOTAL=1298kb
DATASIZE_OUT_TOTAL=65536kb
assert passed : (N / kThreadBlockSize <= 63*1024 && "grid_size larger than suppo
rted (cudaGetDeviceProperties:maxGridSize: 63k currently)")
ReadTextData data/Corel_ColorMoments_9d.ascii : 32768 lines of real data, added
0 lines of random data
0.00 sec : reading data from file
assert passed : (INDEXPOS_0(I0) == INDEXSTART_1-1)
assert passed : (INDEXPOS_1(I0-1,I0) == INDEXSTART_2-1)
assert passed : (INDEXPOS_2(I0-1,I0-1,I0) == INDEX_END-1)
assert passed : (sz < 255)
-2.453896,-2.369326,-2.369326,...,3.864799
0.04 sec : generating index data
0.05 sec : allocate and init device mem
1.21 sec : exec kernel on device
0.00 sec : receive results from device
atom[0]=21875 atom[1]=0 iNumResults=21875 kMaxResults=8388608
check : sequential...
check:iNumResults=21959 maxc=26 for i=17803
22.53 sec : check : sequential
gpu/cpu=     0.1  ok  err<=0.4% N=32768 size=1MB IO=16 i3:0,0 tgpu=1.2s tcpu=22.
5

Press ENTER to exit...