csa_decrypt_1block_005.clWhich of the 7 csa_decrypt_1block_*.cl gives the best kps?
csa_decrypt_1block_005.clWhich of the 7 csa_decrypt_1block_*.cl gives the best kps?
https://www.khronos.org/opencl/
https://www.khronos.org/opencl/community-resources/
https://github.com/KhronosGroup/OpenCL-Guide
https://workupload.com/file/qLPUZMqcWS6
https://github.com/jeremyong/opencl_in_action
-913098240 = 0xC9933A00 = 3381869056 = 3381M
4294967296 -913098240 = 3381869056
41696649805824 / 12329469 / 1000 = 3381.86906555538
cores clock Device Power KPS KPH=PPS*3600 Time for full range
Hours Days
RTX 4090 16384 2230 36536320 3,381,869,056.00 12,174,728,601,600.00 23.12 0.96
RTX 3090Ti 10752 1395 14999040 1,388,338,761.15 4,998,019,540,132.74 56.32 2.35
RTX 2080Ti 4352 1350 5875200 543,819,330.40 1,957,749,589,452.91 143.77 5.99
RTX 1080Ti 3584 1480 5304320 490,977,626.40 1,767,519,455,052.91 159.25 6.64
RTX 1070Ti 2432 1607 3908224 361,752,409.92 1,302,308,675,702.96 216.14 9.01
RTX 1060 1280 1506 1927680 178,429,610.37 642,346,597,323.77 438.20 18.26
RTX 1050Ti 768 1290 990720 91,702,867.48 330,130,322,927.36 852.62 35.53
AMD Radeon HD 6770M 480 725 348000 32,211,520.80 115,961,474,865.47 2,427.31 101.14
Device Power = cores x clock
KPS = 92.56 x Device Power
KPH = 333222 x Device Power
Hours = 844705468 / Device Power
Days = 35196061 / Device Power
Device Power = 480 x 725 = 348000
KPS = 92.56 x 348000 = 32210880 keys per second
KPH = 333222 x 348000 = 115961256000 keys per hour
Hours = 844705468 / 348000 = 2427.31 hours
Days = 35196061 / 348000 = 101.14 days
https://workupload.com/file/jxW8CgwGSYd
#PROGRAM_FILE:"csa_decrypt_1block_000a.cl"
#PROGRAM_FILE:"csa_decrypt_1block_001a.cl"
PROGRAM_FILE:"csa_decrypt_1block_002a.cl"
#PROGRAM_FILE:"csa_decrypt_1block_003a.cl"
#VECTORTEST
VECTORTEST:0 0 43 // If VECTORTEST > VECTORMAX( 43 ) then vector test is ignore. Of smaller test will stop at compared N key
#Main Speed Adjustments
MULTITHREADSIZE:4 1 2 4 # Nunmber of CPU search threads to launch simultaneously
PES1ROUNDSB4PES2:16 # Number of PES1 rounds before testing for PES2 < 16
LOCALWORKDROUPSIZE:0 64 256 64 256 0 64 128 256 # 1) Set recomended, multiple of 32 or 64, new fast GPU can do 256
GLOBALWORKDROUPSIZE:0 1536 6144 3072 1536 1536 6144 1536 3072 1536 128 256 6144 0 1536 3072 6144 # 2) Set Recomended, take note of how many CU. Then 1rst sugest value CU x 256. Then multiples of 2. In my case 6 * 256 = 1536, then multiples 1536 3072 6144
LOOPSPERKERNEL:256 256 516 1024 2048 4096 # 3) Adjust LOOPSPERKERNEL to gest a cadence of about 1 second
PES1:3CEBDC173C2BD64F651688F258D59705
PES2:AD43A480B11CDDBE60AC847768D7A771
PES3:113D7195079EBDB25A66B08092519DE7
If GLOBALWORKDROUPSIZE is the same, does higher MULTITHREADSIZE increase speed?
...Now do not go and attempt to do a 100 in MULTITHREADSIZE!! Just as moonbase did on his last attempt...
https://workupload.com/file/cttjNqqQYXW