Encryption Projects as SU group

moonbase

VIP
Donating Member
Messages
553
A test with CudaBISS v12_3_64 shows the key search speed to be 11609 million keys per second.
 
Last edited:

sdrfgs

Registered
Messages
101
I didn't have much time to test but here is a very brief test of the 2 latest files
p.s can we get a more organized update process?
e.g cudabiss2024-v1.01.rar
which would have the 64 and 128 version and perhaps a history txt file for whatever revisions weremade



2/02/24 tested

CPU I7 5820K
Win 10 current fully updated
16 GB ddr 4x4 2133mhz
Gigabyte GTX 980 4GB
nvidia driver 546.12 DCH

input file

000000000000
010000000000
47420097FBD6514CF8A45710DEDC2BA8
47420090ACA55EC7AD6798DA9939A615
4742009C34C7B45CBFF0718AEDBE387B
1
1

cudabiss12_3_64.exe

1 copy Cpu 10-11%, 95-%97 GPU Load, 990-1004 MB Vram used, mem load controller 30-33%
840796224

cudabiss12_3_128.exe

1 copy Cpu 10-11%, 85-%100 GPU Load, 1328 MB Vram used, mem load controller 33%
734254656

previously using cudabiss51264.exe i could get around 570 million!! 128 version appear to be slower on my 980
 

sdrfgs

Registered
Messages
101
i also found when running the above range
they end with

Possible hit: 0 ff fe fd 57 d 21 85 nope
Possible hit: 0 ff ff fe 2c 90 61 1d nope
Possible hit: 1 0 0 1 10 97 6b 12 nope
Possible hit: 1 0 3 4 35 dd b 1d nope

which is outside the range in the input?
 
Last edited:

moonbase

VIP
Donating Member
Messages
553
Im about to test the new versions. Did it resolve your problem with FFFF search ranges?

I did not test it with an FFFF end range. Ranges I tested v12_3_64 with were of the type listed below.

110000000000
120000000000

In practice it runs everything in the 11 range and it seemed to stop OK.
 

sdrfgs

Registered
Messages
101
my question was why does it go past the set range?
Possible hit: 1 0 0 1 10 97 6b 12 nope
Possible hit: 1 0 3 4 35 dd b 1d nope

shouldnt it stop at 010000000000
shouldnt the displayed output be the full 12 chars?
 

moonbase

VIP
Donating Member
Messages
553
my question was why does it go past the set range?
Possible hit: 1 0 0 1 10 97 6b 12 nope
Possible hit: 1 0 3 4 35 dd b 1d nope

shouldnt it stop at 010000000000
shouldnt the displayed output be the full 12 chars?


To the best of my knowledge, all previous versions of CudBISS omit the leading zero from the key pairs when searching.
This is not new, if there is a single character rather than a pair of characters it is the leading zero that is missing.

The program might be running so fast that it runs over into the first few keys beyond the stop point before the "check and stop" command is implemented.
It is only milliseconds on the overrun, its not like it really adds to the duration and key search speed.

I tested v12_3_128 on a single instance and it is slightly slower than v12_3_64. The slight speed loss will be magnified once multiple simultaneous instances are run.
I will run some more detailed tests later today for v12_3_128.
 
Last edited:

sdrfgs

Registered
Messages
101
Yes re dropping leading digits. That has always been the case, in previous versions. I just thought I would mention it to the guy doing the new releases, just visual issue that slightly annoys me when they scroll by. Perhaps could be adjusted in any new release?

Yeah it goes over the range but like you said, doesnt effect the real outcome.

If we can ask for any improvment, can we convert the processing time to also show minutes and seconds beside it, as its bit more usefull than (ms)

Can confirm the 128 release is Slower for me also

I ran these via remote pc using vnc remote (which in itself uses some gpu to refresh the desktop display so is slower than actual "live" performance)

remote machine see earlier in thread (3060ti etc 16 gb ram x299 machine )

64 version 2673800960 keys/s
128 version 2574168320 keys/s

64 version 2 copys cpu %33
copy 1 1677634304
copy 2 1679028736

total 3356663040 keys/s

64 version 3 copys cpu %43 %14-%14.5 usage each on average
copy 1 1150338432
copy 2 1154346112
copy 3 1153731712

total 3458418256 keys/s
 
Last edited:

moonbase

VIP
Donating Member
Messages
553
I tested v12_3_128 on a single instance and it is slightly slower than v12_3_64. The slight speed loss will be magnified once multiple simultaneous instances are run.
I will run some more detailed tests later today for v12_3_128.


After a more detailed test of v12_3_128 I find it is slightly slower than v12_3_64.
 
Last edited:

sdrfgs

Registered
Messages
101
on my 980

cudabiss12_3_64.exe

1 copy Cpu 10-11% (via taskmanager), GPUZ readings 95-%97 GPU Load, 990-1004 MB Vram used, mem controller load 30-33%
copy 1 840796224 keys/s

2 copys Cpu 10-11% per copy (via taskmanager) ,GPUZ readings %98-%99 GPU Load, 1849 MB Vram used, mem controller load 38-40%
copy 1 488355264
copy 2 488336640
total 976691904 keys/s

didnt try 3 yet
 

Lak7

Registered
Messages
36
The 64 & 128 is the number of cuda threads when compiled. The sweet-spot seems to be 64 for most, but don't know until it's tested.
I have the number of cuda blocks set at 10240, last "original" version was at 512 (why running 8 instances works - fills the newer units)
Just throwing this out there ..... this is exactly the same program as before, just compiled with different versions of the Cuda Toolkit - the 12_3 part, and sometimes changes to the number of cuda threads and blocks ... (aside from having the Possible Hits scrolling or not, letters being in uppercase or lowercase)
For the pre RTX cards, I would use the original version, which ever runs fastest. Here is the last originals from Nov 2011 ....


"102464" means 1024 threads, 64 blocks, defines the size of the workload.
 

sdrfgs

Registered
Messages
101
Thanks for posting cudabiss 2.12 but as mentioned earlier in the thread, it does not work on my 980 gpu
Your newer compiled cudabiss12_3_64.exe however works fantastically!

Its nice to see there is still some interest in using this tool.

I believe 1024 threads is the maximum cuda supports. I did run the Nvidia visual profiler over the original 51264 release long ago. That really showed up how inefficient the old release was on newer cards.

Have you run it on your newer release?
 

moonbase

VIP
Donating Member
Messages
553
The latest test with the RTX 4090 gives a key search speed of 14939 million keys per second.
 
Last edited:

moonbase

VIP
Donating Member
Messages
553
Done it, got past 15 billion keys per second with the RTX4090.
Very slight change to the XMP settings in the motherboard BIOS got it over the line.
 
Last edited:

Lak7

Registered
Messages
36
Thanks for posting cudabiss 2.12 but as mentioned earlier in the thread, it does not work on my 980 gpu
Your newer compiled cudabiss12_3_64.exe however works fantastically!

Its nice to see there is still some interest in using this tool.

I believe 1024 threads is the maximum cuda supports. I did run the Nvidia visual profiler over the original 51264 release long ago. That really showed up how inefficient the old release was on newer cards.

Have you run it on your newer release?
The old version "should" run on any post GTX 2xx series - "should"
That's why testing is important :)
 

Lak7

Registered
Messages
36
Done it, got past 15 billion keys per second with the RTX4090.
Very slight change to the XMP settings in the motherboard BIOS got it over the line.

Key search speed was 15076 million keys per second, ie, 15.076 billion keys per second.
The time for a biss full range key search at this speed is 5.19 hours.

I think there might be more speed to come from the RTX4090 card with a higher spec motherboard and CPU.
It might be possible to get the biss full range key search time to below 5 hours with an RTX4090 card?

A screen grab of the 15.076 billion keys per second is attached below.
.
View attachment 49749
Nice.
It's very PCI intensive. You can't use riser cards like in mining.
 

moonbase

VIP
Donating Member
Messages
553
Nice.
It's very PCI intensive. You can't use riser cards like in mining.

I used a PCIe x16 extension cable, same principle as a riser card except it connects the GPU to the motherboard using the full 16 PCIe lanes.
 
Last edited:

moonbase

VIP
Donating Member
Messages
553
The old version "should" run on any post GTX 2xx series - "should"
That's why testing is important :)


From my own personal tests I have found CudaBISS v12_3_64 to be the fastest of all the versions.

Is there any further scope with the coding in CudaBISS to improve on the speed of v12_3_64, possibly to take into account the architecture of the RTX 4000 series of cards?
Or do you think it is now maxed out for speed of key search in its current status?
 

C0der

Registered
Messages
270
There might be room for improvement. But since I dont have the sourcecode, it's hard to tell.
 

sdrfgs

Registered
Messages
101
You can use Nvidia visual profiler on the exe to see where it has limits and can be optimized
 
Top