-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TODO: Test Silent army v5 of a wide variety of devices #5
Comments
Windows build please |
it is failing, i have to fix it up... |
If you have a 1070 or 1080 please test it: |
try: i remove all the thread waiting.. |
the v5 silent army doesnt seem to be too great when converted to CUDA :(, however It might just be the way its looping...i might be able to fix it |
yeah there is some major issue with cuda and that code that was posted for opencl. cuda keeps "timing" out when I try to do same kind of calls...I guess itll have to be a work in progress for now. Ill post again when I see improvement with my GTX 650, which should then indicate some progress with 1080s, etc. |
I see from the author or silentarmy: mbevand commented 2 days ago • edited |
Well based off that guys message I reverted back to a more direct conversion, maybe itll work better: https://github.com/maztheman/nheqminer/releases/download/v0.4h/v0.4h_MAXWEL_PLUS.7z It was super slow on GTX 650 but looks like it will be because of the shared atomics |
hmm wierd, i get 78 sols/s with my R9 290 with the v5 code in open cl. 0 sols probably because some timeout is causing a crash |
I ran both versions. Setup: 6x GTX1070 with Windows 10 latest drivers, etc (anno vers) 0.4h plain: 241 sol/s power usage 51% previsous version got about 250 so its a little less, power usage is much more stable though. |
hmm 241 is really not that competitive to the SA linux version, is it? |
Honestly i havent tried the linux version yet, The zcminer-dev windows version gets about 300 so its not that bad. You have Linux example on actual speeds? |
Compiled. For me crashing after 20 sec (i see constantly increase use of gpu memory till all 4GB is fillup) and then nvidia is in P5 locked state.. [23:00:05][0x000011dc] Using SSE2: YES |
yes, sorry there is a memory "leak" that i already have fixed but not have pushed |
I tried it on a 750ti comp 5.0 and I get this massage. Missing file MSVCP140.dll for win 8.1 |
oh, you need to install the redist file: |
@chronosek ive checked in the fixes |
That worked...thx :) |
@maztheman thx, not crashing now, but always get 0 sol/s no matter what -cb, -ct will set |
Yeah some internal issue, ill have to fully debug it again... |
Getting @ 17-18 sol/s with a 1060 3gb card with 0.4h. I found the best rate for this card is obtained with -cb 128 -ct 32 (autodect puts it at -cb 63 ct 64 and only yields 15 to 16 sol/s. |
removing packed atomic counters = bad, I tested it in opencl, it is fastest. I tested packed 64-bit atomics too, 1-2% slower |
tpruvot did good job porting to cuda, specially with atomic part (working stable at 58 sol/s but still eqm doing 66 sol/s), i think maztheman problems was from some hardcoded values or other code what was not in silentarmy |
I added a test program that will help me debug this cuda issue. Please post your log files. |
I created a new build that should "work" on 1080 and 1070's again. Probably wont break any records though... |
I uploaded a 16 thread version |
Thanks for your testing! Not exactly what I was expecting but seems each card has its own sweet spot. |
t16 Whenever I start the miner it always shows blocks 42 and threads 64. Is that when the mining software thinks I should be using? |
The original miner used that information but I dont use it. 42 is 7 * your sm count, which I guess is 6. It seems silent army uses a different block dimension so that it can use the thread id as an index into the hash table. |
My results with latest beta: GTX 580:
GTX 550 Ti:
GTX 650:
|
@drigger. Thanks for the testing. |
I'm getting around 100 sols with GTX980 with version L now with cpu (7 out of 8 threads) it gives around 120 sols On previous versions (i) best (42 sols) was -cs -cb 256 -ct 32 -cd 0 This is a huge improvement ! keep up the good work Maz :) |
Tnx I'll keep porting any enhancements I find. |
@jddebug Kenai/Soldotna area. Thanks for the updates @maztheman, I'll post updates for all my stuff tomorrow. |
Can you please send me a private generation absenteeism? |
@ceozero Sorry I dont understand what you mean... |
@maztheman I need a new miner. You need to modify the code to re compile. can you do it? |
Yeah I can compile anything for windows. |
May I know your email address? Or do you want to send a mail to me. [email protected] |
I sent you an email |
Getting @ 75-85 sol/sec off my GTX 1060 3GB with 0.4l, for some reason that number doesn't improve much when I have all 4 CPU's running too. I was getting better results using your miner for CPU and Silentarmy for GPU at a combined 100 sol/s. Default detection for the 1060 is 63 blocks & 64 threads, I cap out at about 75 with this setting, 128b and 32t seems to give me the best capping out at about 85 sol/sec. Is there a calculator out there that will tell you what the best block and thread count is for any given model and ram size video cuda card? |
I can't run, I'm still working on that.
… 在 2016年11月29日,上午4:08,dtawom ***@***.***> 写道:
Getting @ 75-85 sol/sec off my GTX 1060 3GB with 0.4l, for some reason that number doesn't improve much when I have all 4 CPU's running too. I was getting better results using your miner for CPU and Silentarmy for GPU at a combined 100 sol/s. Default detection for the 1060 is 63 blocks & 64 threads, I cap out at about 75 with this setting, 128b and 32t seems to give me the best capping out at about 85 sol/sec. Is there a calculator out there that will tell you what the best block and thread count is for any given model and ram size video cuda card?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AVeH-ZwE_v68V9-n2Rw-fCDUF63s7Phvks5rCzSzgaJpZM4KxkRY>.
|
@dtawom , the silentarmy does not include some of the changes I added, so it maybe more efficient for that specific card. unless you are using the https://github.com/zawawawa/silentarmy version. I ported those changes and some other nvidia specific parameters and stuff. If I had all the nvidia cards I would be testing and testing before I created builds. I just have to trust that the changes im porting "work" and "work well" with all nvidia 10xx series cards. There is a "Version 6" of silentarmy coming VERY soon. I will be porting those changes ASAP. Stay tuned... |
@maztheman yeah the zawawawa one is the one I was using. The difference is so negligible (5-10 sol/sec per machine) that it is easier just to use your latest build only. The new nvidia one you made for older cards is getting me @ 40 sol/s on my 580 gtx which is @ 15 more sol/sec than I was getting before. Thanks soo much for the updates, they've got me hitting peaks of 650 sol/s now between all my rigs. Looking forward to silentarmy V6, hope the improvements are as significant as they were from 4 to 5. |
I'm just glad the work I'm doing is useful to someone :-) |
@maztheman what‘s time update. to Faster., |
A few more days |
I dunno if this guy is running off silentarmy v6 or something else, but these numbers are soooo sexy. Gonna give um a try on my 1060 rigs. https://forum.z.cash/t/ewbfs-nvidia-cuda-zcash-miner-1060-170-h-s-gtx-1070-250-h-s/12523 |
Oh it's closed source so I can't see what he is doing. I can profile the exe probably and see what I can see |
Confirmed 175 sol/sec on 1060 with no CPU. You should do what this guy does and take a couple percent for yourself. I'd rather support you than him. |
Holy moly, getting triple speed off my R9 Fury's now with claymore. 325 H/s per card with https://github.com/nanopool/ClaymoreZECMiner |
:) wow, looks like claymore is on top again! |
Looks like an update will happen pretty soon... |
new build new thread |
Discuss your results here
The text was updated successfully, but these errors were encountered: