elective-stereophonic
elective-stereophonic
NxtMint Java minter
Please login or register.

Login with username, password and session length
Advanced search  

News:

Latest Nxt Client: Nxt 1.11.15

Pages: 1 ... 5 6 [7] 8 9 ... 20  All

Author Topic: NxtMint Java minter  (Read 50874 times)

z38630610

  • Newbie
  • *
  • Karma: +0/-0
  • Offline Offline
  • Posts: 5
    • View Profile
Re: NxtMint Java minter
« Reply #120 on: January 21, 2015, 11:27:22 am »

how  to set up for 2 gpu cards
-d 0,1
Logged

lopalcar

  • Hero Member
  • *****
  • Karma: +99/-15
  • Offline Offline
  • Posts: 561
    • View Profile
Re: NxtMint Java minter
« Reply #121 on: January 21, 2015, 11:41:37 am »

how  to set up for 2 gpu cards
Using the latest NxtMint-1.1.0.jar hosted in google drive, edit NxtMint.conf adding:
gpuDevice=X

where X is a number betwen 0 and (NumberOfGPUs-1)
You need to write that line as many times as devices you want to run specifying the X for each GPU
Logged

lopalcar

  • Hero Member
  • *****
  • Karma: +99/-15
  • Offline Offline
  • Posts: 561
    • View Profile
Re: NxtMint Java minter
« Reply #122 on: January 21, 2015, 11:44:40 am »


I have 16 compute units as following.  BTW, it seems that the latest version has poor performance as minting EGOLD compared with the public released 1.1.0 version.

Code: [Select]
05:17:34 INFO Main.buildGpuList: GPU device 0: NVIDIA CUDA, 3072MB global memory
, 48KB local memory, 16 compute units
I also have poor performance, in a 6990, each GPU now gives me 45mhash and before it was giving me 100mhash, so I don't gain anything using the whole card
Each GPU has 1536 cores and 24 compute units
Logged

crimi

  • Hero Member
  • *****
  • Karma: +122/-11
  • Offline Offline
  • Posts: 863
    • View Profile
Re: NxtMint Java minter
« Reply #123 on: January 21, 2015, 02:55:03 pm »

Got 2 cards running now works fine so far.


So if minting is like a big sudoku looking for a solution, would it make sense just to stop the puzzle and start a new one if you think it going to take to long.
Sometimes i need 1 mio hashes sometimes only 30k hashes atm to find 200 units egold.
« Last Edit: January 21, 2015, 06:20:44 pm by crimi »
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #124 on: January 21, 2015, 03:09:59 pm »

I have 16 compute units as following.  BTW, it seems that the latest version has poor performance as minting EGOLD compared with the public released 1.1.0 version.

Code: [Select]
05:17:34 INFO Main.buildGpuList: GPU device 0: NVIDIA CUDA, 3072MB global memory
, 48KB local memory, 16 compute units
Hmm - I didn't change the Keccak25 or Sha256 code.  I'll rerun some Keccak25 and Sha256 minting and see how it looks.

16 compute units is outstanding (wish I had one :)).  So even the old code which used 4 compute units wasn't stressing your card.  The change to configure dynamically should take care of that problem.  However, you will probably have to increase the Java heap since each instance requires 128KB of global memory (which is mapped to Java memory).  Each compute unit would run 256 instances, so 16*256*128KB=512MB.
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #125 on: January 21, 2015, 03:11:40 pm »

how  to set up for 2 gpu cards
Add a gpuDevice parameter to NxtMint.conf.  The default is gpuDevice=0 if it isn't specified (that is, use the first GPU found)

For example, to use the first two GPU devices, add:
  gpuDevice=0
  gpuDevice=1
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #126 on: January 21, 2015, 03:18:11 pm »

I also have poor performance, in a 6990, each GPU now gives me 45mhash and before it was giving me 100mhash, so I don't gain anything using the whole card
Each GPU has 1536 cores and 24 compute units
OK - I'm guessing it isn't using all cores in a compute unit.  That could be a result of the changes for Scrypt.  I'll look into it.

24 compute units will be even worse then 16 as far as Java memory is concerned.  I wish I could find out how many cores are available but Aparapi doesn't return that information.  So I have to assume it is 256 per compute unit but in your case it would be 64 (1536/24).  I guess I can add it as a configuration parameter and assume 256 if it is not specified.  That should improve performance further and reduce memory requirements (both in Java and on the graphics card).

Logged

lopalcar

  • Hero Member
  • *****
  • Karma: +99/-15
  • Offline Offline
  • Posts: 561
    • View Profile
Re: NxtMint Java minter
« Reply #127 on: January 21, 2015, 03:45:53 pm »

I also have poor performance, in a 6990, each GPU now gives me 45mhash and before it was giving me 100mhash, so I don't gain anything using the whole card
Each GPU has 1536 cores and 24 compute units
OK - I'm guessing it isn't using all cores in a compute unit.  That could be a result of the changes for Scrypt.  I'll look into it.

24 compute units will be even worse then 16 as far as Java memory is concerned.  I wish I could find out how many cores are available but Aparapi doesn't return that information.  So I have to assume it is 256 per compute unit but in your case it would be 64 (1536/24).  I guess I can add it as a configuration parameter and assume 256 if it is not specified.  That should improve performance further and reduce memory requirements (both in Java and on the graphics card).
Ok thank you :)
I don't know what memory are you refearing to... I ran the jar file with -Xmx1024m, either the RAM memory nor the GPU memory are stressed... I also know that the GPU is at 99% of use, but makes less noise than before, so I suppose isn't using all the available cores... "I don't really have idea how the GPU usage is measured then :S "
The GPU memory load is 346MB, same as IDLE, and the RAM from the jar file 184MB
Hope this info helps you in something
Ah, I'm mining EGOLD also, so I don't understand how the scrypt changes are influencing here
Logged

Jimmy2011

  • Sr. Member
  • ****
  • Karma: +24/-19
  • Offline Offline
  • Posts: 329
    • View Profile
Re: NxtMint Java minter
« Reply #128 on: January 21, 2015, 04:05:23 pm »


I think GTX 580 has 16 compute units with total 512 cores, 32 cores for each unit, right?
Logged
NXT-LX5G-L63N-ST8S-9LVZY

Jimmy2011

  • Sr. Member
  • ****
  • Karma: +24/-19
  • Offline Offline
  • Posts: 329
    • View Profile
Re: NxtMint Java minter
« Reply #129 on: January 21, 2015, 04:06:15 pm »

I guess I can add it as a configuration parameter and assume 256 if it is not specified.  That should improve performance further and reduce memory requirements (both in Java and on the graphics card).

It will be great.
Logged
NXT-LX5G-L63N-ST8S-9LVZY

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #130 on: January 21, 2015, 05:07:30 pm »

Ok thank you :)
I don't know what memory are you refearing to... I ran the jar file with -Xmx1024m, either the RAM memory nor the GPU memory are stressed... I also know that the GPU is at 99% of use, but makes less noise than before, so I suppose isn't using all the available cores... "I don't really have idea how the GPU usage is measured then :S "
The GPU memory load is 346MB, same as IDLE, and the RAM from the jar file 184MB
Hope this info helps you in something
Ah, I'm mining EGOLD also, so I don't understand how the scrypt changes are influencing here
I was referring to the local memory on the card.  Each compute unit has its own high-speed local memory.  But you have plenty, so that isn't a problem.

I changed the mint worker to calculate the workgroup range before executing the kernel.  This is done for all algorithms, so it would affect Keccak25 and Sha256.  I think the range call is causing the Aparapi kernel code to reset cached data.  I'll change this to calculate the range during initialization and save the result.  That should get us back to where we were before the Scrypt changes.
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #131 on: January 21, 2015, 05:12:00 pm »


I think GTX 580 has 16 compute units with total 512 cores, 32 cores for each unit, right?
The spec sheet says 512 but you should check the device information for your card since card manufacturers can improve the base specs from NVIDIA (overclocking, extra compute units, etc).  For example, the spec sheet for the GT 720 has 192 cores.  But the card manufacturer placed two compute units on the card, giving me 384 cores and 2 compute units.

On Windows, the NVIDIA Control Panel will show the number of cores in the System Information section.
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #132 on: January 21, 2015, 07:54:37 pm »

OK - new test version on Google Drive.  Hopefully, Keccak25 and Sha256 hash rates will be back to where they were.

The gpuDevice configuration parameter now has 2 fields - the first one is the device number (0, 1, etc) and the second one is the total number of processing cores on the graphics card.  For example, if GPU 0 has 384 cores, you would specify
  gpuDevice=0,384

For Scrypt, the memory allocation is approximately 128KB per processing core.  So a system with 512 cores will require 64MB of Java heap and 64MB of adapter memory.  If you run out of Java heap memory, increase the value for the -Xmx parameter on the Java command line.
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #133 on: January 21, 2015, 08:56:37 pm »

I did some testing and performance is actually better if you don't specify the number of kernel instances per compute unit.  I think this happens because OpenCL will execute all of the requested kernel instances before returning control to Aparapi, which is faster than having Aparapi initiate additional executions.

So, just specify 'gpuDevice=n' and NxtMint will use all of the compute units on the card with 256 kernel instances per compute unit.  If this causes a resource shortage on the card, then you can reduce the number of kernel instances per compute unit by specifying the number of processor cores as 'gpuDevice=n,cores'.  The kernel instances per compute unit will then be calculated as (the number of processing cores) / (number of compute units).

Note that SCRYPT requires 128KB per kernel instance.  The total number of instances is (instances per compute unit) x (number of compute units).  So 24 compute units with 256 instances per compute unit would require 24 * 256 * 128KB or 768MB, in both the Java heap and adapter memory.
Logged

Jimmy2011

  • Sr. Member
  • ****
  • Karma: +24/-19
  • Offline Offline
  • Posts: 329
    • View Profile
Re: NxtMint Java minter
« Reply #134 on: January 22, 2015, 02:44:45 am »


The latest version is so great. 10X speed up compared with last versio!

I confirmed my GPU has 512 cores.

The following results show that the number of instances affect the performance a lot. The hash rate is so low if each core runs only 1 instances, however it is ~ 8X speed up if each core runs the default 8 instances ( 32 cores a unit, total default 32*8=256 instances a unit, that is total 16*32*8=4096 instances). On the other hand, it doesn't support "gpuDevice = 0, 8192" when I tried to set 16 instances for each core. BTW, what does "gpuIntensity" mean if gpuDevice parameter is used?


Code: [Select]
gupDevice=0,512
HR = 0.0025MH/s

Code: [Select]
#gupDevice=0,512
HR = 0.019MH/s
Logged
NXT-LX5G-L63N-ST8S-9LVZY

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #135 on: January 22, 2015, 04:02:10 am »


The latest version is so great. 10X speed up compared with last versio!

I confirmed my GPU has 512 cores.

The following results show that the number of instances affect the performance a lot. The hash rate is so low if each core runs only 1 instances, however it is ~ 8X speed up if each core runs the default 8 instances ( 32 cores a unit, total default 32*8=256 instances a unit, that is total 16*32*8=4096 instances). On the other hand, it doesn't support "gpuDevice = 0, 8192" when I tried to set 16 instances for each core. BTW, what does "gpuIntensity" mean if gpuDevice parameter is used?


Code: [Select]
gupDevice=0,512
HR = 0.0025MH/s

Code: [Select]
#gupDevice=0,512
HR = 0.019MH/s
The device reports maximum sizes to Aparapi, which verifies the requested execution range is valid.  I'll add the maximum item size and maximum group size to the GPU information that NxtMint displays when it starts up.

gpuIntensity determines the number of kernel executions before control is returned to Java.  As you increase the intensity, the card is busier but display performance will begin to suffer since display updates get queued behind the minting kernel executions.

I think we both arrived at the same conclusion on number of work items per work group.  My performance increased if I had 8 or 16 work items per work group.  The performance decreased if the number went below 4 or above 16.  I think the problem results from memory cache contention.  Scrypt maintains 128 bytes of state information for each instance and this data is read/written continuously during the hashing.

Try the NxtMint-1.1.0.jar that I just placed on Google Drive.  It sets the local size (number of work items) to 8 and sets the global size (number of work groups per execution) to 256 x (number of compute units).  See if that gives you better performance.
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #136 on: January 22, 2015, 04:04:47 am »

Here is the current hash rate on my GT 720 (384 cores, 2 compute units).  This is minting ELEMS and worker 4 is the GPU worker.
Code: [Select]
10:59:35 FINE MintWorker.run: Worker 4: 21.65 MHash, 0.0104 MHash/s
10:59:56 FINE MintWorker.run: Worker 1: 5.40 MHash, 0.0026 MHash/s
10:59:56 FINE MintWorker.run: Worker 2: 5.39 MHash, 0.0026 MHash/s
10:59:56 FINE MintWorker.run: Worker 3: 5.32 MHash, 0.0025 MHash/s
10:59:56 FINE MintWorker.run: Worker 0: 5.39 MHash, 0.0026 MHash/s
11:00:04 INFO MintWorker.run: Worker 4 found solution for counter 26
11:00:04 FINE Mint.mint: Solution for counter 26 added to pending queue
11:00:04 FINE MintWorker.run: Worker 1 abandoning counter 26
11:00:04 FINE MintWorker.run: Worker 2 abandoning counter 26
11:00:04 FINE MintWorker.run: Worker 3 abandoning counter 26
11:00:04 FINE MintWorker.run: Worker 2 starting on counter 27
11:00:04 FINE MintWorker.run: Worker 3 starting on counter 27
11:00:04 FINE MintWorker.run: Worker 4 starting on counter 27
11:00:04 FINE MintWorker.run: Worker 0 abandoning counter 26
11:00:04 FINE MintWorker.run: Worker 1 starting on counter 27
11:00:04 FINE MintWorker.run: Worker 0 starting on counter 27
11:00:04 INFO Mint.mint: Solution for counter 26 submitted
Logged

Jimmy2011

  • Sr. Member
  • ****
  • Karma: +24/-19
  • Offline Offline
  • Posts: 329
    • View Profile
Re: NxtMint Java minter
« Reply #137 on: January 22, 2015, 04:14:52 am »


Try the NxtMint-1.1.0.jar that I just placed on Google Drive. It sets the local size (number of work items) to 8 and sets the global size (number of work groups per execution) to 256 x (number of compute units).  See if that gives you better performance.

Is the above default setting?

How to set the number of work groups per execution? It seems that the GPU here can support max 1024 work groups.

Code: [Select]
12:26:38 INFO Main.lambda$null$2: GPU device 0: NVIDIA CUDA
  3072MB global memory, 48KB local memory, 16 compute units, Max work group size
 1024

« Last Edit: January 22, 2015, 04:33:36 am by Jimmy2011 »
Logged
NXT-LX5G-L63N-ST8S-9LVZY

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #138 on: January 22, 2015, 04:33:38 am »

Is the above default setting? What is "number of work groups per execution" ?
Scrypt now hardcodes the work group size to 8.  Specifying the number of cores will not change this value (it will still affect the other algorithms).

The way Aparapi works is as follows:
  • Data is transferred from host memory to adapter memory
  • The OpenCL program is executed and Aparapi waits for it to complete
  • Data is transferred from adapter memory to host memory
OpenCL handles scheduling and dispatching work groups.  Each work group contains one or more work items (instances of the kernel program).  OpenCL terminates when all of the work groups have been processed.  The gpuIntensity value determines the number of work groups to be scheduled.  So, the higher the intensity, the longer OpenCL runs before completing.

Multiple work groups can be dispatched concurrently, so Scrypt should still keep the adapter busy as long as the work group size divides evenly into the number of cores on the compute unit.  I chose 8 instead of 16 to handle cases like 48 cores.
Logged

ScripterRon

  • Hero Member
  • *****
  • Karma: +75/-2
  • Offline Offline
  • Posts: 523
    • View Profile
Re: NxtMint Java minter
« Reply #139 on: January 22, 2015, 04:40:08 am »

How to set the number of work groups per execution? It seems that the GPU here can support max 1024 work groups.
Code: [Select]
12:26:38 INFO Main.lambda$null$2: GPU device 0: NVIDIA CUDA
  3072MB global memory, 48KB local memory, 16 compute units, Max work group size
 1024
No, it means a work group can have a maximum of 1024 work items.  I don't think there is a limit on the number of work groups other than limits imposed by the operating system or device driver (I know the NVIDIA device driver will fail and restart if the OpenCL program runs too long -- had that happen many a time when the code I wrote went into a loop or hung on recursive calls -- at least things are much better than 10 years ago when you would get a blue screen of death).
Logged
Pages: 1 ... 5 6 [7] 8 9 ... 20  All
 

elective-stereophonic
elective-stereophonic
assembly
assembly