Arc plot gpu plotting 0.6 release, please help test and send feedback

After implemented a new kernel in cpu code ( ver:0.5) , finally I am able to port it to cuda which able accelerating phase1 in gpu. please check details in github, and help test. All feedback are welcome especially bug and benchmark report.

2 Likes

How fast rtx 4090 for a plot? :grin:

Edit2: sorry, doesn’t work for me. Illegal instruction, core dump.
Edit: nevermind it is still updating ubuntu, I thought it had finished.

What versions of ubuntu and cuda are supported? I am trying it on an old PC with this result

./arc_plot: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by ./arc_plot)
./arc_plot: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by ./arc_plot)
./arc_plot: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by ./arc_plot)
./arc_plot: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by ./arc_plot)

GitHub say it’s Ubuntu 22.04.
You seems run with plain old cpu without avx2 support :smile:

Could you run " cat /proc/cpuinfo " and post result? maybe your cpu missing a instruction set somewhere.

Muc is right I don’t have avx2, only avx is listed

phase 1 table 1 using shasha8 to create 32G initial table, calc that in gpu then copy back to cpu ram will cost more time than just run in cpu code except you have 32G video ram. so avx2 will help this part run faster. but avx cpu can still run it. I’ll patch AP in next release.

I wouldn’t worry about supporting ancient hardware, just add it to minimum spec.

1 Like

you can try 0.6.1 release, it should able run on your old server.

is any one use arc plot put some result with config thanks

use is almost same as MM, for me tested only CPU ploting 5900x was 19min some 15% faster then MM

1 Like

in tar file only one file how to use that on ubuntu ?!

Exciting to see this.

It would be helpful if you could provide a few examples on how to run it. Testing on a couple of different plotters and took me a minute to notice that GPU was off by default.

Plot time on 11900K with 110GB local ram disk for phase 1 went from 700 seconds to 200 seconds once I enabled the GPU (RTX3080).

I see there is a -R option and wondering how that might differ from using a disk for the temp directories.

Great to see innovations like this.
Couple things. Call these observations or feedback.

First, fix spelling mistake on GitHub. Above the fold spelling mistake makes the project look bush.

Requirements don’t make sense. A 4-core CPU, great! But 128GB RAM? Do tell where you will find old boards with greater than 32GB max capacity. 20 or 30 Series Nvidia GPU? Too back you can’t make use of older GPUs or AMD.

I’m confused about the requirements. If this is meant for consumers, do you not see 128GB RAM as a barrier? Who wants invest that much into this and says to me you aren’t using it unless you’re running a server board.

So to me the requirements don’t match the claim of “focus on consumer hardware”. If you showed this running on Z97 boards or newer with 16GB to 32GB RAM and say, older GPUs that meet the 8GB VRAM requirements, then maybe. If all your data is based on 128GB RAM? I guess I’ll wait this out for some progression.

Not trying to dump on it and I’m sure it’s a project done in your spare time and not as some money making venture at all. GPU plotting in its infancy and I should appreciate that fact. I hope at some point it’s a serious consideration. I just think the goal with anything related to Chia should be about how to reuse old computer gear that needs some “use case” scenario. Maybe a bit slow, slightly less power efficient, but if fully usable for the sake of a crypto POW blockchain. We shouldn’t be plucking new anything off store shelves for the sake of Chia farming.

If GPUs can simplify the plotting? It will be very interesting. Might taint some of the green aspect of Chia though.

-R is used for use 110G ram plotting without setup -2 ramdisk. and all data go through ram directly bypassing file system api, so it’s will be little bit efficient vs ramdisk , but not too much. and save you from create ramdisk.

thanks you for such great feedback.
for hardware requirement:
a.) 4 core cpu for gpu plotting is b/c all GPU can max out it’s performance with 4 thread. when 1.0 release , all load can shift into GPU , so 4-6 core cpu /per GPU should able to fully utilize one GPU’s raw power.
b.) for 128G ram, this is little bit complicated answer. first of all, we maybe all know that when plotting , there are total like 1.3T data need been processed. when speed of phase1 plotting reduced from eg 600sec to 200sec, the bandwidth need for IO is 1.3T /200 sec = 6.5G/sec. if only use 32G ram with SSD, lots of bandwidth pressure will dump into SSD, and there are no consumer level SSD can provide such sustainable IO even use 2 SSD raid0.
c.) for 8G GPU with rtx 20serious plus. this is b/c 10 serious nvidia ( cuda compatibility 6.2 ) have no atomicCAS( uint16_t) support, if go atomicCAS( uint32_t) it will blow the shared memory ( which is 48K). only solution is create a new algorithm to make it happen. ( I 'll try to solve that after 1.0 full gpu function release).
d.) no amd or intel gpu support yet, that’s on todo list, I have to build function one by one. but it also on todo list.
e.) please forgive me , it is a bush project so far :grin:. but I’ll try to make it better and better with such limited resource.

3 Likes

get help:
./arc_plot --help

run:
cpu: 32G + SSD
./arc_plot -r cpu_core_num -n num_of_plots -u 256 -v 256 -t your_ssd_path -d your_distination_path -c your_contract -f your_farm_key

cpu:110G
./arc_plot -R cpu_core_num -n num_of_plots -u 256 -v 256 -t your_ssd_path -d your_distination_path -c your_contract -f your_farm_key

cpu+gpu: 110G
./arc_plot -G -r cpu_core_num -n num_of_plots -u 256 -v 256 -t your_ssd_path -d your_distination_path -c your_contract -f your_farm_key

1 Like

No, I don’t consider this bush at all. It’s just that typos can make something look lesser than it is. Just like a tucked in shirt for example showing up to a business meeting.

Thank you for the explanation in terms of technical reasons for the requirements.

It’s impressive to bring something out first. There is a difference between working on something behind closed doors but you’ve put it out there. It’s a big deal. GPU plotting? It’s very interesting for sure. I’m all ears. I’m not buying a new GPU though and sure I wish a power hungry 1080ti could do the trick rather than buying more hardware to plot with.

I love computers and tech for the very reason you can’t predict how people will make use of it or what they might decide to create with it. This makes Chia very interesting. Community innovations that Chia themselves endorse!

will you be releasing source code at some point in time, or will this remain a closed source deal?

Will be opensource after I done all design and implementation. now I have no time to cleanup the code and the design still drifting too much.

1 Like