To expand on what @zogstrip said: Every 10 seconds or so, your farmer will send a request to the harvester with a challenge. The harvester then:
Checks if any plots pass the filter. This is super-fast.
If any plots passed the filter, it checks them in a more involved process. Also pretty fast, but not as fast.
With large numbers of plots, harvesting plots on a network share can be problematic. See @codinghorror’s thread about that.
If you want to see this stuff in action, set your logging level to “INFO” in the config file for your harvester, and watch the logs. You’ll probably see very fast time when no plots pass the filter, and slightly longer times when one or more pass.
From one of my boxes:
0 plots were eligible for farming feda00b3ec... Found 0 proofs. Time: 0.00696 s. Total 139 plots
1 plots were eligible for farming feda00b3ec... Found 0 proofs. Time: 0.15860 s. Total 139 plots
2 plots were eligible for farming d645418820... Found 0 proofs. Time: 0.66652 s. Total 139 plots
There’s a lot of variance in the time based on system load, etc, but note that higher numbers of eligible plots tend to push the farming time higher.
Many thanks for answer @zogstrip & @timdev
But what I really need to know is:
Does the harvester need to read all the plots for every challenge received (every 10s)? (which will require very high bandwidth usage) or this is not how it works, I feel I’m missing something (any reference on that would be helpful)
The way they did this is clever; clearly there’s some metadata about the plot file stored in memory so the harvester can run a band filter to eliminate a bunch of plots from file-based in depth checking immediately!
Otherwise, if it had to scan every proof on disk every time there was a harvester check… my god… that’d be disastrous, the drives would be at 100% load all the time!
The way it is now, it eliminates the vast majority of plots from contention via a quick in memory metadata check, then only do an in-depth file level check onthe relatively small number of plots (1? 5? 10?) that pass the initial filter.
When a challenge comes in, the harvester filters the plots it’s harvesting. This appears to be done based on data cached in memory on the harvester, though I suppose it might read a few bytes from each plot - but even if that’s the case, it’s a tiny amount of data.
For each challenge, there’s about a 1/512 chance of any given k32 plot passing the filter. For all plots that pass the filter, the harvester checks the plot. That requires actually reading some non-trivial amount of data from the plot. Checking is going to be 99.9% of the network traffic for your network-mounted filesystem.
Does the harvester need to read all the plots for every challenge received (every 10s)
The simple answer is “no”. It needs to read from some small subset of plots (1 out of every 512, on average). So if you have 512 plots, you would expect that it would usually have to check one of them. But sometimes zero, and sometimes 2 or 3, but on average, 1.
I have 3 harvester-plotters, connected to a main farmer.
I see the same challenge hash repeat 64 times on the GUI and all 3 harvesters receive it.
What I dont understand that shouldnt this challenge hash be different every time?
Only once did I see multiple callenge hashes appear in the gui and I could see all 3 harvesters receiving the different ones. Doesnt that mean that a sing challenge hash is not proofed against all the collective sum of plots on all harvesters?
I think renewal of challenge hash after 64 count is a bottle neck at my end or this perfectly normal?
You’ve probably seen a contention glut, challenges should come sequentially I think. But I haven’t seen multiple harvesters on a network, nor do I know the protocol. It looks like the only documentation for it is the code.
This leaves me with my other question!
Is every challenge sent to ALL harvesters or to any one of the harvesters?
What is a good practice, to keep all plots with the harvesters that are plotting them OR to move them to a main harvester once plotted?
Farmer should notice all harvesters of the challenge I think.
But if you miss on a challenge think of you don’t necessarily get all challenges circulating in the network either. So yeah, you miss out just a little more with this harvester thing. If it’s a systemic bug, I have no idea. You should ask on the official Keybase channel where developers are.
I think the software performs really really well for everyone to date considering how little there is to it. You can see on github, there’s barely much of it e.g. in the harvester files. I could perhaps learn it if I wanted to. However because I don’t and it’s so small my confidence would be limited. Foreseeing lots of possibly unhandled corner case behaviour I’d strive to eliminate the network.