Stales on Chia: What Causes Them and How to Reduce Them (you could be losing tons to this)

Several of you from GPU mining may recall stales which are shares submitted to the pool after the relevant block has already been found. These shares can still be sent out in an attempt to find the same block so that you can get an uncle reward after the fact. Chia has its own terminology. Much of the below comes from what others have discussed and discovered in our discord and in the Chia community.
In Chia, stales come about when it takes too long to lookup a plot. This can happen for both solo farmers and pool farmers. Several solo farmers have suddenly realized that the reason they were making less than expected was because they had a lot of stales previously that they never noticed before joining a pool. I’ve heard people say anything under 5% is normal for stales, I’d say you should aim for under 2%. It’s almost impossible to get stales at 0, expect to get at least a few stales and if you have none likely something is wrong. It’s generally recommended to keep lookup times under 3 seconds. Over 5 will likely be stale. Usually pools do not credit you for stales although there may be Chia pools that do.

Causes of Stales:

  1. Active Use: The most frequent cause of stales currently is when farmers are farming on a HDD they are currently plotting to. As it’s in active use, data lookup will be slow. This is only short-term while you are plotting so it can be acknowledged and ignored.
  2. Sleep Mode: HDD’s need to be kept active and spinning. Depending on your settings and brand some External HDD’s will turn off after a few minutes. Make sure to disable turn off disk after X minutes and disable usb selective suspend settings. You can consider installing software that keeps HDD’s active.
  3. Internet Connection: An internet connection that is unstable can also cause stales. Note that bandwidth (aka download/upload speed) is not that important to Chia as little data is sent. Also ping itself is not that important as there is quite a bit of time to get your submission there.
  4. HDD Formatting: There are some issues with the Chia software itself. Some have noticed 1.2.2 causes ExFAT formatted drives on MacOS to have very long lookup times. We suggest XFS format on Linux Operating Systems if possible. We’ve also heard good things about ext4.
  5. Pool Processing: When you submit a partial it has to be processed by the pool’s servers. The reference pool design has a variable difficulty to keep load on pool’s servers to a minimum. However, things add up so there may be problems arising that require hardware or software upgrades.

Feel free to post below if there’s any mistakes, edits, tips to add, or good suggestions and I’ll try to add them in. Note that the 6 causes above are just the most common that I’m aware of there will be other reasons.

2 Likes

There are actually 28 seconds to turn in a proof. I’d really be surprised if folks with directly attached discs are having issues. I can get all the blocks for my proofs through HTTP requests to object storage turn it in and win. I’ve done it in 24 and 26 seconds. I’ve only actually missed a reward once with a 46 second turn in, fortunately that was on flax not chia.

The five second warning that was added to the logs is on the quality check. Basically it reads down the right hand side of the binary tree and partially computes the hash to see if a real proof would be anywhere near close to sufficient. 5 seconds is actually too long at that point if you would have to do that many reads seven more times sequentially. Fortunately chiapos 1.0.3 had a change where all the reads for a proof will happen in parallel. When I’m speed testing my plots I sometimes find that a full proof goes faster than the quality check.

But I’m interested to see the issues people are having, and the resolutions.

1 Like

I’m focusing more on the time you have to get it to the pool. Reference pools do not process things fast. 5 seconds is a good guideline to stay under.

1 Like

Why do pools need the partials so much faster than the real blockchain needs full proofs?

I/O and computation wise those tasks are the same effort right?

Not an expert and I don’t want to be dissing other pools too much so I’m not going to answer.

But that being said the devs called it a “reference” pool protocol as in it was supposed to be referred to not used.

1 Like