Description
If the compaction loop fails for any reason, the rows will be compacted and the compact-rev key will be updated, but the expected compact-rev key stored in memory won't be updated - so it thinks that some other node has compacted, and skips the following interval.
This has been reported several times:
- k3s database size grow due to slow compact process k3s#10626 (reply in thread)
- k3s server create 100% CPU load k3s#11251 (reply in thread)
The first instance was in an odd multi-master Galera cluster, but the second was on plain old sqlite.
This is because if any compaction fails, we restart the outer loop:
kine/pkg/logstructured/sqllog/sql.go
Lines 155 to 157 in c1b2bd8
without recording any of the work done by prior successful iterations of the inner loop:
kine/pkg/logstructured/sqllog/sql.go
Lines 165 to 167 in c1b2bd8
We should fix that, but we should also figure out how to better handle locking errors when trying to compact.
For sqlite at least, this may be related to go-sqlite3's BeginTX ignoring TxOptions:
This is BAD, as the default behavior of sqlite transactions is to... not actually start a transaction:
https://sqlite.org/forum/info/c3cb9524bef62b67#forum11484
A bare BEGIN (as in BEGIN DEFERRED) does not start a transaction. It turns off the auto-commit machinery so that the transaction commenced by the next statement is not automatically committed at the end of the execution of that statement. If that statement is a "read" statement, then the transaction is a read transaction. If that statement is a "write" statement, then the transaction is a write transaction. BEGIN IMMEDIATE and BEGIN EXCLUSIVE both turn off the auto-commit machinery and start a transaction (write or exclusive respectively)
Activity