(Investigation) Improve the contentStoreCleaner mechanism
The contentStoreCleaner is responsible cleaning up orphaned content from the contentstore. Currently this is done in batches of 1k and in case the database is large (150 mil nodes, 500k or 1 mil orphan nodes), it takes ~ 2.5 - 3 min for each batch to complete. Please view this comment for more details on the environment used for tests.
This means that we need weeks to delete a large amount of data (A/C for ).
Check if there is a way to improve the time taken to purge the orphan nodes that will allow deleting 10 mil documents in a more timely manner. Also it looks like the current batch size is hard coded:
so maybe there will be an improvement if the batch size could be changed.
Depending on the change, testing scenarios could include: only fileContentStore, file and s3contentstore, only s3contentstore, s3 with deleted content store, azurecontentstore