Seach Services 2.0.1 killed by Solr OOM killer script during indexing (reindex)
Description
DESCRIPTION
During a reindex using 2.0.1, solr starts to be killed by the Solr OOM killer (oom_solr.sh). Restarting solr will show indexing progress for a period until it repeats. Indexing will eventually complete after several restarts
REPRODUCTION
Bootstrap the supplied test database
disable content indexing - content not available
Trigger the index creation
EXPECTED
Indexing completes without incident
OBSERVED
Towards the end of the indexing, solr starts being killed, with approx 120k transactions remaining out of 4.5 million
There are some 60 or so larger transactions in the remaining list, ranging from 1000 to 59544 nodes.
Yourkit snapshots show the problem threads look to be ForkJoinPool worker threads for the cascadeTracker (SolrInformationServer.cascadeUpdateV2)
My test memory settings were SOLR_JAVA_MEM="-Xms3g -Xmx6g". In both cases, the total heap flat-lined at close to the limit for several minutes before being killed.
The solr logs also show tens of thousands of the following exceptions from the time indexing starts.
Environment
Testcase ID
Activity
I will close the ticket, please reopen if further work is required.
Assigned to to validate if increased memory settings are acceptable as a workaround.
It also looks that there is an issue with the design of the solution that requires that many groups.
PR has been generated in order to provide development feature to skip indexing initial transactions for a repository:
Repository has been indexed using following JVM settings for SOLR
Deployment environment is available in
The node 16848799 is a user associated with 91,353 groups.
Every time the node is modified, all these GROUP parent paths need to be updated.