mirror of https://github.com/citusdata/citus.git
The first and main issue was that we were putting absolute pointers into shared memory for the `steps` field of the `ProgressMonitorData`. This pointer was being overwritten every time a process requested the monitor steps, which is the only reason why this even worked in the first place. To quote a part of a relevant stack overflow answer: > First of all, putting absolute pointers in shared memory segments is > terrible terible idea - those pointers would only be valid in the > process that filled in their values. Shared memory segments are not > guaranteed to attach at the same virtual address in every process. > On the contrary - they attach where the system deems it possible when > `shmaddr == NULL` is specified on call to `shmat()` Source: https://stackoverflow.com/a/10781921/2570866 In this case a race condition occurred when a second process overwrote the pointer in between the first process its write and read of the steps field. This issue is fixed by not storing the pointer in shared memory anymore. Instead we now calculate it's position every time we need it. The second race condition I have not been able to trigger, but I found it while investigating this. This issue was that we published the handle of the shared memory segment, before we initialized the data in the steps. This means that during initialization of the data, a call to `get_rebalance_progress()` could read partial data in an unsynchronized manner. |
||
---|---|---|
.. | ||
multi_progress.c |