Skip to content

On the vagaries of the MS, casacore and python-casacore.  #25

@JSKenyon

Description

@JSKenyon

This issue aims to keep track of problems/limitations uncovered in third-party software/formats.

casacore

Casacore lacks thread safety, making it impossible to read a single, non-multi MS from multiple threads. This forces memory allocations/reads to happen in a single thread which is an unnecessary bottleneck. I have opened this issue on the casacore repository in an attempt to get some feedback from the developers.

python-casacore

Python-casacore doesn't drop the GIL. This PR demonstrates the ramifications of failing to drop the GIL, as well as presents a possible solution to the problem. In my opinion this MUST be changed, as otherwise any parallel processing will always be crippled as threads wait for the GIL, effectively becoming serialised.

MS

The measurement set, whilst convenient and familiar, is an antique which I believe should be retired/rethought. The frustration generated by the above points and discussion with @sjperkins led me to perform a proof of concept test for an alternate storage format. For this test, I elected to write the data to zarr, which interacts well with xarray.Dataset objects. I have called this format a ZMS. Each experiment operated on the same data (around 72GB in the DATA column) using 36 threads. The following table summarises my results:

Parameters Max memory Wall time
MS, GIL ~130GB 10m00s
MS, No GIL ~50GB 9m26s
ZMS, GIL ~140GB 5m30s
ZMS, No GIL ~140GB 4m44s
MS, No GIL, No writes ~20GB 6m22s
ZMS, No GIL, No writes ~120GB 1m31s

Note that in the ZMS case, we don't have any GIL on the reads, but we still write to the MS. Those writes are still affected by the GIL. The maximum memory use requires some explanation - in theory all of the experiments should have the same memory footprint. The large differences are related to the GIL and parallel reads:

  • In the MS, GIL case, lots of data will end up stuck in memory while waiting for GIL.
  • In the MS, No GIL case, the footprint drops off as the compute is very fast and no longer blocked by the GIL. Thus the nominal footprint is related to the read-rate, compute-rate and write-rate.
  • In the ZMS, GIL case, there is no GIL on the reads and reads can happen in parallel. Thus, a huge amount of data can be read in quite fast. Thus, the maximum footprint is relatively consistent with the amount of data we expect to process at once.
  • In the ZMS, No GIL case, the argument of the ZMS, GIL case applies too.
  • In the MS, No GIL, No writes case, I do not write back to the MS. I just produce gain solutions. Here, we see the serialisation effect clearly in the low memory footprint - the data gets processed almost as fast as it can be read.
  • In the ZMS, No GIL, No writes case the large memory footprint arises because we are reading a large amount of data in parallel. Note however that in this case there is no interaction with the measurement set - data is read from the ZMS and the gains are written as zarr. The massive improvement in performance is the clincher for me - interaction with the MS causes slowdowns in this parallel processing regime. Of course, this result should be taken with a pinch of salt, as in practice writing out to some format will always be necessary.

Finally, as an awesome side effect of using the ZMS, we reduce the input MS from 331GB (72GB per data column) to 64GB. Again, this needs to be taken with a pinch of salt as the ZMS does not have 100% of the MS data. That being said, there is clear potential for data compression with basically no effort. This also ameliorates I/O problems as there is less data to be read. Of course, this comes with the cost of decompressing the data on read, but this is highly optimised and seems to work well.

To go further with this PoC would require writing output columns to the ZMS. This would allow for parallel writes. I likely won't try this right now, but I think there is definitely an argument to be made for moving away from the MS entirely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions