Friday 18 March 2016

Processing compressed data

Even with modern machines today, X-ray diffraction data can still be big. However, pixel array data also compress well with e.g. gzip - say 20:1 on CBF files from a Pilatus detector. If only they could be processed compressed...

... well they can be. Actually xia2 can run just fine with compressed images:

xia2 -atom Zn image=/Volumes/DATA/data/thermc_1_0001.cbf.gz:1:1800

and works with XDS as well. In the case of this thermolysin data, which were kept on an external USB3 drive, the whole xia2 job was about 20% faster using the compressed data rather than the raw. Worth considering as it also saved about 95% of the storage space as well.

The effects for even bigger data sets could be more substantial, as the 1800 images were able to fit in the cache. If this was not the case the time saving would be greater as the compressed data could be cached but the raw no...

Worth thinking next time you complain about the amount of storage X-ray data takes up.

YMMV, not tested on non-CBF images, some software may not support this, ...

No comments: