Big data supercomputing simplified by 800 lines of code

Representational image (Photo: Wikimedia Commons)


Scientists have developed "Charliecloud" – a crisp 800-line code – that can help supercomputer users to operate in the high-performance world of Big Data without burdening computer centre staff with the peculiarities of their particular software needs.

"Big Data analysis projects need to use different frameworks, which often have dependencies that differ from what we have already on the supercomputer," Reid Priedhorsky, lead developer of the High Performance Computing Division at Los Alamos National Laboratory, said in a statement.

"So, we've developed a lightweight 'container' approach that lets users package their own user defined software stack in isolation from the host operating system," he added.

Users can thus add required packages as images to run their programmes on the supercomputer without having to reconfigure the system.

This maintains a "convenience bubble" of administrative freedom while protecting the security of the larger system.

Charliecloud was built following two bedrock principles of computing, that of least privilege and the Unix philosophy to "make each program do one thing well."

Competing products range from 4,000 to over 100,000 lines of code.