Skip to content

genxnetwork/crypt4gh-scone-example

Repository files navigation

Crypt4GH on SCONE example

Motivation

As stated on Crypt4GH: A secure method for sharing human genetic data:

Crypt4GH, a new standard file container format from the Global Alliance for Genomics and Health (GA4GH), allows genomic data to remain secure throughout their lifetime, from initial sequencing to sharing with professionals at external organizations.

While Crypt4GH solving secure data-in-rest and data-in-transfer issues, data-in-use is still needed to be addressed. We propose to use Trusted Execution Environment technology and specific Intel SGX implementation for the protected in-memory processing for genomics data. It also provides means to implement a Key Management System (KMS) using remote Configuration and Attestation Service (CAS).

Example of using the SCONE CAS as a KMS for Crypt4GH

This example provides two services:

  • encrypt - to encrypt a VCF file, using a Crypt4GH Python library
  • process - to process the encrypted file: extract the ID column.

Encryption keys are:

components.png

We used Azure Confidential Computing to execute this example, but it can run on any other SGX supported hardware, including bare metal.

Requirements

Build

To build Docker services Docker image, generate and upload session to CAS, run: $ make

Build and start SCONE Local Attestation Service container:

$ docker-compose up -d las

Encrypt a file

$ cat input.vcf | docker-compose run --rm encrypt > ./input.c4gh

Process encrypted input

Run "process" service to extract VCF file ID's from the input:

$ cat input.c4gh | docker-compose run --rm process 2>/dev/null

Run example without Docker and SCONE, with explicit keys

$ pip3 install -r requirements.txt
$ export SENDER_KEY=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
$ export RECIPIENT_KEY=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
$ python app encrypt input.vcf | python app process

Known limitation

  • Memory limit causing SGX VMs overhead on intense memory tasks
  • Need for the additional software adaptation
  • Close to impossible to use the fork system call

Further plans

  • htslib-crypt4gh integration enabling popular bioinformatics tools such as samtools and bcftools to work in the enclave
  • Demonstrate remote block fetch of encrypted SAM/VCF file formats for the secure processing

References

  • Crypt4GH - encryption utility
  • SCONE - a Secure Container Environment
  • VCFPy - a Python 3 library with good support for both reading and writing VCF files

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published