Lehigh University Library & Technology Services (LTS) administers the Secure Research Cloud (SRC) to provide our research community with a computing environment that is compliant to host so-called sensitive data.
What is sensitive data?
Access to certain high-value research data carries extra requirements designed to protect against unauthorized access, and more importantly, prove to auditors and external partners that the data are protected. We colloquially call these data sensitive, but more formally they include personally-identifiable information (PII), personal health information (PHI), and controlled unclassified information (CUI). We classify sensitive data as "Class I: Critical" according to the Office of Institutional Data (OID).
How does the SRC differ from other environments?
Most research computing happens in one of two places. First, projects that require small amounts of data, proximity to a machine like a microscope, or a high level of customization can use a laptop or workstation located in a research lab. Second, larger, more resource-intensive projects with shared data and higher levels of collaboration are hosted on our high-performance computing (HPC) clusters (named Sol and Hawk), or our associated storage system (called Ceph).
These systems provide a high level of traditional security: researchers are given granular permissions over their own datasets. As a result, these systems are also capable of hosting export-controlled data because we strictly control access to each dataset. While our HPC clusters are secured against unauthorized access, we do not maintain highly-detailed documentation and monitoring required to prove this level of security. One exception is the physical health data warehouse (PHDW), which provides an air-gapped computing environment (i.e. computing that is not connected to any network). The process of documenting our security protocols distinguishes the SRC and PHDW from other computing environments.
Does my research require the SRC?
Researchers must sign a data use agreement (DUA) in order to receive sensitive data from a data provider. This process, overseen by the LTS Information Security group, explicates the terms under which a Lehigh faculty can work with sensitive data. After the DUA is signed by both parties, LTS will work with the data provider to securely ingest the data. If you are not sure if you have a sensitive dataset, you should request a consultation with our team.
If you would like to learn more about HPC, you are welcome to visit the overview on our research computing page or our resarch computing knowledge base (at researchcomputing.lehigh.edu).
Where is the SRC?
The SRC was built inside the Amazon Web Services (AWS) cloud by LTS staff. Since our staff control and monitor our this virtual private cloud (VPC) carefully, and it is logically separated from others, you can imagine that this environment is effectively inside the Lehigh network. In practice, the SRC is composed of storage buckets in the AWS S3 service. Our associated computing environment can access these storage buckets directly.
How do I use the SRC?
After you request a consultation with our team, and we ingest data from your provider, the SRC largely resembles a high-performance computing (HPC) environment. Our team will install the software you require and offer one of two computing environments:
- AWS AppStream provides a virtual desktop environment similar to LUapps.
- AWS EC2 provides a Linux-based server.
In both cases, we will build a custom environment to host your data and software.
Funding
Lehigh University provides the infrastructure and administration that supports this environment. We charge your grants, departments, or College by the a la carte resources you consume in the cloud. A typical computing project should cost about $4/day using the AWS on-demand pricing, meaning that you only use the computing resources that you consume. Storage costs are roughly $23/TB/month, ideally with a mirror or backup of the original data. Our team can provide more specific pricing once we learn more about your project.
Takeaways
Lehigh researchers who require a compliant computing environment to host their sensitive data should request a consultation with our team so that we can customize the SRC to meet your needs by learning about the data, provider, software, and workflow necessary to complete your project.
Faculty with prospective interest in the SRC are welcome to engage with in tandem with data providers. We can also provide centralized HPC resources so that you can use mock data before you use the SRC. Our goal is to ensure that faculty have the software, storage, and computing necessary to advance their research projects.