compare Azure and Amazon S3 data storage #6
Replies: 10 comments 5 replies
-
I don't have a lot of experience with AWS. So I don't know how much I can help with a comparison. But if I have some idea of the type of data and usage I can see if I can pull together some information about what azure storage options you might want to consider. Conceptually azure should be able to store the data and provide options for accessing internally and externally. If the data you are going to be storing is sensitive, we may want to get some people involved who are more experienced with security. |
Beta Was this translation helpful? Give feedback.
-
Hi Matt,
I submitted the ticket for Nora Lee copied along with Ran. Here's the description of the data.
"The grant would provide funding to do many of the things that folks at the UHC are doing for SALURBAL, including providing "high-security-compliant, high-capacity informatics infrastructure suitable for data integration, storage, management, and sharing and ensure that the data meet the FAIR principles: findability, accessibility, interoperability, and reusability."
I can speak about costs for S3. However we only use it for storage of data that doesn't contain PHI.
Jody
Project Manager
Urban Health Collaborative
http://drexel.edu/uhc/
|
Beta Was this translation helpful? Give feedback.
-
Azure can pretty much handle any type of data. But exactly how much it costs and what storage type is best is going to depend on several factors. So here are some preliminary questions that will help me narrow down what type of storage you need:
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Azure lets you scale up storage so not knowing how big it is is fine for operations (but makes estimating cost difficult). It sounds like there is no defined schema, so you're probably looking at a NoSQL database of some sort then. So Azure Cosmos DB or Azure Blob Storage is probably what I'd suggest. Pricing information for Cosmos is here and pricing for blob storage is here. Though without any idea about what the data flow will be, it will be very difficult to set anything up that will perform well on either AWS or Azure, provide a firm recommendation about which products to use, or an accurate estimate of cost. |
Beta Was this translation helpful? Give feedback.
-
Welcome @bayerfjThanks for opening your first GitHub issue and welcome to the UHC GitHub 😃! @bayerfj just double checking: is this for the same grant as Leslie's issue #3? My two cents
My intuition says both would work have similiar performance/cost. But given the infrastructure for Azure and people already using Azure, I think we should build on that foundation for this project. Also, if there is PHI involved we would need to get IT to support to properly configure the storage to be HIPAA compliant. @bayerfj Do you know if AWS or S3 storage is used by other groups at Dornsife |
Beta Was this translation helpful? Give feedback.
-
It’s associated with Leslie’s issue #3. Leslie’s issue references multiple data coordinating grants with similar needs. So this is one of them I’m just not sure if it’s the same .
Is there a way I can change this issue to Nora Lee? She’s the one working on the grant and should respond to these questions.
We use Amazon S3 for Gina Lovasi’s data storage. HOWEVER, we don’t use it in the same way Leslie describes in issue #3. Nora confirmed they want to set up something similar to how SALURBAL uses Azure. She used Azure for the budget in the grant. I think that was probably the best way to go.
Jody
Project Manager
Urban Health Collaborative
http://drexel.edu/uhc/
|
Beta Was this translation helpful? Give feedback.
-
Drexel's agreement with Microsoft is already intended to be HIPAA compliant as far as I am aware as well. Though it doesn't sound like PHI is involved in this case.
@rl627 you are using a storage account with blob storage right? That's getting served to a web app? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I am using AWS for my grant, mostly because it is what is preferred by the UniAndes team that is doing the data processing and analyses. We are using mostly computing instances rather than storage. Getting it set up through Drexel via the NIH STRIDES program was a bit of a hassle. If you do need HIPAA-standard, the Azure option would be better, and having the security folks at Drexel check or set that up would be very helpful than trying to figure it out on AWS without assistance as that can be very complicated. I think either will be adequate for storage purposes, but if you are doing any computing instances for specific tasks, my understanding is that AWS and Google Cloud have a better infrastructure for that. |
Beta Was this translation helpful? Give feedback.
-
Name
Jody Bayer
Title
Project Manager
Department
Urban Health Collaborative
Request type
technical/ingratstructure consult for grant submission
Request description
Compare Azure and S3 and understand advantages and limitations of both for grant submission. Size of data is unknown and usage but will be large. Users will be internal and external to drexel. Nora Lee will provide more information on the type of data.
Example
No response
Data
No response
Notes
from Nora
Beta Was this translation helpful? Give feedback.
All reactions