Documents of D6.2 namespace
- D6.2:Prototype Implementation
- Related topics:
- D6.2:Virtual Folder and West-Life Portal integration
- D6.2:Virtual Folder and Partner Portal integration
- D6.2:Virtual Folder and PDB Components integration
- D6.2:Virtual Folder and Cloud integration
- D6.2:Virtual Folder and Access to Dataset
- D6.2:Meeting and conferences notes
User interface task details https://github.com/h2020-westlife-eu/wp6-repository/issues/11
Compressed Archive of Data and Metadata
ZIP/TGZ containing user's acquired RAW data including some metadata about device, acquisition, processing software etc. Regarding ZIP/GZ, consider streaming the data instead of ZIP/TGZ them in batch thus a zip/tgz is made on demand
Reason to do that:
- User can select specific folders/files to be included in the zip stream.
- No need to store extremely big zip on server, no need to wait until big archive is made
- If you need to save space on repository storage, there can be configured compression on filesystem transparently – there are tools like BTRFS
- Streaming is simple – made prototype – cgi script which compress the user’s directory on demand and sends stream directly to http session channel.
- Such stream can be redirected directly to other service: virtual folder storage.
Reason to do ZIP/TGZ:
- organization of datasets into one entity - tgz/zip archive
Proposal of system components
As identified in use cases before, repository should have the following key features:
- storage - to store data uploaded by user, so they can be downloaded later into user's space
- catalog - support to generate and publish metadata (CRISP)
- authentication/authorization - so the appropriate access is given
Existing solutions of repositories of data tagged by advanced metadata
Some of the open source projects that address the features above
- distributed network filesystem - nfs, afs, cvmfs, sshfs - address feature 1. storage
- Dataverse - address 2. 3., mainly in Java EE, have some proprietar solution for user database 4. docs to integrate with third party SSO available
- ICAT project - address 2. partially 3. Used and contributed by STFC in Diamond light Source in RAL.
- DSpace - address 2. 3. and some 4. worth to integrate with external SSO.
- ARIA - authentication service provided by Instruct address 4. Decided to be exemplar solution of West-Life -- see D4.2.
Some exemplar specialized implementation of repositories
- build first prototype with mockup of UI
- take decision on metadata standard - 11th Sep. after Open Science Fair Conference, DCAT,CERIF, VOID,...
- Storage - CVMFS - westlife.egi.eu
- Data Catalog - Dataverse
- Publication - Dataverse
- Authentication - ARIA
- User Interface - mockups