D6.2:RepositoryParts

From West-Life
Jump to: navigation, search

User Interface

User interface task details https://github.com/h2020-westlife-eu/wp6-repository/issues/11

Compressed Archive of Data and Metadata

ZIP/TGZ containing user's acquired RAW data including some metadata about device, acquisition, processing software etc. Regarding ZIP/GZ, consider streaming the data instead of ZIP/TGZ them in batch thus a zip/tgz is made on demand

Reason to do that:

  1. User can select specific folders/files to be included in the zip stream.
  2. No need to store extremely big zip on server, no need to wait until big archive is made
  3. If you need to save space on repository storage, there can be configured compression on filesystem transparently – there are tools like BTRFS
  4. Streaming is simple – made prototype – cgi script which compress the user’s directory on demand and sends stream directly to http session channel.
  5. Such stream can be redirected directly to other service: virtual folder storage.

Reason to do ZIP/TGZ:

  1. organization of datasets into one entity - tgz/zip archive


Proposal of system components

As identified in use cases before, repository should have the following key features:

  1. storage - to store data uploaded by user, so they can be downloaded later into user's space
  2. catalog - support to generate and publish metadata (CRISP)
  3. authentication/authorization - so the appropriate access is given

Adobe SVG Viewer plugin (for Internet Explorer) or use Firefox, Opera or Safari instead.

Existing solutions of repositories of data tagged by advanced metadata

Some of the open source projects that address the features above

  • distributed network filesystem - nfs, afs, cvmfs, sshfs - address feature 1. storage
  • Dataverse - address 2. 3., mainly in Java EE, have some proprietar solution for user database 4. docs to integrate with third party SSO available
  • ICAT project - address 2. partially 3. Used and contributed by STFC in Diamond light Source in RAL.
  • DSpace - address 2. 3. and some 4. worth to integrate with external SSO.
    • Dspace,
    • example of implementation of repository of thesis at Charles University in Prague dspace.cuni.cz
  • ARIA - authentication service provided by Instruct address 4. Decided to be exemplar solution of West-Life -- see D4.2.

Some exemplar specialized implementation of repositories

Conclusion

  1. build first prototype with mockup of UI
  2. take decision on metadata standard - 11th Sep. after Open Science Fair Conference, DCAT,CERIF, VOID,...


First prototype/mockup

  1. Storage - CVMFS - westlife.egi.eu
  2. Data Catalog - Dataverse
  3. Publication - Dataverse
  4. Authentication - ARIA
  5. User Interface - mockups