Analysis and design documents
The goal of the Virtual Folder (VF) (part of WP6 – data management) is integrating of existing external and internal repositories involved in different stages of life cycle of structural biology data. Additional goals are use cases related to processing and analysis of data and searching/presentation capabilities. Two main roles (actors) contribute to use cases of VF, see Figure 1: Scientist (operates domain specific tools) and Scientific IT Admin (operates with IT resources). Configuration and Provisioning is mainly covered by WP4 – Operation - domain of IT Administrator. It is intention of Task 5.4 (WP5) to allow customization of virtual machine, thus general Scientist can perform some tasks of the configuration use case too.
The details of Virtual Folder component is in following Figure. The component HTTP server gives HTTP related services including web application in html pages, webdav module with direct access to repositories and reverse proxy to other web related services. The repositories directory may contain all mountable local or external repositories accessible on file system level. E.g. B2Drop can be mounted using DavFS component which mounts any WebDav interface to the local file system using FUSE driver.
High level view
Technical view of prototype
Single deployment scenario
The components RESTful services and WebDAV services gives api to access metadata stored in internal DB. Control scripts triggers control and configuration action on repositories. With respect of further development of Task 5.4 of WP5 (customized virtual machine) and with respect to use case of Configuration and provisioning (Figure 1), the deployment of Virtual Folder can be separated in two scenarios: 1. Single deployment scenario consist of virtual machine with virtual folder and all other related software and services, see Figure 3.
Cloud deployment scenario
Cloud deployment scenario shares same components – but distributed into multiple computational resources, VF integrated mainly via WebDAV protocol, see Figure 4. Related packages WP4_Scipion_Usecase, WP4_Architecture
In Single deployment scenario, virtual machine (VM) contains all components. It can be deployed on user’s computation resources (personal workstation, notebook, or as VM in cloud -IaaS). The basic interfaces gives access to the VM and it’s features, SSH giving console access, HTTP giving rich web application and programmatic web api to access functionalities, e.g. WEBDAV protocol. The data storage repositories mounted to the VM can use generally two strategies: 1. file synchronization: Local copy of a file is stored internally and synchronized with the external provider (FileSynchronization component). PROS: it allows file level integration with existing tools and software. In case of connection failure, user can work with local copy and synchronization is made after reconnection. CONS: as the copies of files are kept locally, the limit of internal disk space can be reached, however this issue can be partially addressed by selective synchronization. 2. file access on remote location (FileAccess component). PROS: only files immediately used are downloaded into local cache. Some storage providers can be mounted transparently into local file system using e.g. FUSE driver. CONS: In case of connection failure – no files are accessible. Providers gives proprietary API to access resources, thus different drivers must be implemented or used. There can be performance issue when accessing such resource transitively via the WEBDav module.
Webapp class diagram