D6.1:Analysis

From West-Life
Jump to: navigation, search


Workflow

Adobe SVG Viewer plugin (for Internet Explorer) or use Firefox, Opera or Safari instead.

System functionalities

High level view Adobe SVG Viewer plugin (for Internet Explorer) or use Firefox, Opera or Safari instead.

Use cases which relates to virtual folder deliverable:

Adobe SVG Viewer plugin (for Internet Explorer) or use Firefox, Opera or Safari instead.

Virtual Folder

Provides File Transfer integration style <ref>http://www.enterpriseintegrationpatterns.com/patterns/messaging/IntegrationStylesIntro.html</ref>. Includes internal file repository and external repositories.

Internal file repository

Internal file repository will provide limited scratch disc space available within virtual machine or via connected external data storage mounted via file system capabilities (NFS, ...) tightly coupled with the virtual machine instance. The internal file repository can be used by installed application to execute processing tasks etc. Other functionality:

  • provide WebDAV - davfs, http://manpages.ubuntu.com/manpages/wily/man8/mount.davfs.8.html
  • policy of what to synchronize from external repositories
    • B2DROP
      • owncloudclient - can select what folders not synchronize - should configure the complement of it (what folder to synchronize) - in case of offline - local copies are stored
      • B2DROP can be mounted via WebDAV - no synchronization - in case of offline cannot be used
    • Dropbox
      • another folder can be synchronized
      • do not provide WebDAV and don't have it in roadmap --- // 2016/04/19 07:21//
    • Google cloud, Amazon S3, ...
      • Luna technologies have prototype of integration with these

External Repository

External repository will address the need for the long-term preservation of structural biology (SB) raw data and metadata about processing after the end of the SB project/grant within which the data were generated. This usecase will connect to user specific data storage as well as Westlife project specific data storage which might contain raw or result data for further processing. These points should be considered:

  • B2DROP, Google cloud, Amazon S3, other WebDav aware services ...
  • policy of automatically storing RAW data and metadata in westlife specific data storage

Will be deeply addressed within WP6 deliverable D6.2

Configuration and Provisioning

The package will cover functionality for features selection and it's configuration. Internal package selection and configuration will dependend on other Westlife project deliverables [[1]]. External (third party) modules selection and integration (EUDAT,DROPBOX,...) and provisioning to public/private cloud will include:

  • Configuration - modification of provisioning scripts for BASH/CHEF/DOCKER ..., configuration of VM system (proxy/size/...) and integration parameters for external data repositories (EUDAT,DROPBOX,...)
  • VM provisioning - is addressed by WP4
    • vagrant - command line tool for headless provisioning of VM and configuration https://www.vagrantup.com/docs/
    • provisioning - vagrant can use BASH scripts (scripts, quite hard to configure, spaghetti code for complex deployment),
      • CHEF - (DSL on Ruby language, complex scenario and complex configuration files) https://docs.chef.io/chef_solo.html
      • DOCKER - tools to package software with libraries and dependent tools
      • BITNAMI - prepared popular open source stacks like LAMP, RUBY stack etc.

Visualization

Visualization will give functionality to browse local or remote data files - results of structural biology processing, mainly PDB files. There are existing solution for that which might be used/integrated into package:

  • NGL viewer - WEBGL based app to visualize proteins from PDB files and PDB databases - open source, extension could be made to visualize local files in VM and from another PDB databases

Processing and Analytics

This use case provides support in processing with tools. Support of existing tools from Westlife partners are addressed within WP5. It connects data with workflows and automatically record metadata of input datasets, output datasets, detail of processing.

  • CCP4 for structural biology - tools to process X-ray diffraction images and produce protein structure.
  • WeNMR, ... are addressed within WP5.
  • general mathematical/scientific analytics tool to cleverly analyse, manipulate data in e.g. Excel like style integrated into CCP4 suite
  • metadata W3C standard PROV-O, (WP7 will devise additional metadata standards related to new domain-specific challenges.)