Home
D2.4 Software prototype v1
Contents
1. Type RESTful API Standalone Service Type RESTful API Standalone Service 7 ARESTful API taking the respective A Index actions and persisting session id i in database gt PROBADO3D i i i Triggers indexing of metadata Trigger after SIP generation with session ID Responsibility Responsibility Figure 19 Integration design diagram for the Rosetta PROBADO3D Connector component 3 2 5 PROBADO3D The PROBADO framework allows integration of content based indexing and retrieval methods for non textual documents The PROBADO3D architecture follows a three layer approach which consists of a repository layer a core system layer and a presentation layer Distributed local repositories implement document type specific indexing and accessing techniques including rich meta data models The PROBADOSD core layer keeps track of all document repositories registered in the system It maintains an integrated index of all documents The presentation layer offers rich user access methods including graphical query specification and document visualization PROBADO defines a system protocol based on web service technology It allows dispatching content based and metadata based user queries to local repositories which manage the primary documents Synchronization DURAARK L FP7 ICT Digital Preservation Grant agreement No 600908 J DURAARK 1 DURABLE M ARCHITECTURAL Il KNOWLEDGE D2 4
2. Figure 10 List of SDO endpoints and search mask 2 1 4 Workflow Geometric Enrichment The start page for the geometric enrichment see Figure 11 shows a list of available point cloud tools In this version the registration prototype from the D4 1 software deliver able is available and can be selected This software is a standalone desktop application which needs to be installed before the first usage If the software was not installed yet the stakeholder is provided with a download link and installation instructions After a successfull installation a click on the icon opens the Session Page see 2 1 1 1 Here the stakeholder can select one of the existing sessions or creates a new session A click on the Start button of an existing session starts the download of the IFC E57 files denoted in the session After a successfull download the registration prototype opens with the downloaded files as input Figure 12 shows the appearing GUI with the two selected files loaded At this point the reader is referred to D4 1 Appendix A for a description of the usage of the registration software prototype DURAARK e AS DURAARK FP7 ICT Digital Preservation YE ilia ARCHITECTURAL Grant agreement No 600908 Un KNOWLEDGE D2 4 Software prototype v1 20 of 53 After the registration process is finished the resulting mapping RDF file has to be stored on the local harddrive This file serves as input file for the SIP Generation workflow describ
3. 7384dee45c5e HA id 1 label Empire State Building options demo_mode true i files H id 0 path fixtures repository CCO_DTU Building127_Arch_CONF name CCO_DTU Building127_Arch_CONF ifc type ifc size 10 74 MB jd 1 ifc path fixtures repositorv CCO DTU Building127 Arch CONF e57 name CCO DTU Building127 Arch CONF e57 type e57 size 535 30 MB HH uuid Of fe055e 1360 47d4 al6c 026880c9eba5 Listing 1 Example response listing all available sessions Query http workbench duraark eu services session 0 Example response id 0 label CCO_DTU Buildingl27 files id 0 path fixtures repositorv CCO DTU Building127 Arch CONF ifc name CCO DTU Building127 Arch CONF ifc type H ife size 10 74 MB id 1 path fixtures repositorv CCO DTU Building127 Arch CONF e57 name CCO DTU Building127 Arch CONF e57 type e57 size 535 30 MB H uuid Offe055e 1360 47d4 al6c 026880c9eba5 DURAARK i S FP7 ICT Digital Preservation UT Grant agreement No 600908 DURAARK DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 43 of 53 Listing 2 Example response listing session with ID 0 1 2 File Identification API Description Queries the DROID file identification component to identify the E57 file of a given session The example response sho
4. T coordinate_metadata undefined scan count 1 image count 1 scan_size 1 image_size 1 scans name parking000 D2 4 Software prototype v1 45 of 53 guid FOB3C105 325B 4FC9 9EQ1 3130153F9800 Qriginal guids l description sensor Vendor sensor model sensor serial number sensor hardware version sensor software version sensor firmware version temperature 0 relative humiditv 3 4028234663852886e 038 atmospheric pressure 3 4028234663852886e 038 acquisition start 4 year 1980 month 1 dav 6 hour 0 minute 0 seconds 0 acquisition end year 1980 month 1 day 6 hour 0 minute 0 seconds 0 T pose DURAARK FP7 ICT Digital Preservation Grant agreement No 600908 TA SE RT SST DURAARK oat lirai DURABLE MOT ql ARCHITECTURAL 1 1y pp iagi KNOwLEDcE lgili D2 4 Software prototype v1 46 of 53 rotation w 0 99996960774189081 x 0 0074585516927261801 y 0 0022701539983015365 z 0 T translation x 89 951072690000004 y 1 8420018 z 2e 008 index_bounds row_minimum 0 row_maximum 3470 col_minimum 0 col_maximum 8213 return_minimum 0 return_maximum 0 cartesian_bounds x minimum 68 432470999999993 x maximum 57 134830999999998 y minimum 59 8972309
5. index field false return count field false return maximum 0 time stamp field false js Time Stamp Invalid field false time Maximum 0 intensitv field true is intensitv invalid field false intensitv scaled integer 3 0518509475997192e 005 color red field true color green field true color blue field true js color invalid field false points size 27802731 images H name parking000 guid 76BD148C D22A 4FE3 8CB2 OFBO1F96698B description representation spherical acquisition datetime year 1980 month 1 dav 6 hour 0 minute 0 seconds 0 Fa associated_data3D_guid F0B3C105 325B 4FC9 9E01 3130153F9800 sensor vendor sensor model sensor serial number pose rotation 4 w 0 70283815264201144 x 0 077691052038131911 y 0 088163333920523737 z 0 70157669443617587 nn U MESSI DURAARK yl DURAARK e l sr DURABLE FP7 ICT Digital Preservation mis HD ARCHITECTURAL dY ll lill KNOWLEDGE Iiu D2 4 Software prototype v1 48 of 53 i translation x 89 951072690000004 y 1 8420018 z 2e 008 visual_ref_representation jpeg image size 0 png image size 0 jmage mask size 0 image_width 0 image height 0 pinhole representation 4 jpeg image size 0 png image size 0 jmage mask size 0 image_width
6. server Current Sessions Session name My New Session Label Files 0 CCO_DTU Building127 2 1 Empire State Building 2 Figure 4 Session Page 2 1 1 1 Session Page The SIP generation is organized in so called sessions The stakeholder creates a new session via the New Session button after entering a name for it The session is added to the list on the bottom of the page and can be started via the Start button as well as deleted via the red cross button The purpose of a session is to a start a session work on it and resume it at a later point and b to allow collaborative working on a session E g Stakeholder A starts a session and provides input files Stakeholder B CDON SE DURAARK tal ST DURAARK Z ay DURABLE FP7 ICT Digital Preservation 17 il ARCHITECTURAL il Grant agreement No 600908 KNOWLEDGE D2 4 Software prototype vl 12 of 53 works on the corresponding metadata record To allow easy testing of the application one predefined session is provided already containing input files When selecting the predefined session the page described next in this user manual the file upload is skipped Creating a new session and starting it opens the File Upload Page Related content Appendix 1 1 shows the API 2 1 1 2 File Upload Page This page allows the stakeholder to upload files relating to the same building s into the session An IFC file and or an E57 file have to be uploaded If both file type
7. A click on a meta data entry allows to change the entry Itis also possible to add new entries from a list The page at localhost 9000 says Please enter the new value IFC metadata DURRAARK Key Value OK Cancel archiver_organization_phone archiver_organization_name DURAARK Consortium creator_software_name DURAARK SIP Generator v0 1 0 creator_mail creator phone creator organization name DURRAARK Consortium enrichment vocabularv http vocab gettv edu ontologv enrichment vocabularv http vocab getty edu aat enrichment_vocabulary http sws geonames org enrichment_vocabulary http dbpedia org property creator Morten Jensen Figure 7 Metadata Extraction Page 2In this version of the application the definition of mandatory metadata is not finalized yet and will change in future versions AA SS DURAARK al tl DURAARK FP7 ICT Digital Preservation YE ilia ARCHITECTURAL Grant agreement No 600908 Un KNOWLEDGE D2 4 Software prototype vl 15 of 53 2 1 1 5 Semantic Enrichment Page This component uses metadata extracted from the ingested IFC file to search for addtional information which the session will be enriched with The search is able to incorporate different sources from the available metadata e g city names This version of the Workbench is taking the postal address in the metadata as search criteria The page shows a list of the related linked open data LOD sets and is stored within th
8. No 600908 KNOWLEDGE D2 4 Software prototype v1 23 of 53 control automatic update mechanisms access control etc On the backend another module management component keeps track of all the registered web services The UI modules communicate with a web service via a RESTful API The API handles a request from a UI module and delegates it to the web service implementation which in turn delivers the requested data back The architecture allows for two communication scenarios between UI modules and web services First a UI module is directly communicating with a web service which is the case when the web service needs to be configured by the user e g in entering metadata that the service processes then The second scenario covers the direct communication between two web services This is the case if for instance the service responsible for the generation of a SIP package asks the service for file identification to verify if a file has the correct type before creating the package In both cases the defined RESTful API is the enabler for this kind of application to application communication A stakeholder is interacting with the frontend part of the framework She does not have to know anything about the web services that are doing the actual work data processing Also the web services as the name suggests can be distributed over the network it is not necessary for them to reside within a single server context For instance the services
9. Section 3 2 5 gives a deeper look into the used PROBADO3D com ponent Appendix 1 7 and Appendix 1 8 show the API KNOWLEDGE a iia a surasa FP7 ICT Digital Preservation i 17 ili ARCHITECTURAL iil Grant agreement No 600908 D2 4 Software prototype v1 18 of 53 2 1 3 Workflow Semantic Archive Maintenance This workflow includes two tools which are selectable via the Semantic Archive Main tenance start page e SDO Information e Dataset Crawler Module which is described in D3 3 Please refer to that document for a user manual The SDO Information tools allows to lookup information that is stored in the Se mantic Digital Observatory SDO The SDO component discovers and retrieves suitable architecture relevant datasets in crawling linked open data sources and provides structured metadata on those datasets The Dataset Crawler Module is part of the SDO and performs the actual crawling of data A detailed explanation to both can be found in D3 3 here the GUI integrated in the WorkbenchUl is described Figure 10 shows the SDO Information page The stakeholder is provided with a list of data sources which are used for crawling linked open data A name description URL and last crawl date is displayed for all the endpoints The Search Topic box allows searching for specific data sets in all of the listed end points and after clicking the Search button a list with the results is displayed Related content Report D
10. Software prototype vl 31 of 53 methods allow the repositories to inform the core system about availability and updates of hosted contents PROBADOSD is used as the interface for browsing and searching the generated SIP items This can either be done by using the PROBADO3D web pages or the various web interfaces provided by the PROBADOSD service PROBADOSD is especially tailored to the needs of the architectural domain and establishes a search amp retrieval infrastructure e g indexing 3D PDF preview generation etc which can be easily utilized for the various DURAARK needs WorkbenchUl Search amp Retrieval Page Component PROBADO3D Tvpe Frontend integrated Tvpe RESTful API 4 Standalone Service T Stakeholder gets a filterable listing of the generated SIP s A RESTful API for H metadata listing and searching H Database access Stake Data request i data request xj Search amp Retrieval Page 7 gt 7 PROBADO3D holder L Result is sends backdata send back i i FhA esponsibility Responsibility Figure 20 Integration design diagram for the PROBADO3D Search amp Retrieval component 3 2 6 Geometric Enrichment The components developed in WP4 and WP5 are implemented as standalone desktop tools which do not directly connect to remote services but instead process files residing on the user s client computer Their main purpose is the enrichment of datasets before the actual ingest takes
11. carried out in a so called page following the web application terminology Figure 3 shows the general structure of a page On top there is a Next Previous button bar that moves the stakeholder from one page to the next or previous one Below the page title and a description of the current workflow step is given The bottom most section contains the interactive part of the page and or displays data The usage of those parts is the focus of this user manual Workflow 3 and 4 are special as they provide a selection of tools before starting the user interaction Depending on the task it is possible that only a single page contributes to a workflow The remainder of this section goes through the four workflows and describes each con tributing page When applicable the component connected to the GUI page is mentioned so that interested readers have the possibility to get a more technical description of the component in Section 3 2 or dive into the description of the corresponding application programming interface API in Appendix 1 m DURAARK Previous Next Mi ourasie ill ARCHITECTURAL jil Il KNOWLEDGE File Identification Identification for uploaded files This component provides feedback of the DROID file identifiacation tool that is triggered for the uploaded files Status At present file identification is supported for E57 files only Profile patterns for IFC should be readu bu the end of the summer 2014 Identified File
12. environment is to the most degree the same on different platforms e g Microsoft Windows Linux and MacOS but also for the very popular mobile platforms Android iOS Windows Mobile etc This has the tremendous advantage that when developing an application with a web technology stack it will automatically be usable on the most popular desktop and mobile platforms without the need to change the application code Developing against a browser environment has restriction that are relevant for the DU RAARK context The data sets stakeholders will work with can be huge in size The web based Workbench is running on a remote server and it is necessary to transmit the files from the local harddisk to the remote server where the different services have access to them Even with a reasonably decent network connection an upload of a file that is hundreds of mega bytes in size takes multipe minutes or even hours NodeJS as runtime enviroment for web services The web services developed in DURAARK are contained within a NodeJS environment Their purpose is to wrap standalone executables or other web services developed in the project and provide a RESTful API for accessing their functionality to a GUI layer or other services e g application to application communication The wrapper layer is rather thin It takes care of starting an executable or web service and processing its output so that it is consumable by a client NodeJS is a reasonable choice f
13. format so that it may be further processed by other components or for ingest into the archive alongside the E57 data files 8DROID profiling tool http www nationalarchives gov uk information management manage information policy process digital continuity file profiling tool droid l DURAARK sa MITI DURAARK ae l sr DURABLE FP7 ICT Digital Preservation YE il ARCHITECTURAL it Grant agreement No 600908 l KNOWLEDGE D2 4 Software prototype vl 28 of 53 WorkbenchUl Metadata Extraction Page Component E57 Metadata Extraction Type Frontend integrated Type RESTful API Standalone Executable on Backend A RESTful API taking therespective No stakeholder interaction session id i Extracts metadata E57 metadata extraction H idesi i Metadata Extraction Page x providesinput file toemzutatia J E57 Metadata Extraction lt lt S executable Metadata record i sends backdata i issend bak Responsibilitv Responsibilitv Figure 16 Integration design diagram for the Metadata Extraction component 3 2 3 SIP Generator A digital archive must have features and methods to receive and manage digital content This should wherever possible be done in an automatic process which means that digital objects should be delivered in a structured and standardized wav In order to achieve this a software is developed within the DURAARK project that generates a Submission Info
14. place For the first DURAARK system prototype we have focused on the integration of the registration prototype D4 1 for demonstrating the workflow using standalone desktop tools other software prototypes of WP4 and WP5 will work in a similar manner Figure 21 shows an overview of the registration component s input output specification The envisioned workflow for using the registration component is as follows During the preparation of the ingest of multiple new datasets of the same building using the WorkbenchUI for instance multiple scans taken at different points in time or a point cloud and a corresponding BIM model the user has the opportunity to select a pair of datasets which shall be registered i e spatially aligned to each other It is assumed DURAARK ae FP7 ICT Digital Preservation a Grant agreement No 600908 l DURAARK il DURABLE I ARCHITECTURAL jil knowLeooe D2 4 Software prototype vl 32 of 53 that the datasets are available as local files during the ingest process After selection of the datasets the registration prototype i e the software prototype executable may be started from within the WorkbenchUI which is automatically provided with the file paths of the selected files as command line arguments These paths are used to initialize the file chooser which is presented to the user by the executable At this point the reader is referred to D4 1 Appendix A for a description of the usage of th
15. possible to customize the workflows to fit the various needs of different stakeholders This flexibility will support the acceptance of the Workbench as a service platform Developing the web services separated from the user interface forces the development to focus on how a service is exposed to the external world through a reasonable API Moving forward with the corresponding GUI at the same time tests the API and allows to enhance it on the go The result is a stable well tested interface to long term archival services together with a GUI that shows how to use those services the right way This is the first version of the software prototype and the internal structure as well as the GUI will incrementally be improved and adopted to the needs defined by the evaluation activities in WP7 The general architecture and design decision however proved to be suited for the purpose of this deliverable namely to provide an integrated platform with workflows to perform long term archival use cases in the context of BIM data DURAARK vo LS FP7 ICT Digital Preservation XE li Grant agreement No 600908 lli DURAARK DURABLE ARCHITECTURAL KNOWLEDGE Appendices D2 4 Software prototype v1 41 of 53 1 Service Endpoints RESTful API Description In this section the RESTful API enpoints are listed Examples for accessing the API are given as well as the corresponding JSON responses Internally the Workbench is usi
16. related to the SDA see D3 3 and the PROBADOSD service are running on different servers than the rest of the components which are located on a single server in the current setup 3 1 2 Frontend User Interface UI Modules A User Interface UI Module is a visual page within the web browser that is a displaying data and b allows for interaction with the user The technology stack consists of HTML and CSS for the visual representation of data and Javascript for the user interaction logic The DURAARK framework is using an existing Javascript library that is tailored for presenting data in a web browser and for manipulating this data The library is called Backbone Marionette Backbone Marionette provides the basic tools for a structured development of user interface logic is actively developed and has a broad and active community MarionetteJS http marionettejs com DURAARK LLEN FP7 ICT Digital Preservation Grant agreement No 600908 DURAARK J DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 24 of 53 3 1 3 Backend Web Services This part is the backend of the framework running on a server It provides the developer with base classes that cover common functionality used when developing web services for the DURAARK project The base classes allow to create a RESTful API around standalone executables that have a file as input and produce a an output file or b console output for further p
17. sal mmi DURAARK ae l sr DURABLE FP7 ICT Digital Preservation YE ill ARCHITECTURAL il Grant agreement No 600908 l D2 4 Software prototype vl 14 of 53 2 1 1 4 Metadata Extraction Page On this page the session is searched for an IFC and a E57 files If one of them or both are found the metadata for the files is extracted in a background process The extracted metadata is listed and can be changed by the stakeholder by clicking on the respective cell If one or more mandatory metadata entries are not present in the IFC file the application automatically adds those entries and colors the entries in red so that the stakeholder gets a visual hint on which mandatory entries are still missing Be aware that no validation of entered metadata is taking place at the moment After changes are made the appearing Save button has to be clicked to persist the changes The resulting metadata entries are stored and will be added to the final SIP file in form of an RDF Turtle file Related content Section 3 2 2 gives a deeper look into the used E57 metadata extractor component The IFC metadata extractor is described in D3 3 Appendix 1 4 and Appendix 1 3 show the API Previous Next DURABLE MI arcuirecruras KNOWLEDGE Metadata Management Metadata extraction and manipulation from IFC ES7 files If an IFC E57 file is present in the session the meta data of those files is extracted in a background process and is displayed to the user
18. size of the session files this process takes up to a few minutes After a successfull creation of the archive the SIP can be downloaded via the appearing Download SIP button Hidden from the user the generated SIP file is passed over to the PROBADO3D Rosetta Connector see Section 3 2 4 which creates an entry in the PROBADO3D component s internal database see Section 3 2 5 to allow the stakeholder to search for the metadata of generated SIPs later on see 2 1 2 for the workflow description This page finishes the SIP Generation workflow and yields a SIP file that is ready for uploading to the digital preservation system The actual upload is target in future versions of the Workbench The SIP package will be targeting the commercial Rosetta DPS then Related content Section 3 2 3 gives a deeper look into the used SIP Generation component Appendix 1 6 shows the API Section 3 2 4 explains the PROBADO3D Rosetta Connector component Section 3 2 5 the PROBADO3D component 37zip download URL http www 7 zip org download html Despite the name the component is not yet deriving it s input data from the Rosetta system but works directly on the SIP file This will change in future versions of the Workbench DURAARK LLEN FP7 ICT Digital Preservation Grant agreement No 600908 DURAARK J DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype vl 17 of 53 Home Generate SIP jil PURAARK U I DURABLE jil AncHirec
19. standardized wav As the frontend and backend logic in the project is Javascript JSON is a natural choice to exchange data between the web services and the UI modules Every Javascript implementation includes tools for parsing and reading JSON out of the box making it very easy to use the format When using other programming languages to access the DURAARK web services tools are available to handle JSON in those languages 4 2 Risk assessment This section gives a summary of the Impact section in listing the discussed technical risks consequence and treatment action Risk Description The development of web technology based applications loses momen tum resulting in an unsupported development stack Risk Assessment Impact High Probabilitv Low Description Currently the web browser and the corresponding web technology stack is gaining much attention in application development mostlv because of the advantage of platform independency in the context of mobile development The probabilitv is rather low that the web technologv stack is abandoned in the future Contingencv Solution WP2 is closelv following the developments of web technologies If the momentum gets lost the endorsed technology will be evaluated and a plan for porting the existing software will be made Because of the modular design of the DURAARK framework a change to existing and well established technology stacks e g Qt C XAML C Swing Java would be possible t
20. 0 image height 0 focal_length 0 pixel width 0 pixel height 0 principal point x 0 principal point v 0 spherical representation jpeg image size 0 png image size 23551883 jmage mask size 0 jmage width 8187 image_height 3471 pixel width 0 00076745772193576565 pixel height 0 0007666681778157584 bi cvlindrical representation 4 jpeg image size 0 png image size 0 jmage mask size 0 image_width 0 image height 0 pixel width 0 pixel height 0 radius 0 principal_point_y 0 H Listing 5 Example response showing the metadata of the E57 file in session 0 DURAARK E SAT DURAARK FP7 ICT Digital Preservation YE liull ARCHITECTURAL Grant agreement No 600908 Un KNOWLEDGE D2 4 Software prototype v1 49 of 53 1 5 Semantic Enrichment API Description Triggers the semantic enrichment component and lists data sets found by the semantic enrichment component based on metadata found and eventually edited by the stakeholder in the IFC file The example response shows an excerpt of the list of found data sets for session 0 Example query and response Query http workbench duraark eu services semanticenrichment 0 Example response H dataset id 28 dataset name enipedia resource_id 184805 resource uri http enipedia tudelft nl data EU ETS person S F8ren 20Holm propertvuri http e
21. 3 3 explains the SDO and the Dataset Crawler Module DURAARK SS FP7 ICT Digital Preservation i Grant agreement No 600908 IL DURAARK DURABLE Il ARCHITECTURAL Il KNOWLEDGE D2 4 Software prototype vl 19 of 53 Tema ic Digi i R S Semantic Digital Observatorv l KNOWLEDGI lI qu Interface to the Semantic Digital Observatory content Allows the lookup of SDO information Search Mask Enter a topic of top specific resources and datasets The topic has to be a DBpedia category only the category name and not the full category URI i e Architecture and not http dbpedia org page Category Architecture Search topic Search Name Description Uri Crawl date VU The Amsterdam Museum dataset describes more http semanticweb cs vu nl europeana sparql 2014 07 University than 70 000 cultural heritage objects related to the 08T12 26 28 000Z Amsterdam city of Amsterdam described by the museum The metadata was retrieved from an XML Web API of the museum s Adlib collection database and converted to RDF compliant with the Europeana Data Model EDM This makes the Amsterdam Museum data the first of its kind to be officially converted and made available in this format transport Transport related linked data from data gov uk http services data gov uk transport sparq 2014 07 data gov Namespace for roads Namespace for stations 06T20 20 34 000Z uk Namespace for airports Road traffic statistics SCOVO
22. 99999998 y maximum 70 512130999999997 Zz minimum 2 0202709999999997 z maximum 3 779801 T spherical bounds range minimum 1 6562939999999999 range maximum 90 929899999999989 elevation minimum 1 0909121353667537 elevation maximum 1 5701933463079427 azimuth_minimum 0 azimuth_maximum 6 4112263142845904e 007 i intensity limits 4 intensitv minimum 0 intensitv maximum 1 color_limits color_red_minimum 0 color red maximum 255 color green minimum 0 color green maximum 255 color blue minimum 0 color blue maximum 255 point fields Cartesian x field true cartesian v field true Cartesian z field true Cartes lan invalid state field true DURAARK NAT MITI FP7 ICT Digital Preservation H mil L jill Grant agreement No 600908 DURAARK DURABLE ARCHITECTURAL KNOWLEDGE Grant agreement No 600908 D2 4 Software prototype v1 47 of 53 spherical range field false spherical_azimuth_field false spherical elevation field false spherical invalid state field false point range minimum 268 43545599999999 point range maximum 268 43545499999999 point range scaled integer 9 9999999999999995e 007 angle minimum 0 angle maximum 0 angle scaled integer 0 row index field true row index maximum 4095 column index field true column index maximum 16383 return
23. Identification Page z GE Fi gt me Result is i sends backdata send back i FA po ma Responsibility Responsibility Figure 15 Integration design diagram for the File Identification component 3 2 2 E57 Metadata Extraction The E57 metadata extractor is a shared library written in C which uses libE57 at its core to parse E57 point cloud files and extract meta information like for instance the number of scans number of points acquisition date dimensions of embedded images etc In addition to the library a command line tool is provided which exposes the librarv s functionality This command line tool may be used as a stand alone component for metadata extraction without having to link the library code directly into another component by calling the executable from another process When the tool is executed with the help argument a concise usage guide is printed Otherwise the tool must be given at least an input E57 file using the input parameter The output may be either written to a file using the output parameter to specify the output file path or if no output parameter is given written to standard output for piping it to another process The desired output format may be specified using the format parameter which can have either json or xml as its value JSON is the default if no format is specified The extracted metadata is output in a structured hierarchical
24. SS DURAARK is MITI DURABLE lh uil ARCHITECTURAL l il jill KNOWLEDGE lgili D2 4 Software prototype vl DURAARK FP7 ICT Digital Preservation Grant agreement No 600908 Date 2014 07 31 Version 1 0 Document id duraark 2014 D 2 4 v1 0 D2 4 Software prototype vl 1 of 53 Grant agreement number 600908 Project acronym DURAARK Project full title Durable Architectural Knowledge Project s website www duraark eu Partners LUH Gottfried Wilhelm Leibniz Universitaet Hannover Coordinator DE UBO Rheinische Friedrich Wilhelms Universitaet Bonn DE FhA Fraunhofer Austria Research GmbH AT TUE Technische Universiteit Eindhoven NL CITA Kunstakademiets Arkitektskole DK LTU Lulea Tekniska Universitet SE Catenda Catenda AS NO Project instrument EU FP7 Collaborative Project Project thematic priority Information and Communication Technologies ICT Digital Preservation Project start date 2013 02 01 Project duration 36 months Document number duraark 2014 D 2 4 Title of document Software prototype v1 Deliverable type Software prototype Contractual date of delivery 2014 07 31 Actual date of delivery 2014 07 31 Lead beneficiary Fraunhofer Austria FhA Author s Martin Hecher lt martin hecher fraunhofer at gt FhA Dag Field Edvardsen lt dag fjeld edvardsen catenda no gt Catenda Sebastian Ochman
25. TURAL jil KNOWLEDGE SIP Generation Overview of the SIP file content In this componentall uploaded ingested files along with the metadata are structured and packaged in accordance to the implementation specifics of the DPS system Status At present the SIP package is structured for the commercial Rosetta system This includes the structuring of all digital objects and the records in the METS file in accordance to the specification of the vendor Content Overview File name File type File size CCO DTU Building127 Arch CONE fC ifc 10 74 MB CCO DTU Building127 Arch CONF e57 e57 535 30 MB Figure 9 SIP Generation Page 2 1 2 Workflow Search amp Retrieve This workflow provides the stakeholder with the possibility to search for metadata that was ingested into the PROBADO3D database via the SIP generation workflow PROBADO3D is a content based indexing and retrieval service for non textual documents e g for BIM related meta data Keep in mind that the generated SIP is not persisted at the moment as this is the task of the DPS which will be integrated in future versions only the metadata is The page starts with the listing of all generated SIP creation events and allows the stakeholder to inspect the corresponding metadata The Search field provides a filtering method The stakeholder enters a search term resulting in a full text query over all metadata entries The resulting SIP creation events are listed Related content
26. agreement No 600908 DURAARK J DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 22 of 53 3 1 1 Overall Architecture as Stakeholder Es Stakeholder ae A B DURAARK Framework N b l l l l l l l l l l l l l l l l l l l l l 1 4 Desktop Application s a Figure 13 Integration design diagram for the DURAARK framework The overall architecture is directly derived from the web application pattern described in Section 3 1 On the frontend side so called User Interface UI Modules are responsible for displaying data and interacting with the user On the backend side Web Services are processing data and deliver the data in a consumable form for the UI modules The web service layer of the DURAARK framework provides a RESTful API to communicate between service and UI module The actual implementation of the web service has to be provided by the developer This decoupled approach makes it easy to exchange the implementation of a web service with another or updated one without having to change a the code in the consuming UI module and b the API code of the web service Figure 13 shows the overall architecture The framework holds a list of UI modules which can be registered to the system allowing a central module management with version LAA SS DURAARK pad yl DURAARK PAT ay DURABLE FP7 ICT Digital Preservation N li wt ARCHITECTURAL lini Grant agreement
27. arts developed in the project which are accessible as web services The DURAARK framework provides the infrastructure to connect the graphical user interface of the integrated prototype the WorkbenchUI with the web services via an RESTful API The tools for the geometric enrichment workflow currently containing the D4 1 software deliverable application are the second type of components Those are graphical standalone applications which are not reasonably transferable to a web service implementation at the moment as they require graphical user interaction that is not easily done via a UI module because of the web browser runtime environment For this reason the DURAARK framework provides the possibility to start up the tools via a UI module see 2 1 4 The stakeholder uses the application and produces a result which in turn is again handled by a corresponding workfow in the WorkbenchUI In the current version the D4 1 Documenting the Changing State of Built Architecture application in short registration prototype produces a geometric mapping file between IFC E57 files The IFC E57 input files are determined by the stakeholder via the WorkbenchUI meaning that DURAARK s web services as well as the developed desktop applications are working on the very same files and the produced output file is used in the SIP Generation workflow as input file see Section 2 1 1 The approach of separating a service from its user interface is a powerful me
28. can be added to a SIP file in within the SIP Generation workflow Covered use case UC9 Semantic Archive Maintenance The maintenance of the SDA component is managed by this workflow A stakeholder is provided with graphical user interfaces for i the content of the SDO and SDA sub components that include crawling profiling and archiving evloving temporal states of the Linked Datasets used in the Long Term Preservation scenarios covered by DURAARK Covered use case UC3 DURAARK aa ii FP7 ICT Digital Preservation YE li Grant agreement No 600908 lli jil PURAARK DURABLE l I ARCHITECTURAL jil knowLeooe D2 4 Software prototype vl 8 of 53 ome DURAARK Workbench KNOWLEDGE Welcome to the DURAARK Workbench The DURAARK Framework provides a set of workflows supporting a stakeholder with long time archival tasks in the BIM domain See this document for a user manual and a technical behind the scenes description Enjoy mm Available Workflows L j Geometric Enrichment SIP Generation SDA Maintenance Search amp Retrieval Figure 2 A screenshot of the integrated Workbench software prototype for selecting a workflow The main part of this document is dedicated to the description of these workflows in section 2 1 The remainder of this report is structured in the following way Section 2 describes the DURAARK Workbench including the workflow description in form of a user manual Section 3 sheds light on th
29. chanism to enable new and existing applications eventually written in other programming languages than Javascript integrate the DURAARK functionality as a RESTful API is client agnostic This way the GUI is completely independent from the service implementation Figure 14 shows the DURAARK framework approach on how to connect the Ul modules of the WorkbenchUI with the web service and desktop application components This section gives an overview on the components that are developed within the DURAARK project and which are not already described in report D3 3 D3 3 includes the Semantic Digital Archive SDA the Semantic Digital Observatory SDO including the Dataset Crawler Module mentioned in the user manual section 2 1 and the component for the semantic enrichment of an IFC file This report D2 4 gives an overview of the technical implementation of the following components e File Identification e E57 Metadata Extraction DURAARK ia RIT FP7 ICT Digital Preservation XE li Grant agreement No 600908 lli DURAARK J DURABLE ARCHITECTURAL l KNOWLEDGE D2 4 Software prototype v1 26 of 53 e SIP Generator e Rosetta PROBADO3D Connector e PROBADO3D e Geometric Enrichment tools ES Stakeholder Es Stakeholder a A B L DURAARK Framework Workbench File Upload Page Metadata Extraction Page Search Retrieval Page Geometric Enrichment Page afi N Geometric Enrichment tools runtime Web Services ru
30. dhoven University of Technology ifcspfrdfcat 0 0la xsd string foaf based_near geo pos lat 55 68300000 geo pos lon 12 55000000 duraark floor count 8 xsd integer duraark room count 55 xsd integer dbpedia owl address Lyngby xsd string dc creator Morten Jensen xsd string duraark enrichment_vocabulary http dbpedia org property xsd string duraark enrichment_vocabulary http sws geonames org xsd string duraark enrichment_vocabulary http vocab getty edu aat xsd string duraark enrichment_vocabulary http vocab getty edu ontology xsd string Listing 4 Example response listing the metadata of the IFC file in session 0 encoded in RDF Turtle and wrapped into JSON 1 4 E57 Meta data Extraction API Description Triggers the E57 metadata extractor component to query the metadata for the E57 file of a given session The example response shows metadata for the E57 file in session 0 as an JSON response Example query and response Query http workbench duraark eu services e57m 0 safa 3S DURAARK v ll DURAARK fu l si DURABLE FP7 ICT Digital Preservation H GD Itt ARCHITECTURAL tL Grant agreement No 600908 KNOWLEDGE Example response e57_metadata guid 7E3C7C9C EFCB 4F5A 9A6F 98A08F72FB1B version major 1 version minor 0 creation datetime 4 vear 2011 month 11 dav 3 hour 19 minute 5 seconds 35 548999786376953
31. e RDF file that goes into the SIP file at the end of the workflow Future versions will allow a more fine grained control over the enrichment process as well as manual modification of the found data sets Related content D3 3 gives a deeper look into the used Semantic Enrichment com ponent Appendix 1 5 shows the API Previous Next sath Om DURAARK apace URABL dl tribi ARCHITECTURAL Thuin KNOWLEDGE MT Semantic Enrichment Automatic semantic enrichment of the IFC file This component uses metadata extracted from the ingested IFC file s to provide addtional information which is stored in the buildm metadata record Status At present this workflow supports the lookup of location information e g city names in relevant datasets stored in the SDA component Future versions will allow manual modification and extension of this metadata Enriched Metadata Listing Based on Postal Address Dataset Dataset Resource ID Name ID Resource URI Propertv URI 28 enipedia 184805 http enipedia tudelft ni data EU ETS person S F8ren 20HoIm http enipedia tudelft ni data EU ETS citv 28 enipedia 238963 http enipedia tudelft ni wiki Copenhagen Hvdro Powerplant http enipedia tudelft nl wiki Property City 28 enipedia 238963 http enipedia tudelft ni wiki Copenhagen Hvdro Powerplant http www w3 0rg 2000 01 rat schemattlabel 28 enipedia 238963 http enipedia tudelft nl wiki Copenhagen_Hydro_Powerplant http semantic mediawiki o
32. e architectural design of the workbench In Section 4 a rationale for design decisions is given together with a discussion on their risks Finally a conclusion and impact description is given in Section 6 Source Code The source code of the Workbench itself as well as of most individual components is available under an Open Source license and can be accessed at the following URLs Workbench https github com DURAARK workbench E57 metadata extractor https github com DURAARK e57Extract KNOWLEDGE a iia a surare FP7 ICT Digital Preservation i 17 ili ARCHITECTURAL iil Grant agreement No 600908 D2 4 Software prototype v1 9 of 53 2 DURAARK Workbench The DURAARK Workbench is a service oriented platform comprising the software deliverables produced over the life time of the DURAARK project The functionality of the deliverables is accessible via a coherent graphical user interface GUI The GUI is referred to as the WorkbenchUI in the remainder of this section the service oriented platform as Workbench where the functionality of the software deliverables are called Components The WorkbenchUI is the graphical part of this software deliverable allowing a stakeholder to go through a workflow Section 2 1 explains the intended usage of the WorkbenchUI in form of a user manual Each workflow is described accompanied by screenshots of the ap plication The actual software is available via the URL http workbench duraar
33. e data the client can specify via a HTTP header entry which data format he wants to retrieve In DURAARK the default and currently only format is JSON however it is of course possible to implement the service to also support XML as the result encoding URovFielding sdescriptionofREST http ww ics uci edu fielding pubs dissertation rest_arch_style htm DURAARK ARA 1 eee FP7 ICT Digital Preservation i Grant agreement No 600908 IL DURAARK 1 DURABLE Il ARCHITECTURAL Il KNOWLEDGE
34. e registration software prototype After the datasets have been registered and the resulting mapping has been exported as an RDF file the exported file may be selected uploaded to the WorkbenchUI for inclusion in the SIP generation workflow see 2 1 1 WorkbenchUl Session Page Component Registration Prototype Type Frontend integrated Type Start up API Standalone Desktop Tool T Stakeholder selects inputfiles ARESTful API takingtherespective Generates registrationfile la sesion mapping rdf in semi automatic vi e ie session id process Tool execution i Starts with selected is triggered input files Stakehold i P l Lo tt een i i i i i X Responsibilitv Responsibilitv Figure 21 Integration design diagram for the registration component KNOWLEDGE E ia es FP7 ICT Digital Preservation i 17 wll ARCHITECTURAL iil Grant agreement No 600908 D2 4 Software prototype v1 33 of 53 4 Decisions amp Risks 4 1 Technical decisions and impacts Web based user interface for the integrated prototype The software s graphical user interface WorkbenchUT is developed with a web technol ogy stack running in a web browser The browser environment implies advantages over a standalone desktop application the most important one being the platform independence of the application A web browser provides a standardized environment developers can work with This
35. ed in Section 2 1 1 il DURAARK ka BUI 3 Geometric Enrichment lii i KNOWLEDGE Enrichment of 3D data The prototype for the geometric enrichment tools available in the DURAARK tool set are executeted on a local desktop machine that allows sufficiently performant interaction Status At present the main point cloud relevant tool is the guided semi automatic registation of E57 pointclouds to a manual modeled IFC building file Available Tools At present the registration tool is available Further tools will follow in M20 IFC E57 Registration Figure 11 Geometric Enrichment tool selection page an Navigation Rendering v Registration Auto align A gt B Auto align B gt A Advanced settings y Export transformations Export file path Figure 12 Start page of the registration prototype DURAARK sal MITI jl DURAARK FP7 ICT Digital Preservation i LE liui ARCHITECTURAL Grant agreement No 600908 Un KNOWLEDGE D2 4 Software prototype v1 21 of 53 3 Technical Implementation 3 1 Software Design When developing web applications in many cases their structure is following a common pattern that consists of three layers a frontend layer containing the user interface logic and the display of data the GUI ii a backend layer that processes and provides data and iii a communication layer between those two The frontend layer is located in the user s web browser the backend layer is runnin
36. eir own existing applications The report guides a stakeholder through the usage of the graphical user interface describes the components on a technical level and gives interested readers and developers information on how to use the Workbench as a service provider DURAARK SS FP7 ICT Digital Preservation ai Grant agreement No 600908 IL DURAARK J DURABLE M ARCHITECTURAL Il knowLeocE Table of Contents 1 ECTS ew gs A we e OR ees TE T S 2 DURAARK Workbench lt lt bee ess 21 User Manual 2 22 cues ee eee lick Workflow SIP Generation 2 12 Workflow Search amp Retrieve 2 11 3 Workflow Semantic Archive Maintenance 2 14 Workflow Geometric Enrichment 3 Technical Implementation se c s esre s rodas s ql Software Desi s lt crea a ek Boe wi du Ll Overall Architecture a Frontend User Interface UI Modules ales Backend Web Services 3 2 Components cin caca ere 6 21 File Identification aooaa aaa 32 2 E57 Metadata Extraction Diana SIP Generator 4 2 64 crreerirus 3 2 4 Rosetta PROBADO3D Connector 3 2 0 PROBADO3D lt s cosp aa Bee s 3 2 0 Geometric Enrichment D2 4 Software prototype vl 4 of 53 4 pss Wd a a naa to l a A e de 33 4 1 Technical decisions and impacts lt 33 4 2 Risk assessment AMEN TN i B 35 5 INN as ee ra dos Soe IS A a ANA 38 6 Conclusions amp Impact cisco A 39 Appendices 41 1 Se
37. formation Type or gener ated software generated DURAARK Framework MIT D2 4 software generated DURAARK Workbench MIT D2 4 software used Backbone Marionette MIT http marionettejs com software used NodeJS MIT http nodejs org Licenses regarding the components from D3 3 can be found in the respective report DURAARK FP7 ICT Digital Preservation Grant agreement No 600908 DURAARK il DURABLE H ARCHITECTURAL l KNOWLEDGE D2 4 Software prototype v1 39 of 53 6 Conclusions amp Impact The Workbench as an integrated software prototype provides a platform for integrating existing and future software prototype deliverables into a set of workflows Stakeholders defined in earlier reports are able to perform the use cases UC1 UC2 UC3 UC8 and UC9 when stepping through the four provided workflows SIP Generation Search amp Retrieval Geometric Enrichment and Semantic Archive Maintenance The DURAARK Workbench is divided into two conceptual parts the web services provid ing functionality in the context of long term archival of BIM data as well as a graphical user interface to access the functionality from the point of view of a stakeholder This conceptual separation of concerns is a central aspect of the project s architecture After the lifetime of the project the prototype should be usable by various stakeholders either as a frontend user or as a developer With the component based architecture it is
38. g on a server host accessible via the internet The connection layer is a data exchange protocol that transports data over a network connection e g a RESTful API For the DURAARK project it is necessary to integrate different components from partners into a coherent integrated software prototype The input and output characteristic of the developed components is suited to be mapped to the described common pattern For instance to upload data to the DPS it is first necessary to select the files that should be persisted The user selects files in the web browser which is happening in the frontend layer Those files are then uploaded to a web server and are checked for the correct file type which is happening in the backend layer The other components developed in DURAARK see 3 2 for a list and the respective descriptions fit this pattern too As a consequence and for having a platform for connecting the heterogeneous components the decision was taken to develop a general framework the DURAARK Framework providing a sound base for developers in the project and possible future third party developers The vision for the DURAARK Framework is to provide a future proof extensible and light weight software library for building web applications focusing on long term archival of data 5See http www infoq com articles rest introduction for an introduction to REST and RESTful APIs DURAARK LLEN FP7 ICT Digital Preservation Grant
39. ibed in this report more in depth technical aspects can be found in report D3 3 describing the first SDA prototype e The Point Cloud tools responsible for the geometric enrichment of E57 files Those tools are generating additional files containing corresponding information which is then uploaded to the preservation system via the Workbench From the set of planned tools this version of the integrated prototype contains the point cloud registration prototype described in report D4 1 There is no integration of the software deliverable produced in D5 1 yet The software is responsible for recognition of meaningful shapes and point cloud compression The integration will be done for the milestone in M30 which will also contain D5 2 due in M20 This way the M30 prototype will contain both WP5 deliverables in a consistent way from the view of a stakeholder to extend the Workbench with WP5 s topic Recognition of Architecturally Meaningful Structures and Shapes Also the point cloud compression feature will only be used in M30 for providing the stakeholder with a interactive 3D preview for a point cloud From an implementation point of view the integration of D5 1 and D5 2 are very similar to D4 1 there are no conceptual tasks left to solve for their integration Figure gives an overview of the structure of the reports that accompany this software deliverable Yt SS DURAARK SA il DURAARK a l i DURABLE FP7 ICT Digital Preserva
40. jects exist that allow to convert existing web applications into a standalone desktop application where the majority of existing source code can be reused without additional programming work WP2 will look into this projects to assess their capabilities for producing a desktop application als alternative to the current web application This would remove the necessity for an internet connection and long upload times for large files as the services working on the files will run locally on the users computer with access to the local files However some services in DURAARK are depending on an internet connection e g the semantic enrichment the SIP upload to a digital preservation service and will not be usable without it Still the session based design of the Workbench allows to perform the steps where no internet connection is required and pass on the session to an internet enabled computer to resume the session there Matom shell https github com atom atom shell 18node webkit https github com rogerwang node webkit IA SSS DURAARK ae SITI DURAARK a l i DURABLE FP7 ICT Digital Preservation mis l l ARCHITECTURAL l Grant agreement No 600908 KNOWLEDGE 5 Licenses D2 4 Software prototype v1 38 of 53 The following table gives an overview of the software licences generated and used for the web services and UI modules implementation IPR IP used Software name License In
41. k eu for testing The WorkbenchUI is interacting with the components through a service oriented applica tion programming interface API layer In Section 3 2 a functional description of each component is given together with the current state of its implementation Appendix 1 describes and the API to the components 2 1 User Manual This user manual guides a stakeholder through the usage of the WorkbenchUI with the description of four workflows 1 SIP Generation 2 Search amp Retrieval 3 Geometric Enrichment 4 Semantic Archive Maintenance The WorkbenchUI is a web application accessible with a web browser via the URL http workbench duraark eu Workflow 1 2 and 4 are solely running within a web browser Workflow 3 Geometric Enrichment uses the point cloud registration prototype from software deliverable D4 1 which is implemented as a desktop GUI application This application has to be installed on the stakeholders computer the process is described in DURAARK R aN FP7 ICT Digital Preservation i Grant agreement No 600908 IL DURAARK DURABLE I ARCHITECTURAL JI KNOWLEDEE D2 4 Software prototype v1 10 of 53 the corresponding section of the user manual The general user interaction paradigm for the WorkbenchUI is to first select the desired workflow from the start screen of the application depicted in Figure 2 The stakeholder is then guided through the individual steps of the workflow Each step is
42. n SIP Generaton Page ARESTful API taking the respective Send generation session id command sends backstatus Send back status Rosetta Upload Logic in future versions Responsibility SIP Generator is triggered after alldata is available Figure 18 Integration design diagram for the SIP Generator component ws DURAARK DUS FP7 ICT Digital Preservation i YE uii Grant agreement No 600908 Mitt DURAARK DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype vl 30 of 53 3 2 4 Rosetta PROBADO3D Connector In general the Rosetta PROBADO3D Connector is responsible for indexing data uploaded to the Rosetta svstem Rosetta is providing a REST interface that allows to access the uploaded data The connector will be utilizing this interface to request the metadata that is necessarv for the indexing process so that the user can search the dataset later on However for this M18 software prototvpe the Rosetta svstem is not targeted vet Therefore the connector is taking the RDF metadata file generated via the SIP generation workflow indexes the data directly from the that file instead of requiring the same information from the Rosetta REST interface Internally a new dataset entry is created for each generated SIP The dataset is filled with the given metadata stored into the internal database of the PROBADO3D system Component Rosetta PROBADO3D Connector Component PROBADO3D
43. n lt ochmann cs uni bonn de gt UBO Michael Panitz lt michael panitz tib uni hannover de gt LUH Hamid Rofoogaran lt hamid rofoogaran ltu se gt LTU Ujwal Gadiraju lt gadiraju l3s de gt L3S Besnik Fetahu lt fetahu l3s de gt L3S Responsible editor s Martin Hecher lt martin hecher fraunhofer at gt FhA Quality assessor s Jakob Beetz lt j beetz tue nl gt TUE Martin Tamke lt martin tamke kadk dk gt CITA Approval of this deliverable Jakob Beetz lt j beetz tue nl gt TUE Stefan Dietze lt dietze l3s de gt LUH Distribution Public Keywords list prototype workbench use cases Sy RSS DURAARK cd nn DURAARK PAR a DURABLE FP7 ICT Digital Preservation YE liull ARCHITECTURAL Grant agreement No 600908 Un KNOWLEDGE D2 4 Software prototype v1 2 of 53 Executive Summary This report describes the first version of the integrated software prototype comprising the software prototypes developed in DURAARK so far It exposes the functionality of the prototypes as a service oriented platform the Workbench and provides it to stakeholders via a coherent graphical user interface the WorkbenchUT yielding an integrated application for performing long term archival tasks for BIM data from the view of a front end stakeholder Additionally the software acts as a service provider for third party developers to be able to integrate the functionality developed in DURAARK in th
44. ne In this case the stakeholder is asked to check the up loaded E57 file and upload the corrected file The purpose of the screen is to prevent the upload of an invalid E57 file into a long term preservation system without know ing Also the following metadata extraction requires a correctly identified file type as input to prevent follow up errors in the application After a successfull identification the stakeholder clicks on the Next button to proceed to the Metadata Extraction Page At present file identification is supported for E57 files only Profile patterns for IFC should be ready by the end of the summer 2014 Related content Section 3 2 1 gives a deeper look into the used File Identification component Appendix 1 2 shows the API Previous Next ARCHITECTURAL Il KNOWLEDGE File Identification Identification for uploaded files This component provides feedback of the DROID file identifiacation tool that is triggered for the uploaded files Status At present file identification is supported for E57 files only Profile patterns for IFC should be ready by the end of the summer 2014 Identified Files Filename Identified Format CCO_DTU Building127_Arch_CONF e57 E57 point cloud Figure 6 File Identification Page IDROID profiling tool http www nationalarchives gov uk information management manage information policy process digital continuity file profiling tool droid KNOWLEDGE ma DURAARK
45. ng a session system to that holds the input files and information on them Each session has an ID most of the provided services are working based on that session id A prerequisite therefore is the existence of a session In M18 a session can only be created via the WorkbenchUl see Section 2 1 1 1 on how to do that Currently the system provides two predefined session with ID 0 and ID 1 For the following examples one of those IDs will be used When creating new sessions via the GUI those sessions can be used too The session ID can be found on the Session Page 1 1 Session Management API Description Queries the data for the available sessions The example response is listing two available sessions with ID 0 and ID 1 The second example only lists data from session 0 Example query and response Query http workbench duraark eu services session Response H id 0 Label CCO_DTU Building127 files id 0 path fixtures repositorv CCO DTU Building127 Arch CONF ifc name CCO DTU Building127 Arch CONF ifc type ifc size 10 74 MB HA ids 1 path fixtures repositorv CCO DTU Building127 Arch CONF e57 name CCO DTU Building127 Arch CONF e57 tvpe e57 size 535 30 MB HI DURAARK ST DURAARK FP7 ICT Digital Preservation i l l ARCHITECTURAL l Grant agreement No 600908 KNOWLEDGE D2 4 Software prototype v1 42 of 53 uuid 390685d6 e055 4fcO 9133
46. nipedia tudelft nl data EU ETS city resource value 184805 http enipedia tudelft nl data EU ETS person SsF8rent20Holm http enipedia tudelft nl data EU ETS city HA dataset id 28 dataset name enipedia resource id 238963 resource uri http enipedia tudelft nl wiki Copenhagen_Hydro_Powerplant property uri http enipedia tudelft nl wiki Propertv Citv resource value 238963 http enipedia tudelft nl wiki Copenhagen Hvdro Poverplant http enipedia tudelft nl wiki Propertv Citv HA dataset id 28 dataset name enipedia resource id 238963 resource uri http enipedia tudelft nl wiki Copenhagen_Hydro_Powerplant property uri http www w3 org 2000 01 rdf schema label resource value 238963 http enipedia tudelft nl wiki Copenhagen Hvdro Poverplant http www w3 0rg 2000 01 rdf schemaflabel HA dataset id 28 dataset name enipedia resource id 238963 DURAARK SS DURAARK FP7 ICT Digital Preservation Grant agreement No 600908 ARCHITECTURAL KNOWLEDGE l DURABLE il D2 4 Software prototype v1 50 of 53 resource uri http enipedia tudelft nl wiki Copenhagen_Hydro_Powerplant property uri http semantic mediawiki org swivt 1 0 page resource value 238963 http enipedia tudelft nl wiki Copenhagen Hvdro Poverplant http semantic mediawiki org swivt 1 0 page H Listing 6 Truncated Example response li
47. not web based Risk Description incl Cause Javascript is too slow for either a user interface or web service task as it is an interpreted CPU bound language Risk Assessment Impact Medium Probability Medium Description Javascript is an scripting language executed by an interpreter Com pared to compiled languages like C or Java an interpreted language is slower per design So SS DURAARK sal MES DURAARK a l i DURABLE FP7 ICT Digital Preservation mis l l ARCHITECTURAL l Grant agreement No 600908 KNOWLEDGE D2 4 Software prototype v1 37 of 53 Contingency Solution For the user interface a CPU intense task can be delegated to a backend web service On the backend a CPU intense task can be handled in the programming language of choice and then wrapped via NodeJS Risk Description The stakeholder has no or slow access to the internet the web appli cation can not be executed file uploads take too long Risk Assessment Impact High Probability Low Description As web application the DURAARK Workbench heavily depends on a internet connection with reasonable bandwith for a accessing the application and b for uploading files to the web services A non existing connection prevents the usage of the software a slow connection reduces the user experience dramatically Contingency Solution The M18 version of the prototype is a pure web application and will not work without an internet connection However pro
48. ntime NodeJS on server s standalone desktop 3 application Figure 14 Architectural diagram for the Workbench platform 3 2 1 File Identification Since the exact file format identification is needed for preservation planning of the ingested files the widely used file format identification tool from the National Archives DROID Digital Record and Object IDentification was chosen for the DURAARK A DURAARK sal yl DURAARK Z DURABLE FP7 ICT Digital Preservation H LI ARCHITECTURAL huin KNOWLEDEE li gal Grant agreement No 600908 l D2 4 Software prototype vl 27 of 53 Workbench DROID is developed for archives and institutions which have to identify file formats for their stored objects It identifies formats based on patterns e g file format extension internal IDs etc and is updated constantly through xml based signature files which provide the linking to the entries in the PRONOM technical registry with its assigned PUID Pronom Unique IDentifier Figure 16 show the integration of the component into the Workbench WorkbenchulI File Selection Page Component File Identification Type Frontend integrated Type RESTful API Standalone Executable on Backend Stakeholder selects IFC E57 mapping rdf A RESTful API taking the respective i she wants to work with session id 4 Determinesfileformat ES7 file gets identified t providesinput to executable Stake ERA U DROID holder File
49. of the REST principles see a simplified explanation of them in Appendix 2 already gives a developer a lot of knowledge about the provided interface How to access to the API is recommended though not standardized via the use of HTTP verbs e g GET for retrieving information POST for creating new entities PUT for updating existing entities that already have a semantic meaning The second principle of REST is the use of Unified Resource Identifiers URIs which uniquely identify a provided entity or resource e g a session in DURAARK which can be shared or bookmarked The RESTful API allows to access the functionality developed in DURAARK to be accessed by existing or new application which are not implemented in Javascript The only prerequisite for accessing the API is a network socket which is available in all relevant programming languages JSON as data exchange format A RESTful API is capable of answering request in different formats representing the same information e g XML JSON or a custom format In DURAARK JSON is the 10See http www infoq com articles rest introduction for an introduction to REST and RESTful APIs DURAARK LLEN FP7 ICT Digital Preservation Grant agreement No 600908 DURAARK J DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 35 of 53 chosen exchange format JSON means Javascript Object Notation and was developed for the Javascript language to exchange data in a
50. oo 11JSON explanation and standard description http json org DURAARK SS FP7 ICT Digital Preservation i Grant agreement No 600908 IL DURAARK DURABLE I ARCHITECTURAL Il knowLeocE D2 4 Software prototype v1 36 of 53 Risk Description Javascript as the main programming language for backend and fron tend is not accepted by the community Risk Assessment Impact High Probability Low Description In a community it is possible that multiple programming languages are used by respective programmers A wide spread myth though with decreasing tendency blames Javascript as a non compatible language compared to Python Java etc which results from the moved Javascript history Contingency Solution If the community is not adopting the Javascript based approach of the DURAARK framework it is still possible to use the existing functionality via the RESTful API Adding a new web service is possible as providing a RESTful API to a functional block does not demand a Javascript implementation and can be achieved in any other language The DURAARK project endorses to develop of modular backend functionality and exposing it via a well defined API The only disadvantage is that the respective developer can not use the already existing DURAARK framework The integration os new UI modules which are not based on a web technology stack is supported DURAARK is already integrating standalone desktop applications which are
51. or a server backend as it has become very popular in the last years as it is easy to program provides Client side web standards are organized in multiple standard bodies and working groups The most prominent ones are the World Wide Web Consortium W3C http w3 org and the Web Hypertext Application Technology Working Group WHATWG http www whatwg org DURAARK de po ss FP7 ICT Digital Preservation YE li Grant agreement No 600908 lli DURAARK J DURABLE ARCHITECTURAL l KNOWLEDGE D2 4 Software prototype v1 34 of 53 a scalable architecture and has a large community that adds a lot of useful functionality in form of modules The used programming language is Javascript which is consistent with the UI module programming language The advantage is that for programmers familiar with Javascript on the browser side the entry hurdle for developing Javascript based web services is low The knowledge of a single programming language allows to write user interface logic and web services for DURAARK RESTful API The service oriented architecture of the DURAARK framework separates functionality provider from the respective user interface s The communication layer was chosen to be a RESTful API REST means Representational State Transfer and is a way to implement heterogeneous application to application communication also including the communication with a user interface module With a RESTful API the definition
52. ration of a SIP Submission Information Package file from a set of given input files In this case the workflow covers UC1 the deposition of 3D architectural objects as well as UC9 the enrichment of the BIM IFC model with metadata from a repository This is a list of the workflows provided by the workbench so far SIP Generation A stakeholder selects a set of input files describing a building After a file identification the automatically extracted metadata of the files is shown and editable Based on the metadata an automatic enrichment with Linked Open Data is performed and stored in a metadata record In a final step the input files and the metadata record are archived into a downloadable SIP file and the metadata record of the SIP is indexed into a PROBADO3D database for later search amp retrieval Covered use cases UC1 UC8 Search amp Retrieval A stakeholder is provided with a list of generated SIP files Meta data records for the SIP can be displayed A full text search within all metadata records allows the stakeholder to filter the list of files Covered use case UC2 Geometric Enrichment The geometric enrichment workflow is based on the desktop application yielded from the software deliverables D4 1 and D5 1 in M12 After the selection of one or multiple IFC and E57 files a stakeholder is provided with a graphical user interface for performing a geometric registration of the input files The process yields a mapping file that
53. rg swivt 1 O page Figure 8 Semantic Enrichment Page 2 1 1 6 SIP Generation Page This page presents all files that will be packaged in accordance to the implementation specifics of the digital preservation system DPS The engineering metadata from the extraction of the IFC and E57 files together with the descriptive metadata from the enrichment process are put together into a single RDF file buildm ttl that goes into the archive This includes the structuring of all digital objects DURAARK e AS DURAARK FP7 ICT Digital Preservation YE ilia ARCHITECTURAL Grant agreement No 600908 Un REUS D2 4 Software prototype v1 16 of 53 and the metadata records into a METS file in accordance to the specification of the vendor The Content Overview lists the package archive with file names sizes and type Clicking the Generate SIP button starts the background process to generate the archive In this process a mapping from the metadata RDF file to the METS structure is done yielding in a sip cml METS file The resulting archive is a ZIP file with the sip rml and a content folder as root items The content folder contains the uploaded files together with the RDF metadata file The generated ZIP file is of version 2 0 for which we recommend the free software 7zip for opening the archive Tests showed that the integrated ZIP archive handler in Microsoft Windows 7 was not always capable of opening the valid archive Depending on the file
54. rmation Package SIP to be delivered to a DPS The SIP generator software will support producers with the process of compiling digital assets to be ingested to a digital archive Input to this module consists of both manually entered data by the producer user captured by the GUI uploaded files and automatically captured metadata such as file identification results consisting of e g unique id size and hash sum The SIP generator is written in Java and is using a database for temporary storage of meta data Figure 17 shows the sequential actions taken to generate a SIP package Figure 19 shows the integration into the Workbench KNOWLEDGE AAA DURAARK sal MITI DURAARK ae l sr DURABLE FP7 ICT Digital Preservation YE jil ARCHITECTURAL ill Grant agreement No 600908 l ipload filesm nter info tart D2 4 Software prototype v1 29 of 53 Workbench ul SIP Generation Page 1 AE MW gen FED ileUID ilename pathConfig pathOutputfile un rite UUID rite Filepath ite Status Figure 17 Sequence diagram for the SIP generation rite Name rite Version rite PUID rite Mime te IdWarning Component SIP Generator Type RESTful API Standalone Executable on Backend Stores metadata information of give session Type Frontend integrated Responsibility User gets a summary of the contents and startsSIP generatio
55. rocessing For instance one component takes care of file identification The component is available as a standalone executable and needs an IFC or E57 file as input Its output is a description file that contains informations for the provided file This is a typical processing step for web services in the DURAARK context which is common for other components in the project too The framework supports the developer in creating a RESTful API around a given function ality e g a standalone executable The implementation providing concrete functionality e g the file identification component is exchangeable whereas the API does not have to be changed when used with another implementation of the service This approach encourages a stable API development and a clear separation of concerns between service interfaces and their implementation As a basis for the web services part of the DURAARK framework the software library NodeJS is used which provides the functionality to start a web server and handles requests from and responses to clients e g a UI module It is written in Javascript and is a stable and well tested software library with a broad user community and an active development line TNodeJS http nodejs org DURAARK LLEN FP7 ICT Digital Preservation Grant agreement No 600908 DURAARK 1 DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 25 of 53 3 2 Components Components are the functional p
56. rvice Endpoints RESTful API Description 41 1 1 Session Management e o gop casera AR i A 41 1 2 File Identificado sevi eros ae 43 1 3 IPC Meta data Extraction cedo cea E e SE ra 43 1 4 E57 Meta data Extraction 22s bake ek a De ee RES 44 1 5 Semantie Enrichment 3 lt s o s s scr kae a E ia eee REESE 49 1 6 SIP TURN xa AA A jeb SSS 50 Ly PROBADOSD TS fake YA oa PAS we Pees 6 Be b 50 1 8 PROBADOGD Fulltext Search o soe eae Se bd eee ee os 51 2 Representational State Transfer REST Principles 53 DURAARK a BOTT sis Grant agreement No G TW Ske g Iiu D2 4 Software prototype vl 5 of 53 1 Introduction This report describes the first version of the integrated software prototype which is referred to as the DURAARK Workbench in the remainder of the document The purpose of the Workbench is to provide an integrated platform for the software deliverables developed in the project as well as the future ones Currently the following software prototypes are included e The Workbench acting as service oriented platform for the functionality developed in DURAARK and providing a coherent web based user interface to access the functionality from a stakeholder point of view The user manual and the technical architecture description are available in this report e The Semantic Digital Archive SDA which consists of a number of sub components integrated into the workbench While their general use is descr
57. s Filename Identified Format CCO DTU Building127 Arch CONF e57 E57 pointcloud Figure 3 Page lavout example top Workflow navigation center Description of the workflow step bottom Area for user interaction and or data displav KNOWLEDGE DURAARK cv hall DURAARK fi A sr DURABLE FP7 ICT Digital Preservation YE il ARCHITECTURAL ut Grant agreement No 600908 l D2 4 Software prototype vl 11 of 53 2 1 1 Workflow SIP Generation The SIP generation workflow allows the stakeholder to upload data files describing a building which are then packaged into a single SIP file that is ready to be uploaded into a digital preservation system In the process the files are identified and metadata is added H DURABLE yag SIP Generation lu KNOWLEDGE Welcome to the DURAARK SIP Generation workflow The generation of SIPs is organized in Sessions A session consists of a selection of files that you want to archive to a long term archival system The result of a session is a SIP file that is ready for uploading to the archival system There are existing sessions prepared so that one can test the SIP generation without having to upload eventually big IFC or E57 files Clicking the Start button starts to guide the user through the process You can also create a new session and upload your own files Use the New Session button for this As this is a prototype system newly generated sessions will not be persisted on the
58. s are uploaded an optional registration file between the two can be selected The workflow for creating a registration file is described in 2 1 4 For uploading the stakeholder selects the desired files from the computer and presses Upload When the upload is finished which is indicated via a message the Next button is enabled to continue to the File Identification Page gt Sl DURAARK i kisra File Management Upload your IFC E57 Mapping files you want to work with This file manager allows to upload an IFC E57 and a geometric mapping RDF file to a session Select an IFC file Choose Files No file chosen Select an E57 file Choose Files No file chosen Select an E57 registration file Choose Files No file chosen Upload Next Figure 5 File Upload Page ASS DURAARK cl DURAARK Tu in DURABLE FP7 ICT Digital Preservation k YE ill ARCHITECTURAL il KNOWLEDGE Grant agreement No 600908 l D2 4 Software prototype vl 13 of 53 2 1 1 3 File Identification Page If a file of type E57 is present in the session an identification of the file takes place via the DROID file profiling tool from National Archives Depending on the size of the file this process can take up to a few minutes The result of the identification is presented in the Identified Files section of the page A green label in the table cell Identified Format indicates a successfull identification a red label an unsuccessful o
59. sponse lists the metadata for previous generated SIPs 18 PROBADOSD Fulltext Search API Description Allows to search the metadata of all previous SIP generation entries The example response contains a single result The start and count parameters can be used for pagination Example query and response Query https ogo cgv tugraz at api Models fulltextQuery CCO 4start O amp count 1 Example response sessionId a2844222 8d49 4734 a4b7 322c2ffa64fc startIndex 0 count 1 totalResultCount 19 22 SES DURAARK Ey SS FP7 ICT Digital Preservation Grant agreement No 600908 DURAARK DURABLE ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 52 of 53 resultltems documentidentifier 1015 description Test Ingestion title CCO DTU Building127 Arch CONF creatorPersonid 4 geoLocation lt xml version 1 0 encoding utf 8 gt lt Point xmlns http ww opengis net gml gt lt pos gt 50 94158 6 958498 lt pos gt lt Point gt phvsicalAssets null fileInfos KNOWLEDGE Listing 9 The example response lists the metadata for previous generated SIPs DURAARK v L DURAARK ADULTI LL LA l ARCHITECTURAL FP7 ICT Digital Preservation ES Grant agreement No 600908 l D2 4 Software prototype v1 53 of 53 2 Representational State Transfer REST Principles The following is a simplified description of a selection of REST principles
60. sting data sets found by the semantic enrichment component 1 6 SIP Generator API Description Triggers the SIP Generator component The example response shows the URL for downloading the generated SIP Example query and response Query http workbench duraark eu services semanticenrichment 0 Example response url 22844222 8d49 4734 a4b7 322c2ffa64fc zip Listing 7 Example response containing the URL for downloading the generated SIP 1 7 PROBADOSD List API Description Lists the metadata to all previous SIP generation entries The example response contains a single entry The start and count parameters can be used for pagination Example query and response Query https ogo cgv tugraz at api Models start 0 amp count 1 DURAARK aS SS FP7 ICT Digital Preservation IN Grant agreement No 600908 l DURAARK il J DURABLE I ARCHITECTURAL jil knowLeooe D2 4 Software prototype v1 51 of 53 Example response sessionId a2844222 8d49 4734 a4b7 322c2ffa64fc startIndex 0 count 1 totalResultCount 19 resultltems documentidentifier 1015 description Test Ingestion title CCO DTU Building127 Arch CONF creatorPersonid 4 geoLocation lt xml version 1 0 encoding utf 8 gt lt Point xmlns http ww opengis net gml gt lt pos gt 50 94158 6 958498 lt pos gt lt Point gt phvsicalAssets null fileInfos l Listing 8 The example re
61. the authorative description can be found in Roy Fielding s excellent PhD thesis Every resource has an ID In DURAARK a simple example to explain this principle is to imagine an uploaded IFC file The file gets the id 0 and is accessible by this ID from other services or an UI module For the web there is the unified concept for IDs the URI URIs make up a global namespace having the advantage that resources behind a REST service are always accessible via the same URI which can be shared and bookmarked Interlinkage between resources Via hyperlinking it is possible to link from one re source to the other The different resources do not have to be provided by the same service they can be distributed Use of standard methods The data behind an URI is served via the HTTP application protocol which in turn is based on the TCP transport protocol HTTP provides standard methods for accessing and manipulating the data encoded in the URI which are e g GET POST PUT or DELETE For every resource those standard methods provide a clear semantic on what the programmer intends to do with the resource For instance calling the DELETE method on an URI clearly states that the resource should be deleted The standardized concept of the URI and the standard methods provided by HTTP give a clear guidance even without extensive documentation on how to use a REST interface Resources have multiple representations When accessing a URI to retriev
62. tion l YE lI ill ARCHITECTURAL HL Grant agreement No 600908 KNOWLEDGE D2 4 Software prototype vl 6 of 53 With the integrated software prototype a stakeholder is able to perform a selection of use cases defined in report D2 1 The selection is the following e UCI Deposit 3D architectural objects e UC2 Search and retrieve archived objects e UC3 Maintain Semantic Digital Archive e UC8 Exploit contextual information for urban planning e UC9 Enrich BIM IFC model with metadata from a repository a nono LSS a sn o cnn cas 0 4 Integrated Software Prototype 1 D2 4 integrated DURAARK workbench 6 a ta U U 0 Point Cloud Tools Semantic Digital Archive N Digital Preservation Svstem 0 U N aiii X 9 0 0 b ST ST 0 ta 0 0 te 0 0 te 0 0 ta 0 0 0 N Ed the i Changing State of Built e 0 0 ene i AC Semantic Digital Archive 0 Architecture s 4 Pisos 0 N software prototvpe vies G k 0 to 0 n D4 1 a Figure 1 Overview of the scope of the M18 software prototypes and respective reports DURAARK de E nT DURAARK FP7 ICT Digital Preservation i YE II HN li ARCHITECTURAL Grant agreement No 600908 KNOWLEDGE D2 4 Software prototype vl 7 of 53 The workbench organizes the use cases into workflows A workflow is a step by step process on how to achieve the purpose of one or multiple use cases For instance one of the implemented workflows handles the gene
63. ws status for the E57 file in session 0 Example query and response Query http workbench duraark eu services fileid 0 Example response 1 name CCO_DTU Building127_Arch_CONF e57 format fmt 643 valid true formatString E57 point cloud Listing 3 Example response showing the status of the E57 file identification of session 0 1 3 IFC Meta data Extraction API Description Triggers the IFC metadata extractor component to query the metadata for the IFC file of a given session The example response shows metadata for the IFC file in session 0 as an RDF Turtle string wrapped into a JSON response Example query and response Query http workbench duraark eu services ifcm 0 Example response DURAARK Lam DURAARK FP7 ICT Digital Preservation vis i ARCHITEC Grant agreement No 600908 ARCHITECTURAL KNOWLEDGE D2 4 Software prototype v1 44 of 53 rdf prefix dct prefix dbp prop prefix geo pos prefix xsd prefix duraark prefix qudt prefix dbpedia owl prefix foaf prefix dc duraark object identifier 2eD6iPVCPFOADV8eYNtazn xsd string foaf name DTU 127 xsd string dbp prop startDate 1970 01 01 01 00 00 xsd date dbpedia owl buildingStartYear 1970 01 01 01 00 00 xsd date duraark length_unit MILLIMETRE xsd string duraark authoring_tool Autodesk Revit 2013 Autodesk Revit 2013 2013 xsd string duraark authoring_tool Ein
Download Pdf Manuals
Related Search
Related Contents
OneWireless XYR 6000 Universal I/O Transmitter R120 User`s Manual Behringer Europower PMP2000 Quick Start Guide 販売名:NPB呼吸回路セット Copyright © All rights reserved.
Failed to retrieve file