Home

User manual for Jwire 1.0

image

Contents

1. ignore gt sessionids TES reject patterns T act add root dll l a shtml shtm Remove add robots bd PEREA php phtmi php5 php4 php3 add robots r en gt 5 manager 186 gt j harvester jhtmi gt gatherer asp aspx asx U index i gt analysis xhtmi dhtmi htmi htm bt Figure 1 Editing wire conf 2 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez Use the path field to select a new configuration file Use Apply changes to save the changes Use Save as to save the changes to a file without aplying the changes Use the drop down Select a configuration to load the data stored in a file for later modification Path optwiredata wire conf Apply changes Save as Selecta configuration x Figure 2 Commands toolbar Select the nodes in the tree that correspond each of the fields to edit in wire conf On the right side it appears the values that can be edited for the selected field There are different types of fields Each field has different controls some of them are numeric values some others are tables others are text fields etc v config collection v seeder accept _ max urls per site group_document xhtmi dhtmi htmi htm txt v extensions log stat ignore sessionids reject patterns act add root dll add robots txt shtml shtm _ add robots rdf gt Lj manager gt Lj harvester gt B gathere
2. Magnifying glass Using the mouse pointer it can be magnified a concrete zone of the graph while the rest can be seen in a reduced way as it can be seen when using a magnifying glass Aggregations This option is used to show or hide the aggregations In the case of the force directed web graph is activated by default and shows the web sites with colored spots In the case of a multilayer graph is deactivated by default but it can be activated by selecting the type of aggregation to show either on the second level domains or on the first level domain In this case the aggregations are shown by coloring with the same color the nodes User manual for Jwire 1 0 Luis Alberto Garcia Hernandez that belong to the same group Tooltips Show the name of the nodes if the mouse pointer is hovering on the nodes Deactivated by default 4 Treemap Only for multilayer graph Show in a separated window a treemao of the current graph 5 Radial graph Only for the last level of the multilayer graph It shows in a separated window the radial graph of the current graph 6 Up one level Only for multilayer graph it is used to go up one level from the previous one Radial www usal es search gt gt usal es infoprensa Figure 24 Radial graph There is a panel that allows modifying the properties of the system of forces that controls the disposal of the graphs and a searcher to search nodes by name The panel is hidden by defaul
3. m doc_gathered_all b ia doc _gathered_dynamit doc gathered static Configure he ia extension ia lang gt i site gt 53 sitelink re harvest re index gt i link gt 3 log metadata gt 3 sitelink gt ig text url M index gt 5 link gt log 5 metadata 5 sitelink gt Bi text 5 url gt EN harvest AJ XXNNNNXXX doc_gath Refresh Ej 12 Configuri licati igure E S E Figure 13 Accessing the files For this 1 From the execution tab select the Files tab from the left panel 2 A hierarchical tree is shown with the structure of the files ad folders created by WIRE in order to store the generated reports Clicking on the right button of the mouse on the panel it is shown a pop up menu Choosing the option Configure it is open a dialog box that contains a table This table stores in two columns the paths to the applications to use and the extensions of the files associated to those applications Using the dialog box buttons or clicking on the table rows it is possible to edit add or delete any association If the value 8 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez is indicated as extension the associated application will open any other file that won t have any other association set on the table wildcard Once Jwire is configured the files generated by WIRE can be open in the fol lowing way 1 From the execution tab select the Files tab from the left panel in
4. o ek at A oh rn re er De BS Oe SG 18 ZO ECM 16 2 uo a ocd de Be ak BE ES OES ar 19 26 Force panel and nodes searcher 00 2 0004 19 2 Mata aboutthenodess de wade ok a re s a Be es E gh ee BE ewe 20 User manual for Jwire 1 0 111 Luis Alberto Garcia Hernandez 1 Requirements Jwire is implemented in standard Java with the SDK from Sun version 6 update 18 Thus Jwire requires the Java JRE that has to be previously installed in the system on which is gonna be executed in its version 5 or upper It is recommended version 6 update 10 or upper in order to make use of all its features 2 Installation and execution Unzip or untar the archive Jwire zip or tar in the folder from where it is gonna be launched The Jwire archive contains all the files and folders required for Jwire Jwire will consider the current directory as the root directory in which store all the information needed for its right performance To execute Jwire launch the execution script associated to the corresponding operative system in which is installed sh for Unix Linux or bat for Windows User manual for Jwire 1 0 1 User manual for Jwire 1 0 3 Configuration module 3 1 Edit wire conf 1 Select the Configuration tab 2 Do click on the button with label wire conf Path optwiredata wire conf ee Y 3 config group_document U collection 5 v seeder accept max urls per site v extensions download log stat
5. circle with the white cross that is shown next to the existent tabs Each time that a new tab is created the title No name is assigned by default This can be changed when saving the execution lists as it is mentioned below 4 1 2 Visualizing and exporting the standard output Therea are two different panels associated to each execution list that show the corresponding stadard output and error of each executed command from the list The content of these panels can be exported to a text file by clicking with the right button of the mouse to show a pop up menu and then choosing the option Export and specifying the destination file 6 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez io Execute m stop lal Save Ss Load a Configure hd Remove Noname x Noname Noname Figure 9 Several lists at the same time sider staout Load configuration using optwiredata wire conf chdir done de WIRE 6069 seeder start Initializing accept_protocol extensions_ ignore dynamic extension domain suffixes reject patterns sess Open indexes metaidx linkids urlidx done Opening Joptwiredata start_urls ft done LEGEND ACCEPT new document new site REJECT _ seen I malformed P Pattern D Domain E Extension H protocol not Http H Depth C link from Copy M too Many pages 1 3 done WIRE SO69 seeder readed starting urls
6. tcl tk Group nodes 57 Figure 18 Information about the graph 12 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez 5 2 Multilayer graphs El grafo multicapa es un conjunto de grafos web que representan diferentes niveles dentro de una estructura jer rquica de contenci n Los datos generados por WIRE se distribuyen en un nico nivel al cual se accede desde otros niveles superiores que se crean de forma artificial y que representan las agrupaciones de esos datos de forma distribuida El siguiente diagrama representa el mecanismo de contrucci n de los niveles del grafo multicapa Figure 19 Hierarchical aggregation of the multilayer graph The nodes of the lowest level are extracted by WIRE Jwire shows by default the upest level and it is possible to access the loweer levels by double clicking on the node that represents the aggregation to explore User manual for Jwire 1 0 13 User manual for Jwire 1 0 Configuration About Jwire Create net Create GraphML Open file Conversor Export Load graph I A pp usal_tk_2 xml Total nodes 20 usal es Figure 20 Level 0 F4 Oo Jwire Execution Visualization Configuration About Jwire usal es Fr o usal es Total nodes 20 www usal es Figure 21 Level 1 14 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez FA Jwire Create net Create GraphML Open file Conve
7. Alberto Garcia Hernandez zation Configuration About Jwire usal es AR www usal es Total nodes 282 A o info URL Web site Total degree 65 In degree 10 Out degree 55 Neighbors 65 www usal es webusal www usal es www usal es webusal Figure 27 Data about the nodes User manual for Jwire 1 0 19
8. Begin status dump for metaicds o T a E 7 u Figure 10 Standard output and error 4 1 3 Controls bar There are different controls inside the bar situated in the upper part of the central execution panel of Jwire Those controls have the following functions Execute m stop lal Save F Load Configure EJ Remove Figure 11 Controls bar e Execute Execute the command list e Stop If possible it stops the execution of the command list e Save Save the command list into a file for later reutilisation e Load Load into Jwire a command list from a file that has been previously saved e Configure It allows configuring each WIRE command parameters depend ing on the needs for each command e Delete Delete from the command list the selected command User manual for Jwire 1 0 T User manual for Jwire 1 0 4 2 Managing WIRE reports From Jwire it is possible to open the reports generated by the WIRE commands Jwire makes use of the existent applications on the operative system on which it is installed in order to open those files Thus before making use of this functionality it is necessary to configue Jwire to indicate it with what applications the different files will be open m commands ti Files Y 5 Files h a analysis 3 analysis J 3 lt XNNANAo cextensior A XNNNN site pdf A amp NNNNXXX Sitelink p gt ij doc_all_all pe FF doc_all_ dynamic gt if doc_all_static
9. User manual for Jwire 1 0 septembre 2010 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez Information about the author Luis Alberto Garcia Hernandez Informatic engineer luisalbertoghQhotmail com http it luisalbertogh net This document can be freely distributed c 2010 Luis Alberto Garc a Hern ndez Abstract WIRE is a crawler or web spider developed by a team from the University of Chile under GPL license that is of public use The crawler is able to crawl and extract huge amount of information from Internet accessing several web sites It is able to crawl a whole domain of a given country if it has the necessary computationals resources and time This crawler is used by professors and researchers from the University of Sala manca in the realization of their works within different fields of Informatics like those ones related to Web Mining and Cybermetrics Due to the complexity of the operations performed by the crawler and the lack of graphical user interfaces the use of WIRE can become annoying and complex for non advanced users Also in the case of the crawlers in general and with WIRE in particular the amount of information retrieved from the crawled networks is in most of the cases very important To analyze such amount of data can be a complex work without the necessary techniques and tools The goal of this project is to analyze the different visualization techniques that can be applied to the
10. analysis of the data obtained by a crawler in this case the crawler WIRE and to develop a set of visualization tools that implement those tech niques in order to verify and validate their use This set of tools will be develop as a graphical environment for WIRE named Jwire Thus the original project title named WIRE a multifunction crawler has been replaced by the current one Adding to WIRE a new graphical user interface from which it could be possible to make use those visualization tools together with the whole set of WIRE function alities but in a simpler and quicker way making WIRE more accessible to any kind of user automatizing tasks and in summary adding additional values to the already existent application This document presents a brief graphical description of the functionalities of Jwire in its version 1 0 For more information about the operations of Jwire consult the technical manual User manual for Jwire 1 0 i Contents List of Figures 111 1 Requirements 1 2 Installation and execution 1 3 Configuration module 2 3 Edt wire Col w na dr Geese GD ews Geer a cad aoa Be eth Bs ee G 2 3 2 HEA IIS Tea ee EOS BORE SO ok GDS BS OE AS 3 4 Execution module 5 4 1 Execution of WIRE commands 000 5 4 1 1 Create several execution ists 2 2 Ho m nn nn 6 4 1 2 Visualizing and exporting the standard output 6 Al ASOMO Sats Sie hoe xe ee Roe er RR A OR gaa T 4 2 Managing WIRE repor
11. cting the Execution tab A o Jwire Configuration About Jwire Execute Stop k Save FLoad 4 Configure J Remove 4 No name x seeder ay stderr stdout manager harvester gatherer Figure 7 Execution module 1 Select the WIRE command from the commands list 2 Drag the selected commands to the execution list 3 Optionally configure each of the execution paramenters of each command using the Configurar button from the control bar User manual for Jwire 1 0 5 User manual for Jwire 1 0 4 Execute the selected list The commands from the list will be execute in order from left to right The command in execution appears with red background while it is being executed 5 In the lower part of the window it appears the standard output and error of the execution of the commands For more information about the control bar in the execution module see section 4 1 3 f Configuration About Jwine aa A Run iD Execute Oistop im Save amp Load 4 Comgura Eg Remova e Noname x r reset si Seeder gatherer Figure 8 Command execution 4 1 1 Create several execution lists It is possible to handle several execution lists at the same time creating new tabs in the upper part of the central canvas It is enough by clicking on the green
12. data on the screen in the form of web graphs and other useful visualizations when analyzing the obtained data It does exist a tool bar in the form of a sequence of buttons that offers different functionalities These are the following Create GraphML Openfile Conversor Export Load graph Stop Figure 15 Visualization toolbar e Create net Generate a file in net Pajek format with the data extracted in the last WIRE execution This is useful when visualizaing or manipulating those data with applications like Pajek or others that only work with the net format e Create GraphML Generate a XML file with the GraphML schema for web graphs This is a necessary step in order to visualize in Jwire the data gener ated by WIRE e See files Open in a sperated windows the selected file and show it in text mode e Conversor Convert files from net to GraphML and viceversa e Export Export the generated graph in the area of visualization to an image e Load graph Once the GraphML file has been generated it can be open as a web graph by selecting it There are two visualization modes the force directed graph that shows the totality of the data generated in a web graph directed by an artificial forces system or the multilayer graph that shows the data distributed in a hierarchical structure that composes a graph of 3 different levels e Stop Stop or restart the web graph movement The way to visual
13. ize the extracted data is the following 1 Generate the XML file with the GraphML format clicking on the correspond ing button of the tool bar The file has to have the xml extension 2 Load the file with the graph Use the drop down with the tool bar selecting firat the type of graph that it is visualized and afterwards the corresponding XML file 10 User manual for Jwire 1 0 Luis Alberto Garcia Hernandez Execution visualization Configuration About Jwire Figure 16 Visualizind data in Jwire 5 1 Force directed web graph The following is and example of the visualization of data generated by WIRE through a force directed graph By default the aggregations are shown inside the graph through the colored spots that group the nodes that belong to the same web site That is each colored spot is a single web site like www something com User manual for Jwire 1 0 11 User manual for Jwire 1 0 A O Jwire Execution Visualization Configuration About Jwire usal_tk xml Total nodes 174 Figure 17 Force directed web graph In the lower part of the visualization area it is shown the information about the graph and the nodes and links that are selected by hovering the pointer of the mouse on them Normally it is shown the name of the open file the total number of nodes the node or aggregation name selected and the number of nodes of the ageregation usal_tk xml Total nodes 174 www
14. r index gt js analysis group_cgi Add Remove php phtmi php5 php4 php3 cfm jsp jhtmi asp aspx asx Figure 3 Properties of wire conf 3 2 Edit urls file 1 Select the tab Configuration 2 Do click on the button with label start_urls txt User manual for Jwire 1 0 User manual for Jwire 1 0 Jwire Execution Visualization Configuration About Jwire Cr con Sa Path opt wiredata start_urls bt es Apply changes Save as Select a configuration D start_urls bd Add http www dot tk http twww tel tk Remove http iwww usal es http campus usal es Figure 4 Editing the urls file Use the path field to select a new urls file Use Apply changes to save the changes done Use Save as to store the changes in a file without applying them to the urls file e Use the drop down Select a configuration to load the data stored in a file for its later modification Path optiwiredata start_urls bd rs Apply changes Save a8 Select a configuration 7 Figure 5 Commands toolbar e Use the table to add delete and edit urls for their use with Jwire User manual for Jwire 1 0 Luis Alberto Garcia Hernandez start_urls bd Add http www dot tk http www tel tk Remove http fwww usal es http campus usal es Figure 6 Urls table 4 Execution module 4 1 Execution of WIRE commands After sele
15. rsor Export Load graph usal es www usal es Total nodes 282 www usal es webusal usal_noticia_repositorio Figure 22 Level 2 As it can be seen in levels 1 and 2 it is always shown the previous level in a shrinked view on the left side so it is always kept visible from where the view is coming and by double clicking on that view it is possible to access again to the corresponding level User manual for Jwire 1 0 15 User manual for Jwire 1 0 5 3 Tools for graphs Either for the force directed web graph or for the multilayer graph there is a set of tools implemented that allow the manipulation and visualization of the different components of the graphs These tools implement most of the standard techniques for graph visualization and are a recommendation when visualizing the graphs in the best way Most of these tools are accessible from the task bar that appears on the right part of the visualization area Figure 23 Toolbar for graphs This tool bar is not always the same The available options are different for each type of graph The ones shown in the image are all of the existing ones and represent the following functions from top to bottom 16 1 Global view It shows on the graph a reduced area that contains the global view of the graph and a rectangular area that is able to spot a determined zone of the graph that is shown amplified in the central area of the window
16. t but it can be shown by clicking or dragging on the black arrows in the left side of the window Finally certain features have been added to the web graph in order to ease its visual analysis The selected nodes with the mouse pointer are colored in red automatically and in yellow the nodes directly linked with them When clicking on a node while pressing the SHIFT key it is shown a small window with some data about the connectivity of the node inside the graph grades neighbourghs etc If it is clicked the node while pressing the Control key the default web browser of the system will be open and it will try to access the web address represented by the node User manual for Jwire 1 0 17 User manual for Jwire 1 0 18 Treemap usal es vii Usalesiwebusalinode ta en get one bed eee endearing ae 220 SU Or poo tamiz sourozhorga nel E E ners um es 1 docu oct iaa ing dc Figure 25 Treemap Jwire Configuration About Jwire Create net Create GraphML Open file Conversor Export Load graph BB Restart NBodyForce GravitationalC on rt 0 BarneSHutTheta mm 899 DragForce DragCoefficient Q 009 SpringForce SpringCoefficient WO 9 99E DefaultSpringLe 50 0 Connectivity Connectivity 1001 Search usal_tk_2 xml Total nodes 20 il OS Figure 26 Force panel and nodes searcher User manual for Jwire 1 0 Luis
17. the window 2 A hierarchical tree is shown with the structure of the folders and files created by WIRE to sotre the generated reports Browse inside this structure until finding the searching file and double click on it to open it in a separated application Jwire searches for the generated files by WIRE in the default location selected by WIRE to store them and comes from the folder selected by WIRE to store the data To configure the path of this folder it is necessary to modify the wire conf file and the WIRE environment variable Finally the content of the reports directory tree can be refreshed making use of the Refresh option of the pop up menu from the Files tab 5 Visualization module The Jwire visualization and visual analysis tools is available by activating the Vi sualizaci n tab F4 0 Jwire Execution Visualization Configuration About Jwire Create GraphML Open file Conversor Export Load graph yw Stop Figure 14 Visaulization module User manual for Jwire 1 0 9 User manual for Jwire 1 0 Jwire offers to WIRE a set of visualization tools that implement most of the stan dard techniques for visualization and to analyze the data extracted from a network in particular with the use of web graphs and treemaps The visualization mod ule makes use of the WIRE commands used to format and generate the extracted data and afterwards it uses the recommended visualization techniques to show those
18. ts sa gud 5 we a ern et OH GS 8 5 Visualization module 9 5 1 Force directed web graph 2 2 nn 11 p2 Multilayer grapli rd ca a de as See Bk a See ari 13 o o Mena sa ae Bb re ae ee Bo Boe SO Gee GS 17 11 User manual for Jwire 1 0 List of Figures 1 Editing wire conf ooa a 2 2 Commands toolbar a amp ooo e leew ool we wl Aad 3 3 Properties o WIKGCOML ssa darea Soe alee SS s 3 4 Editie The uris MGs se cer ee ts se Mh al oe re ae ee oe pt 4 5 Commands toolbar 4 2 0 u beet bok ew a Eee 4 6 US ANC i Sa ae oe a be AENA ee ee Se ee 5 Y Exreeutlon Mod le e e as oa a oe bot a OH SS 5 8 Command execution 0 4 Aaa a e eA ee eS we de da 6 9 Several lists at the same time 0 00 4 Y 10 Standard output and error Y LE E o AS E E TR T 12 Configuring applications amp 83 442 234 4 2 6 44648 644 64 8 1 JCC o VNC Mess aie Gk Bt beet es Me Oy at os Mee a See we es 8 14 Visaulization module 25 e os wre Se a Dein Bde ee Sle a 9 15 Visualization toolbar 4 4 0 a 8 2 0 28 ac SSO e 11 16 Visualizind data in Jwire sg id Sur dnd o as nda ds 12 17 Force directed web graph ee 13 18 Information about the graph 2 2 KL Em m rn 13 19 Hierarchical aggregation of the multilayer graph 14 20 aUe ot Aa a a RA a RANAS 14 DAs EE Ot ok Gt Ste E 15 DE MECA oe on ee ie ie E ee E E Ak Pe Eh Be ae ee E 16 25 Toolbar for eraphr 2 he ic u a a eee we wee ERS eS ERS SS S 17 24

Download Pdf Manuals

image

Related Search

Related Contents

UTM-110取扱説明書(PDF)  HP LaserJet M1522 service manual - ENWW  Selection Guide to Clamp-On Current Probes  Real Options SLS - Real Options Valuation, Inc.  In Win IW-R400  Port Group Management  Toro 14AP80RP744 Lawn Mower User Manual  FUZZY IRUN OPERATING INSTRUCTIONS  Philips DVD+R DR8S2J05C  Attention! Afin de garantir le fonctionnement correct du produit, lisez  

Copyright © All rights reserved.
Failed to retrieve file