Home
\Fuji Xerox
Contents
1. or an electronic document If the input document is imaged pages layout analysis and OCR is first performed on the document This may be done by separate components or by a more sophisti cated OCR system such as that marketed by Nuance http www nuance com or ABBYY http www abbyy com which will convert a scanned document into a PDF document 0032 Working with electronic documents with some markup such as some PDF documents the sections of text can be identified either directly from the tags or if the tags do not contain section information then sections can be identi fied using heuristics based on line spacing font height and indentation For example regions of text with the same line spacing and font height are considered to be in the same section unless the left edge is indented indicating that the current line is the beginning ofa new section A larger spacing betweena pair of lines also indicates the start ofa new section Keyphrase Discovery 0033 There are a number of ways to identify keyphrases and any can be used Turney 1997 A straight forward method is by tagging the part of speech POS of the text and then identifying POS tag sequences that correspond to a noun phrase Turney 1997 Another method is to identify sequences of words between stop words or non content words Chen 1995 US 2009 0193350 A1 0034 When a document has multiple sections and key phrases are selected to be representati
2. and interprets the command that the user had inputted through the display The processing unit may utilize RAM ROM 806 and the CPU 805 for processing the information For example if the user input is a command to highlight all ofthe squares of documents corresponding to the keyword ipsum the processing unit will process those instructions and forward it to the display controller 803 which then proceeds to highlight all of the squares of docu ments with at least one incidence of that keyword Similarly ifthe user inputs a command to highlight all of the squares of documents corresponding to multiple keywords then only the squares of the documents with those multiple keywords are highlighted Other embodiments ofthe invention are also possible through this example computer platform Further more the computer platform is not limited to receiving com US 2009 0193350 A1 mands by tactile interaction other I O devices 804 as previ ously described may be attached to the computer platform for inputting commands for the processing unit 0069 Finally it should be understood that processes and techniques described herein are not inherently related to any particular apparatus and may be implemented by any suitable combination of components Further various types of general purpose devices may be used in accordance with the teach ings described herein It may also prove advantageous to construct specialized apparatus to perform the metho
3. and wanting to keep the font size reasonably readable Our method is one example of keyphrase selection Any method which allows for selecting Jul 30 2009 keyphrases with a specified maximum number of terms per keyphrase and a ranking of keyphrases can be used Interfaces 0047 The system supports multiple different visualiza tions and interaction techniques 0048 FIG 1 illustrates a multi document view in accor dance with an embodiment of the inventive concept On the collection overview screen 100 keywords that best describe the setof documents in the collection are distributed about the interface 101 Each document is represented by a square 102 Square location is determined by the number of occurrences of the displayed keywords in the corresponding document For example the document represented by a square 105 in FIG 1 is located slightly closer to ipsum than lorem and not near dolor or sit amet This indicates that the term ipsum occurs more times than lorem in the document and that the terms dolor and sit amet do not occur in the document Square size is determined using the sum of the occurrences of all keyphrases currently in view in the corre sponding document thus a small square may represent a large document that is not well represented by the keyphrases currently in view Selecting a keyphrase highlights all of the squares of documents with at least one occurrence
4. it hard for users to read textual information using such a device Various solutions to this problem are being developed in the industry 0005 There are many methods that have been proposed for viewing documents and web pages on small screens For example Woodruffet al Using thumbnails to search the Web Pages 198 205 ACM CHI 01 augment web search by auto matically increasing the font size of search terms on returned documents While the authors did not design their system for use on mobile devices it could be implemented on mobile phone web browsers However the described approach does not use text summaries of segmented regions only increases font size in situ rather than offering multiple different visu alizations and interactors and also does not provide mecha nisms for visualizing keyphrases across a non web document 0006 Berkner et al Image and Display Dependent Thumbnails Pages 53 65 SPIE 04 create a condensed view of a document page or a SmartNail by generating a layout with minimal white space that is composed of selected text in a readable size and selected images In the created condensed view the original document layout is usually changed The goal of this study is to create a readable thumb nail for smaller displays such as PDAs However in the described system there is no indexing between different sec tions and the original text 0007 The system described in Erol et al Multimedia Thumbnails for Do
5. of that keyphrase 103 in this case ipsum When multiple key phrases are selected only the squares of documents with at least one occurrence of each selected keyword are highlighted 104 in this case ipsum and This approach could be scaled to larger sets of displayed keyphrases using combi nations of pan and zoom interaction techniques and 3D visu alizations 0049 FIG 2 illustrates document overview with key phrase selection list On the document overview screen 200 keyphrases 201 appear on a selection window As the user scrolls through the selection list 202 the document segments corresponding to those keyphrases are highlighted 203 In the example figure segments corresponding to the chosen key phrase Global Project are highlighted in the upper left and lower left pages A user can navigate through the different keyphrases in the selection list by for example using the up and down keys or by dragging a pen up and down on the screen or by using a touch panel or the like and also navigate through the highlighted segments as well Within a key phrase a user can navigate through different highlighted seg ments When a highlighted segment is selected it is outlined Here the segment in the upper left page is highlighted When a user enters the input to do so by for example pressing the fire button the middle key on a mobile phone or taps a highlighted are with a pen or the like the interfac
6. the device to specify positions in a plane 0057 An external storage device 712 may be connected to the computer platform 701 via bus 704 to provide an extra or removable storage capacity for the computer platform 701 In an embodiment of the computer system 700 the external removable storage device 712 may be used to facilitate exchange of data with other computer systems 0058 The invention is related to the use of computer sys tem 700 for implementing the techniques described herein In an embodiment the inventive system may reside on a machine such as computer platform 701 According to one embodiment ofthe invention the techniques described herein are performed by computer system 700 in response to pro cessor 705 executing one or more sequences of one or more instructions contained in the volatile memory 706 Such instructions may be read into volatile memory 706 from another computer readable medium such as persistent stor age device 708 Execution of the sequences of instructions US 2009 0193350 A1 contained in the volatile memory 706 causes processor 705 to perform the process steps described herein In alternative embodiments hard wired circuitry may be used in place ofor in combination with software instructions to implement the invention Thus embodiments ofthe invention are not limited to any specific combination of hardware circuitry and soft ware 0059 The term computer readable medium as used herein refer
7. the documents displayed are filtered by that value By toggling to neutral from another value the filtering on that value is undone Once a small number of documents are left a user can indicate to display the document titles as shown in FIG 6C A user may also select a document to view using a document viewer mode FIG 6D shows the use of Jul 30 2009 author metadata where documents by the author j adcock have been selected for display In addition to remov ing document icons by toggling metadata values other visual cues could be used such as changing the colors or transpar ency of the icons as well as drawing the icons or not drawn Exemplary Computer Platform 0054 FIG 7 is a block diagram that illustrates an embodi ment of a computer server system 700 upon which an embodiment of the inventive methodology may be imple mented The system 700 includes a computer server platform 701 peripheral devices 702 and network resources 703 0055 The computer platform 701 may include a data bus 704 or other communication mechanism for communicating information across and among various parts of the computer platform 701 and a processor 705 coupled with bus 701 for processing information and performing other computational and control tasks Computer platform 701 also includes a volatile storage 706 such as arandom access memory RAM or other dynamic storage device coupled to bus 704 for storing various information as well a
8. SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVIGATION ON MOBILE DEVICES USING SEGMENTATION AND KEYPHRASE SUMMARIZATION CROSS REFERENCE TO RELATED APPLICATION 0001 This regular U S patent application is based on and claims the benefit of priority under 35 U S C 119 from pro visional U S patent application No 61 024 087 filed on Jan 28 2008 the entire disclosure of which is incorporated by reference herein 0002 This application also claims the benefit of priority of and is a continuation in part of U S application Ser No 12 242 757 by common inventor Scott Carter Francine Chen and Patrick Chiu filed Sep 30 2008 and entitled SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVI GATION ON MOBILE DEVICES USING SEGMENTA TION AND KEYPHRASE SUMMARIZATION which in turn claims the benefit of priority under 35 U S C 119 from provisional U S patent application No 61 024 087 filed on Jan 28 2008 Application Ser No 12 242 757 is fully incor porated herein by reference for all purposes FIELD OF THE INVENTION 0003 This invention generally relates to presenting infor mation on information displays and more specifically to using displays of small size to render documents in the form con venient for viewing by a user BACKGROUND OF THE INVENTION 0004 The size limitations of ultra portable hand held devices such as cell phones or PDAs limit the size of the screen area available for viewing information This makes
9. US 20090193350A1 a2 Patent Application Publication Pub No US 2009 0193350 A1 as United States CARTER et al 43 Pub Date Jul 30 2009 54 SYSTEM AND METHOD FOR SUPPORTING DOCUMENT NAVIGATION ON MOBILE DEVICES USING SEGMENTATION AND KEYPHRASE SUMMARIZATION Scott CARTER Los Altos CA US Francine CHEN Menlo Park CA US Patrick CHIU Menlo Park CA US 75 Inventors Correspondence Address SUGHRUE MION PLLC 2100 PENNSYLVANIA AVENUE N W SUITE 800 WASHINGTON DC 20037 US 73 Assignee FUJI XEROX CO LTD Tokyo JP 21 Appl No 12 268 343 22 Filed Nov 10 2008 Related U S Application Data 63 Continuation in part of application No 12 242 757 filed on Sep 30 2008 60 Provisional application No 61 024 087 filed on Jan 28 2008 Publication Classification 51 Int Cl G06F 3 048 2006 01 92 A SC Cle ai td eter 715 765 57 ABSTRACT Described is system that characterizes segments of document with one or more keyphrases and then uses keyphrases to help users find interesting parts of document Keyphrases are dis played with information about the location ofthe phrase in the document and are used as pointers to quickly move to from overview to section of potential interest In another imple mentation when there are many documents in a collection inventive multi document view can be used to reduce number of documents presented helping user to more efficien
10. ally Leuski s Lighthouse described in Lighthouse showing the way to relevant information Pages 125 130 IEEE InfoVis 00 2000 is a search engine that presents returned documents with both a flat list and a cluster of spheres positioned according to the similarity of their corresponding documents 0011 Despite the foregoing advances the conventional industry approaches are deficient in their ability to facilitate efficient use of displays of small size to render documents in the form convenient for viewing by a user SUMMARY OF THE INVENTION 0012 The inventive methodology is directed to methods and systems that substantially obviate one or more of the above and other problems associated with conventional tech niques for displaying of documents on small information displays 0013 Various embodiments of the inventive concept include devices methods and computer readable mediums containing computer code for identifying multiple segments of a document determining at least one keyphrase and asso ciating the determined at least one keyphrase with each iden tified segment displaying the determined least one key phrase and upon a user s selection ofthe least one keyphrase enabling the user to view the corresponding segment of the document 0014 Various embodiments of the inventive concept also include a device with a display unit a sensing unit which is configured to sense input a processing unit which is operable to pro
11. cept 0021 FIG 2 illustrates document overview with key phrase selection list in accordance with an embodiment ofthe inventive concept 0022 FIG 3 illustrates page overview in accordance with an embodiment of the inventive concept 0023 FIG 4 illustrates reflow left and zoomed right views in accordance with an embodiment of the inventive concept 0024 FIG 5 illustrates alternative keyphrase selection list in accordance with an embodiment of the inventive concept 0025 FIG 6A 6D illustrate another example of a multi document view in accordance with an embodiment of the inventive concept 0026 FIG 7 illustrates an exemplary embodiment of a computer platform upon which the inventive system may be implemented 0027 FIG 8 illustrates an example functional diagram of how the present invention relates to the computer platform DETAILED DESCRIPTION 0028 Inthe following detailed description reference will be made to the accompanying drawings in which identical Jul 30 2009 functional elements are designated with like numerals The aforementioned accompanying drawings show by way of illustration and not by way of limitation specific embodi ments and implementations consistent with principles of the present invention These implementations are described in sufficient detail to enable those skilled in the art to practice the invention and it is to be understood that other implementa tions may be utilized and
12. cess the input to identify multiple segments of a docu ment and to forward instructions to a display controller to highlight zoom or navigate through the identified document segments and the display controller operable to process the forwarded instructions and to generate a resulting visual rep resentation for display on the display unit The processing unit is further configured to determine at least one keyphrase and associate the determined at least one keyphrase with each identified segment 0015 Various embodiments of the inventive concept also include devices methods and computer readable mediums for displaying documents as visual representations and US 2009 0193350 A1 grouping the documents based on the occurrences of key phrases wherein the size of each visual representation depends on a function of the number of occurrences of all keyphrases in the corresponding document highlighting all ofthe visual representations with at least one occurrence of a selected keyphrase and highlighting only visual representa tions with at least one occurrence of each selected keyphrase when multiple keyphrases are selected 0016 Various embodiments of the inventive concept include methods computer programming products and sys tems for preparing multiple documents determining at least one value of at least one type of metadata corresponding to each of the multiple documents and associating the at least one value of the metadata with t
13. cuments Pages 231 240 ACM Multime dia 06 automatically creates an animation that pans to important segments on a web page The described approach Jul 30 2009 also includes audio cues that include keyphrases for the docu ment text as well as figure captions However this approach does not augment manual interaction and relies on audio which at times may be unavailable or inappropriate 0008 In M Hood E Newspapers Digital Deliverance IEEE Spectrum February 2007 an iLiad document reader operates to overlay the title and first sentence of news articles on top ofthe full document 0009 Hearst s TileBars described in TileBars Visualiza tion of Term Distribution Information in Full Text Informa tion Access Pages 59 66 ACM CHI 95 1995 include rows of tiles corresponds to the results of query term sets where each tile represents a text segment and the length of a row represents the length of the document The term fre quency is indicated by the gray level of the tile and the term distribution by these tiles as they appear in the overall graphic representation 0010 Rattenbury and Canny s CAAD system described in CAAD An Automatic Task Support System Pages 687 696 ACM CHI 07 2007 represents collections of docu ments in a pannable zoomable interface However this sys tem clusters files related to a common activity rather than keyphrases and the display is not designed for a mobile interface Addition
14. d steps described herein The present invention has been described in relation to particular examples which are intended in all respects to be illustrative rather than restrictive Those skilled in the art will appreciate that many different combinations of hardware software and firmware will be suitable for prac ticing the present invention For example the described soft ware may be implemented in a wide variety of programming or scripting languages such as Assembler perl shell PHP Java etc 0070 Moreover other implementations of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein Various aspects and or components of the described embodiments may be used singly or in any combination in the inventive information display and navigation system It is intended that the specification and examples be considered as exemplary only with a true scope and spirit of the invention being indicated by the following claims What is claimed is 1 A method comprising preparing a plurality of documents determining at least one value of at least one type of meta data corresponding to each ofthe plurality of documents and associating the at least one value of the metadata with the corresponding document displaying a plurality of icons corresponding to the plural ity of documents in 2 or 3 dimensions in a first display region and the a
15. document is represented by a square as in FIG 1 but this view differs in several ways Rather than being laid out based on keywords the documents are laid out on the horizontal axis according to a value of a date type metadata such as creation date publication date last modified date last annotated date and last referenced date The vertical axis is random to spread out the documents In addition metadata is presented in a sidebar 601 Examples of metadata types include keyphrases authors topics loca tion time or publications One type of metadata values is displayed at a time For example the most frequent key phrases could be displayed as shown in FIG 6A and then upon pushing a button the metadata type could cycle so that authors are displayed for example When a user indicates that a metadata value e g Search 602 is of interest a symbol is displayed beside the metadata 602 as shown in FIG 6B and only documents with that metadata value are displayed When a user indicates that a metadata value e g Video 603 is not of interest a symbol is displayed beside the metadata 603 as shown in FIG 6B and all documents with those metadata values are not displayed Other indicators could be used instead of symbols such as color e g red and green or font e g bold and italic Each metadata value can be set to or neutral By toggling a metadata value to or
16. e zooms in on the appropriate highlighted segment 0050 FIG 3 illustrates page overview On the page over view screen 300 all of the keyphrases are overlaid on top of their respective segments 301 The keyphrases can also be mapped to numbers on the keypad which are shown directly next to each keyphrase 302 When a user either taps a key phrase with a pen or enters a number the interface zooms in on the appropriate segment Here if the user pressed key 3 the application would zoom into that segment 0051 FIG 4 illustrates reflow left and zoomed right views On the zoomed image 401 and text reflow screens 400 the user can navigate through a page s keyphrases on a selec tion window 402 As the user scrolls through the selection list the document segments corresponding to those keyphrases are highlighted in an overview visualization 403 A user navi US 2009 0193350 A1 gates through the different keyphrases in the selection list using the up and down keys or by dragging a pen up and down on the screen and selects a keyphrase by pressing the fire button or by selecting a highlighted keyphrase 404 When a user selects a keyphrase the application then zooms into the appropriate segment 405 0052 FIG 5 illustrates an alternative keyphrase selection list In another embodiment keyphrases in the selection win dow are shown with small graphic icons next to them 500 This technique adheres to an effective information visua
17. east value of at least one type of metadata corresponding to each of the plurality of documents US 2009 0193350 A1 Jul 30 2009 7 and associate the at least value of the metadata with wherein the sensing unit is operable to sense selecting of the corresponding document and at least one value of the metadata by a user and a display controller coupled to the display unit and wherein the processing unit is further operable to operable to cause to be displayed a plurality of icons cause varying the display states of the plurality ofthe corresponding to the plurality of documents in 2 or 3 icons in the first display region based on the selected dimensions in a first display region and at least one of at least one value of the metadata the at least value of the metadata at a second display region kook ko
18. ed and include 0037 1 tf number of times a term occurs in the current section 0038 2 tf number of times a term occurs in the docu ment 0039 3 number of documents in which a term occurs at least once in an English corpus We used a list from the Berkeley and Stanford Digital Libraries project which was available at ftp elib cs berkeley edu outgoing docfreg but is not available online anymore 0040 4 df number of sections in a document in which a term occurs at least once 0041 5 k number of times the candidate keyphrase has previously been selected as a keyphrase 0042 6 t number of tokens in the keyphrase 0043 7 1 location of first mention of the term in the paragraph 0044 The weighted combination of terms is given by Score k Smd 0045 where A is the weight given a feature and f is the value of feature i for keyphrase candidate k in section s in document d Other combination or ranking models can be used For example if training data labeled with keyphrases foreach section is available then more powerful models such as a maximum entropy model Berger et al 1996 could be used instead 0046 Once each ofthe keyphrases is scored they are then ranked against each other and the best keyphrase s is selected for each section For our application we select only the best keyphrase and limit the maximum number of terms to two because of limited screen space
19. he telephone line and use an infra red transmitter to convert the data to an infra red signal An infra red detector can receive the data carried in the infra red signal and appropriate cir cuitry can place the data on the data bus 704 The bus 704 carries the data to the volatile storage 706 from which pro cessor 705 retrieves and executes the instructions The instructions received by the volatile memory 706 may option ally be stored on persistent storage device 708 either before or after execution by processor 705 The instructions may also be downloaded into the computer platform 701 via Internet using a variety of network data communication protocols well known in the art 0062 The computer platform 701 also includes a commu nication interface such as network interface card 713 coupled to the data bus 704 Communication interface 713 provides a two way data communication coupling to a network link 714 that is connected to a local network 715 For example com munication interface 713 may be an integrated services digi tal network ISDN card or a modem to provide a data com munication connection to a corresponding type of telephone line As another example communication interface 713 may be a local area network interface card LAN NIC to provide a data communication connection to a compatible LAN Wireless links such as well known 802 11a 802 11b 802 11g and Bluetooth may also used for network implementa tion In any such imp
20. he corresponding document displaying multiple icons corresponding to the multiple docu ments in 2 or 3 dimensions in a first display region and at the least one value ofthe metadata at a second display region and selecting the at least one value of the metadata for varying the display states of the multiple icons in the first display region based on the selected at least one value of the metadata 0017 Additional aspects related to the invention will be set forth in part in the description which follows and in part will be obvious from the description or may be learned by practice of the invention Aspects of the invention may be realized and attained by means of the elements and combina tions of various elements and aspects particularly pointed out in the following detailed description and the appended claims 0018 Itis to be understood that both the foregoing and the following descriptions are exemplary and explanatory only and are not intended to limit the claimed invention or appli cation thereof in any manner whatsoever BRIEF DESCRIPTION OF THE DRAWINGS 0019 The accompanying drawings which are incorpo rated in and constitute a part of this specification exemplify the embodiments of the present invention and together with the description serve to explain and illustrate principles of the inventive technique Specifically 0020 FIG 1 illustrates a multi document view in accor dance with an embodiment of the inventive con
21. ication Jul 30 2009 Sheet 5 of 8 US 2009 0193350 A1 Lorem 4 Ipsum dolor 2 Ipsum dolor dolor Sit amet m Consectetuer Figure 5 Patent Application Publication Jul 30 2009 Sheet 6 of 8 10 Oo Remix rooms Redefining the Reading in L UbiMEET De Virtual Physics Circus Figure 6C US 2009 0193350 A1 xt Keyphrases Collaboration Search Display 601 Keyphrases Collaboration Search 604 Display Office Video Exit Keyphrases 602 ine A Search Display 603 Office Video Authors g golovchinsk y j adcock t j pickens p chiu s carter Figure 6D US 2009 0193350 A1 Jul 30 2009 Sheet 7 of 8 Patent Application Publication eoeyeyul S04 04 Kemaje9 sna 4 6 0 5 5 OJEMULJIJ JUSJSISlaJ oBeJojs lt MJOMJSN YOMAN 614 4 4 5 Bugulod uonisod Jasno Li Lt gt Aejdsig Patent Application Publication Jul 30 2009 Sheet 8 of 8 US 2009 0193350 A1 H sum h orem Sensing Unit In im sit amet ue Display Controller Exi Processing Unit Figure 8 US 2009 0193350 A1
22. lementation communication interface 713 sends and receives electrical electromagnetic or optical signals that carry digital data streams representing various types of information Jul 30 2009 0063 Network link 713 typically provides data commu nication through one or more networks to other network resources For example network link 714 may provide a connection through local network 715 to a host computer 716 ora network storage server 717 Additionally or alternatively the network link 713 may connect through gateway firewall 717 to the wide area or global network 718 such as an Inter net Thus the computer platform 701 can access network resources located anywhere on the Internet 718 such as a remote network storage server 719 On the other hand the computer platform 701 may also be accessed by clients located anywhere on the local area network 715 and or the Internet 718 The network clients 720 and 721 may them selves be implemented based on the computer platform simi lar to the platform 701 0064 Local network 715 and the Internet 718 both use electrical electromagnetic or optical signals that carry digital data streams The signals through the various networks and the signals on network link 714 and through communication interface 713 which carry the digital data to and from com puter platform 701 are exemplary forms of carrier waves transporting the information 0065 Computer platform 701 can send messages and
23. liza tiondesign principle known as Small Multiples Tufte 1990 The graphic icon represents a document page with regions highlighted 503 that correspond to the spatiallocation of each instance of the keyphrase on the page or the segments on the current page in which the keyphrase appears Also in this embodiment a horizontal pane 501 across the top of the selection list highlights all of the pages in the document on which the highlighted keyphrase appears Boxes highlight the page the user is currently viewing as well as the currently selected keyphrase 502 The small graphic icons allow the reader to infer semantic information about each keyphrase by its location e g a keyphrase is part of the title The distribu tion ofthe keyphrases can also be read off by looking at these graphic icons The highlights could additionally be coded by color or intensity to indicate the number of times a keyphrase appears in the segment or for the horizontal pane the page 0053 FIGS 6A through 6D illustrate another embodi ment of a multi document view in accordance with an embodiment of the inventive concept When there are many documents in a collection the inventive multi document view can be used to reduce the number of documents presented helping a user to more efficiently find documents of interest Inthis view a user possibly repeatedly filters the documents displayed based on the metadata values An exemplary view is shown in FIG 6A Each
24. n the selected least one type of the value of the metadata is a keyphrase and wherein only the icons corresponding to the documents containing with the keyphrase are displayed in the first display region 10 The method of claim 1 further comprising displaying titles of the documents corresponding to the icons displayed in the first display region 11 A computer readable medium embodying a set of instructions which when executed by one or more proces sors cause the one or more processors to perform a method comprising preparing a plurality of documents determining at least one value of at least one type of meta data corresponding to each ofthe plurality of documents and associating the at least value ofthe metadata with the corresponding document displaying a plurality of icons corresponding to the plural ity of documents in 2 or 3 dimensions in a first display region and at least one value of the metadata at a second display region and selecting the at least one value ofthe metadata for varying the display states of the plurality of the icons in the first display region based on the selected the at least one value of the metadata 12 The computer readable medium of claim 11 wherein varying the display states comprises at least one of drawing or not drawing the icons changing colors of the icons and changing transparency of the icons 13 The computer readable medium of claim 11 wherein each ofthe plurality of icons are
25. positioned in the first display region based at least in part on a value of a date type of metadata of the corresponding the plurality of documents 14 The computer readable medium of claim 11 wherein the metadata comprises at least one type of a keyphrase an author a topic a location a time or a publication associated with the corresponding document 15 The computer readable medium of claim 14 wherein one type of the metadata 1s displayed at a time 16 The computer readable medium of claim 11 wherein the metadata comprises a frequent keyphrase 17 The computer readable medium of claim 11 wherein upon selecting the at least one value of the metadata only the icons corresponding to the documents filtered using the selected least one value of the metadata are displayed in the first display region 18 The computer readable medium of claim 17 wherein the method further comprises changing a color ofthe selected at least one value of the metadata 19 The computer readable medium of claim 17 wherein the selected least one type of the value of the metadata is a keyphrase and wherein only the icons corresponding to the documents containing with the keyphrase are displayed in the first display region 20 An apparatus comprising a display unit a sensing unit operable to sense input a processing unit coupled to the sensing unit and operable to process the input and to prepare a plurality of documents determine at l
26. receive data including program code through the variety of network s including Internet 718 and LAN 715 network link 714 and communication interface 713 In the Internet example when the system 701 acts as a network server it might transmit a requested code or data for an application program running on client s 720 and or 721 through Internet 718 gateway firewall 717 local area network 715 and com munication interface 713 Similarly it may receive code from other network resources 0066 The received code may be executed by processor 705 as it is received and or stored in persistent or volatile storage devices 708 and 706 respectively or other non vola tile storage for later execution In this manner computer system 701 may obtain application code in the form of a carrier wave 0067 FIG 8 illustrates an example functional diagram of how the present invention relates to the computer platform 0068 Presented is an example of how an exemplary embodiment of the present invention utilizes segmentation and keyphrase summarization for document navigation the example computer platform being used and an example as to how it relates to the computer platform Here the figure illustrates the collection overview screen embodiment When input is given through the display 800 a sensing unit 801 senses the input and forwards it to the processing unit This information is then sent to a processing unit 802 which pro cesses the information
27. s instructions to be executed by processor 705 The volatile storage 706 also may be used for storing temporary variables or other intermediate information during execution of instructions by processor 705 Computer platform 701 may further include a read only memory ROM or EPROM 707 or other static storage device coupled to bus 704 for storing static information and instruc tions for processor 705 such as basic input output system BIOS as well as various system configuration parameters A persistent storage device 708 such as a magnetic disk optical disk or solid state flash memory device is provided and coupled to bus 701 for storing information and instruc tions 0056 Computer platform 701 may be coupled via bus 704 to a display 709 such as a cathode ray tube CRT plasma display or a liquid crystal display LCD for displaying information to a system administrator or user of the computer platform 701 An input device 710 including alphanumeric and other keys is coupled to bus 701 for communicating information and command selections to processor 705 Another type of user input device is cursor control device 711 such as a mouse a trackball or cursor direction keys for communicating direction information and command selec tions to processor 704 and for controlling cursor movement on display 709 This input device typically has two degrees of freedom in two axes a first axis e g x and a second axis e g y that allows
28. s to any medium that participates in providing instructions to processor 705 for execution The computer readable medium is just one example of a machine readable medium which may carry instructions for implementing any of the methods and or techniques described herein Such a medium may take many forms including but not limited to non volatile media and volatile media Non volatile media includes for example optical or magnetic disks such as storage device 708 Volatile media includes dynamic memory such as volatile storage 706 0060 Common forms of computer readable media include for example a floppy disk a flexible disk hard disk magnetic tape or any other magnetic medium a CD ROM any other optical medium punchcards papertape any other physical medium with patterns of holes a RAM a PROM an EPROM a FLASH EPROM a flash drive a memory card any other memory chip or cartridge a carrier wave as described hereinafter or any other medium from which a computer can read 0061 Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 705 for execution For example the instructions may initially be carried on a magnetic disk from aremote computer Alternatively a remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem A modem local to computer system 700 can receive the data on t
29. t least one value of the metadata at a second display region and selecting the at least one value of the metadata for varying the display states of the plurality of the icons in the first display region based on the selected the at least one of the value of the metadata 2 The method of claim 1 wherein varying the display states comprises at least one of drawing or not drawing the icons changing colors of the icons and changing transpar ency of the icons 3 The method of claim 1 wherein each of the plurality of icons are positioned in the first display region based at least in part on a value of a date type of metadata of the corre sponding plurality of documents 4 The method of claim 1 wherein the metadata comprises at least one type of a keyphrase an author a topic a location a time or a publication associated with the corresponding document 5 The method of claim 4 wherein one type of the metadata is displayed at a time 6 The method of claim 1 wherein the metadata comprises a frequent keyphrase 7 The method of claim 1 wherein upon selecting the at least one of the value of the metadata only the icons corre sponding to the documents filtered using the selected least one of the value of the metadata are displayed in the first display region Jul 30 2009 8 The method of claim 7 further comprising changing a color ofthe selected at least one ofthe value of the metadata 9 The method of claim 7 wherei
30. that structural changes and or sub stitutions of various elements may be made without departing from the scope and spirit of present invention The following detailed description is therefore not to be construed in a limited sense Additionally the various embodiments of the invention as described may be implemented in the form ofa software running on a general purpose computer in the form of a specialized hardware or combination of software and hardware 0029 Viewing and identifying interesting sections of documents on a small screen such as on a cell phone or PDA is difficult An embodiment of the invention provides a method that uses keyphrases for easily moving to interesting sections ofthe document while at the same time helping users to be aware of document context as they read portions of the document Technical Details 0030 To process a document and create a visualization sections or segments of text are first identified In our imple mentation these sections generally correspond to paragraphs or figure captions The sections could alternatively be speci fied to be coarser such as text under a sub heading Next one or more keyphrases are associated with each text section The keyphrases and identified sections are then used by the inter face for visualizing and interacting with the document Text Section Identification 0031 The input document may be a set of imaged pages such as from a scanned paper document
31. tly find documents of interest In this view a user possibly repeat edly filters documents displayed based on metadata values In one implementation icons corresponding to documents are displayed on a display device together with metadata corresponding to the documents When the value ofthe meta data is selected by the user display state of the icons corre sponding to document is varied based on selected value of metadata M LI 201 Fuji Xerox Toyota Motor Global Project 202 eustomer service Ext Patent Application Publication Jul 30 2009 Sheet 1 of 8 US 2009 0193350 A1 101 102 105 104 Figure 1 Patent Application Publication Jul 30 2009 Sheet 2 of 8 201 Fuj Xerox Toyota Motor Global Project 202 customer service US 2009 0193350 A1 Figure 2 Patent Application Publication Jul 30 2009 Sheet 3 of 8 US 2009 0193350 A1 joco Frontline Reports en 300 Fuji Xerox 2 Toyota Motor ona na a 301 302 5 service manual Exit Figure 3 Patent Application Publication Jul 30 2009 Sheet 4 of 8 US 2009 0193350 A1 Toyota Motor Global Project Fuji Xerox customer service Ma Toyota Motor service manual MEE Global Project customer s rvice full screen 404 r 401 402 Figure 4 Patent Application Publ
32. ve of each section there are methods that take into account previous key words or keyphrases that have already been identified in the text and to give greater weight to terms that have not been selected as a keyphrase Carbonell and Goldstein 1998 proposed the use of Maximal Marginal Relevance to rank documents using a weighted combination of the similarity of a document to a query and the similarity of a document to previously selected documents Brants et al 2004 propose the selection of key words and keyphrases for interactive topic based summari zation using a statistical measure of segment characterization and differentiation such as pointwise Mutual Information 0035 An embodiment of the inventive method for identi fying keyphrases identifies sequences of words between stop words as candidate keyphrases For each section of text the candidate s keyphrases are scored and the best N keyphrases selected where N is pre specified and may be dependent on the amount of screen space available in the application 0036 To select the best keyphrases we use a weighted combination of features similar in spirit to a maximum entropy model Keyphrases are found for each section taking into account the keyphrases selected for other sections The selection could be optimized over all combinations but for simplicity we order the text sections and then select key phrases for each text section one section at a timer The features are text bas
Download Pdf Manuals
Related Search
Related Contents
Specification Stomacher® 80 Biomaster Mode d`emploi TOUR FRANCE de INSCRIPTION, MODE D`EMPLOI Manfrotto Pro Field XXL Smart Response PE 2008-14 Mitsubishi Evolution / Type33F Philips GC7240 Iron User Manual Acron Hand Tools User Manual Copyright © All rights reserved.
Failed to retrieve file