Home

Mobile image-based information retrieval system

image

Contents

1. tially it allows to recognize an object based on a single picture 0068 The macro algorithmic principles of the object recognition engine are extraction of feature vectors 162 from key interest points 164 comparison 168 of correspond ing feature vectors 166 similarity measurement and com parison against a threshold to determine if the objects are identical or not see FIG 6 Actually we believe that today there is large consensus that the elements listed above are the basic elements of any successful recognition system 0069 Taking Lowe s system as the baseline implemen tation we suggest employing certain alternative sub modules to perform certain steps better 0070 1 Interest Operator 0071 Using phase congruency of Gabor wavelets is superior to many other interest point operators suggested in the literature such as affine Harris or DOG Laplace Kovesi 1999 Oct 26 2006 0072 2 Feature Vectors 0073 Instead of Lowe s SIFT features we make exten sive use of Gabor wavelets as a powerful general purpose data format to describe local image structure However where appropriate we augment them with learned features reminiscent of the approach pioneered by Viola and Jones Viola and Jones 1999 Finally we started to study the use of a dictionary of parameterized sets of feature vectors extracted from massive of image data sets that show varia tions under changing viewpoint and lighting conditions of generi
2. The remote media server forwards mobile media content to the mobile telephone based on the associated text identifier 0007 In a more detailed feature of the invention the remote recognition server may include means for adding an object representation to the database using the mobile tele phone 100081 Alternatively the present invention may be embod ied an image based information retrieval system that includes a mobile telephone and a remote server The mobile telephone has a built in camera and a communication link for transmitting an image from the built in camera to the remote server The remote server has an optical character recognition engine for generating a first confidence value based on an image from the mobile telephone an object recognition engine for generating a second confidence value based on an image from the mobile telephone a face recognition engine for generating a third confidence value based on an image from the mobile telephone and an Oct 26 2006 integrator module for receiving the first second and third confidence values and generating a recognition output 0009 In more detailed features of the invention the object recognition engine may comprise a textured object recognition engine a rigid texture object recognition engine and or an articulate object recognition engine 0010 Additionally the present invention may be embod ied in an image based information retrieval system that includes a
3. NOTE This will not remove any media content associated with this image 0180 7 3 5 Adding Content for an Image in the OR Server Database 0181 For the case where you used the client to add an image to the OR Server database and supplied it with a new ID and you do not see it in the combo on the Update page do the following 0182 Follow the Adding New Content instructions and use the reference name you entered on the client for the ID 0183 7 4 Adding an Image to the OR Server Using the Client 0184 Step 1 0185 Find an appropriate object that you wish to attach content to FIG 10 0186 Step 2 0187 Run iScout It will initialize the camera allowing you to take a picture of the object 0188 Click the joystick in to snap an image FIG 11 Oct 26 2006 0189 Step 3 0190 After taking an image of the object you will be presented with two choices 0191 0192 2 Add to Database 0193 Select Add to Database click Option and then Continue The application will ask if it can connect to the internet click Yes 1 Recognize Image 0194 You will be prompted for a reference name Type in a name using the phones keypad If you already set up content for this object using the Specifying Content section you may enter the reference name you added to the system Alternatively you can supply a new name now and follow the Add New Content section to supply content at a later time NOTE Spa
4. are likely to more similar 0048 Location information can also be used in obvious ways Staying with the hotel example one would arrange the search process such that only object representations of hotels are activated in the query of hotels that are close to the current location of the user 0049 Overall it will be helpful to organize the image search such that objects are looked up in a sequence in which Oct 26 2006 object representations close in time and space will be searched before object representations that are older were taken at a different time of day or carry a location label further away are considered 0050 3 Client Side 0051 3 1 Feature Extraction on the Client Side 0052 The simplest implementation of a search engine is one in which the recognition engine resides entirely on the server However for a couple of reasons it might be more desirable to run part of the recognition on the phone One reason is that this way the server has less computational load and the service can be run more economically The second reason is that the feature vectors contain less data then the original image thus the data that needs to be send to the server can be reduced 0053 3 2 Caching of Frequent Searches 0054 Another way to keep the processing more local on the handset is to store the object representations of the most frequently requested objects locally on the handset Infor mation on frequently requested searches
5. can be obtained on an overall group or individual user level 0055 3 3 Image Region Delivery on Demand 0056 To recognize an object in a reliable manner suffi cient image detail needs to be provided In order to strike a good balance between the desire for a low bandwidth and a sufficiently high image resolution one can use a method in which a lower resolution representation of the image is send first If necessary and if the object recognition engines discover a relevant area that matches well one of the existing object representations one can transmit additional detail 0057 3 4 Over the Air Download 0058 For a fast proliferation of the search service it will be important to allow a download over the air of the client application The client side application would essentially acquire an image and send appropriate image representa tions to recognition servers It then would receive the search results in an appropriate format Advantageously such an application would be implemented in Java or BREW so that it is possible to download this application over the air instead of preloading it on the phone 0059 3 5 Reducing the Search Through Extra Input 0060 Often it will be helpful to provide additional input to limit the image based search to specific domains such as travel guide or English dictionary External input to confine the search to specific domains can come from a variety of sources One is of course text
6. input via typing or choosing from a menu of options Another one is input via Bluetooth or other signals emitted from the environment A good example for the later might be a car manual While the user is close to the car for which the manual is available a signal is transmitted from the car to his mobile device that allows the search engine to offer a specific search tailored to car details Finally a previous successful search can cause the search engine to narrow down search for a subsequent search 0061 Accordingly with reference to FIG 5 the present invention may be embodied in an image based information retrieval system 10 including a mobile telephone 12 and a US 2006 0240862 A1 remote server 14 The mobile telephone has a built in camera 16 a recognition engine 32 for recognizing an object or feature in an image from the built in camera and a communication link 18 for requesting information from the remote server related to a recognized object or feature 100621 Accordingly with reference to FIGS 4 and 5 the present invention may be embodied in an image based information retrieval system that includes a mobile tele phone 12 and a remote recognition server 14 The mobile telephone has a built in camera 16 and a communication link 18 for transmitting an image 20 from the built in camera to the remote recognition server The remote recognition server has an optical character recognition engine 22 for generating a first confi
7. mobile telephone and a remote server The mobile telephone has a built in camera a recognition engine for recognizing an object or feature in an image from the built in camera and a communication link for requesting informa tion from the remote server related to a recognized object or feature 0011 In more detailed features of the invention the object may be an advertising billboard and the related information may be a web page address Alternatively the object may be a car and the related information may be a car manual Also the object may be a product and the related information may be a payment confirmation The object may be a bus stop sign and the related information may be real time information on the arrival of the next bus Further the object may be a book and the related information may be an audio stream 100121 Tn other more detailed features of the invention the object feature may text and the related information may be a translation of the text or a web page address provided in real time Similarly the object feature may be an advertise ment and the related information may be a web page address Also the object feature may be a picture and the related information may be an audio stream Further the object feature may be an equipment part and the related information may be an operation and maintenance manual for the equipment 0013 Other objects features and advantages will become apparent to those skilled in th
8. t li uoud liqoul v o xoeq uoneuuojut HWSUE o q eu ym PAJEDOSSE 1uejuoo ey s sn EIPOW uoud N 21 Juajuo9 BIPO W llqoN aew QI Jeynuep K L ecl 1enu8 E P N p lqo y jues pue eBsaweo euoud liqouu WOJ oew 1ane uoniuBcoey Pafqo TOMES A S BIPOL OU O LOJEW SU UHA p leoosse q su Spues pue eu seuojeu 021 Janas Z US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 4 of 11 p H ce uonduos q eBeui 0 vc ul5u4 92 82 uoluBooay 1221040 uolufBoo x 122140 panxa eje nonv 99e j pily jeondo 1 M S uoniuDoo9y Patent Application Publication Oct 26 2006 Sheet 5 of 11 US 2006 0240862 A1 FIG 5 44 5 e o o 2 E 5 a US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 6 of 11 9 H 991 211 aew ue ul qISIA 51 3 lqo ue 18U 8UM epi28p pjoysaiy e 0 34095 AJIEJIWIS EWIXELU ayeduio2 99 SaJnjegj WJOJSUE2 jeu sBun s 19joure1ed pue saaow 24SEJ8 104 J8j OIJU0 s ul5uq Jajawezeg UOI EJSUEJJ AJUEJIWUIS 5195 uuojsueJ zey s um s pue suonen uos ude 5 nsel J
9. 2 A1 multiple specialized recognition engines that analyze the query images with respect to different objects 0042 We suggest an architecture in which multiple rec ognition engines are applied to an incoming image Each engine returns the recognition results with confidence values and an integrating module that outputs a final list of objects recognized The simplest fusion rule is an and rule that simply sends all the relevant textual ID s to the media server Another useful rule if one wants to reduce the feedback to a single result is to introduce a hierarchy among the recog nition disciplines The channel which is highest in the hierarchy and which returns a result is selected to forward the text ID to the media server FIG 4 shows an effective recognition server 14 that is comprised of multiple special ized recognition engines that focus on recognizing certain object classes 0043 2 2 Maintaining the Image Database 0044 Objects change Therefore it is important to regu larly update the object representations This can be achieved in two ways One way is that the service providers regularly add current image material to refresh the object representa tions The other way is to keep the images that users submit for query and upon recognition feed them into the engine that updates the object representations The later method requires a confidence measure that estimates how reliable a recognition result is This is necessary in o
10. US 20060240862A1 ay United States a2 Patent Application Publication 10 Pub No US 2006 0240862 A1 Neven et al 43 Pub Date Oct 26 2006 54 76 21 22 63 60 MOBILE IMAGE BASED INFORMATION RETRIEVAL SYSTEM Inventors Hartmut Neven Malibu CA US Hartmut Neven SR Aachen DE Correspondence Address ROBROY R FAWCETT 1576 KATELLA WAY ESCONDIDO CA 92027 US Appl No 11 433 052 Filed May 12 2006 Related U S Application Data Continuation in part of application No 11 129 034 filed on May 13 2005 which is a continuation in part of application No 10 783 378 filed on Feb 20 2004 Provisional application No 60 570 924 filed on May 13 2004 Provisional application No 60 680 908 2 Server 120 matches the image and sends the ID associated with the match to the media server rl r Object Recognition Server 122 filed on May 13 2005 Provisional application No 60 727 313 filed on Oct 17 2005 Publication Classification 51 mt CI H04M 1 00 2006 01 TE o EA RAGE 455 550 1 57 ABSTRACT An image based information retrieval system including a mobile telephone a remote recognition server and a remote media server the mobile telephone having a built in camera and a communication link for transmitting an image from the built in camera to the remote recognition server and for receiving mobile media content from the remote media server
11. ates how VMS allows using traditional print media as pointers to interactive content 0109 Another useful application of image based search exists in the print to internet space By submitting a picture showing a portion of a printed page to a server a user can retrieve additional real time information about the text Thus together with the publishing of the newspaper maga zine or book it will be necessary to submit digital pictures of the pages to the recognition servers so that each part of the printed material can be annotated Since today s printing process in large parts starts from digital versions of the printed pages this image material is readily available In fact Oct 26 2006 it will allow using printed pages in whole new ways as now they could be viewed as mere pointers to more information that is available digitally 0110 A special application is an ad to phone number feature that allows a user to quickly input a phone number into his phone by taking a picture of an ad Of course a similar mechanism would of useful for other contact infor mation such as email SMS or web addresses 0111 5 2 1 Interactive Digital Billboard 0112 Visual advertising content may be displayed on a digital billboard or large television screen A user may take of picture of the billboard and the displayed advertisement to get additional information about the advertised product enter a contest etc The effectiveness of the advertisem
12. bile media content to the mobile telephone based on the associated text identifier 2 An image based information retrieval system as defined in claim 1 wherein the remote recognition server has an optical character recognition engine for generating a first confidence value based on an image from the mobile tele phone an object recognition engine for generating a second confidence value based on an image from the mobile tele phone a face recognition engine for generating a third confidence value based on an image from the mobile tele phone and an integrator module for receiving the first second and third confidence values and generating the associated text identifier 3 An image based information retrieval system as defined in claim 2 wherein the object recognition engine comprises a textured object recognition engine 4 An image based information retrieval system as defined in claim 2 wherein the object recognition engine comprises a rigid texture object recognition engine 5 An image based information retrieval system as defined in claim 2 wherein the object recognition engine comprises an articulate object recognition engine 6 An image based information retrieval system as defined in claim 1 wherein the remote recognition server includes means for adding an object representation to the database using the mobile telephone 7 An image based information retrieval system compris ing a mobile telephone and a remote server t
13. c surface patches Locons 0074 3 Matching 170 0075 Almost all matching routines described in the lit erature only consider similarity between feature vectors We also explicitly estimate displacement vectors as well as parameter sets that describe environmental conditions such as viewpoint and illumination conditions This can be achieved by considering the phase information of Gabor wavelets or through training of dedicated neural networks 0076 Consequently we believe that our system can more rapidly learn new objects and recognize them under a wider range of conditions than anyone else Last but not least we have extensive experience in embedded recognition sys tems The recognition algorithms are available for various DSPs and microprocessors 0077 4 1 1 View Fusion 0078 To support the recognition of objects from multiple viewpoints feature linking is applied to enable the use of multiple training images for each object to completely cover a certain range of viewing angles 0079 If one uses multiple training images of the same object without modification of the algorithm the problem of competing feature datasets arises The same object feature might be detected in more than one training image if these images are taken from a sufficiently similar perspective The result is that any given feature can be present as multiple datasets in the database Since any query feature can be matched to only one of the featu
14. can assist the recognition process Inside the coffeehouse you study the menu but your French happens to Oct 26 2006 be a bit rusty Your image based search engine supports you in translating words from the menu so that you have at least an idea of what you can order 0090 This anecdote could of course easily be extended further Taking a more abstract viewpoint one can say that image based search hyperlinks the physical world in that any recognizable object text string logo face etc can be annotated with multimedia information 0091 5 1 Travel and Museum Guides 0092 In the specific case of visiting and researching the art and architecture of museums image based information access can provide the museum visitors and researchers with the most relevant information about the entire artwork or parts of an artwork in a short amount of time The users of such a system can conveniently perform image based queries on the specific features of an artwork conduct comparative studies and create personal profiles about their artworks of interest FIG 7 illustrates an example of the intelligent museum guide where on the left side user has snapped an image of the artwork of his her interest and on the right side the information about the artwork is retrieved from the server In addition users can perform queries about specific parts of an artwork not just about the artwork as a whole The system works not only for paintings but for al
15. ces in the reference name are not permitted at this time 0195 Click Options Continue once again You may be prompted with a choice of how to connect to the internet Select the default 0196 You will see a message Successfully Opened Out put Stream The image is novv being sent to the Recognition Server This may take several seconds to complete 0197 The system will respond that the image has been saved once the operation is complete You are now ready to test the recognition of this object 0198 7 5 Recognizing an Object with the Client 0199 Referring back to FIG 3 the following is an overview of the process of recognizing an image 0200 Step 1 0201 Follow Step 1 and Step 2 from the Adding an Image to the OR Server section to capture an image of the object 0202 0203 0204 0205 You will see a message Successfully Opened Out put Stream The image is now being sent to the Recognition Server This may take several seconds to complete 0206 Step 3 0207 Depending on the content associated with the object See Specifying Content below you may see any of the following 0208 1 A simple message stating Received Message is followed by the reference name You may use this name in the Specifying Content to have the Media Server return more appealing content when this object is recognized Step 2 Select Recognize Image Click Option then Continue 0209 2 An image reference name an
16. cognition and Bar Code Readers 0086 A face recognition engine described in U S Patent No 6 301 370 FACE RECOGNITION FROM VIDEO IMAGES Oct 9 2001 Maurer Thomas Elagin Egor Valerievich Nocera Luciano Pasquale Agostino Steffens Johannes Bernhard Neven Hartmut also allows to add new entries into the library using small sets of facial images This system can be generalized to work with other object classes as well 100871 Adding additional engines such as optical character recognition modules and bar code readers allows for a yet richer set of visual patterns to be analyzed Off the shelf commercial systems are available for licensing to provide this functionality 100881 5 0 Applications of the Visual Mobile Search Ser vice 100891 Let us start the discussion of the usefulness of image based search with an anecdote Imagine you are on travel in Paris and you visit a museum If a picture catches your attention you can simply take a photo and send it to the VMS service Within seconds you will receive an audio visual narrative explaining the image to you If you happen to be connected a 3G network the response time would be below a second After the museum visit you might step outside and see a coffeehouse Just taking another snapshot from within the VMS client application is all you have to do in order to retrieve travel guide information In this case location information is available through triangulation or inbuilt GPS it
17. d URL You may need to press the up and down arrow to see the entire message Select Options gt Go To Hyperlink to launch the internet browser and view the web page referred by the URL 0210 3 An Object Not Found message The image was not recognized by the Recognition Server 0211 Tfthe object has already been already been added to the OR server database try to recognize it again US 2006 0240862 A1 0212 Tfit has not been added to the OR Server database you may wish to follow the Adding an Image to the OR Server Database section so that it too may be recognized 0213 The steps of a method or algorithm described in connection vvith the embodiments disclosed herein may be embodied directly in hardware in a software module executed by a processor or in a combination of the two A software module may reside in RAM memory flash memory ROM memory EPROM memory EEPROM memory registers a hard disk a removable disk a CD ROM or any other form of storage medium known in the art An exemplary storage medium is coupled to the proces sor such that the processor can read information from and write information to the storage medium In the alternative the storage medium may be integral to the processor The processor and the storage medium may reside in an ASIC The ASIC may reside in a user terminal In the alternative the processor and the storage medium may reside as discrete components in a user terminal 0214 It shou
18. dence value based on an image from the mobile telephone an object recognition engine 24 and or 26 for generating a second confidence value based on an image from the mobile telephone a face recognition engine 28 for generating a third confidence value based on an image from the mobile telephone and an integrator module 30 for receiving the first second and third confidence values and generating a recognition output The recognition output may be an image description 32 0063 4 0 The Recognition Engines 0064 The heart of the VMS system is the suite of recognition engines that can recognize various visual pat terns from faces to bar codes 0065 4 1 Textured Object Recognition 0066 We first discuss the general object recognition engine that can learn to recognize an object from a single image If available the engine can also be trained with several images from different viewpoints or a short video sequence which often contributes to improving the invari ance under changing viewing angle In this case one has to invoke the view fusion module that is discussed in more detail below 0067 One of the most important features of an image based search service is that it is possible for a user who is not a machine vision expert to easily submit entries to the library of objects that can be recognized A good choice to implement such a recognition engine is based on the SIFT feature approach described by David Lowe in 1999 Essen
19. e art from the following detailed description It is to be understood however that the detailed description and specific examples while indicating exemplary embodiments are given by way of illustration and not limitation Many changes and modifications within the scope of the following description may be made without departing from the spirit thereof and the description should be understood to include all such modifications BRIEF DESCRIPTION OF THE DRAWINGS 0014 The invention may be more readily understood by referring to the accompanying drawings in which 100151 FIG 1 is a figure illustrating the main components of the Visual Mobile Search VMS Service 0016 FIG 2 is a figure illustrating the population of a database of a VMS server with image content pairs 100171 FIG 3 is a figure illustrating the process of retriev ing mobile content from the media server through visual mobile search 0018 FIG 4 is a figure illustrating an effective recogni tion server 0019 FIG 5 is a block diagram of an image based information retrieval system US 2006 0240862 A1 100201 FIG 6 is a flovv diagram for an operation of an object recognition engine 0021 FIG 7 illustrates an example of an intelligent museum guide implemented using the VMS service 0022 FIG 8 illustrates an example of how VMS may be used as a tool for a tourist to access relevant information based on an image 0023 FIG 9 illustrates an examp
20. e number of feature matches per hypoth esis and boost recognition performance at very little com putational cost 0082 4 1 2 Logarithmic Search Strategy 0083 An efficient implementation of a search service requires that the image search is organized such that it scales logarithmically with the number of entries in the database This can be achieved by conducting a coarse to fine simple to complex search strategy such as described in Beis and Lowe 1997 The principal idea is to do the search in an iterative fashion starting with a reduced representation that contains only the most salient object characteristics Only matches that result from this first pass are investigated closer by using a richer representation of the image and the object Typically this search proceeds in a couple of rounds until a sufficiently good match using the most complete image and object representation is found 0084 To cut down the search times further we also propose to employ color histograms and texture descriptors such as those proposed under the MPEG7 standard These image descriptors can be computed very rapidly and help to readily identify subsets of relevant objects For instance a printed text tends to generate characteristic color histograms and shape descriptors Thus it might be useful to limit the initial search to character recognition if those descriptors lie within a certain range 100851 4 2 Face Recognition Engine Optical Character Re
21. e phone and the recognition servers is handled via multi media messaging MMS FIG 1 illustrates the main com ponents of the Visual Mobile Search Service 0036 To make use of VMS service the application devel oper submits a list of pictures and associated image IDs in textual format to the visual recognition server An applica tion developer 126 which can occasionally be an end user himself submits images 114 annotated with textual IDs 128 to the recognition servers FIG 2 illustrates the population of the database with image content pairs 0037 FIG 3 shows in more detail the steps involved in retrieving mobile content and how the system refers an end user to the mobile content 1 The user takes an image with his camera phone 12 and sends it to the recognition server 122 This can either be accomplished by using a wireless data network such as GPRS or it could be send via multi media messaging MMS as this is supported by most wireless carriers 2 The recognition server uses its multiple recog nition engines to match the incoming picture against object representation stored in its database We recommend using multiple recognition experts that specialize in recognizing certain classes of patterns Currently we use a face recog nition engine an engine that is good for recognizing textured objects Optical character recognizers and bar code readers try to identify text strings or bar codes For a more detailed description of the r
22. e username and 1234 for the password US 2006 0240862 A1 0162 Click Add New Record 101631 Type in a name of the obfect into the TD field This can either be a nevv name or a reference name used if you used when adding an image to the OR server database NOTE Spaces in the ID are not permitted at this time 0164 Use the fields to supply an image from your computer descriptive text and a URL that the client can open if desired Ifyou do not want the client to automatically open a web browser you may enter none in the field 0165 7 3 2 Updating Viewing Existing Content 0166 Click Update 0167 Select the ID you wish to update view from the dropdown 0168 This will give you a preview of the content for the given ID 0169 0170 0171 0172 Another way to add images to the OR Server other than using the client is to add an image directly from your computer 0173 Click Add New Record under Image Database Administration Modify anything you wish to change Click update when finished 7 3 3 Adding an Image to the OR Server Database 0174 Enter a reference name and use the Browse button to load an Image 0175 Click Review Delete Image Database Record to view the added image 0176 7 3 4 Reviewing Images in the OR Server Database 0177 Choose an ID and click Review Delete Record 0178 If desired click Delete on the image to remove it from objects that may be recognized 0179
23. ecognition engines please refer to section 3 0 3 Successful recognition leads to a single or several textual identifiers denoting object faces or strings that are passed on to the so called media server 130 Upon receipt of the text strings the media server sends associated mobile multimedia content back to the VMS client on the phone This content could consist of a mix of data types such as text images music or audio clips In a current implementation the media server often just sends back a URL that can be viewed on the phone using the inbuilt web browser 0038 Please note that the content could simply consist of a URL which is routed to the browser on the phone who will then open the referenced mobile webpage through standard mobile web technology 0039 2 0 Useful Server Side Features 0040 2 1 Multiple Engines on the Server 0041 Years of experience in machine vision have shown that it is very difficult to design a recognition engine that is equally well suited for diverse recognition tasks For instance engines exist that are well suited to recognize well textured rigid objects Other engines are useful to recognize deformable objects such as faces or articulate objects such as persons Yet other engines are well suited for optical char acter recognition To implement an effective vision based search engine it will be important to combine multiple algorithms in one recognition engine or alternatively install US 2006 024086
24. ent can be measured in real time by counting the number of clicks the advertisement generates from camera phone users The content of the advertisement may by adjusted to increase its effectiveness based on the click rate 0113 The billboard may provide time sensitive adver tisements that are target to passing camera phone users such as factory workers arriving leaving work parents picking up kids from school or the like The real time click rate of the targeted billboard advertisements may confirm or refute assumptions used to generated the targeted advertisement 0114 5 3 Payment Tool 0115 Image recognition can also be beneficially inte grated with a payment system When browsing merchandise a customer can take a picture of the merchandise itself of an attached barcode of a label or some other unique marker and send it to the server on which the recognition engine resides The recognition results in an identifier of the merchandize that can be used in conjunction with user information such as his credit card number to generate a payment A record of the purchase transaction can be made available to a human or machine based controller to check whether the merchan dise was properly paid 0116 5 4 Learning Tool For Children 0117 A group of users in constant need for additional explanations are children Numerous educational games can be based on the ability to recognize objects For example one can train the recognit
25. entive concepts disclosed herein Various modifications to these embodi ments may be readily apparent to those skilled in the art and the generic principles defined herein may be applied to other embodiments e g in an instant messaging service or any general wireless data communication applications without departing from the spirit or scope of the novel aspects described herein Thus the scope of the invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein The word exemplary is used exclusively herein to mean serving as an example instance or illustration Any embodiment described herein as exemplary is not necessarily to be construed as preferred or advantageous over other embodi ments What is claimed is 1 An image based information retrieval system compris ing a mobile telephone a remote recognition server and a remote media server the mobile telephone having a built in camera and a communication link for transmitting an image from the built in camera to the remote recognition server and for receiving mobile media content from the remote media server the remote recognition server for matching an image from the mobile telephone with an object represen tation in a database and forwarding an associated text identifier to the remote media server and the remote media server for forwarding mo
26. he mobile telephone having a built in camera and a communication link for transmitting an image from the built in camera to the US 2006 0240862 A1 remote server the remote server having an optical character recognition engine for generating a first confidence value based on an image from the mobile telephone an object recognition engine for generating a second confidence value based on an image from the mobile telephone a face recognition engine for generating a third confidence value based on an image from the mobile telephone and an integrator module for receiving the first second and third confidence values and generating a recognition output 8 Animage based information retrieval system as defined in claim 7 wherein the object recognition engine comprises a textured object recognition engine 9 Animage based information retrieval system as defined in claim 7 wherein the object recognition engine comprises a rigid texture object recognition engine 10 An image based information retrieval system as defined in claim 7 wherein the object recognition engine comprises an articulate object recognition engine 11 An image based information retrieval system com prising a mobile telephone and a remote server the mobile telephone having a built in camera a recognition engine for recognizing an object in an image from the built in camera and a communication link for requesting information from the remote server related to a recogni
27. iled Feb 20 2004 entitled IMAGE BASED SEARCH ENGINE FOR MOBILE PHONES WITH CAMERA and which claims the benefit of U S Provisional Application No 60 570 924 filed May 13 2004 which applications are incorporated herein by reference This application also claims the benefit of U S Provisional Application No 60 727 313 filed Oct 17 2005 which application is incor porated herein by reference BACKGROUND 0002 1 Field 100031 Embodiments of the invention relate generally to information retrieval systems and more particularly to a mobile image based information retrieval system 0004 2 Background 0005 Almost all mobile phones come with an integrated camera or image capture device The camera is typically used for taking pictures for posterity purposes however there are many other applications for which the images may be applied SUMMARY 0006 The present invention may be embodied in an image based information retrieval system including a mobile telephone a remote recognition server and a remote media server The mobile telephone has a built in camera and a communication link for transmitting an image from the built in camera to the remote recognition server and for receiving mobile media content from the remote media server The remote recognition server matches an image from the mobile telephone with an object representation in a database and forwards an associated text identifier to the remote media server
28. ion system to know all countries on a world map Other useful examples would be numbers or letters parts of the body etc Essentially a child could read a picture book just by herself by clicking on the various pictures and listen to audio streams triggered by the outputs of the recognition engine 0118 Other special needs groups that could greatly ben efit from the VMS service are blind and vision impaired people 0119 5 5 Treasure Hunt Games 0120 Object recognition on mobile phones can support a new form of games For instance a treasure hunt game in which the player has to find a certain scene or object say the facade of a building Once he takes the picture of the correct object he gets instructions which tasks to perform and how to continue US 2006 0240862 A1 0121 5 7 Product Information and User Manuals 101221 Tmage based search vvill be an invaluable tool to the service technician who wants more information about a part of a machine he now has an elegant image query based user manual 0123 Image based information access facilitates the operation and maintenance of equipment By submitting pictures of all equipment parts to a database the service technicians will continuously be able to effortlessly retrieve information about the equipment they are dealing with Thereby they drastically increase their efficiency in operat ing gear and maintenance operations 0124 5 9 Public Space Annotation 0125 Anothe
29. ionality to mobile application developers and to the users of mobile phones Mobile phone users can use the inbuilt camera 16 of a mobile phone 12 to take a picture 114 of an object of interest and send it via a wireless data network 118 such as for example the GPRS network to the VMS server 120 The object gets recognized and upon recognition the servers will take the action the application developer requested Typi cally this entails referring the sender to a URL with mobile content 121 designed by the application developer but can entail more complex transactions as well 0032 VMS Servers Typically we organize the VMS servers into two main parts 1 0 System Architecture 1 1 Overview 0033 Visual Recognition Server 122 also sometimes referred to as the object recognition oR server Recog nizes an object within an image interacts with the Media Server to provide content to the client and stores new objects in a database Oct 26 2006 0034 Media Server 124 Responsible for maintaining content associated with a given ID and delivering the content to a client It also provides a web interface for changing content for a given object 0035 VMS Client Mobile phones are responsible for running the VMS client to send images and receive data from the server The VMS client is either pre installed on the phone or comes as an over the air update in a Java or BREW implementation Alternatively the communication between th
30. ld be noted that the methods described herein may be implemented on a variety of communication hardware processors and systems known by one of ordinary skill in the art For example the general requirement for the client to operate as described herein is that the client has a display to display content and information a processor to control the operation of the client and a memory for storing data and programs related to the operation of the client In one embodiment the client is a cellular phone In another embodiment the client is a handheld computer having communications capabilities In yet another embodiment the client is a personal computer having communications capabilities In addition hardware such as a GPS receiver may be incorporated as necessary in the client to implement the various embodiments described herein The various illustrative logics logical blocks modules and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor a digital signal processor DSP an application specific integrated circuit ASIC a field pro grammable gate array FPGA or other programmable logic device discrete gate or transistor logic discrete hardware components or any combination thereof designed to per form the functions described herein A general purpose processor may be a microprocessor but in the alternative the processor may be any conventional process
31. le of how VMS may be used in using traditional print media as pointers to interac tive content and 0024 FIG 10 11 are figures used to describe the use of the VMS client 0025 Like numerals refer to like parts throughout the several views of the drawings DETAILED DESCRIPTION 0026 This invention disclosed exploits the eminent opportunity that mobile phones with inbuilt camera are proliferating at a rapid pace Driven through the low cost of cameras the percentage of camera phones of all mobile phones is rapidly increasing as well The expectation is that in a few years in the order of one billion mobile handsets with cameras will be in use worldwide 0027 This formidable infrastructure may be used to establish a powerful image based search service which functions by sending an image acquired by a camera phone to a server The server hosts visual recognition engines that recognize the objects shown in the image and that returns search results in appropriate format back the user 0028 The disclosure at hand also describes in detail the realization of the overall system architecture as well the heart of the image based search service the visual recogni tion engines The disclosure lists multiple inventions on different levels of the mobile search system that make it more conducive to successful commercial deployments 0029 0030 0031 The visual mobile search VMS service is designed to offer a powerful new funct
32. look up can translate the word before it is processed further 0104 5 2 Media Bridging and Mobile Advertising 0105 Image based search can support new print to in ternet applications If you see a movie ad in a newspaper or on a billboard you can quickly find out with a single click in which movie theaters it will show 0106 Tmage based mobile search can totally alter the way how many retail transactions are done To buy a Starbucks coffee on your way to the airplane simply click on a Starbucks ad This click brings you to the Starbucks page a second click specifies your order That is all you will have to do You will be notified via a text message that your order is ready An integrated billing system took care of your payment 0107 A sweet spot for a first commercial roll out is mobile advertising A user can send a picture of a product to a server that recognizes the product and associates the input with the user As a result the sender could be entered into a sweepstake or he could receive a rebate He could also be guided to a relevant webpage that will give him more product information or would allow him to order this or similar products 101081 Tmage based search using a mobile phone is so powerful because the confluence of location time and user information with the information from a visual often makes it simple to select the desired information The mobile phone naturally provides context for the query FIG 9 illustr
33. mically viable fashion we propose to apply the fol lowing business models 0132 The VMS service is best offered on a transaction fee basis When a user queries the service at transaction fee applies Of course individual transaction fees can be aggre gated in to a monthly flat rate Typically the transaction fee is paid by the user or is sponsored by say advertisers 101331 To entice users to submit interesting images to the recognition service we suggest to put in place programs that provide for revenue sharing with the providers of annotated image databases This a bit akin to the business model behind iStockPhoto 0134 7 0 Tutorial For a Current Implementation 0135 This section describes in detail the steps a user has to go through to handle a current implementation of VMS Oct 26 2006 called the Neven Vision oR system The client is called iScout is implemented in Java and runs on a Nokia 6620 phone 0136 7 1 Overview 0137 The following is a brief tutorial for using the Object Recognition oR system that includes step by step instructions for Adding Images to the oR Server Database Recognizing an Image and Specifying Content A brief troubleshooting section is also included 0138 7 2 Installation 0139 In order to use this document you will need to install the oR client named iScout on a Nokia 6620 phone 0140 Download the client application from the internet onto a computer 0141 0142 1 Yo
34. most any other object of interest as well statues furniture architectural details or even plants in a garden 0093 The proposed image based intelligent museum guide is much more flexible than previously available sys tems which for example perform a pre recorded presenta tion based on the current position and orientation of the user in museum In contrast our proposed Image Based Intelli gent Museum Guide has the following unique characteris tics 0094 1 Users can interactively perform queries about different aspects of an artwork For example as shown in FIG 2 a user can ask queries such as Who is this person in the cloud Being able to interact with the artworks will make the museum visit a stimulating and exciting educa tional experience for the visitors specifically the younger ones 100951 2 Visitors can keep a log of the information that they asked about the artworks and cross reference them 100961 3 Visitors can share their gathered information with their friends 100971 4 Developing an integrated global museum guide is possible 0098 5 No extra hardware is necessary as many visi tors carry cell phones with inbuilt camera 0099 6 The service can be a source of additional income where applicable 0100 Presentation of the retrieved information will also be positively impacted by the recognition ability of the proposed system Instead of having a one explanation that fits all f
35. o 195 jeniul ue 3 s OZL Bu u31elN 091 Alle pi 5329 4 S 9S jo 1 9 11 pa101S 5195 31n E94 J0 93 9Q 1s 1 1ul JO S3ulOd Vol vi s3 lqo pauses jo eseqejeq ee N US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 7 of 11 L EH 1x8 18dAy e qejjouos e JO w104 34 Ul y jnoge uoneuuoju PSASIIAY ue Buiddeus US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 8 of 11 8 DI 4no e ajnpayrsS sawn 1 1 29 so2ud UOJSSIWPE pui 1IS y JO A o siu peay 1s 1 lui JO are e jo AJNDIg Patent Application Publication Oct 26 2006 Sheet 9 of 11 US 2006 0240862 A1 FIG 9 S g Ev 2 lt lt oe Xa Watch the trailer Picture of movie poster US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 10 of 11 0I Ol OMOOVA n NIEGI 1 Patent Application Publication Oct 26 2006 Sheet 11 of 11 US 2006 0240862 A1 FIG 11 US 2006 0240862 A1 MOBILE IMAGE BASED INFORMATION RETRIEVAL SYSTEM CLAIM OF PRIORITY 0001 This application is a continuation in part of U S application Ser No 11 129 034 filed May 13 2005 entitled IMPROVED IMAGE BASED SEARCH ENGINE FOR MOBILE PHONES WITH CAMERA which is a continu ation in part of U S application Ser No 10 783 378 f
36. or controller microcontroller or state machine A processor may also be implemented as a combination of computing devices e g a combination of a DSP and a microprocessor a plurality of microprocessors one or more microprocessors in conjunc tion with a DSP core or any other such configuration 102151 The various illustrative logics logical blocks mod ules and circuits described in connection with the embodi ments disclosed herein may be implemented or performed with a general purpose processor a digital signal processor DSP an application specific integrated circuit ASIC a field programmable gate array FPGA or other program mable logic device discrete gate or transistor logic discrete hardware components or any combination thereof designed to perform the functions described herein A general purpose processor may be a microprocessor but in the alternative the processor may be any conventional processor controller microcontroller or state machine A processor may also be implemented as a combination of computing devices e g a Oct 26 2006 combination of a DSP and a microprocessor a plurality of microprocessors one or More microprocessors in conjunc tion with a DSP core or any other such configuration 0216 The embodiments described above are exemplary embodiments Those skilled in the art may now make numerous uses of and departures from the above described embodiments without departing from the inv
37. or an artwork it is possible to organize the infor mation about different aspects of an artwork in many levels of details and to generate a relevant presentation based on the requested image based query Dynamically generated presentations may include still images and graphics overlay annotations short videos and audio commentary and can be US 2006 0240862 A1 tailored for different age groups and users with various levels of knowledge and interest 0101 The museum application can readily be extended to other objects of interest to a tourist landmarks hotels restaurants wine bottles etc It is also noteworthy that image based search can transcend language barriers and not just by invoking explicitly an optical character recognition subroutine The Paris coffeehouse example would work the same way with a sushi bar in Tokyo It is not necessary to know Japanese characters to use this feature FIG 8 illus trates how VMS may be used as a tool for a tourist to quickly and comfortably access relevant information based on an acquired image 0102 5 1 1 Optical Character Recognition with Lan guage Translation 0103 A specific application of the image based search engine is recognition of words in a printed document The optical character recognition sub engine can recognize a word which then can be handed to an encyclopedia or dictionary In case the word is from a different language than the user s preferred language a dictionary
38. r important area is situations in which it is too costly to provide desired real time information Take a situation as profane as waiting for a bus Simply by clicking on the bus stop sign you could retrieve real time information on when the next bus will come because the location information available to the phone is often accurate enough to decide which bus stand you are closest to 0126 5 10 Virtual Annotation 0127 Auser can also choose to use the object recognition system in order to annotate objects in way akin to Virtual Post it Notes A user can take a photo of an object and submit it to the database together vvith a textual annotation that he can retrieve later when taking a picture of the object 101281 5 11 User Generated Content 101291 Another important application is to offer user com munities the possibility to upload annotated images that support searches that serve the needs of the community To enable such use cases that allow users who are not very familiar with visual recognition technology to submit images used for automatic recognition one needs take pre cautions that the resulting databases are useful A first precaution is to ensure that images showing identical objects are not entered under different image IDs This can be achieved by running a match for each newly entered image against the database that already exists 0130 6 0 Business Models 0131 To offer the image based search engine in an econo
39. rder not to pollute the database There are different ways to generate such a confidence measure One is to use match scores topological and other consistency checks that are intrinsic to the object recognition methods described below Another way is to rely on extrinsic quality measures such as to determine whether a search result was accepted by a user This can with some reliability be inferred from whether the user continued browsing the page to which the search result led and or whether he did not do a similar query shortly after 0045 2 3 Databases that Sort the Available Images by Location Time and Context 0046 To facilitate the recognition it is important to cut down the number of object representations against which the incoming image has to be compared Often one has access to other information in relation to the image itself Such information can include time location of the handset user profile or recent phone transactions Another source of external image information is additional inputs provided by the user 0047 Tt will be very beneficial to make use of this information to narrow down the search For instance if one attempts to get information about a hotel by taking a picture of its facade and knows it is 10 pm in the evening than it will increase the likelihood of correct recognition if one selects from the available images those that have been taken close to 10 pm The main reason is that the illumination conditions
40. re datasets in the database some valid matches will be missed This will lead to more valid hypotheses since there are multiple matching views of the object in the database but with fewer matches per hypothesis which will diminish recognition performance To avoid this degradation in performance feature datasets can be linked so that all datasets of any object feature will be considered in the matching process 0080 To achieve the linking the following procedure can be used When enrolling a training image into the database all features detected in this image will be matched against all features in each training image of the same object already enrolled in the database The matching is done in the same way that the object recognition engine deals with probe images except that the database is comprised of only one image at a time If a valid hypothesis is found all matching feature datasets are linked If some of these feature datasets are already linked to other feature datasets these links are propagated to the newly linked feature datasets thus estab lishing networks of datasets that correspond to the same object feature Each feature datasets in the network will have links to all other feature datasets in the network US 2006 0240862 A1 0081 When matching a probe image against the database 172 in addition to the direct matches all linked feature datasets will be considered valid matches This will signifi cantly increase th
41. t feature is text and the related information is a web page address 20 An image based information retrieval system as defined in claim 19 wherein the remote server provides the web page address to the mobile telephone in real time 21 An image based information retrieval system as defined in claim 17 wherein the object feature is an adver tisement and the related information is a web page address 22 An image based information retrieval system as defined in claim 17 wherein the object feature is a picture and the related information is an audio stream 23 An image based information retrieval system as defined in claim 17 wherein the object feature is an equip ment part and the related information is an operation and maintenance manual for the equipment
42. the recognition server for matching an image from the mobile telephone with an object representation in a database and forwarding an associated text identifier to the remote server and the remote media server for forwarding mobile media content to the mobile telephone based on the associated text identifier 3 Media server uses the mobile content associated vvith the ID to transmit information back to the mobile hone Media Server P Text Identifier ID 1 Image taken from mobile phone camera and sent to the Object Recognition Server Mobile Phone Mobile Media Content 12 US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 1 of 11 I DIH Vl 3u931u0 ZIIGOW HRA BYSSUOSLES ADUNAT 29 ZUY UN MWA per 11 SOMOS SWA ONION zi ea 585 TUBUIWOJJ MOJ 40 LO ELIT yel I SS 1 IM RANE SIBAIBS PIPE K 0 4 Cnt MN a ma uoud eJoulg i uo 3u ll SWA 8LL ewo 821 Kaju ql oslqO i sobewy le b q i i JOMION ENSI ssejouM lt i yy 021 Vil US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 2 of 11 COW 221 uou Boy ENSI qyIszuaqsapasiaw pe 0 OSIszuaqsapasjiaw pe 821 yil 921 dadojsasq uoneoliddv US 2006 0240862 A1 Patent Application Publication Oct 26 2006 Sheet 3 of 11
43. u must have a Bluetooth adapter installed on your machine Installing application using Bluetooth 0143 2 On the phone navigate to Connect gt Bluetooth 0144 3 Select it and make sure Bluetooth is on 0145 4 On your computer Browse to the folder you copied the iScout0 6 jar installation file 0146 5 Right click on the file and select Send To gt Bluetooth Device 0147 6 Click Browse 0148 7 Your phone s name should appear in the list Select it and click OK 0149 8 Click Next 0150 9 On the phone click Yes to accept the message 0151 10 When the message alert pops up click show 0152 11 This will launch the installer Click Yes throughout and accept all defaults 0153 Installing application using Nokia PC Suite and data cable 0154 1 Install the program and USB drivers for the PC using the CD that came with the phone 0155 2 After successful installation plug in your phone to the data cable 0156 3 Right click on iScout1 0 jar and select Install with Nokia Application Installer 0157 4 Follow the instruction to install the application 0158 7 3 Specifying Content on the Media Server 0159 The Media Server can be used for setting up content to be displayed on a client when an object is recognized 0160 7 3 1 Associating New Content with an Image in the OR Server Database 0161 Go to http recognitionserver nevenvision com or and enter your user for th
44. zed object 12 An image based information retrieval system as defined in claim 11 wherein the object is an advertising billboard and the related information is a web page address 13 An image based information retrieval system as defined in claim 11 wherein the object is a car and the related information is a car manual 14 An image based information retrieval system as defined in claim 11 wherein the object is a product and the related information is a payment confirmation 15 An image based information retrieval system as defined in claim 11 wherein the object is a book and the related information is an audio stream Oct 26 2006 16 An image based information retrieval system as defined in claim 11 wherein the object is a bus stop sign and the related information is real time information on the arrival of the next bus 17 An image based information retrieval system com prising a mobile telephone and a remote server the mobile telephone having a built in camera a recognition engine for recognizing an object feature in an image from the built in camera and a communication link for requesting informa tion from the remote server related to a recognized object feature 18 An image based information retrieval system as defined in claim 17 wherein the object feature is text and the related information is a translation of the text 19 An image based information retrieval system as defined in claim 17 wherein the objec

Download Pdf Manuals

image

Related Search

Related Contents

Turboair LIPARI IX/A    Kit Manual  Affichage    Product datasheet  ZC-PT336N-IR Manual    Direction générale Environnement Sumitomo Chemical UK PLC  

Copyright © All rights reserved.
Failed to retrieve file