Home
        Readiris User`s Manual
         Contents
1.         s so svelte that you can tuck iti    The page analysis will even detect zones where you get white text on a  black background  Recognizing such inserts is no problem  while the preview  displays the scanned document correctly on screen  Readiris    inverts    the image  when the need arises to recognize such text blocks   You can have your scanner  generate fully inverted images to process pages with white text on a black back   ground  See below         ONE AND A HALF  SORTING WINDOWS    Readiris not only detects the various blocks  but also sorts them  the zones are  sorted top down  left to right by default to cope with columnized documents    Evidently  you can modify the sort order  To do so  click the  Sort  button on  the main toolbar  The mouse cursor becomes a pointing hand as soon as the    sort  mode    is enabled        Click on the windows you want to include  Windows you do not click on are  simply ignored  excluded from recognition  It   s easy to see which windows are    Ze Io       USER    S GUIDE    selected and which aren   t  the selected windows have their full color  non se     lected windows have a lighter color tone       Readiris   C  Program Files Readiris english  jpg  page 1 of 1     File Edit Settings View Process Learn Register Help    OCR Wizard    x cost way  Although the first research and development on Optical Charac  ygiuition  OCR  began more than 30 years ago  this technology is still unknown  asl of the people who could use it fo
2.     Ei Adobe Acrobat    Autoformat  pdf  Ei Adobe Acrobat    Autoformat  pdf   pi File Edit Document Tools View    Window Help je File Edit Document Tools View    Window Help    sense Aa Euro  SBa aaa AE     B  Tal Arr BAESU    Sl    EB Bookmark              eee TE     Nah     Aut    The aim of  au                          g a7  I iek 0 Image 1    B  L   Tiles          1 Bookmarks      The aim of  at                       oe     Autoformatting the original do fd   the original do     ee H S   E i et   2002 Copyright he OCR proces   E F l She OCR proces       a Tables recognize your t    recognize your t     ae 0 Table 1 you too      you toa     In a way  text m In a way  text n        OR READING THEM    Let   s look the other way for a moment  As Readiris offers full support of the  Adobe Acrobat PDF format  you won t just generate PDF files  you can also  read them        Repurposing    PDF documents may be a major application of Readiris   There are several reason why this is the case  First of all  it   s a way of converting  images into text  open image based PDF documents  execute the recognition and  save the OCR result to a text document  in any supported text format   Text files  are editable  image files are not           Second case  you can convert image based PDF files to text based PDF docu   ments  You then execute the recognition on    image only    PDF files and save the  OCR results    as text based PDF documents  Text based PDF files are search   able and ed
3.    16 nights  9th   25th Oct 2000        The image smoothening can also be enabled when you load prescanned   m   ages into memory     Files of type   ai image files      Digital camera     Force as 300 dpi   Wo Smoothen color images     Load POF documents in color       The brightness now  By    brightness     we actually mean the black and white  threshold  The setting  Automatic  determines the bilevel threshold automatically   Apply a different threshold when necessary by darkening or lightening the black        5 22 0       and white image  when you darken the image  more pixels become black   n the  black and white version  when you lighten the image  less prxels become black in  the black and white version     Note above all that no image adjustment is executed until you click the  Ap   ply  button  By clicking  OK   you execute the adjustment and close the window   Here   s an example where we lightened the black and white image dramatically    though admittedly not with OCR accuracy in mind        W Smaothen color image    With some scanner models  reduction of the  sharpness is needed to recognize color and  greyscale images     Smoothening allows to separate the text from  the colored background    te Cancel  i Brightness o Cel e   Apply  Dhe deerimenl is reci ber yoni se    Automatic      amd re H e a  Ai s ale ob Menmuell E7    black punts  sels  on a wwe   Help      infarmatika From L  ser piels  tl               lighten  I oy sten tO ety LSL Lisiscb    rating e
4.    Readiris    USER   S  GUIDE                                                                      HIRIS    Document to Knowledge             Readiris Pro       2002 I R 1 S  All rights reserved  OCR technology by I R LS   Connectionist  AutoFormat and Linguistic technology by I R LS        2002 I R 1 S  All rights reserved    III       USER  S GUIDE    SAVE TIME  No More RETYPING     Congratulations on acquiring Readiris  This software package will undoubt   edly be of great help in recapturing your texts  tables and graphics     As efficient as computers are  you have to key in your information first  If you  have ever retyped a 15 page report or a large table of figures  you know how  tedious and time consuming it can be  Use this state of the art OCR package to  automatically enter text in your applications and you Il acquire an unprecedented  level of efficiency and comfort        Scan a printed or typed document  indicate the zones of interest   or have the  system detect them for you   and execute the character recognition  Documents  composed of many pages are processed from start to finish in a single effort  A  few mouse clicks beat long hours of work as Readiris converts your paper docu   ments into editable computer files  it   s up to 40 times faster than manual retyping        The wizard guides you through the OCR process comfortably  answer a few  simple questions and you ll obtain quick and easy results with Readiris  You can  send the reading results directl
5.    nee  1 21  User Lexicons to    Boost    the Linguistics                       2222222ssssnennnnneennnneeeseseennneenen 1 26  Readiris Changes Languages As Needed              0 00ccccceccccssesesssseeeeeeeeeeeeeeeeeeeeeens 1 29  Reading Documents with Mixed Languages                          uu2uuunnnneeeeeeeeeeeeeeennenenn 1 31  Defining the Document Characteristics                       02202sssssnsnsnneennnnnnennnenseeennnnneenenn 1 33  Readiris Gets More Intelligent Each Time                                         nn  1 34   1   ANTM cineca e nlc oneness paisa dade de eee UND  E E E E EEEE E EEE 1 36   Don AGM ee eek ea ee 1 37     B    121 1   SEEN e nO eee PRUE te SE Te CP eee PET AL LEHRER Te a eee eee er UNE oe 1 37   ee A E E ed ne ee A 1 38   Finish Repent na oo A EU EOE PIEPER OE CTU Te a OT e PE Deter Mine RRC EEN Te CED ETC Oe Re 1 38   ADO ce erase Ses dct sweet eee nassau ec et ees ee rece 1 38  TheRole ot Foni B   e111  6  01  10     a ee ee eu 1 38  Sending the Result Directly to Your Application                              eeennenn 141  Saving the Results in a Text File u    ee ae 145  Creatine Portable DocUmeniS nee ee ae 1 47  e IER ae I O a EE EEA E E E E EA 1 51  Recon IVIL AS nee 1 53  Editing multipage documents                  cccceceessssssccceeeeeeeccceeececeeeeeeeeeessssssesaeeeeeeeeeeseess 1 59  Sirung a New IOCHITIE Eee 1 60  Organizme ihe TESL OUMU een N A E 1 60  BELLINP UP our SCANNER en ee 1 62  Bring Color to Your Text Scans  sense ee 
6.    rth   g Sottwarebl  Pro PC Suite    rtf   Sun StarOffice 5        rth   Sun StarOffice 6 0     rth   Test  MS DOS Format     tet   Text    txt   Unicode     txt   Unicode UTF 8     txt   WardPerfect 4 2  wp    WordStar ws        Recreate source document    Options    WoardStar 2000        The option Open after Saving is largely similar to the    send    feature  you open  the recognized document once it   s saved     Output       Send ta         External file Microsoft Ward 97  2000  2002    rtf          W Open after saving       USER    S GUIDE    However  the method used to address the target application 1s different  This  time  the Windows file types determine which application will be started up  It   s  as if you double clicked the output file in the Windows Explorer     With the op   tion  Send to   Readiris addresses specific target applications directly      Folder Options    General   View 4 File Types    Offline Files     Registered file types        Extensions   File Types  Microsoft Excel Data Interchange Format  Microsoft Word Document    Microsoft Word HTML Document    DOCMHTML File  Microsoft Word Template    Microsoft Word HTML Te                         Details for DOC extension    Opens with  Microsoft Word    Files with extension DOC are of type  Microsoft Word Document      To change settings that affect all    Microsoft Word Document    files     click Advanced              CREATING PORTABLE DOCUMENTS          We ll go deeper into one format  Adobe Ac
7.   Excel  so we select Excel as target application under the  Format  button     Output  f Send ta   Microsoft Excel          AbiS ource Abiw ord  External file Adobe Acrobat  Header    Image Text  Adobe Acrobat  Reader    Text  v Clipboard  Clipboard Microsoft Excel  Layout Corel WordPerfect    HTML editor  Jarte 1     Retaj een hi Microsoft Excel  etan word and paragraph Id yierasaft Internet Explorer  Microsoft Word 97 word 2000 Word 2002           Create body text    m  Netscape    OpenOfice ong nter 1 0  Softwareblz Pro PC Suite  Sun StarOffice 6 0  Options    The spreadsheet is started up automatically and the result looks like this  the  typical table structure with rows and columns is recreated  and you are immedi   ately ready to process the data        USER  S GUIDE        E Microsoft Excel   Book   File Edit wiew Insert Format Tools Data window  OSM AR SRY BO e     a Ti     ro            2 390  397 745 129 24 5 509       1  2  J 19 149 915  91 549  4   207 410 49 526 3 012  5   429 000 U 17 17 429  B   499 123 149 25 122 098  i  E    You may come across    ungridded    tables the page analysis does not detect as  table zones because the columns are too widely spaced   Readiris tries to avoid  confusion with columnized text blocks  To create a table window manually  click  on the  Table Window  tool in the image toolbar and proceed as usual  the button s  tooltip again indicates the number of table windows        EE  oan  LLH    GETTING ON LINE HELP    This concludes
8.   The Readiris software is delivered exclusively on an autorunning CD ROM   To install  simply insert the CD ROM in your CD ROM drive and wait for the  installation program to start running  Follow the on screen instructions           Should the installation not begin to run when the CD ROM is inserted in your  CD ROM drive  run the setup program MENU EXE to install the software     Users of Windows XP  Windows 2000 and Windows NT must ensure that  they have the necessary access rights   contact the system administrator if       necessary     Some installation options are offered  Be sure to install the linguistic data   bases of all languages you intend to read  By default  all lexicons are installed   You are recommended to install the sample images which are used in the tuto     rials of this manual     InstallShield Wizard    Select Components  Choose the components Setup will install     Select the components you want bo install  clear the components you do not want to install     Languages   Sample Images 17117 K  Electronic Manual adar k  Adobe Acrobat Reader Bes K    Space Required on C  92395 K    Space Available on C  OKE    Description      Includes the linguistic     databases  Install the lexicon  of all languages you intend to    recognize     Change          lt  Back Cancel         Similarly  install the Acrobat Reader software required to access the software  documentation  should this be necessary  The electronic manual is by default  copied to your hard disk
9.   You can also leave   t on the CD ROM        USER    S GUIDE    The submenu  I R I S  Applications   Readiris  under the  Programs  menu is  created automat  cally by the installation program     m LRLS  Applications     fan  Cardiris    I  IRISPen      fay Readiris                   1 R 1 5  on the Internet    Readiris       Uninstall Readiris i  T User s Manual    The same holds for a shortcut to Readiris on the Windows desktop  As a  result  you are able to start Readiris directly from your desktop           5 1 4       UNINSTALLING THE READIRIS SOFTWARE    There are only two correct ways of uninstalling Readiris  using the Readiris     uninstall    program and using the Windows  un install wizard  You are strongly  recommended not to uninstall Readiris or its software modules by manually eras   ing the program files     Readiris    uninstall    program    Select  Uninstall Readiris  under the submenu  I R I S  Applications   Readiris   to start the Readiris    uninstall    program and follow the on screen instructions          fam LRLS  Applications      fan  Cardiris      IA IRISPen                  ff Readiris      7 LRLS  on the Internet      m Reading Asiar   documents      Ed Readiris         Uninstall Readiris       Windows  un install wizard    Execute the following steps to make use of the Windows  un install wizard      J Click  Settings  under the  Start  menu of Windows and go to the  Con   trol Panel     Q Click the icon  Add Remove Programs  under the contro
10.   and on the Readiris icon  to recog   nize them     As soon as pages gets processed  an additional toolbar  the page toolbar  is  added on the right side    t represents the various pages of the document and gives  access to the page commands using the right click  the  Context  menu            5 220       GETTING STARTED WITH A First TUTORIAL    The best way to become familiar with the operation of Readiris is undoubtedly  by using it  A number of prescanned images is provided with the software  they  allow you to get started even when there is no scanner connected to your com   puter  Let   s turn to these now     The  Source  button on the main toolbar determines whether you are going to  use a scanner or a prescanned image as image source     Color  greyscale and black and white images are supported on an equal basis   Readiris allows you to open Adobe Acrobat PDF documents  JPEG images  Paint   brush  PCX  images  DCX fax images  a multipage version of the Paintbrush  format   PNG images  TIFF images  uncompressed  LZW  PackBits  Group 3  and Group 4 compressed   multipage TIFF images and Windows bitmaps  BMP      This capability 1s particularly useful to convert your faxes into editable text  files     As you are going to open a prescanned image  you should select the disk  and  not the scanner  as image source with the  Source  button        Next  click the  Open  button   When you select the disk as image source  the   Scan  button is replaced by the  Open  button an
11.  Know that the tooltip of the  Learn  button indicates at all times which font  dictionary is currently active and in which mode that dictionary operates         era      nteractive learning  TAMY Documents Readiris dus  New Dictionary     When you enter the interactive learning  the dictionary and its operating mode  are indicated in the window title  you should click the  Abort  button and start  over In case they are wrong           New Dictionary  C  My Documents Readiris dus      Dont leam      Abort         SENDING THE RESULT DIRECTLY TO YOUR APPLICATION    The interactive training concludes the character recognition  As Microsoft  Word operates as output target by default  your wordprocessor is started up au   tomatically at the end of the recognition  if necessary  and the recognized text 1s  inserted     You may get a progress bar on screen as the recognized document gets for   matted   Whether this progress bar appears on screen or not depends on the size  of the document and the complexity of the formatting to be performed      Formatting text    CC          USER  S GUIDE    The scanned image   s displayed again with the zoning as created to be avail   able for further processing    t stays there until you scan another page     You have indeed converted a paper document into an editable computer file   be it 40 times faster than manual retyping  Go ahead and compare it with the  image you have inside your Readiris window     Actually  Readiris offers three differen
12.  Properties       You can even drag several prescanned images from the Windows Explorer  onto the Readiris window  The same argument holds  all images you drag onto  the Readiris window are added to the current document until you click the com   mand  New Document      Readiris sorts the images automatically   image 001 tif precedes 002 tif pre   cedes 003  tif etc   The page toolbar on the right side   it is displayed as soon as pages get    processed   represents the various pages of the document and gives access to  the page commands  using the right click      2 9        USER  S GUIDE       The current page   s highlighted   n the page toolbar and mentioned in the Readiris  title bar     The page toolbar comes with a tooltip  hold your mouse pointer over a page  thumbnail to learn which image was loaded into the memory   If a multipage  image was opened  there   s obviously just one file for all the images   When you  are scanning multipage documents  the tooltip simply mentions the scanner model     HP Scarlet 5470C       Load the sample image MULTIPAG TIF and start the recognition  The vari   ous pages are displayed one after the other  the Readiris title bar indicates the  page number           Readiris   C  Program Files Readiris multipag tif  page 2 of 5     File  Edit     Settings view  Process  Learn  Register  Help    OCR Wizard    KS rihemiore  no distinction shall be made on the basis of the political  jurisdictional       Scan    Q    Recognize         r h  Pag
13.  Readiris is unable to recognize  such as mathematical and scientific symbols and  dingbats  Some examples  Readiris can be trained to recognize the  r  symbol as     pi  or the dingbat     as  Tel    However  the list of recognized symbols cannot  be extended with the symbols  z and          The recognized text 1s displayed progressively and the system stops on doubt   ful characters  or   if you are dealing with touching characters     ligatures       on  doubtful character strings  They are always presented   n their context  the doubt   ful characters are highlighted  Unrecognized characters are represented by a  tilde  the     symbol               USER  S GUIDE    New Dictionary  C  My Documents Readiris dus       The first thing you should do 1s verify if you activated the correct font dictio   nary and dictionary mode   these are always indicated in the title of the learning  window  If that 1s not the case  click the  Abort  button   the document image is  redisplayed with the zoning as was created    enable the right font dictionary or  dictionary mode and run the OCR again   The operation of font dictionaries will  be discussed shortly      If necessary  enter a character  or character string  for the incorrect or un   known shape and click one of the following buttons     Learn    You agree with the proposed solution or correct it  The program saves this  doubtful character in the font dictionary as    sure     final  Future recognition will  no longer require your
14.  inside your Readiris  window     But how do you save the text of additional pages  Or   n other words  how do  you process documents consisting of multiple pages  It   s actually very simple  go  on recognizing pages  but enable the option  Append  when you are saving to the  same file   If you append an existing file  be sure it isn   t currently open  because  that will prevent you from writing to it   Secondly  don   t forget to put the font  dictionary in the append mode so that you can continue the font training comfort   ably     a    File name  Readiris    Save as type    Microsoft Word 97  2000  2002    lf Append       As soon as you scan pages  or open image files  inside a document  you have  to decide whether you want to start a new document or complete the current  document     Readiris    Are vou ready to delete the current document     Yes No   Cancel            5 2 48       Answer  no  to add pages to the current document  answer  yes  to create a  new document  Th  s answer has the same effect as the command  New Docu   ment  under the  File  menu     New Document Ctrl M       But there   s a more efficient way of recognizing several pages than scanning  and OCRing them one after the other  processing multipage documents di   rectly     To scan a document composed of several pages in one operation  enable the  document feeder of your scanner with the option  ADF  under the  Scanner   button       Landscape    Ww ADF  gr   Invert      il   Digital camera  Scan
15.  intervention  the shape   s considered learnt once and for  all     In the example above  the system stops on a soiled character  and we click   Learn  to accept a shape which cannot be confused with other characters           Don   t Learn    You agree with the proposed solution or correct it  The difference with the   Learn  button is that the learnt symbol gets the status    unsure      n the dictionary   For future recognition  the system will propose the    learnt    solution but still re   quire a confirmation    This button is used for symbols which might be confused with others  a de   faced  e  which might be mistaken for a  c   a damaged  t  which closely re     sembles an  r  etc     New Dictionary  C  My Documents Readiris dus             The  e  above is seriously damaged   in fact it is close to the  e  symbol    and  you should click  Don   t Learn  so as not to confuse the two symbols     Delete    The displayed form is eliminated from the output  This button is used to ignore     noise    on the documents   spots  coffee stains etc    which might get recognized  as points  commas and what have you    and to erase any other unwanted sym   bol        USER    S GUIDE    Undo    You go back to correct mistakes  You can undo the nine last decisions        Finish  The learning process is aborted but the OCR continues in automatic mode  All  decisions by the system thereafter are accepted without user validation     Click this button when you see that the recogniti
16.  our overview of Readiris  Some last minute information may  not be included in this manual  We thus recommend you to consult the on line  help system for additional information on Readiris    Go to the  Help  menu to do so  The command  Help Topics  and its shortcut  key Fl allow you to navigate through the many help topics           E  Readiris help    SE y  gt  A      Hide lack Foward Home Print Options  Contents   Index   Search      Welcome to the Readiris help  Introducing OCR   Recognizing Documents   How to       Reference Information  Software Versions and Options  Product Registration   Product Support   LRLS     Welcome to Readiris    Help          Use on line help to learn more about  Readiris     e Quickly find answers to questions        Connect to the IR1S  web site for latest tips  and product updates       2002 Copyright LR LS  All rights reserved       The other commands of the  Help  menu tell you how to get product support   how to contact I R I S   give direct access to the I R I S  home page etc     
17.  pages for you  Enable the op   tion  Detect Page Orientation  under the  Settings  menu and Readiris will cor   rect the page orientation where needed           Detect Page Orientation    You can make good use of the image DESKEW JPG in the Readiris folder if  you want to try it  Enable the options  Page Deskewing  and  Detect Page Ori        USER  S GUIDE    entation  before you open the image and let Readiris restore the Tower of Pisa  the way we like it       m   j Ne J f p  Ej Readiris   C  Program Files Readirisideskew  jpg  page 1 of 1     OJ X    File Edit Settings View Process Learn Register Help    oa magnetar  us een  a   novia We  Geigy  ged ul  p pgo wA aog atsAqeue ad  ng Fal       oy tle   maa    ri  A wpe a  A   pu ay   en      piana w I roe  ore   san vp Fal  may vet og AFT Q    i      i Ko para I fer     igne ap t   Ga rer  u   a a    BE ee  de none migao    E we anaa ae  ny pic EHEN a er  means    im    az     men  um mir Pe  omiaa ND NE N une  a FE hen ump zo RE paar    Ga  ye LS  spree   aca    iura a     st  peyar pera ANI ma  or u   zied ap PWAN F  siup  NA PT    s  ru  ayt Va sto hc  fe  jno sa  nopan  aa Bed PPP    zu  opiu ma au    sags Buruau tens    zart po aged ay durpapdl    woTyo He       ADJUSTING THE SCANNED IMAGES       As was already indicated  powerful intelligent routines automatically convert  color and greyscale images into black and white  Should this still be necessary        5 2  68       the user can optimize the image further for th
18.  struct them for you by recreating the tables cell by cell in your spreadsheet or by  inserting a table object inside your wordprocessor files        USER    S GUIDE    Let   s explore the different solutions  starting with the    gridded    or    framed     table   it has borders around the cells     3 Readiris   C  Program Files Readiris tables  jpg  page 1 of 1     File Edit Settings View Process Learn Register Help    OCR Wizard Readi ng Ta bles    1A Readiris recognizes tabular data and recreates them ceil by cell in worksheets  Scan or as lable objecis inside wurdprocessor files     To insert tables as table objects  you must retain ihe word and paragraph formatting or recreate the  soure  document  we the    Format    button on the main toolbar     The page analysis detects    gridded    and    ungndded    tables     Gndded   or    fumed    lables have    basa sare borders around the cells   as docs the example below  The barders of lhe table cells get recreated         Performanee test optical media  English CD ROM Average accesy CPL   Vielen clip   Sequential  Digital Versatile Disk time  msec  utilization       playbacks   read 16 KB      a     Source     CD ROM Ax speed    seni ee    gt   a    E  corres  ni  CD ROM 32x s   sn 2  987        Tr     Tested on 333 HHz Pentium H eee       Ungridded    tables don t have any borders around the cells  When Ihe volum  s ol ungerichdest   lables arc loo widely spaced  the page analysis may nor derect a table window to aveid 
19.  top accuracy  including  low quality documents  faxes and dot matrix printouts  It copes beautifully with  badly scanned and copied documents containing too light or dark font shapes   Joined characters     ligatures     are resolved and fragmented forms  such as dot  matrix symbols  are recomposed     User verification in pop up style not only flags doubtful characters but also  increases the system   s precision  All solutions confirmed by the user are memo   rized  increasing speed and confidence as you go along  Using Readiris means  rendering it more intelligent each time  This powerful learning tool allows you to  train Readiris on special characters such as mathematic symbols and dingbats  but also to handle distorted fonts as you will find in real documents        To increase your productivity further  Readiris not only recognizes your texts   but can format them for you as well  Make use of    autoformatt  ng    and Readiris  recreates a facsimile copy of the scanned document  the word  paragraph and  page formatting of the original document are retained     Similar typefaces are used  the point sizes and typestyles as used in the source  document are maintained across the recognition  The placement of columns  text  blocks and graphics follows your original documents  And as Readiris supports  greyscale and color scanning effortlessly  you can recapture any graphics   be  they lineart  black and white photos or color illustrations  When a document con   tains table
20.  un  WA TEXHOJIOTUA EINE NOKA  is HEH3BECTH IIHPoKoM TIy  nuke    us    et aBTOMATHYECKOTO BBOoAa MaTepmasa H  Scanner JOKYMEHTOB           USER  S GUIDE    The end result looks l  ke th  s when opened with the wordprocessor   you may  have to select a Cyrillic font to display the Russian text correctly     WordPad  File Edit Mew Insert Format Help    mee ex er Bs UD    TlpequasHadeHiem CHCTEMEI OnTHyeckoroa PacnosHabaHHa SHAKOB ABIIGeTCA  ABTOMATHYECKHA BBON IIEYATHEIK HOKYMEHTOB B TIaAWATE KOMIEROTEDA KpamHe    HORYMECHTOB        For Help  press F1    To mix other languages  simply select the language with the most extended  character set  If you have a document where the  say  French translation is placed  alongside an English text  you have to select French as language to ensure that  the accentuated characters such as        and    get recognized correctly     DEFINING THE DOCUMENT CHARACTERISTICS    Now that the language is set  we ll turn to the other document characteristics   You can fine tune the recognition by specifying some document features  the font       5 2 28       type and character pitch   These commands do not apply to As  an documents    Let   s clar  fy what th  s means        Let   s start w  th the command  Font Type  under the  Settings  menu  The font  modes separate    normal    documents from dot matrix printed documents     Draft     or    9 pin    dot matrix symbols are made up of isolated  separate dots  and highly  specialized recognition rout
21.  view and edit such documents  Office XP and  2000 were specifically designed to cope with documents in many different lan   guages    Refer to the Readiris    Read Me    file for more information on this subject        USER  S GUIDE    Selecting the proper document language is imperative  Based on the selection  of a language  the software knows which symbol set to recognize  Multi linguis   tic support ensures that    exotic    characters such as     B      y and    are recog   nized correctly     Secondly  the software extensively uses linguistic databases to validate its  results  Suppose that you have to read the word  president  where an ink stain  makes the  r  look like an  f   Looking things up in the English lexicon  Readiris  will detect autonomously that the word  president  is being read and that it doesn   t  make any sense to recognize the symbol  f   This    self learning    technique is  of course highly dependent on the linguistic context     Linguistics offer useful help to solve ambiguous cases such as an  O  which  might be mistaken for a  0   Another typical example is the letter  I  and number   1  which have an identical form in many fonts   think of texts produced on old  typewriters  The linguistic context helps to determine whether you are dealing  with  I  or 1     The illustration below shows various shapes of  l  and  I   The shapes on the  first line are unambiguous  the shapes on the second line are ambiguous  but  linguistics can solve them  W
22.  zur Intimit  t ist z ee ee    I 5 und die symbolische    das Weibliche zur  ck MR   estellt  das Weibliche    Tod  Auch im zweit BE REN art robots  The Marine             5 programmatisch  Er  apply to   whole document      1  Start new column Das war Hartmans  m    Prostituierten  die si es verwirklicht     7 wird Rafterman der Fotoapparat gestohlen  Freilich ist auch diese Lesart nochmals zwiesp  ltig  Denn im    Damit sind die Sujets umrissen  die Unf  higkeit Kampf kann Joker nicht auf die Frau schie  en  eine symbolische    der Amerikaner  sich ein Bild zu machen und Ladehemmung    ausgerechnet Rafterman  der Naivste und  o sich in einer Welt  die vor allem   unordentlich    Schw  chste der Truppe  erschie  t die Hecken sch  tzin und rettet    ist  zurecht zufinden  in einer Welt  die durch die Joker das Leben  Jokers   Geburt   als Soldat ist mehrfach  gt      TEXT FORMATTING  PART 2       The other layout options are  Create Body Text  and  Retain Word and Para   graph Formatting      As the icon on the right side illustrates  creating body text means you create  a non formatted     running    text  The text will be captured  but its formatting 1s       USER    S GUIDE    entirely ignored  Use this option when you just need to recapture a text but not its  layout     Layout   f Create body text      Retain word and paragraph formatting     Recreate source document    wi    The option  Retain Word and Paragraph Formatting  represents the middle  road  the word formatti
23. 1 64  Different Devices  Different Resolution                      uuuusaaeeeeeeeeennneeseeenennnnneeeeneennnnen 1 66    BASE Delan leoc IES see a ee ae ae ee 1 69    VII          USER    S GUIDE  Saving Specific Settings aa nenn nenne erinnern 1 70  Scanning DOCUMENTS ernennen 1 71  Adjusting the Scanned TMG GES  nee een ie 1 73  Letting the OCR Wizard Work for You                         2222222snnseneeeneeeneeeenesseseennnnneenn 1 77  Readiris Recreates Your Document Layout                     uuusessseeseeeeneeeenessseennnnnenenn 1 78  Columns Please  NOL Frames  en  ee een 1 83  De SE POT ar a ee ehren 1 84  Saving Graphics Separately                     uuuueeeeeeeeeeeesessssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnneeeeenn 1 85  Taking Graphics to the Hilt                                  000neneeeeeeneneneneeseessnsnennnnnnnneennn 1 87  Reading Faxes and Deferred Recognition                           ennnnnnnnnneenn 1 89  Roo aane ADE eee net ne rg nn ae rn rE eT EUR 1 90  Recognizing Business Cards    ein 1 95  Scanning Business Cards se ocouseirusncicsesayancwssouitnimcdpinmominintgugeceeandeddodadnoseinwardieseaaseeaes 1 96  It Takes a Business Card Reading Mode                                ssneeeeenen 1 100  RECOGMIZING Business Cards a nee ee 1 102  Getting On line FIG   522 csteenacecenescinsdtcuaeosageaeisyeccienctcineeecenatateachadectintociocecntticneseantacenteee 1 104    CREDITS AND COPYRIGHTS    The Readiris software is designed and developed by I R I S  OCR   Con
24. DOS Text  ASCII  etc  do not support advanced formatting codes and therefore  cannot offer autoformatting  The Adobe Acrobat PDF format on the other hand  was designed to copy the look of your documents  PDF documents by nature  imply autoformatting    When the recognized text   s opened using a wordprocessor  the text looks like  this without any intervention by the user           ig Autoformat   Microsoft Word      File Edit View Insert Format Tools Table Window Help    DRAA SRV BAS o     HOR      44 Normal   Courier Ne   Courier New 12        3w PDD ia OB       Final Showing Markup    EIER    Type a question for help   X    Er    aR 1 7  1   6  1   5  i DER 1 172  1 3  104   0  500  6  0  7  0  gt  ae    Autoformatting    The aim of  autoformatting  is to recreate a facsimile copy of the    original document     he OCR process does more than just  recognize your text  it can format it for    you too     In a way  text recognition is becoming  more and more page recognition or document  recognition        Whether your OCR software reformats  the recognized text or not is up to the user  You  can perform OCR because you just need the  text  in which case you will edit and format it  yourself  and you can recreate the source  document  including its formatting     At 7 5cm Ln 28 Col 16       The various levels of formatting  are  creating body text   retaining the word and paragraph  formatting and creating a  facsimile copy     Creating body text means no  formatting is appl
25. Formatting is applied  you  get a continucus  running  text  All formatting  if  any  is done afterwards by  the user     If you ren Ihe wurd und permmaph formatting   the font ope  size and typestyle arc maintained  across the recognition  The justitication of the  paragraphs   s alao detected  TInwever  no graphics  are captuted and the cuhun iren L cecrewted    Ihe paragraph just follow each other etc        Antoformatting    recreates a facsimile  copy of the original document  the text  blocks  graphics an lhles are reerealed in  the same placce and the word and  paragraph formatting are maintained  across the recognition      100 000    As uresull  you gel a bre copy of yaur soure  doenment  be it a compact and editable teat  Ale  no lonver a seemned image of vaur  document     TO02 Copyedyck Immy vun i I  nayenzat Geucere       Click the  Format  button on the main toolbar and choose to send the OCR  result to Microsoft Word or select the RTF  Rich Text Format  or Word  DOC   format  Secondly  select  Recreate Source Document  as layout option   The  option  Merge Lines into Paragraphs  is enabled by default to apply wordwrap  within the paragraphs         USER  S GUIDE    Layout     Create body text  f   Retain word and paragraph formatting     e Recreate source document       W Use columns instead of frames    Whether layout reconstruction is available depends on the selected output  mode  Some    poor    formats generating    plain    text such as Text  ANSI   MS   
26. I   SFR Open with   20KB Y  Bla  ZKE Y  F  2  e 544KE JI  af Cut IKB Y  Er copy 975KB Jl  E or    597KB Y  E ar nn Shortcut 1 264 KB Y  gr  Peste 20KB Y  vc r 20KB     he Properties SOKB Y  hur  ver BSKB D       That does not mean the OCR is promptly executed  to give the user full flex   ibility  Readiris is simply started up and the image is opened    The image toolbar on the right side of the Readiris application window con   tains all commands you need during the image preview  tools to indicate the zones  of interest  to rotate the image  zoom in and out etc     LOOMING IN ON IMAGES    Readiris has several commands that allow you to zoom in on a scanned im   age  for instance to verify the scanning quality    The image toolbar contains buttons that allow you to zoom in at real size  to fit  the image to the page width and to fit the entire image in the preview window     2 11       USER  S GUIDE    The  View  menu contains the same commands and adds two extra zoom levels   you can display the image at 50  and 200  of its actual size  At actual size  a  screen pixel corresponds to an image pixel   Shortcuts are available for all zoom  levels      View       w Fit to Window Ctrl F  Fit bo width Cr   Actual Size Ctrl 1  200  Actual Size Chrl 2    Also notice that the zoom levels are available on the right click  Click with the  right mouse button to invoke the  Context  menu and select the appropriate zoom  level         5 Readiris   C  Program Files Readiris english  jpg  
27. R I S   under the  Help  menu of Readiris details in which ways you  can get in touch with LR LS           SE       A    Hide Back Print    Contents   Index   Search      2  Welcome to the Readiris help  ie Introducing OCR  Recognizing Documents  How to      Reference Information  Software Versions and Options  Product Registration  Product Support    Register your Readiris licence  Readiris registration form  How to get product support    How to acquire software options  da LRLS  on the Internet    Options       How to Get in Touch  with I R 1 S     Head Office  Belgium   Phone  32 10 45 13 64  Fax  32 10 45 34 43    1 R 1 5  on the Internet   LR 1S  home page  http  jiw ww irislink com  Readiris web site  http  jiw ww readiris com  On line shop  http   fshopirislink com  E mail info  infomirislink com   E mail sales  sales irislink com   E mail support  support irislink com    USA Office  East Coast   Phone  1 561 395 7831 7 000 447 4744  Fax  1 561 347 6267    USA Office  West Coast   Phone  1 480 854 3111   800 7USAIRIS  Fax  1 480 854 2929    France Office  Phone  0810 00 19 27  Fax  0810 42 41 43    An application icon in the submenu  I R I S  Applications   Readiris  under the     Programs  menu takes you directly to the I R I S  home page  So does the  Readiris startup screen and the command  I R I S  on the Internet  under the   Help  menu of Readiris        USER  S GUIDE         fan  Cardiris      IA IRISPen      fag LRLS  Applications  gt                LRLS  on th
28. You can also use the  command  Select Source  under the  File  menu         USER  S GUIDE    Select Source    S Ounces    HP Precisionscan Pro 3 1   IBCR Il 1 2   WlA Hewlett Packard Scanlet 44000 1 2    Cancel      Once the scanner is selected  the same window may allow you to set the  scanning resolution  the page format and orientation  brightness and contrast and  may allow you to indicate whether you are going to use the scanner   s document  feeder  With Twain compliant scanners  all scanning parameters are often set  inside the Twain interface        Set the brightness  and  if available  the contrast     By enabling the option  Landscape   you indicate that the selected page orien   tation is wide     landscape     instead of tall     portrait      The page orientation actu   ally applies to reduced page formats  on an A4 flatbed scanner  you can scan  say   A5 pages  half that big  in portrait or landscape format  but you can obviously  only scan the full A4 surface in one direction        The option  Invert  allows you to generate    inverted    images in the black   and white scanning mode   you can activate this option to process full pages with  white text on a black background        5 2  96       BRING COLOR TO YOUR TEXT SCANS     Readiris supports black and white  greyscale and color images on an equal  basis  so you are free to choose the color mode that best suits your needs  To  include lineart graphics in the recognized documents  scan in black and white  
29. ared se utne har ah Despeckle  off    chavagciers gind Boytest   ha  whine  miwa lu  len vo ha rraz virlu      islellisenl cach line yaa use H 0 zu        2002 Copyrig  Wi    The first two options concern color and greyscale images  the last one    Despeckle   exclusively concerns black and white images        Despeckling    means  that the    parasite pixels     also called    salt and pepper noise     will be removed  from black and white images        2 7        USER  S GUIDE    If computers can   t If computers can   t  adapt easily then adapt easily  then   maybe the people maybe the people  using themcan     using them can     Be sure that you don   t erase spots that are too big  otherwise you might start  erasing the dots on  1  etc   portions of dot matrix letters etc       Despeckle  remove 10 pixel dots    0 20    Removing too large dots may erase useful  information from the image    The best way of optimizing the images for the OCR process 1s this  place the  adjustment window where it doesn   t prevent you from judging the image adjust   ment you execute  Adapt the parameters   clicking  Apply  each time   until the  image is crisp and clear     LETTING THE OCR WIZARD WORK FOR YOU    Let   s get started capturing documents now  Instead of going through all the  parameters  we ll use the OCR wizard  a very comfortable way of recognizing  pages    Click the  OCR Wizard  button on the main toolbar  or select the command   OCR Wizard  under the  Process  menu      p
30. at ol re  a    OCR Wizard    Cbhrl O       SLR Wizard       5 2272       The wizard guides you through the OCR process comfortably  answer a few  simple questions and you Il obtain quick and easy results with Readiris     OCR Wizard    The OCR wizard leads you through the OCR process  comfortably     Just answer these questions and you ll get quick results with  Readiris     Click Nest to begin     W Enable Wizard on Startup    Cancel         Actually  the OCR wizard starts running each time you start up Readiris  you  can avoid this by disabling the option  Enable Wizard on Startup  in the first  screen of the wizard  and with the equivalent option under the  Settings  menu      READIRIS RECREATES YOUR DOCUMENT LAYOUT    The OCR wizard renders the recognition process highly automatic  but    auto   matic    OCR should not be confused with autoformatting     Autoformatting    means  that Readiris recreates a facsimile copy of the scanned document  the word   paragraph and page formatting of your original document are applied     Similar typefaces  serif and sans serif  proportional and fixed  normal and  condensed  are used as in the source document  the point sizes and typestyles       USER    S GUIDE     bold  italic and underlined  are maintained across the recognition  The tabs and  the alignment  left  centered  right and justified  of each text block are recreated   The placement of columns  text blocks and graphics follows your original docu   ment        In other word
31. confusion  N with columnized text blacks    When your lables eaclusiveky conlains numeric characters  enable the numeric reading mode with   the    langnage    butrar on the main toolbar for increased aueuracy        Run the recognition with the layout option  Retain Word and Paragraph For   matting  or  Recreate Source Document  enabled and the table gets recreated   Open your wordprocessor to have a look at the result   You could obviously have  included the text paragraphs in the text file as well            ial Table   Microsoft Word SEE    File Edit View Insert Format Tools Table Window Help Type a question for help   X    OsGaa SRY  oO 0     Alm      44 Normal  gt  Times New Roman 12 SIB Z   i   gt  Final Showing Markup   Showy PAD Dr Ga  amp  By    TER WEHEACHICH DERTROG EBUETERUKETAUKE IE dar eds reise A    Performance test optical media ee e E    CD ROM Average access CPU Video clip Sequential  Digital Versatile Disk time  msec  utilization     playbacks read 16 KB   frames  Kbps   dropped  7   CD ROM4xspeed      m    a Juu       CD ROM 24x speed      s   aa j oo   CD ROM 32x speed   o o ma oo on  bo O    s         om                Page 1 Sec 1 1 1 At 12 4cm Ln 14 Col 1 REC TRK EXT OVR French Gy       Now the    ungridded    example   it has no borders around the cells  Note that  the page analysis nevertheless detects the table           USER  S GUIDE    Readiris   C  Program Files Readiris table jpg  page 1 of 1     File Edit Settings View Process Learn Register H
32. d above the text in a two layered PDF file  Use the  Search   tool of Adobe Acrobat  Reader  and this becomes quickly obvious         Adobe Acrobat    Autoformat  pdf     File Edit Document Tools View Window Help          H64   Ad Ble i gt  vu a  Oc     FAAS        7 amp   E S   B 0 2 BS BQUHeT     Autoformatting    The aim of    autoformatting    is to recreate a facsimile copy of  the original document     i The various levels of  he B  fprocess docs more than just formatting are  creating body                      Tecognize your text  it can format it for text  retaining the word and  you too  Fee formatting and  ereaftfing a facsimile copy             Find Again  1  O Match Whole Word Only Cancel        D Match Case   Find Backwards    Ignore Asian Character width    Comments a i        Signatures 5       Click the  Format  button to discover an option that concerns the Acrobat  PDF format   Create Bookmarks      Options       IY Merge lines into paragraphs    i Include graphics    W Create bookmarks    The option  Create Bookmarks  sees to   t that a bookmark is created for  each document element   the graphics as well as the text blocks and tables  For          USER  S GUIDE    the text zones  Readiris applies an intelligent algorithm to come up with a title  a     summary    per zone  the tables and graphics are simply numbered   Another  navigational element of PDF documents  page thumbnails  can be created dy   namically by your Adobe Acrobat  Reader  software       
33. d the corresponding  Scan   command under the  Process  menu is replaced by the  Open  command      mr      pen       You could also select the command  Open  from the  File  menu and open a  prescanned image directly   this works even 1f your scanner operates as current    image source        USER    S GUIDE       You are invited to select an image file  Select the file ENGLISH JPG in the  Readiris folder  As this sample file is a color image    t is not only read from disk   a    binar  zed     black and white version is created for the OCR process     Loading O Readine english  jpa       Finally  the image is displayed in the image zone  The page toolbar indicates  that a single page is loaded into Readiris     Converting             Readiris   C  Program Files Readiris english  jpg  page 1 of 1     File Edit Settings View Process Learn Register Help    Recognize  English     gt     Page Analysis    AB    Learn    Fe    Format    2     Scanner    A third way of opening prescanned images is the use of    drag and drop      drag images from the Windows Explorer onto the Readiris image zone or on the       A word about OCR       The aim of OCR is ta an  tamatically enter printed text documents in a very effective and  low cost way  Although the first research and development on Optical Characler  Recognition  OCR  began more than 30 years ago  this technology is still unknown by  masl of the people who could use it tor their document entry applications     Now  you can use th
34. de to change the window size     To move a window  simply select   t and drag it to another location     To delete windows  select them and choose the  Cut  or  Clear  command  from the  Edit  menu  The  Cut  command cuts the window s  to an internal  buffer   Clear  erases the window s  irretrievably  When you paste zones  they  are inserted in their original position  and you have to drag them to their new  location     In fact  all familiar commands from the  Edit  menu apply to the windows   you can delete  cut  copy and paste them  The  Undo  command also applies  if  you have unfortunately deleted  moved  resized etc  some windows   Undo  will  cancel the last operation     2 19       USER  S GUIDE       uk Ctrl  Copy Cbrl i2  Paste krl  Clear Del   Delete Small windows Ctrl M  Select All Chrl 4    Also note that shortcuts are available for all commands  Let   s give an ex   ample  to erase all existing windows  you can choose the command  Select All  or  its shortcut Ctrl A and click the command  Clear  or its shortcut Delete  You are  now ready to recreate the necessary layout  To restore the previous layout  you  can choose  Undo  or the shortcut Ctrl Z              THREE  SAVING WINDOWING TEMPLATES    The resulting windowing layouts can be saved as zoning templates for fu   ture use with the command  Save Layout  under the  File  menu and loaded into  memory with the command  Load Layout      Save Layout    N    If you have to recognize documents with a similar layo
35. e Analysis    AB    h o e  Learn    A   Format    i  Scanner    ibunal  in the determination of his rights and obligations and of any criminal charge against    illy according to law in a public trial at which he has had all the guarantees necessary       If the interactive learning 1s enabled  you go through the recognition and learn     ing phases page by page  The dictionary mode  New    s used for the first page  and the mode  Append  for the successive pages     When you click the  Finish  button  all decisions by the system thereafter are  accepted without user validation  In other words  the interactive learning is aborted  for all pages  the OCR for this document continues in automatic mode        USER    S GUIDE    The recognition result of multipage documents is saved in a single output file   When the recognition result 1s sent to a target application  multiple pages get  created inside a single document     EDITING MULTIPAGE DOCUMENTS    The user can edit multipage documents  mainly to correct scanning errors  he  can delete pages from the document and move pages to other locations in the  document     The navigation first  To go to a page  click on its icon in the page toolbar or  hold your cursor over its thumbnail  invoke the  Context  menu by right clicking  and use the command  Select Page   To go to the previous page  you can use the  shortcut PageUp  to go to the next page  press PageDn  Or use the correspond   ing commands under the  View  menu        Prev
36. e Internet    Ei Readiris  et  Uninstall Readiris    User s Manual    INSTALLED FILES       The installation program has created a folder where the Readiris files are  located  Never try to uninstall Readiris or some of its modules by manually eras   ing the program files  use the Readiris    uninstall    program or the Windows   un install wizard instead  See above        Read Me files and documentation  README DOC    Read Me    file  MANUAL PDF User   s manual  in Adobe Acrobat format     Scanner drivers    Finally  you may find some scanner drivers on the Readiris CD ROM under  the folder  Drivers     I R LS  offers no guarantee that drivers are supplied for your scanner model  or that the drivers supplied on the Readiris CD ROM will work  well  with your  scanner model    Don   t hesitate to contact your scanner manufacturer or its representative should  problems with scanner drivers continue  Most manufacturers allow you to down   load the latest versions of the scanners drivers from their web site        5 l 10       REGISTER TO VOTE     Don   t forget to register your Readiris licence  Doing so will allow us to keep  you informed of future product developments and related I R I S  products  The  registration benefits  including free product support and special offers  are strictly  limited to registered users    You can register   n many ways  by sending in your registration card or faxing    its electronic counterpart  by calling I R I S  during working hours and by f
37. e by adding one room  after the other     Creating polygonal table windows doesn   t make any sense      3 Readiris   C  Program Files Readiris english  jpg  page 1 of 1     File Edit Settings View Process Learn Register Help    fag on Optical Character  psy is still unknown by    Hlications     AR      B den yourself with the    Format     i and fastest tool to enter       Furthermore  manual windowing can be combined with window sorting  you  can draw new windows even when the    sort mode    is enabled  You then use  sorting to include a number of detected windows and manually create some other  windows where the page analysis didn   t yield the appropriate results  As soon as       5 210       you start creating windows in the    sort mode     all zones you didn   t select are  promptly erased     To modify  move and delete windows  you need to select them first  To do so   select the  Window Selection  or    arrow    tool in the image toolbar and click  inside a window  Rectangular markers now appear at each corner and   n the  middle of the window sides          A word about OCR  To unselect windows  click the mouse button elsewhere  To select add     tional windows  hold down the Shift key while clicking on these extra windows     To select a window and the included windows  of another type   hold down the  Ctrl key while clicking on the main window        So much for selecting windows  To modify a window  select it  put your mouse  cursor over a marker and drag the si
38. e consecutive OCR process  Select  the command  Adjust Image  under the  Process  menu to do so     Adjust Image    Ctri J       When you access this command  the black and white version 1s displayed  automatically   It   s as if you disabled the option  Display Document in Color     There are some complicated concepts here  and we need to discuss them in  detail     Adjust Image    iM Smoothen color image    With some scanner models  reduction of the  sharpness i   needed to recognize color and  greyscale images     Smoothening allows to separate the text from  the colored background    E    Brightness Cancel    f Automatic Apply      C Manual      Help      Despeckle  off    ee    0       The option  Smoothen Color Image  renders greyscale and color images more  homogeneous by    flattening     smoothing out relative differences in intensity  As a       USER    S GUIDE       result  a stronger contrast   s created between the foreground   the text   and the  background   a color  artwork etc     This preprocessing feature may seem highly technical and difficult to under   stand  but it certainly has its role to play  with some scanner models  this reduction  of the sharpness 1s needed to recognize color and greyscale images  Smoothen   ing 1s sometimes the only way separate text from the colored background  Below  is asample image that is simply illegible without image smoothing     WARE OF    GALES      E nigheas   deh si Deich       IN QUEST OF CALYPSO  from only   1 650 
39. elp    OCR Wizard    1    Scan    Q    Recognize    m   gt     English  Ran   m      Source  Pi ii    En  age Analysis    AB    Learn    FE    Format    Sca       Ungridded    tables don t have any borders around the cells  When the columns of ungndded  tables are too widely spaced  the page analysis may net detect a table window to avoid confusion  with columnized text blocks     When your tables exclusively contains numerir characters  enable the numeric reading mode with  the    Language    button on the main toolbar for increased accuracy     Finally  you can send your tables of figures directly tn Microsoft Fxcel hy selecting the spreadsheet  as target application   refer to the    Format    button on the main toolbar     2002 Copyright Image Reengnition Integrated Systems  Web site  hitp   www  inslink com       Eak     For optimal OCR accuracy  you should limit recognition to the numeric sym   bols with the  Language  button   The numeric mode is not strictly numeric  it  includes the symbols 0 to 9                 comma      dot                         and the      symbol      Language    W Numeric        Baa     o  English     Cancel      Mumeric          5 2   86       As you can only do this when the table doesn   t contain any alphabetic symbols    otherwise the text portions won t be recognized correctly   we can activate the  numeric mode now but couldn   t do it for the first table     This time  we will send the OCR result directly to the spreadsheet Microsoft
40. end of the recognition  the target application is started up and the rec   ognized document is opened inside a new text file or worksheet     Please wait while loading Microsoft  Word 97 word 2000 word 2002       Don t forget that the option  Send to  also allows you to copy the recognized  text to the Windows clipboard  so there 1s no strict need to export the result    or  save it to an external file     SAVING THE RESULTS IN A TEXT FILE       You can indeed write the OCR result to an    external    file  Here again  Readiris  supports a wide range of file formats incorporating all popular wordprocessors   spreadsheets  web applications etc    Microsoft Word  DOC   RTF     Rich Text  Format     and HTML etc           Output    C Send to        f External file  Microsoft Word 97  2000  2002    doc          AbSource Aboard    rtf   W Open after saving   Adobe Acrobat POF Image Text    pdf   Adobe Acrobat POF Text    pdf     Layout a    Bs  Bos  oe  9  10  fl    dca     Create body text Display Write    dw            Retain word and paragraph fq a a  Lotus WordPro  Amifro      rth   Microsoft Excel  cr     Microsoft Excel  htm   v Use columns instead o Microsoft Excel tab      tet   Microsoft Word 2      doc   Microsoft Word 4 0  6 0  7 0     Microsoft Word 97  2000  2002     dac   Microsoft Word 97  2000  2002    rtf   W Merge lines into paragraphs   Microsoft Works 4 5  5 0  6 0  aps   Multikd ate mm    W Include graphics OpenOffice org Writer 1 0    rtf   Rich Text Format  
41. ever your scanning mode may be  use a scanning resolution of 300 dpi  for normal applications  Use a higher resolution of 400 dpi for small print  below  10 point  and when the document is very degraded     Readiris reads point sizes of 6 to 72 point  0 08    to 1 or 0 21 to 2 54 cm      6 point    72 point    Readiris also recognizes    drop letters     large caps that cover several lines    These can of course be no bigger than 72 point      eadiris reads drop  letters  also called     drop    caps  that    cover several lines and  assigns them to their starting  line     As optimal OCR requires a resolution between 300 and 400 dpi  Readiris  warns you when you re submitting images with a resolution lower than 200 dpi or  higher than 800 dpi  However  Readiris can correct scans with too much detail  for you  Enable the option  Optimize Resolution for OCR  in the scan settings to  do so  Whenever the image resolution of your scans exceeds 600 dpi  the resolu   tion is reduced for the OCR process       i Optimize resolution for OCA    There are other ways of avoiding this warning  you may be reading faxes    which have a resolution of 100 or 200 dpi    when you re creating images with a  digital camera   where the resolution is unknown   and when you re opening          USER  S GUIDE       images where the file header contains an incorrect resolution  To process such  images hassle free  enable the option  Force to 300 dpi   This setting applies to  both direct scanning and t
42. h a carriage return added at the end of each line     This option is not available when the PDF format is selected  Adobe Acrobat  PDF files always store text line by line      The  Format  button contains some formatting options we haven t discussed  yet   this will be done shortly      SETTING UP YOUR SCANNER    Let   s set our scanner up now  It 1s assumed that the scanner hardware and  necessary drivers are installed correctly        5 2  96       If your Readiris software licence was bundled with a scanner or digital cam   era  this step probably is unnecessary as your hardware may already be set up  under Readiris     Click the  Scanner  button on the main toolbar     Scanner       Click the button  Scanner Model  to determine your scanner model     Scanner    Type  Format  ok    HF Scanlet 54700 Ad  Config     Scanner Model       Cancel  Resolution   300    Bright  m   Black and white m       Greyscale EB Invert  lighter darken    Cals      Digital camera    Force as 300 dpi    W Optimize resolution for OCR  Smoothen color images       When you select the option   lt Image gt   as    scanner     prescanned images func   tion as image source at all times   you won t have even to select the disk as image  source with the  Source  button on the main toolbar     The  Config   button 1s only available when you scanner allows it  It gives  access to some advanced scanning parameters  with Twain scanners  clicking  the  Config   button allows you to select the Twain source   
43. h numerous advanced  features  We will discuss all major features   n this chapter and add many tips and  hints concerning the use of Readiris     STARTING THE SOFTWARE UP    Click on the Readiris application in the submenu  I R 1 S  Applications   Readiris    or click on the shortcut to the Readiris application on your desktop     ZI Cardiris      G    r  spen        m Readiris      fy LRLS  Applications                   I R 1 5  on the Internet    Ko Readiris  et  Uninstall Readiris w      User s Manual  The Readiris startup screen and application window are displayed  The startup  screen displays the version and copyrights of the Readiris software  It also gives  direct access to I R I S    s home page   simply click on the URL to visit the    I R LS  web site  Clicking the mouse anywhere else makes this screen disap   pear                   Readiris release 9 0   2002   Image Recognition Integrated Systems SA   All right  reserved    For more info on new products and  Upgrades  vist our web site  ww irislink com    eh Ve a  a Sea E         The next window concerns the OCR wizard  click  Cancel  for the time be   ing     THE First  TIME STARTUP    Depending on the software bundle you acquired  the first startup may be spe   cial  you may be prompted to register your licence        If this is the case  the use of Readiris is limited to 30 days  and by registering   you receive a free softkey from I R I S  to continue using the software after the  first month        It take
44. he opening of prescanned images       Invert   Digital camera      Digital camera w Force as 300 dpi  w Force as 300 dpi      Smoothen color images      Smoothen color images      Load POF documents in color       When your images are acquired by a digital camera instead of a scanner  it is  mandatory that you enable a special option  that also applies to scans and  prescanned images        Invert i Digital camera    Digital camera w Force as 300 dpi  w Force as 300 dpi      Smoothen color images      Smoothen color images      Load POF documents in color       By doing this  you enhance the image before it gets recognized  There are  specific challenges to be met when it comes to digital cameras  they produce  low resolution images   even when you hold the camera very close over your  document   and the image resolution is in any case unkown     There are some    finer points    to be aware of when it comes to successfully  recognizing images captured with a digital camera     First of all  select the highest possible image resolution  Create for instance  2 048 x 1 536 size images when 1 024 x 768 and 640 x 480 images are also  supported  Secondly  enable the    macro    mode of your camera to take closeups    which is always the case when you photograph documents   This mode was  designed to capture flowers  insects etc   Otherwise  the images are unsharp and  illegible              Limit yourself to no or small compression  important compression reduces the  sharpness of 
45. hen the context does not suffice  the user inter     193 19505  ihr  Well  Rossellini    READIRIS CHANGES LANGUAGES AS NEEDED    But the buck doesn   t stop here  Readiris can switch languages in the middle of  a sentence without any help from the user  When Western words pop up in  Greek  Cyrillic or Asian documents   many untranscrible proper names  brand  names etc  are written using the familiar Western symbols    Readiris can switch       5 2 26       to the correct alphabet automatically  In other words  you can activate a mixed  alphabet of Greek  Cyrillic or Asian and Western characters     Be sure to select  Greek English  or the appropriate Cyrillic language setting    for instance  Byelorussian English   In other words  don   t try to just select     Greek  or  Byelorussian  as document language and hope that the Western sym   bols will come out fine     are          Russian E    Here   s an example where a Russian text contains some English words   open  the image file ALPHABET TIF if you want to try it for yourself     9 Readiris   C  Program Files Readiris alphabet tif  page 1 of 1     File Edit Settings View Process Learn Register Help    OCR Vizard    Scan    ig  t      Russian   English  e  Recognize    IIpeqHa3sHayeHnem CHCTEMbI ONTHYECKOTO      Pacnho3HaBaHHa 3HAKOB ABIIACTCA  A aBTOMATHUYECKHUH BBOA TIeyaTHBIX    NOKYMEHTOB B NaMATb KOMIbIOTepa KpafiHe   P ShbeKTHBHbBIM MH ANEIEBBIM NYT  M   Page Analysis Co  uro pa3pa0oTka 3ToN  AB JbA Tpeanmpmunata ee   
46. icient and fastest tool to ent    Recognize              English    M     Page Analysis    AB    Learn    FE    Format     3         a  id sends il  Ihe image  Al Ihis step  Ihe document image is only a meaningless cloud       s it ack points  pixels  on a white background  Ihe OCR sottware has to exlracl  Brees formation from these pixels  it has to recognize shapes by assigning characlers    e system extensively uscs linguistic databases when analyzing the context  in this v  nding correct solutions tur difficut cases   he user trains the sollware on n  haracters and typestyles  which are recognized automatically later on  This learniz  adule allows you to read virlually any font  In other words  the software gets mo       We KLOPSYTIERE IMAZ RROZALIOU mMmterratid Systems  Aifa sugar  hime seria aise Nebr ase       Page decomposition uses three window types  text  graphic and table win   dows  Readiris discriminates text blocks  tables and graphic zones containing  photos  illustrations etc  on the page   Saving graphics and recognizing tables will  be discussed at great length below     A color code indicates the window type  text zones have a yellow border   graphic windows have a blue border and tables a purple border        5 2 14       The number of windows is indicated at all times in the tooltips of the  Text  Window    Graphic Window  and  Table Window  tools     Page analyisis is fast  skew tolerant and highly accurate  it traces complex      irregular    shapes   
47. ied  you get  a continuous  running text  All  formatting  if any  is done  afterwards by the user     If you retain the word and paragraph formatting   the font type  size and type style are maintained  across the recognition  The justification of the  paragraphs is also detected  However  no graphics  are captured and the columns aren t recreated the  paragraph just follow each other etc        REC TRE EXT OVR French  Belg Gof    To see the effect correctly  you need to enable the    WYSIWIG    mode of  your wordprocessor  mostly called    page layout    mode  However  if you send the  recognized document directly to Microsoft Word  the page or print layout view is    activated automatically        USER  S GUIDE       ent   Microsoft Word    View   Insert Format Tools    cay Document   Microsoft Word         Eile Edit   view Insert Format Tools      5 T h MHormal    vo Web Layout    63  z  Outline    In short  Readiris not only recognizes your texts  but can format them for you  as well  OCR isn t just text recognition anymore  it is becoming more and more  page or document recognition as well     J    Normal i    x Web Layout            Print Layout      E Outline       Task  Pane      COLUMNS PLEASE  NOT FRAMES     The formatting option  Use Columns instead of Frames  determines how the     autoformatting    gets done  the text blocks  tables and graphics can either be  stored in frames or in editable columns           Frames    are separate containers for text used to po
48. illing  out a registration form on the I R I S  web site     pep        USER    S GUIDE    E  Readiris help    7 i g    DH    Hide Back Print Options    Contents   Index   Search      Me Register Your Readiris Licence  2  Welcome to the Readiris help    ie Introducing OCR    we Why you should register  we Recognizing Documents    e Registering allows us to keep you informed of    future product developments and related  i Software Versions and Options LRLS  products     NE  Product Registration  Register your Readiris licence Registering entitles you to free product support  A Readiris registration form and special offers   ae nn zus  Depending on the software bundle  you ll  receive the softkey in return 35 may be needed  to continue using Readiris after one month     fee Reference Information    How to       Registration wizard   Click U to start the registration wizard   Mail   Send in your registration card    WAW    Click here to access the Readiris registration form  on the 1LF 1 5   web site        The Readiris registration wizard as you ll find under the menu  Register  of    the Readiris software can guide you through the registration process comfort   ably     l    12          Readiris registration wizard    Welcome to the Readiris registration wizard   It allows you to register your Readiris software license     Regi  ternng allows us to keep you informed of future  product developments and related 1 A 1 5  products     Registering entitles you to free product supp
49. ines are used to recognize them     ape descended life       Letter quality    dot matrix printing  also called    25 pin    or    NLQ    dot matrix   requires the    normal    setting  as do the printing qualities typeset  typewritten   laser printed and inkjet printed     The setting  Automatic  means that Readiris will detect the font mode auto   matically  Let Readiris    auto detect    the font mode   n all cases   unless you are  sure only dot matrix documents are being read   Obviously   Automatic  is the    default value    Font Type Rand oad  Dok Matrix    The font type is indicated in the tooltip of the  Recognize  button  when no  message is added to the tooltip  the    auto detection    of the printing quality ap   plies  when the message  Dot Matrix  shows up in the tooltip  the dot matrix  reading mode is enabled             N a Perform text recognition  Dot Matrix    The character pitch can be set with the command  Character Pitch  under    the  Settings  menu   Character Pitch ae    Fixed      Proportional       With fixed or    monospaced    fonts  all symbols of the font have the same  width  An  i  takes up as much horizontal space on a line as a       USER    S GUIDE     w   as is the case in this sentence  Think of documents produced  using a typewriter  where the carriage moves a fixed distance for each typed  symbol     A proportional pitch means that the width of a character depends on its  shape  Symbols like    m    and    w    are wider  take more h
50. ious Page h PageUp  PageDown    Next Page    Let   s edit the document now  To delete a page from the document  hold your  cursor over its thumbnail  right click it and use the command  Delete Page   To  move a page up in the document  use the command  Move Page Up   and to  move a page down  use the command  Move Page Down             Delete Page 4  Move Page 4 Up  Move Page 4 Down                   5 2 54       STARTING A NEw DOCUMENT    You can use the command  New Document  under the  File  menu to close  the current document     New Document Ctrl M       This command    cleans the slate     Any document loaded into memory   con   taining a single page or multiple pages   is erased  You are now ready to create a  new document     But you can also create a new document from within the current document   As long as the OCR was not executed  the system assumes that you want to add  pages to the current document  You can for instance scan all the pages in the  scanner   s autofeeder  fill the feeder again and start over  All pages scanned will  compose a single document  Or you could scan a number of pages and add some  image files  say  faxes  These pages again form a single document  all you have to  do is change the image source in between with the  Source  button     When the OCR was already executed and you re initiate the scanning  or the  loading of images   you are prompted to start a new document or complete the  current document     Readiris    Are vou ready to dele
51. is effective tool in your office and unburden yeursclf with the  fastidious task of retyping printed text  OCR is the mast efficient and fastest tool to enter  texts into your computer automatically     The document is read by your scanner  This device acla as the  eve  of your computer  and sends il  Ihe image  Al Ihis step  Ihe document image is only a meaningless cloud of  black points  pixels  on a white background  Ihe OCR sottware has to exlracl ext  information from these pixels  it has to recognize shapes by assigning characlers        The system extensively uscs linguistic databases when analyzing the context  in this way  tinding correct solutions fur difficudt cases  The user trains the sollware on new  characters and typestyles  which are recognized automatically later on  This learning  module allows you to read virlually any font  In other words  the software gets more  intelligent each Lime you use il     2002 Copright Image Rovegnition Intezrated Systems  Web site  hip  swe inslink com    Readiris icon and they are promptly opened        USER  S GUIDE       You can even open images from within the Windows Explorer  right click an  image file and select the command  Recognize  from the  Context  menu   This  command only appears when the file   s file type is supported      2 10          hai Mame   Size   ow  Bllexcel ibt IKB I  o fayt 20KB Y  5  Cira Preview ae  ri 20KE Y  a EE ac   lem    Fr 932KB A  Print  SFr a 340KE A  x Resize Pictures      Fre     S14KB 
52. itable       mage only    PDF files are not              Finally  converting PDF files 1s a way of    unlocking    PDF content  You can  recognize    read only    PDF documents  where the text is normally inaccessible   With unprotected PDF files  the content can be retrieved  copied and saved to an  RIF file   with    read only    files  the content cannot be extracted  These docu   ments can only be viewed and printed           An important nuance  Readiris does not open password protected PDF docu   ments  even if all other PDF security barriers are broken down by Readiris    Proceed as usual  load PDF files into memory as you open prescanned images    faxes  snapshots made with your digital camera etc  Still  there   s a specific  option that concerns PDF files  You can open them as color and as black and   white documents  This option is offered because rasterizing color documents 1s  much slower     Look  ir      Readiis   do fe  sample    File name     Files of type   POF  pdf     Cancel        Digital camera     Force as 300 dpi     Smoothen color images   iM Load PDF documents in color          USER  S GUIDE    RECOGNIZING MULTIPLE PAGES    After the OCR  the scanned image is redisplayed with the zoning as created  to be available for further processing     You can now open the recognized text with your wordprocessor or text editor   import it into your desktop publishing software or any other text based applica   tion  Go ahead and compare it with the image you have
53. l panel           USER  S GUIDE    i Add or Remove Programs    5  Currently installed programs  sort by     Change or  Remove    __ Programs Adobe Photoshop Size 9  OME      g Cardiris Size 35 66MB  Q   Ma IRISPen Size  2 74MB  Add Mew  Programs jo  Mcafee VirusScan Size 15  90MB          pe Adobe Acrobat Size 76 358               EA Microsoft Office XP Professional Size 174  00MB      Readiris Size   Add  Remove  Windows   Components Last Used On 31 07 2002  ee    To change this program or remove ik From your computer  click Change Remove  Change Remove     7  RealOne Player Size 22  10MB    Used fF          LJ Follow the on screen instructions to remove the Readiris software     INSTALLING SOFTWARE OPTIONS       There   s a single software option available for the Readiris software  the    Asian  OCR add on     It allows you to read Japanese  Traditional Chinese  Simplified  Chinese and Korean  This software 1s again delivered on an autorunning CD   ROM           E  Readiris help    a   i g    Hide Back Print    Contents   Index   Search      2  Welcome to the Readiris help  ig Introducing OCR  i Recognizing Documents    ie Reference Information    C  Software Versions and Options  Software versions  Asian OCR  Add on      word about the Asian languag      gt  Product Registration      gt  Product Support       Options    Software Option Asian  OCR    Add on       Reading Asian documents    The software option    Asian OCR Add On     offers recognition of the Asian language
54. ll use  Select a format that   s supported by your paint or photo retouching  software  The JPEG TIFF and Paintbrush  PCX  formats are supported  Enable  the option  Greyscale Color  to save the graphic as a color or greyscale graphic     2 81       USER  S GUIDE    Save Graphics       Save ir         My Documents   tE    File name     Save as ype   TIFF Ei 00000  TIFF f  tif 7    soft Paintbrush     pcx   JPEG    jpa       READING FAXES AND DEFERRED RECOGNITION    Saving images as image files opens another possibility  you can save the full  page and perform deferred OCR on it later on  That   s what we did with the  prescanned images of our tutorials    Simply scan the document  Select the command  Save Full Page as Image     under the  File  menu to save a single page  You    Il again be prompted to save the  entire page as TIFF or Paintbrush  PCX  file        Save Full Page as Image    N    Save All Pages as Image       Select the command  Save All Pages as Image  to save a multipage docu   ment  A single file format   s available here   multipage  TIFF        5 2082       You can now select the disk as image source and open the image file with the   Open  button  or with the corresponding command under the  Process  menu     If you use the  Open  command under the  File  menu  you don   t even have to  update the image source      As color  greyscale and black and white images are supported on an equal  basis  Readiris opens Adobe Acrobat PDF documents  JPEG images  Pain
55. n the dictionary is full  the results of the learning are no longer held in  memory or written to a dictionary    You can set the dictionary mode inside the command  Font Dictionary  or  directly under the  Learn  menu  Three dictionary modes are available  new   append and read        USER  S GUIDE          w New Font Dictionary  Append Font Dictionary  Read Font Dictionary    w Interactive Learning    By selecting  New Font Dictionary   you indicate that the training results will  be saved in a new dictionary   If you select an existing dictionary  its contents will  be erased      The append mode indicates that the training results will be saved in an exist   ing dictionary  the recognition makes use of the extra intelligence already con   tained   n the dictionary  and you add new font shapes to it  In simple terms  this  option allows you to build up a font dictionary   n several steps         When you enter a filename for a new dictionary and activate the    append     mode  an empty font dictionary is created and you complete it      With the last option   Read Font Dictionary   the dictionary functions in read   only mode  you make use of the dictionary without adding new font shapes to it     Select the new mode when a single page is recognized  To recognize many  pages of the same type   pages with the same fonts and printing quality   select  the new mode for the first page  the append mode for a few pages more and the  read mode for the rest of the document s     
56. nd the text may run from top to  bottom  from right to left  And if you forgot to select the proper language  select  it afterwards  Readiris re executes the page analysis automatically        Some documents have many    stray    dots on the page  may generate a black  page border around the actual image etc  To erase all small windows   it   s as   sumed they don   t contain any text   and re sort the remaining zones  you can  click the command  Delete Small Windows  under the  Edit  menu     Delete Small Windows Ctrl M       Two  WINDOWING A SCANNED IMAGE MANUALLY    Page analysis is the automatic way of windowing a scanned page  Alterna   tively  you can zone an image manually with the windowing tools of Readiris        To draw a rectangle around a zone of interest  select the corresponding tool in  the image toolbar  click the cursor   n the upper left corner of the window  stretch  the window by moving the mouse to the lower right corner and click again   Sides  smaller than 1 mm are not allowed  they wouldn t even contain a single character    anyway      2 17       USER    S GUIDE    The windows are automatically sorted in the order of creation  arrows indi   cate the sort order     You can also frame    irregular    text blocks by drawing polygonal windows  around them  Non rectangular windows are created by merging rectangular zones   as soon as two rectangles  of the same type  intersect  they become a single  window automatically  In a way  you   re building a hous
57. nectionist  AutoFormat and Linguistic technology by I R I S  I R I S  detains  the copyrights to the Readiris software  the OCR technology  the linguistic tech   nology  the on line help system and this manual     AutoFormat  Cardiris  Connectionist  I R I S  Linguistic Technology  the LR LS   logo and Readiris are trademarks of I R LS     Acrobat  Reader  and the PDF format are  registered  trademarks of Adobe   AsianBridge is a trademark of TwinBridge  AsianSuite is a trademark of  Union Way  Excel  Windows and Word are registered trademarks of Microsoft   Intel is a registered trademark of Intel  WordPerfect is a registered trademark of    Corel        VIII             USER  S GUIDE    Chapter    INSTALLATION    This chapter discusses the system requirements and installation of the Readiris  software     SYSTEM REQUIREMENTS    This is the minimal system configuration required to use Readiris    LJ a 486 based Intel PC or compatible  A Pentium based PC is recom   mended    LJ 32 MB RAM  64 MB RAM is recommended to process greyscale and  color images    LJ 110 MB free disk space  95 MB of disk space suffices when you leave  the sample files on the CD ROM    LJ the Windows XP  Windows ME  Windows 2000  Windows 98  Windows  NT 4 0 or Windows 95 operating system     Note that some scanner drivers may not work under the latest Windows  version s   Refer to the documentation supplied with your scanner to see which  platforms are supported     INSTALLING THE READIRIS SOFTWARE     
58. ner   Force as 300 dpi         Smoothen color images       Place the pages of your document in the automatic document feeder and start  the scanning  all pages are scanned until the document feeder   s empty     You can also open multiple prescanned images  To load several images  se   lect the first image and hold down the Ctrl key as you select additional images  To  load a continuous range of   mages  select the first image and hold down the Shift  key as you select the last image        USER  S GUIDE    C alphabet   deskew greek    multipag   Telasiar   ni digital BI italian ni  norweg   Cl autoform dutch japanese polish   brazil korean russian  French lite simp chinese    ni german E  matrix B  spanish    Wr File name   english  jpg   alphabet  tit  autafarm jpg    Open  Files of type  fan image files    Cancel      l Digital camera    Force as 300 dpi    Smoothen color images      Load POF documents in color       The same effect can be obtained comfortably from within the Windows Ex   plorer  select several image files  right click and select the command  Recognize   from the  Context  menu  You can repeat this operation  all images you send to  Readiris append the current document until you click the command  New Docu   ment               x Mame      E excel ibt  Er         Era  Preview  es    Fax  li  Edt      fin  Print           Frit Resize Pictures    lim onen with N      Fre  FR  Send To    Frri  Fru z  gal py    gan Create Shortcut    gen Delete  el    Rename   
59. ng   font type  serif   sans serif  proportional   fixed   normal   condensed   point size and typestyle  bold  italic and underlined    is  retained across the recognition  and so is the paragraph formatting   the tabs  and the alignment  left  centered  right and justified      Don   t confuse this formatting option with    full    autoformatting  this option just  puts one paragraph after the other    t does not recreate columns or copy the  relative position of the various zones     SAVING GRAPHICS SEPARATELY    In our example  the graphic was included in the recognized text  whether this  is the case depends on the formatting option  Include Graphics   Whether it 1s  possible to save graphics inside the text again depends on the output mode     Poor     text formats such as Text  ANSI  etc  don   t store graphics     Options    Merge lines into paragraphs    iM Include graphics          3 2  80       Still  with Readiris  you can save graphics without performing text recognition   As Readiris generates black and white  greyscale and color images  you can  capture lineart graphics and photos    How  Draw a graphic zone around the illustrations  cartoons etc  you need   Creating graphic windows manually is done in the same way as drawing text and  table windows  simply select the  Graphic Window  tool now        Next  choose the command  Save Graphics  under the  File  menu     BREI ET a    You are prompted to specify a filename  Determine which graphic file format  you wi
60. nnnnnennnensnnnnnnnennnnnnnnnnn Il  Tabl OLE OT Te ee Eee V  Ce an O E S e eT RTT Wi ee DER TEES VI    Chapter 1  Installation    S R E e ee ee 1 1  Installing the Readiris Software een ee 1 1  Uninstalling the Readiris Software anne een ee 14  Readiris    uninstall  program ee ea ee 1 4  Windows  un install wizard ee ee ee ne ee 1 4  Installing Software Options asses serge cee ee ee nee ee 1 5  Installine Related eae 0  6 111       cae een 1 7  In ale RE ee ee nee ee RE 1 9  Read Me Tiles and Jo HB ee 1 9  SAN nee Bere Men 1 9  REITS EL TON DE er ee 1 10    Get  ne Product UII OU seat ee na 1 12       5 VI       Getting in Touch with LR LS                      00000sseeeeeeeeeeeeseeeeesssenannnnnnnnnnnnnnennnneeeeeeeeen 1 13    Chapter 2  Guided Tour       SAL IE IH SOLLE OD  ee 1 1  The Firstime SET Dee ee eure 1 2  Discovering the Readiris Interface ee ee 1 3  Getting Started with a First Tutorial                        2200000000eeeeeneeeennnnnnnnnnnenenennnnneeeeeeeeenn 1 6  ZOOM TP OMAN AO CS ee ee 1 10  One  Decomposing a Scanned Image                          uuuuuunnnneeeeeeeeeeeeseeesnsnsnnnnnnnennnnn 1 12  One and a Half  Sorting Windows    cssicden cimardincoumaisncessiesissateetoutnicnnesesewaxsitenewasioseisusestescsinnes 1 14  Two  Windowing a Scanned Image Manual                                      eeeeen  1 16  Three  Saving Windowing Templates                              000000000s ee eeeeeeeennnnnnnnnnnnnennnnn 1 19  Readiris Takes You around The World u 
61. of Readiris are stored in the settings files     SAVING SPECIFIC SETTINGS  The default settings will obviously be used at each program startup  but you    can save specific settings as well to avoid having to redefine the operational  parameters  The commands  Save Settings  and  Load Settings  under the  File     menu take care of this   Save Settings         Save Default Settings       Let   s give an example  if you regularly have to OCR English documents with  a specific layout  you are recommended to create a settings file for this type of  document  You would then select  English  as the document language  load a  specific zoning template to avoid having to reapply the same windowing each  time  disable learning but activate a font dictionary in the    read    mode because  the same typefaces are used systematically etc    If you are unsure what the current settings are  you don   t have to    plunge     into every menu and command to discover what they are  You can use the com   mand  Info  from the  File  menu to get an overview        USER  S GUIDE    Information on settings    Scanner  Model HF Scanlet 5500C  Resolution i  Format Text  Format Microsoft word 97 Word 2000   word 2007    Faragraph On    Mode Black and white Layout Recreate source document    Landscape Off  Document  Font Type Automatic Language English    Page  Resolution       SCANNING DOCUMENTS    Now that our scanner is set up  we want to get started scanning documents   There are some elements you 
62. on English   To go to another letter  say T  press BackSpace  before you enter the  T  character         Readiris   s far from limited to English  up to 104 languages are supported  All  American and European languages are supported  including the Central Euro   pean languages  Greek  Turkish  the Cyrillic     Russian     and the Baltic languages     Optionally  you can read Asian documents  the extra module    Asian OCR  add on    offers recognition of Japanese  Simplified Chinese  Traditional Chinese  and Korean   Simplified Chinese is used on China   s mainland and in Singapore   where Traditional Chinese 1s used by Hong Kong  Taiwan  Macau and the over   seas Chinese communities      Also note that the British and American   or should we say    international        variants of the English language are distinguished    It takes the appropriate Windows configuration to display Central European   Greek  Turkish  Cyrillic and Baltic characters  You may have to install the Win   dows multilanguage support before your Windows system is able to cope with  these languages    On a Windows XP  2000 and Windows NT 4 0 operating system  select the  icon  Regional Settings  and Languages   under the  Control Panel         USER  S GUIDE    Regional and Language Options    Regional Options   Languages Advanced    Text services and input languages  To view or change the languages and methods you can use to enter                            text  click Details     Supplemental language sup
63. on is highly accurate and does  not require detailled proofreading     Abort    Don   t confuse  Finish  with the  Abort  button  with  Abort   no output is  generated and you start all over  with  Finish   the text is created  it just isn   t  proofread in detail     THE ROLE OF FONT DICTIONARIES    The results of each training session are temporarily held in the computer   s  memory but can and should be stored in files called    dictionaries    for future use     These font dictionaries should be loaded into memory when you want to rec   ognize similar documents in order to make use of the extra intelligence they con   tain    n this way  Readiris takes into account the intelligence stored in these font  libraries  You could say that Readiris gets more intelligence each time you use it              How does this work  The operation of font dictionaries is controlled by the   Learn  menu  you have to select a dictionary with the command  Font Dictio   nary  and determine its mode of operation           Dictionary    Look in   EI My Documents    ri  Amy Music    le Pictures    My Videos    Readiris dus    File name  Readiris  Files of type   Dictionary    Cancel      f New Dictionary       Append Dictionary     Read Dictionary       Font dictionaries are limited to 500 shapes  and you are recommended to  create separate dictionaries for specific applications  for instance per type of  document  Dictionaries have the default extension   DUS  Training no longer has  effect whe
64. orizontal space on a line  than the    thin    characters    I    or    4     Virtually all books  magazines and newspa     pers are printed in proportional pitch     The simplest solution is to leave this option at all times on the default value   Automatic   which means that Readiris will detect the character pitch automati   cally     READIRIS GETS MORE INTELLIGENT EACH TIME     When the document language is selected and document characteristics are       set  enable the interactive learning and click the  Recognize  button     a    ko     Recognize    el    Learn       The OCR progress is indicated on screen  You can click the  Stop  button to  abort the text recognition     OCR in progress          5 280       At the end of the recognition  Readiris enters the interactive learning phase  when the learning is enabled with the  Learn  button on the main toolbar      Interactive learning does not apply to Asian documents  learning does not  make sense for these languages which use thousands of different symbols   and  you d have to be able to enter the ideograms  not an easy task when using a  Western keyboard      Font training can substantially enhance the accuracy of the recognition sys   tem  When the user tries to read distorted  defaced forms as are found in real  documents or stylized font shapes which Readiris does not recognize optimally   training can overcome this temporary    failure        User learning is also used to train the system on special symbols which 
65. ort and  special offers           Depending on the software version you acquired  you ll receive the softkey  in return as may be needed to continue using the Readiris software after one  month     GETTING PRODUCT SUPPORT       The command  Product Support  under the  Help  menu of Readiris details  how you can get technical support  Please describe the phenomenon you experi   ence clearly and include all relevant data concerning Readiris  your scanner and  your computer system        l 13       E  Readiris help    P    Hide Back Frint    Contents   Indes   Search      2  Welcome to the Readiris help  ie Introducing OCR    Recognizing Documents    How to   a   fe Reference Information    Software Versions and Options  ie Product Registration  Ky Product Support  How to get product support  How to get in touch with IRIS      Getting product support by e mail       USER    S GUIDE    Options    How to Get Product Support    Free technical support is offered to all  registered customers   Registering alsa  entitles you to special offers      Europe    Hotline  32 10 45 13 64  working hours   fall major languages   Fax  32 10 45 34 43    USA    Hotline  1 561 395 7851   800 477 4744  working hours   Fax  1 502 507 3418    APA AY    www Irslink comy support htmi   troubleshooting info   Click here to access the troubleshooting info     E mail  supportmirislink com    l 14             USER  S GUIDE    Chapter 2  GUIDED TOUR    Readiris is a state of the art OCR package equipped wit
66. page 1 of 1     File Edit Settings View Process Learn Register Help    OCR Wizard    Window   Fit ko    Width y    50   Actual Size  Actual Size  200  Actual Size       a       Recognize    ewe Ihe aim of OCR    English    Finally  you can double click the right mouse button over a region of the scanned  image to zoom in at real size immediately  Repeat the operation to zoom out  again     2 12          ONE  DECOMPOSING A SCANNED IMAGE    Now that the image is scanned  you have to indicate which parts you want to  convert into editable text by drawing frames  so called    windows     around the  zones of interest    Actually  Readiris will do this for you automatically when the option  Page  Analysis  is enabled on the main toolbar         Page Analysis    Automatic page decomposition is particularly useful when columnized texts  and documents with a complex page layout  possibly including graphics and tables   are recognized     2  19       USER  S GUIDE    Readiris   C  Program Files Readiris english  jpg  page 1 of 1  SE    File Edit Settings View Process Learn Register Help    OCR Wizard    TS    Scan    x cost way  Although the first research and development on Optical Charac  ygiuition  OCR  began more than 30 years ago  this technology is still unknown  asl of the people who could use it for their document entry applications     ow  you can use this effective tool in your office and unburden yourself with      stidious task of retyping printed text  OCR is ihe mast eff
67. port    Most languages are installed by default  To install additional languages   select the appropriate check box below     Install files for complex script and right to left languages  including  Thai     Install files for East Asian languages       On a Windows ME and 98 operating system  select the icon  Add Remove  Programs  under the  Control Panel  to find out if the module    Multilanguage  Support    is installed on your PC           Add Remove Programs Properties Ei  Instal Uninstall Windows Setup   Startup Disk    To add or remove a component  click the check box   amp  shaded box    means that only part of the component will be installed  To see  what s included in a component  click Details     Components         iy Microsoft Exchange       C a Microsoft Fax 0 0 ME  E Multilanguage Support 10 4 MB  SO  Multimedia 1 1 MB  O RE  The Microsoft Network DOMB    Space required  1 2 MB  Space available on disk  29 3 MB    Description  Includes options to change keyboard  sound  display  and  mouse behavior for people with mobility  hearing and visual  impairments        1 of 1 components selected Details       Have Disk               To view and edit Asian documents  you can install an Asian version of the  Windows operating system or run specialized    emulating    software  such as  UnionWay AsianSuite or TwinBridge AsianBridge  on a Western version of Win   dows to correctly represent the ideograms of these Asian languages  Finally  you  can use Word 2002 or 2000 to
68. r their document entry applications     ow  you can use this effective tool in your office and unburden yourself with    stidious task of retyping printed text  OCR is the mast efficient and fastest tool to ent    Recognize        be    English    wen    M     Page Analysis    AB    Learn    4     Format     gt    Scanner     The system extensively uses linguistic databases when analyzing the context  in this wa    ading correct solutions for difficult cases  Ihe user trains the sollware on new   haracters and typestyles  which are recognized automatically later on  This learning  module allows you to read virlually any font  In other words  the software gets more  intelligent each lime you use il        i m m  ZUZ Lupyright Image Revegninon Int  rratcd Systems  Web site  hupi www inslink com       Page analysis is enabled by default  To force Readiris to decompose the cur   rent page   because you disabled page analysis by accident  because you erased  some windows erroneously and want to redo the page analysis etc     you can    simply click the button  Analyze Page  in the image toolbar           5 2  10       Select the document language before executing the page analysis when you  are dealing with Asian documents  Specific routines are used for these languages   the interline spacing of Asian documents is in most cases bigger than in Western  documents  the text is made up of small icons     ideograms     that could easily be  seen as graphic zones in Western documents a
69. recognized text or not is up to the user   You can perform OCR because you just need  the text  in which case you will edit and  format it yourself  and you can recreate the  source document  including its formatting     x Copyright     b  14 4  10f1  gt  ree    4       The various levels of   formatting are  creatin  text  retaining the wor  Paragraph formatting and  creating a facsimile copy     bod   o gl    Creating body text means no  formatting is applied  you  get a continuous  running  text  All formatting  if  any  is done afterwards by  the user     If you retain the werd and paragraph formatting    the font type  size and typestyle are maintained  across the recognition  The justification of the  paragraphs is also detected  However  no graphics  are captured and the columns aren t recreated    the paragraph just follow each other etc      Autoformatting  recreates a facsimile  copy of the original document  the text  blocks  graphics and tables are recreated in  the same place and the word and  paragraph formatting are maintained  across the recognition     Cell ZA    Cell 3A    100 000    As a result  you get a true copy of your source  document  be it a compact and editable text  file  no longer a scanned image of your  document     Image Eecogsitica Integrated Systems    The format  PDF Image Text  yields different results  Readiris creates a  searchable PDF file that contains the recognized text and the page image  The           z        page image is containe
70. robat PDF  Readiris allows you  to create PDF documents of two types   PDF Text and PDF Image Text              Output  f Send to   Adobe Acrobat  Reader    Image Text      m    AbSource Abi  ord  External file Adobe Acrobat  Reader    Image T ext  Adobe Acrobat POF  Reader    Text     pdf  he  v Clipboard  Clipboard Microsoft Excel  Layout Corel WordPerfect  Output     Send to      f External file  Adobe Acrobat POF Image T est     pdf        Abib ource AbM ord    rtf       v Open after saving Adobe Acrobat PDF Image T ext      Adobe Acrobat PDF Text     pdf   Layout Corel WordPerfect 5 4  6 1  8 8  9  10       What   s the difference between the two  When you select the format  PDF  Text   Readiris creates a PDF file that contains the text result   Graphics may  occur but only when graphic zones occur on the page   photographs  artwork    etc   In other words  the page image is not contained in the single layered PDF  file         amp  amp  Adobe Acrobat    Autoformat  pdf   pa Fie Edit Document Tools view Window Help    USER  S GUIDE    EEk     BBEASAER AM E     gt  gt     gt  Oo    e AOAAS  N    DA  E  5    B 7 2zB amp  BUSUHET     Autoformatting    The aim of  autoformatting    is to recreate a facsimile copy of    the original document     he OCR process does more than just  recognize your text  itcan format it for  you too     In a way  text recognition is becoming  more and more page recognition or document  recognition        Whether your OCR software reformats  the 
71. s  Japanese  Simplified Chinese  Traditional  Chinese and Korean     Tip  a large number of Asian languages such as  Malay  Tagalog etc  are supported by the     standard    Readiris software because they use  the Latin alphabet     What it takes   the working environment      To view and edit Asian documents  you can  use Word 2002  Office  P  or Word 2000   Office 2000  or install a localized Asian  version of the Windows operating system     Alternatively  you can run specialized     emulating     or  overlay software such as  Unionvway Aslansuite or Twinbridge  AsianBridge on a Western version of  Windows to correctly represent the  ideograms of these languages     By installing this option  specific documentation becomes available that dis     cusses how you can recognize Asian documents        USER  S GUIDE             Cardiris    IRISPen        m Readiris        LRLS   on the Internet    Ka Reading Asian documents be      Ed Readiris      HA Uninstall Readiris    fag LRLS  Applications                           User s Manual    INSTALLING RELATED PRODUCTS    Depending on the software bundle you acquired  Readiris may be supplied  with an evaluation version of the related product Cardiris  a business card or   ganizer     If this free software package 1s included on your Readiris CD ROM  it is also  installed using the autorunning CD ROM and following the on screen instruc   tions    Contact I R I S  to learn more about complementary software  the command     Contact I 
72. s  Readiris allows you to archive a true copy of your documents   be   t a editable and compact text file instead of a scanned image    All this implies that the sorting of windows only partially applies when     autoformatting    is used  you can include and exclude zones  but any re ordering  of zones is simply ignored     Here   s an example of how it works  To get acquainted with this feature  open  the image AUTOFORM JPG which is found in your Readiris folder     2 74          Readiris   C  Program Files Readiris autoform jpg  page 1 of 1     File Edit Settings View Process Learn Register Help    Autoformatting    The aim of    autoformatting    is to recreate a facsimile copy of    OCR Wizard    ES  Mrz  Scan         Q  Recognize         English         Page Analysis    AB    Learn     FE  Format  J  G    Scanner    the original document     he OCR process dnes more thar just  Tecoguize your text  it can formar it for    you Luss     In a way  text recognition is becoming  maure wid nerw pige cecowoilion or dovument  recognition        Whether your OCR soliware reformat   the rrwoggr  zed lext or rot js up lo Ihe user   You can perform  MIR Secause you just need  the text  in which casc you will edit and  focsnat 11 yourself  nod you can recreule tbe  source document including its formatting     The various levels of     orma Liug uro  vscakimy body  text  retalnirg the word and  Paragrasa formatting and  meatirg a Tarsim   le nopy     Creatine body tex  gt  means neo  
73. s  Readiris reorganizes them in real cells and recreates the cell borders  of the original tables        In other words  Readiris allows you to archive a true copy of your documents   be it editable and compact text files instead of scanned images  Various levels of  formatting are available  the choice 1s up to the user        USER  S GUIDE    You can even recognize business cards with Readiris  scan your business  cards  recognize them and convert them into an address database  Think of your  last exhibition when you came back with an entire stack of business cards and it  took your secretary two days to encode them     The card   s data is extracted automatically from the image and the recognized  data is assigned to specific database fields  Readiris extensively uses a knowl   edge database  thus acquiring the necessary intelligence to discriminate the first  and last name  a city and its state  a telephone and a fax number etc  The result   ing data can be sent directly to your contact management software such as  Microsoft Outlook  Express  or any vCard compliant application        Readiris supports a wide range of popular scanners  numerous flatbed scan   ners  sheetfed scanners     all in one    devices or    MFPs        multifunctional pe   r  pherals     and digital cameras can be used  Readiris also supports the Twain  scanning standard and some scanning platforms        TABLE OF CONTENTS    Save Time  No More Retyping                           cccceensssssssssnnnn
74. s your identification number to generate the softkey  be sure that this  number is available or mentioned when you register your licence        USER  S GUIDE    Readiris  The identification number on this machine iz      aco 425 35  085035 88032535444 508050 Help      To enable this software  you need a key   Please contact   A l S  to obtain this key    Enter your key number       don t have this key         DISCOVERING THE READIRIS INTERFACE    The Readiris application window not only contains command menus but also  two button bars that give quick access to all frequent commands  Initially  some  command menus are dimmed  they concern the preview  As long as no image 1s  opened  they are unavailable           Readiris    File Edt Settings view Process Learn Register Help    F S         I   Sort  g  Recognize    English    Page Analysis    AB    Learn    ey    Format         Scanner       The same goes for the image toolbar on the right side of the application  window  it contains all commands you need during the image preview  The main  toolbar on the left gives quick access to all frequent general commands     To learn which command corresponds to a certain button  hold your mouse  pointer over it for a while  a tooltip will tell you what the button does        USER  S GUIDE      Readiris        File Edit Settings view Process Le    OCR Wizard       Bo  The window pane or image zone is where the scanned images are displayed     You can drop image files onto the image zone
75. schap nier  hoe was   k in boeken diets anders d   m cdrijf te amuseren   Als ik bij hee Iczen ap macilijkhed  it  hije ik mijn tanden cr nier pp kapot  Ik laat ze voor    Page Analysis    j zijn  Oa ze   u vt tweemaal te hebben peartaqueerd       AB    Learn 7    Fe et wel duizend hacken in zijn k  st en een encyclupedische ke  on s van de Crriekse en Latijuse       Format       Scanner          USER    S GUIDE    READIRIS TAKES YOU AROUND THE WORLD    Assuming that the w  ndows are correctly defined  you are now almost ready  to execute the character recognition  We say    almost     because we haven t veri   fied the language and document settings yet        The language setting can be found on the main toolbar               English       Click the  Language  button to modify the document language          Language          Numeric    English  Chinese  Simplified   Chinese  Traditional  Cancel      Corsican  Croatian       Haitian Creole  Hani       5 2   22       You can press a letter key to move to it directly  if English is currently se   lected  and you want to select Occitan  you can click the  O  key on your key   board to go directly to the Occitan language  When several languages have the  same initial  press the letter several times to go through the options  Let   s give an  example  Readiris reads English and Estonian  By pressing  E  once  you select  English  by pressing  E  a second time  you select Estonian  and by pressing  E  a  third time  you   re back 
76. should be aware of     First of all  pay some attention to lineskew  Although the page analysis and  recognition are skew tolerant    t may become difficult to window and OCR a  page correctly when the skew is too significant  Limited lineskew  less than 0 5     can be ignored because the OCR accuracy does not suffer    The option  Page Deskewing  under the  Settings  menu determines whether  pages which were scanned at an angle will be deskewed  straightened auto   matically   limited lineskew gets ignored  This option is disabled by default     Page Deskewing          If you forgot to enable this option  use the  Deskew Page  button on the image  toolbar  and the command  Deskew Page  under the  Process  menu  to     straighten    pages which were scanned at an angle        5 2  66       ij em    The deskewing takes a few seconds  the image is analyzed to detect the skew  angle   if any    the color or greyscale image and its black and white version are  deskewed and the page analysis gets re executed           Detecting lin  skew Deskewing       You may also need to adjust the page orientation  Use the rotation tools on  the image toolbar   Corresponding commands are found under the  View  menu    Three rotation directions are available  to the left  to the right and upside down   Rotation also takes a few seconds as the image itself is updated  not just the  display on screen     Rotate Right h      Turn Upside Down             However  Readiris can correct badly oriented
77. sition several blocks of  text  graphics and tables on a page  With columns  the text flows naturally from  one column to the next  and columnized texts are much easier to edit    We now assume that real columns do occur on the scanned document  when  the system is unable to detect columns in the source document  this formatting  mode uses frames anyway as a    fallback    position     You can make good use of the image COLUMNS  TIF in the Readiris folder if  you want to try it           U columns   Microsoft Word    EX     Eile Edit View Insert Format Tools Table Window Help Type a question for help   X  Deh AaSeay s aavio         ET 0    2      Ic ma  y y WO    A Normal   10 pt  Jus    Times New Roman   10   B i    Final Showing Markup   Showy  gt    p UP Ahr  Qe  21  or re re    ut  m  nti   m  y    EJs t   Pala                     e     1    Br Br rer Zr ee zZe nee                          Schauspielem nicht vig d  singen die Gls das    Parris Island  Insofern    n      ER   e Revue des Terrors    Dompteur der Rekrute Krieges  der nur Tod  s Kubrick entdecken  Two Three Left Right T    Right to left e fri  t jede Bedeutung   2 Number of columns    Line between  Die Logik des Krie   a     m Width and spacing         Preview    H schwarz und zum  _ er zweite Te Col    width  Spacing  den Rolling Stones     JACKET erz   re  6 34 m  H  Dam H   ische conclusio  kein  Drills  Die        s Krieges  Denn darin  2 zivilen Existenzen si   2 g on     a   d J Kr hat erreicht   A keit
78. t methods when it comes to saving the  OCR result  sending the recognized document directly to a target application   saving the result in an external file and copying the result to the Windows clip   board     The output target is selected using the  Format  button on the main toolbar   or the command  Text Format  under the  Settings  menu            Text Format    Output    f Send to   Microsoft Word 97 word 2000 word 2002       External file        Open after saving    Layout     Create body text     Retain word and paragraph formatting         Recreate source document    Use columns instead of frames    Options   W Merge lines into paragraphs  i Include graphics        Cancel         The  Send to  feature offers a direct OCR link between your scanner and  your Windows applications  you send the scanned documents directly to your  wordprocessor  spreadsheet or web browser  to Adobe Acrobat  Reader  etc         3  USER    S GUIDE  Output  f  Send to Microsoft Word 97 r Word 2000   Word 2002         AbSource Abi ord  Adobe Acrobat  Reader    Image Text       External file    Adobe Acrobat  Reader    Text  E Clipboard  Clipboard Microsoft Excel  Layout Corel WordPerfect     Create body text Gee oa      Microsoft Excel     Retain word and paragraph fo Microsoft Internet Explorer        u ond 7002     Recreate source document Netscape  JOpenOffice org Writer 1 0    v Use columns instead ol s oftware602 Pra PC Suite    Sun StarOffice 6 0  Web browser  Options WondPad    At the 
79. tbrush   PCX  images  DCX fax images  a multipage version of the Paintbrush format    PNG images  TIFF images  uncompressed  LZW  PackBits  Group 3 and Group  4 compressed   multipage TIFF images and Windows bitmaps  BMP      This capability is particularly useful to convert your faxes into editable text  files  Readiris uses extra intelligence when   t comes to reading faxes  the soft   ware detects the typical fax resolutions   100 x 200 dpi     normal quality      200 x  200 dpi     fine quality     and 200 x 400 dpi     superfine quality       and    prepro   cesses    these images automatically to ensure optimal OCR results           Nevertheless  it   s still a good idea to ask your correspondents to send faxes  with the    fine    quality   those faxes will yield better OCR results     Don   t forget that you can right click on images in the Windows Explorer and  select the command  Recognize  from the  Context  menu to open images  Al   ternatively  you can use    drag and drop     drop image files from the Windows  Explorer onto the image zone or icon of Readiris and they are promptly opened     RECOGNIZING TABLES    So far  we   ve recognized texts and faxes and we ve saved graphics  Let   s  process a table now  Take a table of figures and scan it  or open the sample image  TABLES JPG in your Readiris folder    Actually  the image TABLES JPG contains two tables  and that   s no coinci   dence  The page analysis zones them as table windows  and Readiris will recon  
80. te the current document     Ves No   Cancel         ORGANIZING THE TEXT OUTPUT    Saving or exporting the text means more than selecting an output method or  defining a filename for the output file  You also select a file format and determine  the appearance of the recognized text  In short  you have to decide where you  want to take the text before you launch the execution        USER    S GUIDE    Some options ofthe  Format  button allow you to influence the look ofthe text  output     The text flow of the output document is directly influenced by the option   Merge Lines into Paragraphs      Options  lf Menge lines into paragraphs    J Include graphics       Keep this option enabled to have Readiris detect the paragraphs  Readiris will  then apply the normal wordwrap typical of wordprocessors  otherwise  a car   riage return is added after each line and hyphenated words remain so  Paragraph  detection is enabled by default     Let   s give an example to clear things up  When the first three lines of a col   umn are  Ihe new presi     dent waved from the balcony   and  His wife had  joined him    the paragraph detection gives you the following result   The new  president waved from the balcony  His wife had joined him   The hyphenated  parts of the word  president  were    reglued    and a space was added at the end  of the first sentence  thus creating naturally flowing text        Had paragraph detection not been enabled  the original layout would have  been retained  wit
81. the captured text  Zoom manually to crop your document   some  cameras are bundled with photo stitching software  but don   t bother using it for  document capture     Hold the camera directly above the document to avoid capturing the docu   ment at an angle  However  avoid shadows cast on the document by the camera  or your hand  Produce stable images  Consider mounting your camera on a tripod  when necessary     Disable the flash when you re filming glossy paper  otherwise the image may  be too light  Generally speaking  adapt the brightness and contrast to the environ   ment   day light  lamp light  neon light etc   Some cameras can be calibrated by  filming a white document      We EY CERO ST m    To give it a try  open the image DIGITAL JPG in the Readiris folder and  execute the recognition           USER  S GUIDE    Tg Readiris   C  Program Files Readiris digital jpg  page 1 of 1    j L led  File Edit Settings View Process Learn Register Help       SAVING DEFAULT SETTINGS    Set all scanning parameters correctly and click the command  Save Default  Settings  under the  File  menu to save the current settings as default settings for    future use   Save Default Settings h      Settings files contain more than the scanner settings  they also determine  whether you are going to use interactive learning  which language the documents                   5 2 64       have  which output mode is used   for instance send text to WordPad   etc  In  short  all operational settings 
82. to  include black and white photos  scan in greyscales  to include color pictures  scan  in color     But why would you reduce the bit depth of the images during the scan  It goes  without saying that greyscale and color images are slower to acquire and require  more RAM memory than    bilevel    images     Scanning in greyscale and color isn   t just useful to save the graphics with  sufficient quality    n some instances  it   s also useful or necessary to obtain good  OCR results  When text is printed on a color background  scanning in color may  create the tone differences that are lacking in black and white images  When  there is only limited contrast between the text and the background  the back   ground can create    noise    that renders the recognition difficult or impossible     Think for instance of black text printed on a dark background  when scanning  such a document in black and white  you may not be able to    drop    the back   ground color without losing the text information as well  as much as you may try  to adjust the scanner brightness           MASAYOSHI SON  42  president and CEO    is the master Net empire builder  His con   glomerate holds stakes in 300 Internet  companies in the U S   Japan  Europe  and  other Asian countries  Today  Softbank  manages about  4 billion in venture capital  funds for global investments     YASUMITSU SHIGETA  35  has invested in  more than 70 Web or mobile Net based ven   tures in Japan and the U S   including Tum   ble
83. ut  for instance a 50  page report where the header and footer should be excluded for obvious reasons   a single template can be applied to zone all 50 pages        When you load a template into memory  page analysis is disabled automati   cally  The zoning template remains active until you re enable page analysis on the  main toolbar     Actually  there   s a nice alternative for zoning templates  the preview tool Ig   nore Exterior Zone limits the page decomposition to the    cropped    portion of the  image              Select th  s tool and frame the portion ofthe image you want to process  When  you re dealing with a multipage document  you can exclude the same outer zone  from page analysis on every page   Re execute the page analysis to cancel the  image    cropping     or change the zones manually         Readiris    File Edit Settings View Process Learn Register Help    Hue slimme mensen moneten kimken en praitzien  OCR Wizard    m  imelligeniie geruigt tu handen hebben ala we hee niet mecr    RZ jpen  Want diepzinnige denkboelden kunnen nu ecamaal n  Scan kindertaal worden uitgelegd lach zou her verband tussen co  FE k en dicpeinnig niet zo snel beschreven moeten worden als e  HE crair manifestatie van ven atwijking die we kennen uit het p  Sort  gQ    achi en angripbare mensen dan voor betrouwbare    Recognize      lachtige hacken     tk kin uep langdurig met  ze  verkeren    t  reef hij     ik hout slechts van lectuur dic liche en amusanr is    Dutch    k de wecen
84. weed Communications and Phone com   Shigeta is also developing new businesses  that take advantage of the growth of the  Internet and mobile communications     VASUMITSU SHIGETA  35  has invested in    USER    S GUIDE          more than 70 Web or mobile Net based ven   tures in Japan and the U S   including Tum       bleweed Communications and Phone com   Shigeta is also developing new businesses _  that take advantage of the growth of the     Internet and mobile communications        Readiris creates a black and white version for every greyscale and color im   age  Thanks to its intelligent routines  even tough cases get solved   here   s how a       difficult    image gets binarized     MASAYOSHI SON  42  president and CEO    is the master Net empire builder  His con   glomerate holds stakes in 300 Internet  companies in the U S   Japan  Europe  and  other Asian countries  Today  Softbank  manages about  4 billion in venture capital  funds for global investments     YASUMITSU SHIGETA  35  has invested in  more than 70 Web or mobile Net based ven   tures in Japan and the U S   including Tum   bleweed Communications and Phone com   Shigeta is also developing new businesses  that take advantage of the growth of the  Internet and mobile communications     To view a scanned image in black and white  disable the option  Display Docu     ment in Color  under the  View  menu        w  Display Document in Color Ctrl 0       5 200       DIFFERENT DEVICES  DIFFERENT RESOLUTION    What
85. y to your wordprocessor and spreadsheet  To rec   ognize faxes and convert PDF documents  you can drag the image files from the  Windows Explorer to the Readiris application window  Or right click on an image  to send it prompty to Readiris     Readiris recognizes tabular data and recreates them as worksheets or as table  objects inside your wordprocessor  your numeric data are immediately ready for  further processing     Based on the Connectionist technology from I R I S   Readiris represents the  best OCR has to offer  Font independant feature extraction 1s complemented by  self learning techniques derived from a proprietary neural network  The system  can learn new characters through context analysis  linguistic knowledge about  syllables and words improves the OCR performance        Readiris supports up to 104 languages  all American and European languages  are supported  including the Central European languages  the Baltic languages   Greek and the Cyrillic     Russian     languages   Optionally  you can read four       5 IV       Asian languages   Japanese  Simplified and Traditional Chinese and Korean    Readiris even copes with mixed alphabets  the software detects    Western    words  that pop up in Greek  Cyrillic and Asian documents   many untranscrible proper    names  brand names etc  are written using the Western symbols     Readiris uses linguistics during the recognition phase  not after it  As a direct  result  Readiris recognizes documents of all kinds with
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
h i j h i j  Intelbras WBN 312  Datavideo MP-4200 User's Manual  USER`S, MAINTENANCE and SERVICE INFORMATION    User Manual for Connect2NSE Utilities  Le guide des premières fois avec mon bébé :  取扱説明書等 - アイ・オー・データ機器  Urgomed - medicalplus83.fr  - biovendis    Copyright © All rights reserved. 
   Failed to retrieve file