Home
        triroff, an adaptation of the device-independent troff for formatting tri
         Contents
1.                                                                                                                                                                                                                                                                                                                                                                                                                                                 Ossana     Kernisghan     ditroff    ditroff                                                                                                                                     Buchman     Berry    fforld                    ditroff                                                                                                                                                              bditroff        3                                                                               0894   3982 89 030003   24  12 00 Received 22 August 1988       1989 by John Wiley  amp  Sons  Ltd  Revised 22 November 1989                                 Z  BECKER AND D M  BERRY          WACO Aol HRA   OH 4  eA OCREYE bw SRR gm t       xu sa Jn      AKA LAR            de GN NM IN d 4 Eee iEn ES  KR CREE 24 RING CORE    EHAR Y RAW YEO       OA ee  HAKAN LEAD VIG oR OAK IRS 28  Shar nx A Siw Se BU KR Rn oO                wea mS  Zo                 Rae dH           dB    moe  mio         lt  RS au  p SE Eo 00     IND m dB  260 KAKRA OWS  Offa vite   RESCH YK FLORA SO  sO PNEU
2.       Ree 6               Q0 d  amp  d t  2ggocG  u                      PUOHH Q7Uoo 0 LOO ceca RA      re a2      A dd Grae n duo       4    G  mooodB8sSd mo  ton          0 nd qx d Quas  PL OHNHQCHRUHS AMER SO          XK  e Hoveipe 6  RISK AANA a Day Hons LE  e D G4H BOT PS D HR AOR Bo v  ss          A X   LX E      DA ai s     WR         i  5 X    H                a       nr ER E E                                               Ossana and  Buchman and    E     amp         up EE       d NEUSS p  MSEK        EMER         Soe Sy   clE    Fy  Bu ae E ge  Rain RE  ER Aye   g  KE Ey E  BEE Re Jg       iz  Hex i    IIS ME  abb un          e             rE PILLE   KENE d  R  Hg 8  MR         5  2H    Ds      J  Cep                                          itroff              1 KERE    WC ang eK EHX E edi        m cp ONE RE EMS WP BO eX             WE E  EES RE ERE DX   E          EE                  M o m  REREPMB E  iip   KENE Eirg E  ss Iq O  RERE A i8         BREHM           KERE       thi me RIES  lt         gt                      Rk wh ERR 22  RE SR eR  HR El m  Bf  RRE OCunage    d d cr Meu d t                 dd      ToP h OHH    EL          Re          Ms onc ES         ZE Te       om BE       wae Has C  HER a      lt      ex m  KEKEE  HHEN de                      tE    Kernighan   s ditr          Berry   s ffortid         geod motuum            OP R UYES o E     e o n                                               There has already been much work done in    With variants 
3.      ll  n  XXu        TS    additional   if invoked from diverted text  br   AB CD        TE    additional   if invoked from diverted text  Jf     in    reset indentation to previous value   ll    reset line length to previous value    The output obtained is     J awp    In other words  the layout of the page is unchanged by the application of bditroff   Thus  the page prepared only by ditroff and a device driver can be used as a guide to the  ultimate appearance of the page after application of bditroff    Finally observe that the algorithm is page preserving  That is  the page on which a  given occurrence of a character appears does not change  although the character   s loca   tion on that page might very well change  This fact means that the algorithm needs only  to consider one page at a time  that the maximum storage required for the program is that  to store one page  and that the output of the algorithm can be conveniently passed to any  device driver which works page by page  That is  it can assume that once it has built the  description of page n and has seen the beginning of the next  it may print page n with the  assurance that no more information for page n can arrive later     A POSTSCRIPT device  driver behaves under this assumption  as the POSTSCRIPT language is a page description  language     REQUIREMENTS ON THE INPUT TO bditroff    In order for the algorithm described in the previous section to work it must be possible  for the program    1  to determine t
4.    18     19   20     21     22     23     24            Abe and D M  Berry     indx and findphrases  A System for Generating Indexes for  Ditroff Documents     Software   Practice and Experience  19  1   1 34  1989     C  Buchman  D M  Berry  and J  Gonczarowski     DITROFF FFORTID  An Adaptation of the  UNIX DITROFF for Formatting Bi Directional Text     ACM Transactions on Office Informa   tion Systems  3  4    1985        N  Batchelder and T  Darrell     Psfig        DITROFF Preprocessor for POSTSCRIPT Figures        Computer and Information Science Department  University of Pennsylvania  Philadelphia  PA  19104       J D  Becker     Multilingual Word Processing     Scientific American  251  1   96 107  1984      J D  Becker     Arabic Word Processing     Communications of the ACM  30  7   600 611  1987      J L  Bentley and B W  Kernighan   GRAP     A Language for Typesetting Graphs  Tutorial and    User Manual     Computing Science Technical Report No  114  AT amp T Bell Laboratories  Mur   ray Hill  NJ 07974  December  1984        S  Carson and D M  Berry     Alg      Filters for Typesetting Algorithms     News     Usenet  1985        K P  Chow and C T  Hung     Chinese Workbench  An Interactive Environment for Chinese    Writers     Technical Report TR 87 07  Centre of Computer Studies and Applications  Univer   sity of Hong Kong  June  1987        EgWord Version 2 2 English Reference Manual  Ergosoft Corp   Tokyo  Japan  October 1986     FeiMa II  Chinese Word Proces
5.   ps  11   N  and  in for desired printing    br    break       Make sure that the appearance of mono spacing is not   N  destroyed by spreading characters to fill the line   na       signal beginning of vertical text   NN x TS    additional NN if invoked from diverted text   VS   n VV   set vertical spacing    55   n VM 12    set space width to be the distance between      columns        nr VM 5      should be 4 or 5 to get spacewidth 4 or 5 times normal   nr VV  n  s    FORMATTING TRI DIRECTIONAL TEXT 19              should be 1 or 1 1 times current  ps   de E   br   break       signal ending of vertical text   NN x TE    additional NN if invoked from diverted text   ad    go back to normal spreading of lines    VS    reset vertical spacing what it was    55 12    set space width back to normal                   The  BT macro uses the values of the registers VM and VV to adjust the inter word  hor   izontal  space and the inter line  vertical  space to help make it clearer to the human eye  that the text is to be read from top to bottom rather than horizontally  It is recommended  that VM be set to 4 or 5 and that VV be set to 1 or 1 1 times the current point size    To assist the user in in forcing the location of the columns  the macros   RA    BC  and    EC are defined which force a given number of right adjusted columns  force a given  number of centered columns  and reset normal page margins  respectively  Their  definitions are        de RA   force right adjusted   1 
6.   the tri directional formatting system  described herein has some weaknesses  Some are easily repaired  and others are not  The  problems and possible solutions are presented one by one in this section     Orientation of punctuation characters    Readers who know a Far East language will have noticed that the punctuation symbols in  the examples are oriented incorrectly for top to bottom printing  They are oriented  correctly for horizontal printing  Specifically  the stand alone punctuation symbols  the  period  the comma  etc   are in the lower left hand corner of their bounding boxes  and  the bracketing punctuation symbols  the parentheses  the braces  etc   are oriented in their  bounding boxes to wrap around the ends of enclosed horizontal text  A stand alone sym   bol needs to be in the center of gravity of its bounding box  and a bracketing symbol  needs to be rotated 90  counter clockwise in its bounding box so that it can wrap around  an end of top to bottom text    Probably the simplest solution is to add the missing alternative forms to the character  sets in unused positions  Then  the bditroff program can be modified so that when it is  working with a region of top to bottom text  it simply replaces the codes for the horizon   tal versions of these characters by those for the alternative  vertical versions of the same  characters     22 Z  BECKER AND D M  BERRY          Inclusion of proportional spaced Latin characters in top to bottom text    triroff supports 
7.  and registers if any of them conflict with those of the base  macro set used  as indeed happened when these were used with the macro package sup   plied by the editors of this journal for preparing the camera ready copy  Moreover  these  macros assume that their invocations are not inside diverted text  If they are invoked in  diverted text  the transparent output commands beginning with      x must be changed to  begin with NNNN   x  i e   an extra pair of  V s must be added to delay the output until the  surrounding text is printed    In order to allow proper control of horizontal spacing in a horizontal printing of the  Far East language fonts  the following special characters have been provided        1  The interword space has been set at  125 em so that the proper horizontal  inter character spacing can be obtained just by making each Far East language  character a word    2  The blank character  f  al   al in the upper left hand corner of the charac   ter matrix has the same 1 em width that all other characters have    3  The    character  which is normally 1 6 of an em space  has been set to have  the width of a full character so it can be used to force a full character width  without forcing a font switch to font al  the font of the blank character and  without forcing emission of a character  i e  ditroff treats use of     as a move   ment    4         V  character  which is normally 1 12 of an em space  has been set to have  half the width of the     character  It
8.  ffortid has been applied is irrelevant to bditroff  because it leaves horizontal lines intact    The description of the algorithm in the next subsection appeals to these invariants to  demonstrate that the algorithm does work     THE bditroff ALGORITHM AND ITS USE    The following discussion assumes the following input  in which lower case letters  represent characters of a variable width font to be printed from left to right and in which  upper case letters represent characters of a constant width font to be printed top to   bottom  Note that the algorithm can be used to print any text from top to bottom so long  as the text is composed of characters that have the same constant width  either naturally  or via the   cs command  Consider the input     site R   s9ada is a trademark of the u s  dept  of defense   ms dos is a trademark of microsoft  inc  s 10           br   Nx TS          ABCDEF  GHIJKL  MNOPQR  STUV  WXYZ  Pees        TE   ft R     s9ffortid is a trademark of berry computer scientists  ltd   unix is a trademark of at amp t bell laboratories  s 10    Note the   br commands before and after what is considered top to bottom text  Note  also the transparent outputs  V  x TS and   x TE  signalling the start and end of what  is to be printed from top to bottom  Assuming a line length that allows 5 constant width  characters and intervening blanks per line  and ignoring page breaks that might occur in  the midst of the example  the output of ditroff follows  shown sch
9.  met  The basic  structure proposed works because of certain observable invariants in tri directional text   An algorithm that exploits these invariants is given  The algorithm allows a tri directional  formatter to be built on the existing full function ditroff system  It is explained how the  algorithm and the underlying ditroff system can be exploited by the layout designer to  achieve most desired effects  Some of these effects are illustrated by examples  However   not all desired effects are achievable  the weaknesses of the present system are identified  and solutions are proposed for them  Implementation of these solutions are left to future  work     EXISTING SOFTWARE    There are a variety of systems for formatting Japanese or Chinese printed from left to  right mixed with other left to right languages  e g   English  These include Kameyama  and Hasebe   s jtroff 17  11   Nagashima and Kawabata   s early adaptation of TEXTM 19   for Japanese 22   Saito s JTEX 24   and Berry   s and Chow s adaptations of ditroff 12  8    These can also print Japanese and Chinese from top to bottom simply by printing from  left to right in landscape mode with a font consisting of the characters rotated 90  coun   terclockwise  All of these systems use the standard 94x94 matrix arrangement of the    8 Z  BECKER AND D M  BERRY          Japanese or Chinese characters  as the case may be  The first four of these formatters  have modified the base program  troff or TEX  so that two byte ch
10.  section on weaknesses  i e   with variable width Latin language text printed sideways so that  its left to right flow matches the top to bottom flow of the containing Far East language text  Due  to the limitations mentioned in that section  these abstracts could not be typeset as part of the docu   ment that ends with the end of this paragraph  Instead they had to be typeset as a single other docu   ment and pasted in     26 Z  BECKER AND D M  BERRY          Figure 2  Japanese Abstract  Figure 3  Chinese Abstract     
11.  the file is stored as it appears  changing either of these lengths means mas   sive manual editing  If one could recover the original entry order of the characters from  the text  the original entry algorithms could be applied to the stream of characters relative  to the new line and page lengths  However  with raw printable characters  in which punc   tuation is not distinguished by language  complete recovery is impossible because of  punctuational ambiguities  Thus a system in which the text is stored in the original input  order with the original language information is more general  This same problem occurs  with many  but not all  WYSIWYG systems  because many of them store the characters  in the visual order     FORMATTING TRI DIRECTIONAL TEXT 9          DESIRABLE PROPERTIES OF A TRI DIRECTIONAL FORMATTING  SYSTEM    There appear to be a number of desirable properties that should be satisfied by a tri   directional formatting system that make it more general  more functional  easy to use   and easy to program    In order to allow maximum formatting flexibility with easily changed sizing charac   teristics  it is necessary to store the characters of the document in the order entered  with  backspacing corrections normalized out   and to let the formatting algorithm be applied at  the time of printing  This suggests a batch formatting system  but it does not preclude a  WYSIWYG system  so long as the formatting algorithm is fast enough to be applied on  the fly  with li
12.  too is treated by            as a movement  with no character emission    5  The   amp  character is still the zero width character    6  Finally the    XX character has been provided in the S font as printing nothing  but having a width equal to that of the blank and all other characters  It is a  true character  so ditroff emits a character  It is on the first special font  so it  can be used regardless of the current font without having to request a new  font     EXAMPLES    This section shows a sample of input and of printing it both from left to right and from  top to bottom  Unlike in 12   there is no need for cutting and pasting to get both of the  outputs on the same page    The text is a famous Chinese poem  composed around 700 A D  by the most  renowned poet in China  Li Bai  The input is     N   eON ae N   c1N bO N   ccN cO N   b7N ee N   b8N   7 N   alN a4  N   b5N b   N   cON a7 N   c3N c   N   beN e5  f cl        N   alN a3  Dr    f  da   aa  f c6  ac N   cbN be N   ccN cO N   b7N ee N   alN a4  N   cAN e3 N   c6N ac N   bbN d7 N   b8N ce N   b6N bf N   alN a3    FORMATTING TRI DIRECTIONAL TEXT 21          Its output in left to right mode is     ik AA HOC  Bee                                Its output in top to bottom mode is                   zB                   X                           it      Hb       o          This printing was done using the   BT    ET    RA  and   EC macros defined above     WEAKNESSES    The enthusiasm of the authors notwithstanding
13.  would be set to mean    right to left    when the  result of saying         TB    is found and would be set to mean    left to right    when the  result of saying      x TM    is found  The result of saying      x TE    can be used to end  the top to bottom region for either        CONCLUSIONS    As can be seen from this paper  triroff works mostly as desired  That is  the three pro   grams ditroff  ffortid  and bditroff combine to produce an effective tri directional for   matter that accepts any input accepted by ditroff  including that produced by any of  ditroff   s preprocessors  works with any set of ditroff macros  and generates output indis   tinguishable from ditroff   s  This output is then acceptable to any ditroff device driver   The use of this software to typeset this paper is a demonstration of this claim    The main strength of the triroff approach is its modularity  This modularity allows  each new direction of printing to be attacked as a separate problem uncluttered by con   cerns with other directions and other formatting problems  This modularity allows the use  of an unmodified ditroff  which in turn allows the use of all of ditroff s preprocessors and  macro packages    There are a number of minor problems both in appearance and in function  However   their solutions are straightforward because of this modularity  For example  changing the  orientation of the punctuation symbols and moving them in the bounding box involves no  change to any of the prog
14. AL TEXT 11          per row in addition to whatever else is available for the device  Because the standard  ditroff program is used  all existing preprocessors and macro packages still work  The  system generates output in exactly the same format as is generated by the existing stan   dard ditroff  Thus all existing postprocessors still work  and the formatted output can be  printed on any existing device for which both a device driver and all the required fonts  are available  The ability to format right to left and top to bottom text in addition to  left to right text is created by the addition of two programs between the basic ditroff pro   gram and the device drivers  These two programs each accept as input the output of ditr   off and produce output in the same form as ditroff output  Thus these two programs can  accept as input each other   s output as well and can send their outputs to the same post   processors to which ditroff can send its output  Because the input to the underlying ditroff  is in time order  ditroff s output reflects formatting decisions made as if all the text were  written from left to right    The first of the additional programs is ffortid  which  on a line by line basis  reorgan   izes the line of text so that each contiguous sub line of text in left to right fonts is printed  from left to right and each contiguous sub line of text in right to left fonts is mirrored  about its own center in its current position so that it is printed from right 
15. ELECTRONIC PUBLISHING  VOL  2 3   3 26  DECEMBER 1989           triroff  an adaptation of the device independent  troff for formatting tri directional text                         ZEEV BECKER  pn     RT                                M  AND DANIEL BERRY  73 9x77 7   X JV NY FH  Computer Science Department       Technion    JL  Haifa 32000 w 1t  Israel fti Al   SUMMARY    This paper describes a system for formatting documents consisting of text written in  languages printed in three different directions  left to right  right to left  and top to bottom   For example  this paper is such a document because it contains text written in English   Hebrew  Japanese  and Chinese  The system assumes that the input is in the order in which  the text is read aloud  and it produces output in which each language is printed in its own  correct direction  but for which a human cognizant of the reading conventions will reproduce  the input order  The system consists of three major pieces of software  Ossana and  Kernighan   s ditroff for formatting text consisting of only left to right or unidirectional text   Buchman and Berry   s ffortid for rearranging right to left language text buried in ditroff out   put to be printed from right to left  and a new program bditroff for arranging that top to   bottom text buried in ditroff output is printed from top to bottom    Below are translations of this English language abstract  except for this paragraph  into  Hebrew  Japanese  and Chinese  The lat
16. RIPT fonts  then the added characters  can  in fact  be POSTSCRIPT programs that build and show the numerals using  digits from one of the available Latin fonts  Since the length of these numerals  cannot be too much longer than the width of a normal Far East language char   acter  the number of these is limited  Given the fonts used in this paper  it  appears that the maximum length of such numerals is two digits  thus only 99  numeral characters would have to be added to the Far East font    2  Use the above mentioned facility  such as available in devps  that can rotate  any text any angle to build a macro that rotates its argument about its center  and fools ditroff into believing that the size of the argument is the same as that    FORMATTING TRI DIRECTIONAL TEXT 23          of the normal Far East font character     As one of the referees pointed out  in the Xinjinang Uighur region of the PRC  there  is another language spoken which is not written in any of the directions covered so far in  this paper  The language is Mongolian  and it is written from top to bottom on lines that  flow from left to right  Even if a font were available for the language  the current version  of the software cannot handle its printing direction  However  it would not be difficult to  make printing in the Mongolian direction another direction supported by the bditroff pro   gram  It would mean making the direction in which the reconstructed columns flow  determined by a variable  The variable
17. aracter codes are  acceptable as input  The two byte codes are generally distinguished from ASCII charac   ters by having the eighth bit turned on in each of the two bytes  The latter two formatters  avoid having to modify the base program  ditroff  by considering each row of the matrix  to be a separate font and addressing the cdth character of the abth row as  f  abN  cd   This  by the way  is the scheme adopted by the system presented herein  as it too is based  on ditroff    Among the products for Japanese and Chinese wordprocessing on the Macintosh     are EgWord 9  and FeiMa 10   Both EgWord  a Japanese word processor  and FeiMa  a  Chinese word processor  seem not to be able to deal with bi directional text  All of their  examples are strictly left to right  However  since they run on a Macintosh computer  with its standard user interface  there is nothing to stop the user from rearranging some  text to be written from top to bottom  and cutting this text out and pasting it into another  document whose text is printed from left to right  In addition  at least FeiMa gives the  user a choice of printing direction when requesting the printing of a document  however   this direction applies to the whole document    One paper from the Xinjinang Uighur autonomous region 28  describes UKKMC   DOS  a version of MS DOS    which is capable of accepting input in English  Uighur   Kazak  Kirgiz  Mongolian  and Chinese  The text of each language is displayed on the  screen printed 
18. columns       set the page offset to the current line length minus the      new line length defined below    nr XX NWAn  lu  NNS1 NNw  NN     a1NN  al   u     nr XX    NN 1 1   NNW NN    a1NN  u     in    n XXu    N set line length to  N 1 times the width of a standard   N  character  plus   N 1 minus 1  times the width of the   N  inter word space    ene XX      1   w      f  a1NN  al   u     nr XX       1 1    w      E al    u     ll   n  XXu                    de EC    end columns or centering   in    reset left margin to previous value   11    reset right margin to previous value        de         centering   1 columns       set the page offset to half of  the current line length      minus the new line length defined below     nr XX    n  1lu     1   w          a1NN  al   u     nr XX    NN 1 1   NNw NN    a1NN    u   2u    in    n  XXu    if invoked in diverted text  use     po      set line length to    1 times the width of a standard   N  character  plus   N 1 minus 1  times the width of the      inter word space     nr XX      1   w      f  a1NN  al   u     nr XX       1 1    w    NN      1    u     ll   n  XXu                20 Z  BECKER AND D M  BERRY             In order that  RA and  CE work properly with respect to the spacewidth established for  the top to bottom text in the   BT macro  it is necessary that the  RA or   CE come after  the  BT    Obviously  these and other macro definitions given in this section must be modified to  use different names for macros
19. ction being identified by the current font     The system uses the standard  essentially unchanged underlying ditroff formatting  program to make all the formatting decisions  Fonts for the standard character sets from  Korea  Japan  and the People s Republic of China are arranged as a 94x94 matrix  For  the purpose of ditroff formatting of Far East language text  the martix is treated as 94  fonts each containing 94 characters  all exactly the same size  Thus  the only changes to  the ditroff program as distributed by AT amp T are the changes of the constant defining the  number of fonts mounted to a number large enough to accommodate the 94 row fonts  plus whatever else is mounted on the local printing devices and of the constant defining  the size of the character set to something large enough to accommodate the 94 characters         To make sure that requiring line breaks before and after each block of top to bottom text is reasonable  we  went to the neighborhood bookstore and bought a Japanese magazine with both left to right and top to   bottom printing of Japanese  In all cases of switch of direction  there was an accompanying line break  That is  there was no case of beginning top to bottom printing on the same horizontal line that contains the preceding  left to right text  and there was no case of beginning left to right printing on the same horizontal line serving  as the bottom line of the rectangle of the preceding top to bottom text     FORMATTING TRI DIRECTION
20. ell Laboratories  June 21  1978     M E  Lesk  TBL     A Program to Format Tables  Bell Laboratories  Murray Hill  NJ 07974   1978                  and                      OA BR  20 TeX2  YS          CANON                  1986     7     17H     J F  Ossana  NROFF TROFF User s Manual  Bell Laboratories  Murray Hill  NJ 07974   October 11  1976    Y  Saito     Report on JTEX  a Japanese TEX     TUGboat  2    1987                                                FORMATTING TRI DIRECTIONAL TEXT 25    25     26     27     28     29     30         Trickey     DRAG     A Graph Drawing System     in Electronic Publishing    88  ed  J  Andr    and H  van Vliet  Cambridge University Press  Cambridge  UK   1988        The UNIX Programmer   s Manual     Technical Report  Bell Telephone Laboratories  Murray  Hill  NJ 07974  June  1981     International UNIX  Supplement to UNIX World  1989    Z  Wu  W  Islam  J  Jin  S  Janbolatov  and J  Song     A Multi Language Characters Operating  System on IBM PC XT Microcomputer     in Proceedings of Second International Conference  on Computers and Applications  Beijing  PRC   June  1987     T  Wolfman     flo     A Language for Typesetting Flowcharts     M Sc  Thesis  Technion  Haifa   Israel  1989     C  J  Van Wyk     IDEAL User   s Manual     Computing Science Technical Report No  103  Bell  Laboratories  December 17  1981      APPENDIX    This appendix contains the Japanese and Chinese abstracts printed in the additional form mentioned  in the
21. ematically  i e   after  passing it through the device driver        ada is a trade   mark of the  u s  dept  of  defense  ms     14 Z  BECKER AND D M  BERRY          dos is a trade        mark of  microsoft  inc   A B C D E  FGH IJ  K LMNO  P O  IR 285 E  U VMWXY  2    ffortid is a  trademark of  berry com   puter scien   tists  ltd  unix  is a trademark  of at amp t bell  laboratories     The rectangular region to be reorganized by bditroff is the 5x6 region containing the  typewriter font characters  This region   s last four characters are blanks of exactly the  same size as the letters  Suppose that all of the text fits on one page  Then  bditroff reads  the characters in the region in a left to right  top to bottom sweep  as A  B  C          blank   blank  blank  blank  It then lays them out in a top to bottom  right to left sweep in the  same order to fill the same region  After this reorganization  the text is  schematically     ada is a trade   mark of the  u s  dept  of  defense  ms   dos is a trade   mark of  microsoft  inc   Y  Z       dH                         Hn    A  B  C  D  F  ffortid is a  trademark of  berry com   puter scien   tists  ltd  unix  is a trademark    of at amp t bell  laboratories     The reader should note that the algorithm is being applied by the formatting software on  these examples  If however  the page break were to come  relative to the original ditroff  output  after the third line of the constant width text  i e   between the    and 
22. he exact position of each character on the page  and  2       independently determine the line and page boundaries in the input        EA major source of problems with tbl is that it can violate this single pass page construction property  If a table  with more lines than can fit on the current page also has vertical lines  which are normally drawn after  finishing the last row of the table  these lines get drawn on the second page from the projection of the start of  the table onto this page to the end of the table on this page     18 Z  BECKER AND D M  BERRY          The ditroff output consists of a preamble describing the device  followed by a  sequence of page descriptions each beginning with a page command of the form pn sig   nalling the beginning of page number n  The description of a page consists logically of a  sequence of  position  character  pairs  each describing exactly where on the page to print  a character  The actual form of the position information is as occasional absolute coordi   nates with intervening horizontal and vertical movements  Thus a program reading this  output must keep a position state and follow the relative movements in order to calculate  the exact position of each character  Embedded among these  position  character  pairs   and actually independent of them  are end of line markers  of the form nb a  the impor   tant thing here is the n  the    and the a give the amount of space before and after the line  in the device s units  These ma
23. he line  will have exactly the same length    4  Within any contiguous rectangle of top to bottom Far East language text       2 The origin of the name    bditroff    is that to get    ditroff    written from top to bottom in the unenhanced ditroff   one says  b    ditroff  this utterance appears to the shell as    bditroff      For the length to be totally independent of the order  it is required that any kerning algorithm have the current  font s direction as a parameter in order to know which pairs of letters must be kerned  If the algorithm is  table driven  then the kerning distance of the pair  X Y  must be adjusted to look good when Y is printed to the  left of X     12    Z  BECKER AND D M  BERRY          ditroff input format       ditroff             ditroff output format       ffortid             Z output format       OO                                   ditroff output format    Figure 1  Data flow of triroff     FORMATTING TRI DIRECTIONAL TEXT 13          within a page  any permutation of the text within the rectangle will exactly fill  the rectangle  For the purpose of this statement  trailing blanks are considered  to be characters  This invariant works because the widths and the heights of all  Far East language characters are the same     Observe that application of ffortid to the output of ditroff should not affect the holding  of the invariants  ffortid merely permutes the characters of horizontal lines  these lines  remain horizontal  In fact  whether or not
24. ices  rang   ing from line printers  dot matrix printers  laser printers  through to photo   typesetters    2  all the various software which allows preview  on a high resolution screen  of  how a document will appear when printed on some printing device  and   3  Indx 1   for preparing a back of document index     Finally  the macro packages include mm  man  ms  me  and mX  which are described in  the various versions of the UNIX Programmer   s Manual 26   The primary advantage of  the ditroff system over other more monolithic systems is precisely that it is not monol   ithic and is composed of many programs  which may be combined for application to a  document to get their combined functionality  It is relatively easy to add new programs  with new functionality as evidenced by the ever growing collection of pre  and postpro   cessors cited above  Thus  it is relatively easy to experiment with new functionality in the  context of a full function system    These formatting programs were developed in a primarily English speaking environ   ment  However  in principle  these programs can be used in conjunction with any  language written from left to right with lines flowing top to bottom for which fonts are  mounted on the printing device    The goal of the authors and their colleagues has been to adapt the ditroff collection to  the multi lingual setting  A ditroff postprocessor  ffortid    2   has been developed to  make the collection useable as a bi directional formatting sy
25. in its own proper direction as it is being entered  The system is a  WYSIWYG system in which the screen appearance reflects exactly what is in the file   The text in the file is broken into pages  Each page is treated as a two dimensional array  of characters  When a page is seen on the screen  the array is displayed directly on the  screen with each row of the array being displayed on a separate row of the screen  Entry  of a Latin character causes cursor movement one position to the right both on the screen  and in the array  entry of a Uighur  Kazak  or Kirgiz character causes cursor movement  one position to the left  and entry of a Chinese or Mongolian character causes cursor  movement one position downward  In the editor  one moves around this array to directly  address each character in its own position    Working with this system is rather straightforward precisely because the visual image  is a very accurate model of the internal structure  One is formatting the text as it is  entered  with the software taking care of most of the global formatting details  such as  keeping to the line length and to the page length  It is easy to address each character for  direct manipulation of the character    However  certain formatting operations are difficult with this system  most notably  changing the page sizing characteristics  e g   the line length  and the page length   Because the formatting is done during entry  based on the current setting of line and page  lengths  and
26. inclusion of Latin language text among top to bottom text  but only in an  unrotated advancing downward form using the constant width Latin characters found in  row three of the JIS  the GB 2312  and the KS C 5601 character sets  It is also common  these days  to rotate the Latin language text so that its natural right to left flow matches  the top to bottom flow of the surrounding Far East language text  That is  the Latin text  is printed in a variable width font sideways with its base line coinciding with the line run   ning down the left edge of the vertical column containing the text  This printing is  achieved by having a Far East language font with its letters rotated 90   counter   clockwise  and printing this Far East font together with the available Latin fonts from left  to right  If such a page is then rotated 90   counter clockwise  it appears to the reader that  the Far East language characters are printed right side up top to bottom and the Latin  letters are printed sideways  Probably  this style arose simply because it is so easy to  implement with modern printing devices    Figures 2 and 3  found in the appendix  show the Japanese and Chinese abstracts of  the paper printed in that style  The reason these figures are in the appendix is to preserve  the truth of the claim  made at the beginning of the paper  that the entire paper is printed  as a single document with the software describe herein  The appendix figures cannot be  printed in the same run of ditr
27. m top to bottom with the columns  laid out from right to left  one gets as output something like     10 Z  BECKER AND D M  BERRY          English  nay                   To minimize both programming effort and user learning effort  it is useful that the for   matting software be an upward compatible extension of an existing system  For max   imum functionality  it is useful that the underlying existing system be a stabilized system  capable of dealing with pictures  tables  graphs  equations  indices  tables of contents   bibliographical citations  program code formatting  indexing  etc  One example of such a  system is the UNIX documentor   s workbench  DWB     or ditroff collection  Another is  the        collection    In this respect  it is best of all if the underlying system   s software can be used  unchanged  Then  only the new capabilities need to be programmed  Full functionality is  obtained with no additional programming effort  Finally  the user community can rely on  extant behavior being reproduced  even down to the bugs that have become features     BASIC STRUCTURE OF SYSTEM    This paper describes a system for tri directional formatting based on the UNIX DWB or  ditroff collection  The system assumes input    1  in time order  and  with line breaks before and after each contiguous stream of constant width  characters   in Far East language fonts  all characters are the same width   to  be printed from top to bottom    and   3  with the current language and dire
28. may be the bottom of a page  In any case  the region  cannot be larger than one page  Any such region is a rectangle bounded by    the beginning    the end    the left margin of the page  and  the right margin of the page     rN    The rearrangement algorithm makes a column of text as long as necessary to fill the  region  All the extra blanks end up in the left most columns  If the user does not desire  this sort of filling  then it is straightforward for the user to adjust the page offset  line  length  page length  line spacing  etc  to obtain the desired physical appearance  The  authors examined Japanese magazines and found that the spacing between successive  characters in a column is about  1 times the character size  but that the space between  columns is about 1 2 times the character size  To achieve this appearance with the    16 Z  BECKER AND D M  BERRY          algorithm  it suffices for ditroff to be told that the vertical spacing is 1 1 times the current  point size  as opposed to the more usual 1 2 times   and that the spacewidth is 1 2 ems  as  opposed to the more usual approximately  333 ems for variable width fonts or 1 em for  fixed width fonts   Of course  it is necessary to reset these upon leaving a region of top   to bottom text    Observe that with this algorithm  text of length n residing within one line in ditroff     e g      ABCD    will end up being in what appears to be a right to left order  e g      DCBA    as the algorithm fills a 1xn region t
29. nes written in the  right to left direction  These same newspapers and magazines have advertisements using  Latin language text written in the left to right direction  Thus  in one document  text is  written in three directions  left to right  right to left  and top to bottom    In the Xinjinang Uighur autonomous region of the People s Republic of China  inha   bitants speak the languages  Uighur  Kazak  and Kirgiz 28   which are written in the  right to left direction  as are their linguistic cousins Arabic  Farsi  and Urdu  When com   bined with the general use of Chinese and Latin languages in the country  the need arises  in the region for tri directional formatting  For example  at universities in the region  a  technical paper might very well be written in a local language  use English for technical    FORMATTING TRI DIRECTIONAL TEXT 7          words  have formulae  and be required to have a Chinese abstract    Finally  any business contract for high technology work done jointly by companies  from a Far East country and a Mid East country could require tri directional formatting    There is an additional language used in the Xinjinang Uighur region  whose correct  traditional printing direction was learned only after the software was written and the  referees    comments on the first draft of this paper had been received  Mongolian is the  language  and in its traditional writing direction  the characters flow from top to bottom  on a line  and the lines flow from left 
30. of the UNIX    operating system spreading throughout the world 27   there  is a concern to adapt the various UNIX facilities to be useable in a variety of languages  multi lingual formatting  For example  Becker describes one such what you see is what     KEY WORDS Document Preparation Multi lingual Multi directional Troff Typesetting    and in mixed language environments 13      BACKGROUND    FORMATTING TRI DIRECTIONAL TEXT 5          you get  WYSISYG  formatting system 4   This paper describes new software whose  general goal is to help adapt the facilities of the UNIX device independent troff  known  as ditroff 23  16   to the multi lingual environment    The ditroff system is composed of a basic formatter called ditroff 16 23  plus a  number of preprocessors  postprocessors  and macro packages  Among the preprocessors  are    refer  for handling bibliographical citations 20    ideal  for drawing pictures 30     pic  for drawing pictures 15     grap  for plotting graphs 6     drag for drawing directed graphs 25     flo for drawing flowcharts 29     psfig for including figures drawn in POSTSCRIPTTM 3     alg and its derivatives for formatting included program code 7   tbl for laying out tables 21   and   eqn for laying out mathematical formulae 14      OO AON      de                        Among the postprocessors are    1  all the various device drivers for translating ditroff output into the instructions  needed to print the formatted documents on the various printing dev
31. off that prints the rest of the paper  even with a rotated  Latin font  because the Latin font does not meet the constant width requirement for using  bditroff  It is typeset as a separate document using the trick of the rotated Far East  language font    In order to be able to print the appendix in the same run of ditroff that prints the  paper  it is necessary to have a ditroff device driver that can change from portrait to  landscape mode and vice versa at any arbitrary point in the document  The particular  device driver used to print this paper  psdit  from Adobe   s TRANSCRIPT    package  does  not have this capability  There exists a device driver  namely Pipeline Associate   s  devpsTM that has facilities for rotating arbitrary text at any angle  Thus  it should be pos   sible to put the needed capability into any device driver    In top to bottom Far East language text  a short multi digit numeral in a Latin text  font is occasionally printed as a unit unrotated  with its base line perpendicular to the  vertical axis of the column that contains it  This works nicely when the numeral is short  enough so that it does not stick out to far from the width of the column  This cannot be  done in the current version of the software  However  there are a number of ways that  this feature can be implemented as easy extension of the current software     1         to the Far East language font all possible short multi digit numerals as  single characters  If the fonts are POSTSC
32. op to bottom  with the columns of length one being  filled in from the right to the left  If it is desired to obtain an nx1 space  with all the text  down one column  one must trick the formatter a bit  ditroff can be forced to format the  text in a line length equal to the width of one character  Then the text gets printed  correctly  top down  without application of bditroff  and applying bditroff reorganizes the  rectangle of width one character into itself  Thus  one can have the    B C D printed  downward  in a right justified column  by giving the input         set the page indentation to the current line length      minus the new line length defined below    nr XX Mn  lu Nw NfCA u    in   n  XXu       set line length to the indentation plus the width of a      standard character    nr XX   w    fCA   u              11  n XXu   Nx TS    additional   if invoked from diverted text  Ir   AB CD   Hons   Nx TE NV additional   if invoked from diverted text        br   in    reset indentation to previous value   ll N  reset line length to previous value    Doing so yields the output           uu ow    One can obtain a centered column by giving the input     FORMATTING TRI DIRECTIONAL TEXT 17              set the indentation to half of  the current line length      minus the new line length defined below     nr XX   n  lu  w       CA   u   2u    in   n  XXu       set line length to indentation plus the width of a       standard character    nr XX   w  fCA    u                  
33. op to bottom direction  is called the Far East languages     There are a number of writing directions dealt with in this paper  they are identified  as follows        The direction of writing in which the characters flow from left to right on a  line and the lines flow from top to bottom is called the left to right direction    2  The direction of writing in which the characters flow from right to left on a  line and the lines flow from top to bottom is called the right to left direction    3  The direction of writing in which the characters flow from top to bottom on a  line and the  vertical  lines flow from right to left is called the top to bottom  direction    4  Together the left to right and right to left directions are called the horizontal  directions while the top to bottom direction is a vertical direction     THE NEED FOR BI  AND TRI DIRECTIONAL FORMATTING    Throughout the Far East  documents are written containing a mixture of text in Far East  and Latin languages  The Latin language text may include mathematical formulae  It is  often desired to print the Far East language text in the traditional top to bottom direction   While it is possible to print the Latin language text letter by letter in the same direction   it is preferable to print the Latin language text in its traditional left to right direction    Moreover  in Hong Kong and Japan  newspapers and magazines have their main line  Far East language written in the top to bottom direction and their headli
34. rams composing triroff  it requires only the use of a different  Far East language font containing the reoriented and repositioned punctuation symbols as  added characters  The solution to these problems are left for future work    Of course  the ultimate judge of the quality of the software is the user  Accordingly   the bditroff software described herein is available from the second author for a nominal  fee and under a non disclosure agreement     ACKNOWLEDGEMENTS    The Japanese and Chinese translations of the abstract were provided by Taiichi Yuasa     iX X      and Kam Pui Chow                respectively  Low Hwee Boon           X   provided samples of Hong Kong and Singaporean magazines and newspapers    The authors thank one particular enthusiastic  but critical  anonymous referee  whose  hard questions resulted in a greatly improved paper    devps is a trademark of Pipeline Associates  Inc  DWB is a trademark of AT amp T  Bell Laboratories  ffortid is a trademark of Berry Computer Scientists  Linotronic is a  trademark of Linotronic  Inc  Macintosh is a trademark of Apple Computers  Inc  MS     24    Z  BECKER AND D M  BERRY          DOS is a trademark of Microsoft  Inc  POSTSRIPT is a trademark of Adobe Computer  Systems  TEX is a trademark of the American Mathematical Society  TRANSCRIPT is a  trademark of Adobe Computer Systems  UNIX is a trademark of AT amp T Bell Labora   tories     REFERENCES    1     2     OUR    11     12     13     14     15     16     17  
35. rkers are necessary and cannot be calculated from the  movements  There is no guarantee that all large movements to the left with small move   ments downward are ends of lines  One finds such movements in equations  graphs  pic   tures  tables  etc    Because there are no end of line markers in TEX s DVI output format  the system  structure adopted in this paper cannot be applied to make a tri directional version of TEX   Instead  one must make modifications to TEX either to have it do the reorganization or to  have it emit end of line markers 18   In either case  one cannot use the standard distri   buted TEX and one faces the problem of maintaining more than one version of the pro     gram     ACTUAL PROGRAM    In the input  one must signal the beginning of the text to be printed vertically by use of  the transparent output V  x TS and signal its ending by use of V  x TE  If the text to be  printed vertically appears in a diversion  the signals must be preceded by one                 level of diversion  In addition  if the signals occur in macro definitions  each         must be  doubled  These signals must be preceded by breaking commands such as   br  To assist  the user in dealing with the top to bottom text  macros   BT and  ET are defined which  do these activities and which also adjust the line and word spacing to produce nicely  spaced columns  Their definitions are            de BT    begin top to bottom processing    N  The user is presumed to have properly set the
36. sor  English Reference Manual  Version 3 0   Wu Corp      Avon  Connecticut  August  1986                      and                                                                                                      A  tat  in                                               pp 1 10   1985     C H  Ip  D M  Berry         K P  Chow     CWPR  a Chinese Japanese Word Processing System  for Use with UNIX Device Independent TROFF     in Proceedings of Second International  Conference on Computers and Applications  Beijing  PRC   June  1987     R  Kasbarian     The Language of Choice     International UNIX 41 46  1989     B W  Kernighan and L L  Cherry  Typesetting Mathematics     User   s Guide  Second Edition    Bell Laboratories  Murray Hill  NJ 07974  1978    B W  Kernighan     PIC     A Graphics Language for Typesetting  User Manual     Computing  Science Technical Report No  85  Bell Laboratories  March  1982     B W  Kernighan     A Typesetter independent TROFF     Computing Science Technical Report  No  97  Bell Laboratories  March  1982     SWS A and                     UU EC UR Gr MARY A A JTROFFO B8     in                            31             60                                  D E  Knuth and P  MacKay     Mixing Right to left Texts with Left to right Texts     TUGboat   8  1    1987     D E  Knuth  The TEXbook  Addison Wesley  Reading  MA  1984           Lesk     Some Applications of Inverted Indexes on the UNIX System     Computing Science  Technical Report No  69  B
37. stem  in conjunction with  Arabic  Farsi  and Hebrew fonts  In addition  the ability to handle very large character  sets  such as those used in Japan  Korea  and the People s Republic of China  requiring  two bytes for encoding characters  has been added 12   Interestingly  the three character  sets from these countries  the JIS  KS C 5601  and GB 2312 standards are all arranged as  94x94 matrices  Given that all characters in all of these character sets are exactly the    6 Z  BECKER AND D M  BERRY          same square size  the width tables and the processing for these character sets are nearly  identical    For the purpose of identifying groups of languages with similar formatting problems   the following group names are used in this paper     1         group of languages  including English  whose members are printed with  alphabets of size less than 256 in the left to right direction is called the Latin  languages  even though it includes many languages  such as Greek  and Rus   sian  not written with the Latin alphabet    2         group of languages  including Arabic  Farsi  and Hebrew  whose members  are printed with alphabets of size less than 256 in the right to left direction is  called the Middle East languages  even though it includes languages  such as  Urdu  whose locale is not really in the Middle East    3  The group of languages including Chinese  Japanese  and Korean whose  members are printed with alphabets of size greater than 256  traditionally in  the t
38. ter two are each printed twice  once in a modern left   to right style  and once in a more traditional top to bottom style  The software described in  this paper was used to format and typeset this paper     Yspn    Draw  pan pm  Pawa uv mins        12         minii  maw Dyan ODOM NITY  n  n INN      own         mas may moii INIA DO     D 5n NAW TWIN         Jno us      TNA RANT  055 noynom  OWN                  23 NoaTw        55 120 v55 AP DD                 v Rp1 NW 12 1702 Tini                    NAAT    Ossana Pv ditroff igypyy open      naama noma  oopn        nw     nA ox  meopna nao  ym  anon            r0   Berry  Buchman 70 ffortid  7nx 122 1X par  Prawa p                     Tay  Kernighan  nun  nyana vo2on 770   bditroff non miam praw  pan oDTYY 72 ditroff    v 0995 ADA DRAW  ma   num noynmen           35 ditroffn  v 0999 non                                  JU AER                                                                                     3                                                                                                                                                                                                                                                                                                                                                                              1                                                                                                                                                              
39. the P  the  output would be  schematically     ada is a trade   mark of the    FORMATTING TRI DIRECTIONAL TEXT 15    u s  dept  of  defense  ms   dos is a trade   mark of  microsoft  inc   MJG  N  O    ON  HT  n1  pr  U  QU         NK  Xxx  GHU    P  Q  R  ffortid is a  trademark of  berry com   puter scien   tists  ltd  unix  is a trademark    of at amp t bell  laboratories     Here  the horizontal rule represents the page boundary    If one is using a ditroff macro package in which page headers and footers are gen   erated  even just page numbering  then additional measures must be taken lest the header  and footer be included in the regions that are to be rearranged into top to bottom printing   The macro that is invoked at the page bottom trap must issue the same commands that are  used to end a top to bottom region before it emits any of the regular page footer text   Moreover  it must arrange that the very next invocation of the page header macro issue  the same commands that are used to begin a top to bottom region after it has emitted any  of the regular page header text  This arranging is done by setting a register to a value  which is interrogated by the page header macro    Consider any region in which bditroff has been asked to rearrange the text to be  printed from top to bottom  The beginning of the region may have been requested expli   citly by the user or it may be the top of a page  The end of the region may have been  requested explicitly by the user or it 
40. to left  To this  program  top to bottom text is treated as left to right text  The second of the additional  programs is bditroff   On a page by page basis  bditroff reorganizes the text on a page so  that each contiguous nxm rectangle scanned left to right  top to bottom  of text in top to   bottom fonts is permuted to become an        rectangle scanned top to bottom  right to  left  Because the scanning directions of the two rectangles are perpendicular to each  other and the characters are all the same size  the nxm and the mxn rectangles actually  occupy the same area on the paper  Thus the structure of the system is as given in Figure  1     INVARIANTS THAT ALLOW THIS SYSTEM TO WORK    The simple  modular structure described in the previous section works because of a  number of invariants that apply both to the text and its printing     1     given horizontal line on the page consists either of left to right and right   to left text or of top to bottom text  This is the case because of the line breaks  that are required at each point of changing from horizontal to vertical text or  vice versa    2  While reading  within each contiguous rectangle of horizontal text on a page   one does not move from a line   to the next until one is finished reading all the  text on line    Within line    one may in fact bounce around reading in alternat   ing directions  however  no text is read more than once    3  Within any such horizontal line  any permutation of the characters in t
41. to right  How this direction can be handled is dis   cussed later in the section on weaknesses  implementing this solution is left for future  work    Note that if the tri directional formatting problem is solved  then any bi directional  sub problem  e g   left to right and right to left  left to right and top to bottom  and  right to left and top to bottom  is also solved     THIS PAPER    This paper describes a pair of programs that enhance the ditroff collection to be tri   directional  One of these  ffortid  which provides the right to left formatting capability   was described in detail in an earlier paper 2   The second of these  bditroff  which pro   vides the top to bottom formatting capability  is the focus of this paper  The sequential   piped composition of ditroff with these two programs is called triroff  By enhancing an  existing full function formatting system  it is intended to be able to use the existing  system   s preprocessors and macros with no change  Indeed  this very paper was typeset  camera ready for this journal on a Linotronic    300 at 1250 dpi with the help of refer   pic  eqn  ditroff  ffortid  bditroff  and variants of the ms and MX macro packages that  were developed for this journal    The plan for the rest of the paper is as follows  Existing software is surveyed in order  to be able to determine desirable properties of a tri directional formatting system  Then it  is possible to identify a basic structure that allows these properties to be
42. ttle observable delay to the user  This order of input is called time order  it  is the order in which the text is thought of as it is being written  It is the order in which  the properly formatted text is read out loud by a human reader cognizant of the multi   directional text reading conventions  It is also the order in which the letters would appear  on paper if all languages were written in the same  say left to right  direction  This time  order is the input order that is assumed by a variety of multi lingual systems  specifically  those implemented by Joseph Becker 4  5   by Pierre MacKay and Donald Knuth 18    and by Cary Buchman  Daniel Berry  and Jakob Gonczarowksi 2   The groups doing  these projects seem to have arrived  independently  at the conclusion that time order is  best  That is  each group had written at least a draft of its paper and code before any of  the others    papers had appeared    Thus  from the input  shown in stylized form      ft R   Roman   English    in  5i V  effective        11 4 51    line length 4 inches  zT    PR    predominantly right to left   ft HB   Hebrew   yarn    br    PL    predominantly left to right   ft KT   Katakana                    br    BT    begin top to bottom    ft HR   Hiragana                      SD    ft CH   Chinese              br    ET    end top to bottom          assuming that English and Katakana are printed from left to right  Hebrew is printed from  right to left  and Hiragana and Chinese are printed fro
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
Informations de garantie, de réglementation et de sécurité  若者を取り巻く - 国民生活センター    User Guide  User Manual - INDUSTRIAL COMMS  Raypak MVB TYPE B 504-2004 User's Manual  (Rev. M) User Manual  Qu`est-ce que l`ANRAT ? Présentation complète  Fiche Produit  F900 - Citizen    Copyright © All rights reserved. 
   Failed to retrieve file