Home
SpeechRecorder Quick Start and User Manual
Contents
1. recording modal prompt active prompt display modal prompt display display Lad Y log Y log Y 002 baa t Figure 4 Recording phases the utterance A postrecording phase is most commonly used to avoid signal truncation due to clicking the stopbutton too early The timing for time dependent prompts has to be set to appropriate values by the script author e g to make sure that the recording time is sufficient for prompt playback and recording 3 Menu File The menu File contains commands to create open close import and save projects and to quit the application 3 1 New command The New command prompts the user for a project name and creates a project directory at the location provided Initially this directory contains an empty speaker database a directory for the audio recordings a project configuration file a sample recording script and the recording script DTD 3 2 Open command The Open command prompts the user to select a project from the list of known projects These projects must reside in the SpeechRecorder directory in the user s home directory A project can only be opened if no other project is open 3 3 Close command Close closes the current project If this project has been changed e g by editing 10 Open K Close W Import 251 Save dE 9 Quit HQ Figure 5 File menu the speaker database the user is prompted to save or discard the changes prior to closing the project
2. and close the dialog with the button select Click the button Recordto start your recording session Stop the current recording by clicking Stop or waiting until the recording timeout has been reached After the recording has ended the signal is displayed on the screen Click on Play to listen to the recording Proceed to the next item by clicking gt gt Start the next recording with the Record button After the final item has been recorded SpeechRecorder displays a message Click 0k to acknowledge the message You will find your recordings in the subdirectory RECS of the project di rectory You re done you ve recorded your first session using Speech Recorder Demo Script The demo script consists of two sections The first section contains test items recording progress is manual i e you have to click to start and end a record ing and click to proceed to the next item and only the supervisor view is shown The second section contains sample prompts in different languages and of different types Recording progress is semi automatic i e after a recording is stopped the script proceeds automatically to the next item and the speaker view is displayed Multiple Displays If you have two displays attached to your machine the supervisor view will always be shown on the primary display the speaker view on the secondary display s see fig 1 a and b eee Speech Recorder 2 0 4 Copyright 2004 2006
3. recording Play SpeechRecorder is an application for script driven speech audio and signal recordings Its main features are e platform independence e automatic and manual recording progress e local and remote recordings via the Internet e number of recording channels dependent only on the audio hardware e speaker and supervisor views on multiple screens e full Unicode text image and audio prompts Quick Start SpeechRecorder organizes recordings in projects A project is a combination of a speaker database a set of recording scripts and a set of recording sessions A recording session consists of an individual speaker a recording script the selected recording settings and a directory into which the recorded files are written 1 Download SpeechRecorder from http www speechrecorder org Java Web Start on your machine should automatically start SpeechRecorder Select the command File gt New from the menu and give the project a name The following items will now be created e a project directory in your home directory e a sample recording script e an empty speaker database e a project configuration file On the left side of the display a small traffic light will show up In the middle the prompt area is displayed and on the right side the contents of the recording script are listed see fig 1 a In the Settings menu select the option Speakers and enter data for a speaker Select the speaker in the table
4. recordings will be made using the new sample rate Contacts and Copyright SpeechRecorder is being developed by the Institute of Phonetics and Speech Processing of Ludwig Maximilian University in Munich Germany Its main authors are Christoph Draxler and Klaus Jansch Many people have contributed to the software by providing localized versions of the graphical user interface or by suggesting improvements to the software 18 The software is Copyright 2007 2008 by Ludwig Maximilians University of Munich Germany You may use the software free of charge for academic research and de velopment and commercial purposes We particularly encourage the use of SpeechRecorder in university or school courses on speech recording You may distribute the software freely provided that the packed jnlp or jar files are not altered The software is provided as is The authors and Ludwig Maximilians Uni versity cannot be held responsible for any damage caused by the use of the software 19
5. the string contains characters that have a special meaning for the file system 7 M 39 eg T etc It is thus recommended to use only the characters a z A Z or 0 9 for the itemcode attribute Time dependent prompts The following list describes issues with time dependent prompts i e audio and video prompts Note that the playback of the time dependent prompt will be recorded C 2 D For time dependent prompts set the promptphase attribute of the en closing section to recording Otherwise multiple playbacks of the time dependent prompt may occur Furthermore make sure that the recording duration is longer than the duration of the time dependent prompt Otherwise the playback of the prompt will continue until after the end of the recording duration it may even overlap with the playback of the subsequent time dependent prompt Platform dependencies Mac OS X Audio device selection does not work in Mac OS X and Java versions prior to Java 2 1 5 0 07 The sample rate must be set to 44 100 kHz 16 bit stereo PCM in the Settings dialog and only one audio device may be connected Note on Mac OS X notebooks this audio device is already used by the internal microphone Windows XP If a Windows beep is output via an M Audio mobile pre USB device the recording sample rate is reset to the sample rate of the beep i e usually 22 050 kHz SpeechRecorderdoes not detect this change of sample rate All subsequent
6. 3 4 Import command The Import command for a project archive in a zip archive The archived project will be deployed in the project directory 3 5 Save command Save saves the current speaker database and project settings in local files in the project directory 3 6 Quit command Quit exits the application The user is prompted to save any changes to the current project 4 Menu Settings The Settings menu allows the user to configure the current project to edit the speaker database to set the recording parameters to skip to a given recording 11 60e Project Speaker Recording Skip Signal Display Speaker Window Figure 6 Settings menu item and to toggle the speaker display on and off 4 1 Project command The Project command opens a dialog window with the tabs Project Audio Speakers Recording and Prompting The Project tab fig 7 presents the project name its location on disk and the audio class used to record audio The Speakers tab contains the location of the speaker database This database can be stored in the project directory or any other accessible loca tion in the local file system The Recordingtab fig 8 allows the user to set the recording parameters They include the sample rate quantization byte order encoding number of channels whether repeated recordings of an item overwrite previous ones or are stored as versions the progress mode resetting the level meter f
7. File Settings Speaker Window oom Welcome to the SpeechRecorder Demo Script 1 2 3 5 6 7 8 9 o Signal Display Introduction manual sequential idle false Play Welcome to the SpeechRecorder Demo Script b Figure 1 SpeechRecorder supervisor a and speaker b views Contents 1 o gt Q E Recording Script 1 1 The lt section gt element 1 2 The lt recording gt element 1 3 The lt mediaitem gt element Recording Phases Menu File 3 1 New command 3 2 Open command 3 3 Close command 3 4 Import command 3 5 Save command 3 0 Quit Command vza aa eteni LR ue 4 a HAN at Menu Settings 4 1 Project command 4 2 Speaker command 4 3 Recording command 44 Skips Command o ieg ae nues A da 4 5 Signal display command 4 6 Speaker Window command Recordings via the Internet Miscellaneous Recording script DTD Reserved keywords for recording scripts Known issues C 1 Time dependent prompts C 2 Platform dependencies Contacts and Copyright Co NI Or Ol 10 10 10 10 11 11 11 12 13 13 13 13 13 14 15 16 17 17 18 18 18 1 Recording Script A script speci
8. SpeechRecorder Quick Start and User Manual Christoph Draxler draxler phonetik uni muenchen de Institut fiir Phonetik und Sprachverarbeitung Universit t M nchen eee Speech Recorder 2 0 4 Copyright 2004 2006 File Settings Speaker Window file Users draxler speechrecorder QuickStartManual QuickSt Code QS Name Quick Firstname Start Gender m Accent NI Date of Bi 11 02 2007 file Users draxler speechrecorder QuickStartManual RECS Recording Progress Status Re File Prompt Record Odemo_000 Welcome to the SpeechRecorder Demo 1demo 001 and here is the prompt a text to read 2demo_002 The recording script is divided into sec 3demo_003 in the next section a speaker display 2741683950 Welcome to the SpeechRecorder Demo iimis isis sivua zos Gdemo_061 Aeur pa 24 AmpiXiou 2005 7demo 0S2 Mit sy t mieluiten aamiaiseksi sinulla Script Sdemo 030 Quest ce que vous avez Tak hieraa 9 demo_063 Ti vene my rexeurala wpa m E M 5 AMOR AOT CONST UND E 12 demo 042 Vad har du gjort under den senaste t C 14demo 012 How did you get here today D 1Sdemo_051 Mik on nimesi S 16demo 031 A Paris il y a 14 lignes de m tro dont 9 C 17 demo_062 KOAnge o T vTTepnc kat 18demo_020 Wie hei en Sie 19demo_013 And then he said to me B5 854 Signal Display 60 0 77 __ 7 c cy ea S Introduction manual sequential idle false demo_000 lt lt Record gt gt C Go to next
9. duction or Narrative speakerdisplay indicates whether the speaker view will be shown or not allowed attribute values are yes and no order specifies the order in which the items in this section will be presented The allowed values are sequential or random mode controls the recording progress The attribute value manual means that the user has to click once to advance to the next recording item and again to start the recording autoprogress means that the user clicks only once to advance to and immediately start the next recording autorecording finally means that the script proceeds to the next item and starts its recording without user action However the user may pause the script and resume recording later promptphase specifies when the prompt item is displayed idle displays the item already before the actual recording e g to give the user time for preparation recording shows the prompt only during the recording phase see section 2 for details and Appendix C 1 for problems when using audio or video prompts Sample sections lt section name Introduction order sequential speakerdisplay no mode manual promptphase idle gt lt section gt lt section name Recording Session order random speakerdisplay yes mode autoprogress promptphase idle gt lt section gt Section display Information on the section is displayed below the table with the recording items in the supervisor view fig 3 Introduc
10. ed alt contains the text that is displayed if the item cannot be retrieved from the external source autoplay modal and volume apply only to time dependent prompt items i e audio or video clips If autoplay is set to yes the clip plays automatically as soon as the item is displayed otherwise the user has to start playback explic itly With modal set to yes item playback cannot be interrupted and volume determines the audio volume for playback widthand height specify the width and height in pixels of the image or video to display It is up to the recording script author to set the mediaitem attribute values to meaningful values SpeechRecorder accepts the combinations given in table 1 3 An audio lt mediaitem gt element without contents displays a generic symbol for audio playback An audio lt mediaitem gt element with contents displays the text contents and plays back the audio lt mediaitem gt sample The following lt mediaitem gt element displays a text prompt mimetype src alt autoplay modal width height volume text UTF 8 text rtf URL image jpeg URL audio x wave URL video mpeg URL F Es F Table 1 Meaningful lt mediaitem gt attribute combinations lt mediaitem mimetype text UTF 8 gt Welcome to the SpeechRecorder Demo Script lt mediaitem gt This lt mediaitem gt element sh
11. fies which items are to be recorded A script consists of two parts a header containing meta data items and the recording script proper The recording script is divided into sections A section is an organizational unit that specifies the presentation order and progress mode for the recording items it contains A recording item consists of the instructions the prompt item and a com ment Instructions and comment are optional A prompt item consists of text an image or an audio clip The text may be stored in the recording script or fetched from an external file or URL Images and audio clips must be loaded from external sources e g a file or a URL Figure 2 Structure of a SpeechRecorder recording script reccomment A recording script is stored as an XML document The DTD is given in Appendix A SpeechRecorder does not yet have an editor for recording scripts Hence recording scripts must be created and edited using an external XML editor 1 1 The lt section gt element A section groups together items that are presented and recorded in a similar manner In a recording script the lt section gt tag is defined as follows lt ELEMENT section nonrecording recording gt lt ATTLIST section name CDATA IMPLIED speakerdisplay CDATA IMPLIED order CDATA IMPLIED mode CDATA IMPLIED promptphase CDATA IMPLIED gt All attributes are optional name specifies the name of the section e g Intro
12. gt lt ATTLIST session id CDATA REQUIRED gt lt ELEMENT metadata key value gt lt ELEMENT key PCDATA gt lt ELEMENT value PCDATA gt lt ELEMENT recordingscript section gt lt ATTLIST section name CDATA IMPLIED speakerdisplay CDATA IMPLIED order CDATA IMPLIED mode CDATA IMPLIED promptphase CDATA IMPLIED gt lt ELEMENT section nonrecording recording gt lt ELEMENT nonrecording mediaitem gt lt ELEMENT recording recinstructions recprompt reccomment gt lt ATTLIST recording itemcode CDATA REQUIRED recduration CDATA REQUIRED prerecdelay CDATA IMPLIED postrecdelay CDATA IMPLIED finalsilence CDATA IMPLIED beep CDATA IMPLIED rectype CDATA IMPLIED gt lt ELEMENT recinstructions PCDATA gt lt ATTLIST recinstructions mimetype CDATA IMPLIED src CDATA IMPLIED gt lt ELEMENT recprompt mediaitem gt 16 lt ELEMENT reccomment PCDATA gt lt ELEMENT mediaitem PCDATA gt lt ATTLIST mediaitem mimetype CDATA IMPLIED src CDATA IMPLIED alt CDATA IMPLIED autoplay CDATA IMPLIED modal CDATA IMPLIED width CDATA IMPLIED height CDATA IMPLIED volume CDATA IMPLIED gt B Reserved keywords for recording scripts A recording script may contain the following keywords for recording progress presentation order and recording type Recording progress and presentation order are defined via attributes of the tag lt section gt recording ty
13. hich recording is active but the the prompt is still inactive see 2 for details finalsilence is a flag for silence detection to stop recording If it is set to a value gt 0 recording stops after the specified amount of silence beep is a flag that determines whether a beep is to be played prior to recording see C for details Finally rectype is one of audio or video see C for details Recording sample lt recording prerecdelay 2000 recduration 20000 postrecdelay 500 itemcode demo_001 gt lt recprompt gt lt recprompt gt lt recording gt 1 3 The lt mediaitem gt element The lt mediaitem gt element holds the prompt item It may be empty or contain text which is displayed on the screen lt ELEMENT mediaitem PCDATA gt lt ATTLIST mediaitem mimetype CDATA IMPLIED src CDATA IMPLIED alt CDATA IMPLIED autoplay CDATA IMPLIED modal CDATA IMPLIED width CDATA IMPLIED height CDATA IMPLIED volume CDATA IMPLIED gt All attributes are optional mimetype specifies the type of prompt item The encoding of the prompt text is inherited from the encoding of the entire recording script and hence for text prompts this attribute is not used However for image and audio prompts this attribute provides a hint for displaying the prompt item image items are drawn on the screen audio is played via the system speakers or a headphone src is a file name or a URL from which a prompt item is retriev
14. o the appropriate item in the recording script 4 5 Signal display command Not yet implemented 4 6 Speaker Window command The Speaker Window command toggles the speaker view on the secondary dis plays on and off 13 20 Project configuration Project Speakers Recording Prompting Sample rate 44100 0 RJ Channels 2 BE Byteorder Little Endian 21 Encoding PCM_SIGNED A Sample size 2 R Overwrite wv Recording mode manual 21 Autoprogress to next unrecorded item M Reset peak at start of recording FA Number of audio lines 1 B Prerecdelay 1 000 gt Postrecdelay 500 5 Recording URL directory file RECS 1 Figure 8 Settings gt Project gt Recording dialog window 5 Recordings via the Internet One of the most interesting features of SpeechRecorder is its ability to transfer audio files to a remote server This is achieved using the http hypertext transfer protocol in combination with the post method for sending data from the client to a server To address a server via the Internet the server address must be provided as a URL in the Settings gt Project gt Recording dialog http SERVER_NAME SERVER_PATH with http the data transfer protocol SERVER NAME the IP name of your server and SERVER_PATHthe directory on the server SpeechRecorder will then encode the data to be sent to the server as attribute value pairs for the following attributes cmd command to ser
15. or every recording default values for pre and postrecording phases and the location for the recorded audio files Note that if the location for recorded audio files begins with http then the files are saved to a server via the http protocol over the Internet In this case the server must be configured to accept input via web forms with data transferred using the post method see for details The Prompting tab fig 9 displays the lists of fonts for prompt and instruc tions texts and the location of the recording script file This recording script file can be stored in the project directory or in any accessible location in the local file system 12 eoe Project configuration Project Speakers Recording Prompting Name QuickStartManual Description Audiocontroller class name ipsk audio impl j2audio J2AudioController Cancel Cape Figure 7 Settings gt Project command 4 2 Speaker command Speaker opens the speaker database and allows entering deleting or select ing a speaker 4 3 Recording command Recording shows an audio mixer allowing the user to select input and output devices and their levels Note that JavaSound does not detect all mixer controls of a given hardware configuration If you do not see your input devices in the list then select them via the system control panel of your operating system 4 4 Skip command Skip prompts the user for a script item number and skips t
16. ows a formatted text loaded from a file lt mediaitem mimetype text rtf src promptText rtf gt This lt mediaitem gt element shows an image loaded from a URL lt mediaitem mimetype image jpeg src http www speechrecorder org prompts images FelixWas jpg alt Boy and washing machine gt 2 Recording Phases Each recording is performed as a sequence of phases The seqence of phases is shown in fig 4 A modal prompt display means that the prompt item is shown but marked as inactive e g by using greyed out text low resolution images or a disabled audio button The default setting is to have modal prompt display during the prerecording and postrecording phases and an active prompt display during recording The attribute promptphase of a lt section gt element determines the start of an active prompt display and it overrides the default setting IDLE no recording red light prompt item is only displayed if the attribute promptphaseis set to idle PRERECORDING recording yellow light modal prompt item display RECORDING recording green light active prompt item display POSTRECORDING recording yellow light modal prompt item display A prerecording phase is useful to either record environment noise prior to the main recording or to give the speaker a precisely delimited time to prepare idle prerecording recording postrecording idle O O O O O O O O O O O
17. pe via an attribute of lt recording gt and mime types via lt mediaitem gt recording progress attribute mode values manual autoprogress autorecord ing presentation order attribute order values sequential random recording type attribute rectype values audio video Note video as a recording type is not yet implemented mime types texrt utf 8 for text audio z wave audio z aiff for audio im age jpeg image gifior images C Known issues The following list contains some of the known problems of SpeechRecorder If you find further bugs and errors please contact draxler phonetik uni muenchen de e No recording script editor Recording scripts have to be edited using an external XML editor e The following attributes are defined in the recording script DTD but have not yet been implemented in SpeechRecorder The lt nonrecording gt element is not yet implemented mimetype and src attributes for the lt recinstructions gt element are ignored Playing a beep before and silence detection to stop a recording and are not yet implemented 17 C 1 Recording video is not yet implemented The directory name into which the audio files are saved is named after the speaker number in the speaker database Currently this number is not visible in the speaker database Attribute values for itemcode may be arbitrary strings and the value becomes part of the audio file name This may cause problems if
18. tion manual sequential idle false Figure 3 Section information display in the supervisor view 1 2 The lt recording gt element The lt recording gt element defines the id contents and timing of the current recording item It consists of the optional lt recinstructions gt and lt reccomment gt elements and the mandatory lt recprompt gt element lt recinstructions gt and lt reccomment gt simply contain text which is displayed to both the speaker and the supervisor or the supervisor only respectively lt ELEMENT recording recinstructions recprompt reccomment gt lt ATTLIST recording itemcode CDATA REQUIRED recduration CDATA REQUIRED prerecdelay CDATA IMPLIED postrecdelay CDATA IMPLIED finalsilence CDATA IMPLIED beep CDATA IMPLIED rectype CDATA IMPLIED gt lt recinstructions gt may have the attributes mimetypeand src to allow in structions to be read in from an external source see C for details The attributes itemcode and recduration of lt recprompt gt are mandatory They uniquely identify a recording item and specify the duration of the record ing of this item itemcode can be an arbitrary string however because the itemcode becomes part of the audio file name it may not contain characters that have a special meaning in the file system see C for details recduration specifies the recording time in milliseconds prerecdelay and postrecdelay specify in milliseconds a time span during w
19. ver store_audio audio signal data store_log log file data store_timelog log file timestamp itemcode unique id within a recording session speakercode speaker code speakerid unique speaker id 14 eoe Project configuration Project Speakers Recording Prompting rPrompt font rinstruction font pDescription font Family SansSerif Bi Family SansSerif Family SansSerif BJ Bold M Bold W Bold O Italic 1 Italic 2 Italic 2 Size 48 Size 48 Size 145 l 1 Example Example Example Prompts URL file QuickStartManual_script xml Browse Automatic prompt play LA Show separate prompt window Show transport buttons in prompt window Figure 9 Settings gt Project gt Prompting dialog window extension signal filename extension script name of recording script session session id line recversion version of the recording file augmented for every re recording of the same prompt Your server must correctly extract and interpret the attribute values and read the signal data which is sent via the post method This is usually achieved by a cgi module in the server or external scripts called by the server or a Java server such as Apache Tomcat 6 Miscellaneous SpeechRecorder logs its activities into plain text log files The number of log files is dependent on the platform 15 A Recording script DTD lt ELEMENT session metadata recordingscript
Download Pdf Manuals
Related Search
Related Contents
Toshiba Satellite E300-1005UT feuille n°6 été 2010 pages 3 et 4 Bedienungsanleitung Thermofilter 2180 Operating Instructions Herunterladen Publications récentes Exacompta "Jura 160" Samsung RS21DCSW Инструкция по использованию Copyright © All rights reserved.
Failed to retrieve file