Home

Executing recognizers under ELAN

1. C Users diasoriDocuments Recognizers output audio speech xml EE a Output xml Tier holding the speaker diarization C Wsers ciasontDocuments Recognizers output audi speaker J csv CSV holding the speaker diarization B E Lal 8 Le je sag Elapsed time 00 00 Time since last update 00 00 00 09 35 025 Selection 00 00 00 000 00 00 00 000 0 MeL id td feal gt elel sielie Lelols t Cseetionmoe Juoopmose 4 AN Figure 7 Speaker diarization 9 Skin Color Estimation with GuiSkin The Skin Color Estimation App is a graphical way to estimate skin color in a video by manually selecting the best intervals in the YUV domain This step has to be done before the video file can be successfully processed by Hands and Head tracking recognizers 1 Open GuiSkin and import a video gt FILE gt LOAD VIDEO 2 Use the color intervals to mark all skin with blue ink and make sure no background is marked Advise GuiSkin has six sliders Y for the brightness U and V for the color components First three sliders represent the mid point of the selected range the last three sliders represent the size of the selected range U usually ranges between 80 and 130 V between 125 and 175 When estimating the color you will notice this step involves trial and error Our advice is to start with the second slider and
2. loopmode d IG IP Figure 6 IAIS 04 Speech Non Speech recognizer 10 8 IAIS 06 Speaker diarization The speaker diarization recognizer will try to assign each segment of the recording to respective speaker So it should detect the number of speakers in the recording and detect who is speaking when Change the Feet hee Speaker Mdiarization Figure 7_ Speaker diarization Within the parameter section fill in the input file name tick file which contains the speech non speech information you have created earlier and output file name which will hold the results of speaker diarization For example C Users diasor Documents Recognizers output audio speech xml OUT xml Tier holding the speaker diarization e g C Users diasor Documents Recognizers output audio speaker xml Another option within the parameter section is to choose the tier default tick tier It allows you to choose a tier you previously created with another recognizer Then click on START gt ELAN Undefined File Nam nl So File Edit Annotation Tier Type Search View Options Window Help Grid Text Subtitles Lexicon AudioRecognizer Metadata Controls Recognizer AIS 06 Speaker diarization mme E Parameters Input audio Input audio file liesa17a 1 wav dl xmi Result of one of the speech non speech recognizers D selection tier file
3. 00 00 000 00 00 00 000 0 id ld Jer gt bk De Di DI DI bs le el ols 7 giereg 7 Loop Mode di 09 23 000 00 09 24 000 00 09 25 000 00 09 26 000 00 09 27 000 00 09 28 000 00 09 29 000 00 09 30 000 00 09 31 000 00 09 32 000 00 09 33 000 ISS JES VER Set E et tet Wry rt FLL 00 2 000 00 09 33 000 gt e yY rol ba bai bei be Get Eat eet bal Eat Eat pel bal bai nt Fe VARSLING Det Gei bal ben kat Eet Ee bel Een pt Eat Gel bel eat ke Eat bal a bai bet Eet kat kal ben bai ai ET ht Ee pi Get Gel en ei Eat pa bei bai Een bat pe bat en Eat Lei pl bel ba Lei pi bei ban ben bet ai 09 23 000 00 09 24 000 00 09 25 000 00 09 26 000 00 09 27 000 00 09 28 000 00 09 29 000 00 09 30 000 00 09 31 000 Lil Wa Figure 3 IAIS 02 Standard audio segmentation If the output file is not defined ELAN MR in the Recognizers folder created in section 4 Using the ELAN audio recognizers SE ELAN Undefined File Name Lo jase File Edit Annotation Tier Type Search View Options Window Help Grid Text Subtitles Lexicon Audio Recognizer Metadata Controls Recognizer IAIS 02 Standard homogeneous segmentation D Parameters Input audio Input audio file EE Output xmi Tier holding the standard segmentation Iesv CSV holding the standard segmentation Progress
4. User guide Executing recognizers under ELAN Christopher Rosenthal Przemyslaw Lenkiewicz Diana Ransgaard Sgrensen Content kl ELCAN en EE 4 NST GIN WE 4 2 HANN 4 4 Using the ELAN audio FONT are 5 5 IAIS 02 Standard homogeneous segmentaton ENEE 6 6 IAIS 03 Fine audio segmentation EEN 8 7 TAIS 04 Speech Non Speech detection based on pre trained acoustic models een 9 8 KES 10 9 Skin Color Estimation with GuiSkin isesesesesevevevevevsvsesesenereneneveverevevsenesenenereveveneseererenenenenenerevevervnenene 11 10 Using the hands and head tracking recognizer EEN 12 FANG 01101 50 Ta ER KE EE EE EE 15 List of figures Figure 1 Extraction of EN SION S ua cagesesexnessinssaepnsvieacsnesescnesehstehameyeuvecbin te agonteses cine ssenosenecesentinee davedeene 4 Figure 2 Open new audio RE 5 Figure 3 IAIS 02 Standard audio segmentation ENEE 6 Figure 4 ELAN automatically Saves the Output irsesesesenenrvaveveversvnvsenenenenenevevevevernenenenenerevevevenseresenenenenenennn 7 Pr frr 8 Figure po IAIS 04 Speech Non Speech recognizer sessesnvnveveverreresesenerenenereverevensenesenenerevevevensererenenenenenennn 9 Figure 7 Speaker diarization ssereraveserenrnrsenesvavavororanvsvsenenesenenvavavesorneseesenenennnnsravaseserenenesesenesvsnnsnrnsesesenenenenene 10 Figure 0 SKIN color estimation EE 11 Figure 9 Open new video Ke UE 12 Figure 10 Video Recognizer and Parameters m mssrmnsserrsrseirsresseernsnnsein
5. ction of extensions 4 Using the ELAN audio recognizers To start using the audio recognizers open ELAN go to gt FILE gt NEW audio file by MOWingTEintoNhelSel cted Files menu Figure 2 Open new audio file Then click OK Note The files can be placed on any drive Go to the location either a folder or the desktop and create a new folder called e g Recognizers The folder must be accessible with write permissions for the recognizer Select the audio files desired and copy paste them to the newly created location SE New Look In Recognizers x G el mr D r Selected Files J liesa17a 1 wav C UsersddiasoriDocumentslRecognizersliesa17a 1 wav liesa17k 1 wav or EEO gt gt D lievi22e 1 wav Select Media Template File Name lliesa17a 1 wav XxX t d Files of Type All Files A Add Streaming File i OK Cancel d Figure 2_ Open new audio file To let ELAN create audio segmentations within an audio file the program offers two recognizers 1 IAIS 02 Standard homogeneous segmentation that splits audio on significant changes e g new speaker music 2 IAIS 03 Fine audio segmentation for splitting audio into utterance level segments Both of them can be used for audio segmentation however the IAIS 03 Fine audio segmentation gives the user more control to fine tune the results GT 5 IAIS 02 Standard homogen
6. eous segmentation Choose the IAIS 02 Standard homogeneous segmentation recognizer from the Recognizer dropdown list Figure 3_ IAIS 02 Standard audio segmentation Define the output file where the results of the segmentation will be stored The file has to be located in a folder which is accessible with write permissions for the recognizer so e g the Recognizers folder that has been created in section 4 Using the ELAN audio recognizers The name of the file needs to end with ml Try to give this file a meaningful name as the file will later be reused like C Users diasor Documents Recognizers output audio standard segmentation xml Click on S NR Y and let ELAN define segments within your audio file m e F e mm wm wm wm on xX S ELAN Undefined File Name E A m File Edit Annotation Tier View Options Window Help Grid Text Subtitles Lexicon Audio Recognizer Metadata Controls Recognizer IAIS 02 Standard homogeneous segmentation EE v Parameters Input audio Input audio file ll liesa17a 1 wav v Output mn Tier holding the standard segmentation C Users diasonDocuments Recognizers output audio standard segmentation xml EE a e al e Start Progress Elapsed time 00 00 Time since last update 00 00 00 09 35 025 Selection 00
7. ithin the parameter section is to choose the tier default tick tier It allows you to choose a tier you previously created with another recognizer Then click on START die Fie ene E 2 2 File Edit Annotation Tier Type Search View Options Window Help Grid Text Subtitles Lexicon Audio Recognizer Metadata Controls Recognizer AIS 04 Speech Non Speech detection based on pre trained acoustic models SEF v Parameters Input a audio Input audio file liesa17a 1 wav X xmi Result of the segmentation with optional manual labels for training Speech Validated or Nonspeech Validated The recognizer will use the validated segments to adapt the speech non speech model selection tier file C Users diasoriDocuments Recognizers output audio fine segmentation xml EE el aux Speech model to be used you can use a model you ve trained before aux Non speech model to be used you can use a model you ve trained before Output xmi Tier holding the speech non speech segmentation C Users diasoriDocuments Recognizers output audio speech xml lt D H Lahe Le js Progress Elapsed time 00 00 Time since last update 00 00 Start 00 09 35 025 Selection 00 00 00 000 00 00 00 000 0 IK 14 Fd gt Pr DEL DAL DL DI bsfsll f gt Ja 1 Oseedtionmode
8. ne 13 Figure 11 Create tiers from segment EEN 14 Figure 12 Example of four tiers with annotation s sesesesesenerevevereversvrenenenenenereveveneveenerenenenenenereverevennenene 14 1 Install ELAN 4 1 2 The ELAN annotation tool allows accessing the recognizers that are running on the Max Planck servers In order to use this functionality first download ELAN Note ELAN can be placed on any drive For example C Program Files x86 ELAN 4 1 2 Install the Skin Color Estimation Application Necessary for optimal configuration of the Hands and Head Tracking Recognizer It can be downloaded from the following location For instructions on using please see section 9 on page 11 ELAN extensions Download the extensions files and extract them Figure 1_ Extraction of extensions to your ELAN extensions folder If ELAN is installed at the C drive then place the downloaded extensions at C Program Files x86 ELAN 4 1 2 extensions B cmdi clam lux112 lux17 recognizers iais mpi elan extensions zip WinRAR evaluation copy EEE x File Commands Tools Favorites Options Help ANB ham We MN Je Add ExtractTo Test View Delete Find Wizard Info VirusScan Comment SFX E B cmdi clam lux112 lux17 recognizers iais mpi elan extensions zip ZIP archive unpacked size 231 792 bytes I Name Size Packed Type File folder H M extensions File folder E31 Selected 1 folder Total 1 folder Figure 1 Extra
9. o 1 Da hands_info 2 ae nei hands_info 3 PR 1 24 PM 10 27 2011 be is a d a Figure 12_ Example of four tiers with annotation Appendix Guidelines for video capture e Decent resolution at least standard definition 720 X 576 pixels e Decent video quality the higher the bitrate the better e Uniform lighting condition neither too dark nor too bright Bad examples e The color of the clothes should be different than the color of the skin Same for the background if it is very close to the hands Bad examples 15 Good example e No more than two persons in the scene e Fixed camera Other settings that are not mandatory but that potentially yield better automatic annotations e People should face the camera e People should be close to the camera Bad example Good example e f there are two persons they should be at the same distance from the camera e The tracking is easier if the person wears long sleeves clothes instead of short sleeves e Background removal works only with static or almost static background background removal is used only if the color of the objects in the background is similar to skin color though 16
10. observe the changes in the image When a reasonable amount of skin color is highlighted use the 5t slider to adjust the range for this value The two remaining sliders can serve to limit the amount of non skin pixels highlighted Sometimes it cannot be avoided that some parts in the background will be marked as well Priority should always be that all skin is marked Non moving background can be cancelled out by the recognizers later on E B Skin colour estimati 255 128 Mean U 0 Y 255 105 Mean V 0 Y 255 155 U 128 128 Figure 8 Skin color estimation Once done calibrating press SAVE RESULTS and save the given XML file in your work folder from section 4 Using the ELAN audio recognizers or save it anywhere and then copy or move to your work folder 11 10 Using the hands and head tracking recognizer Open ELAN gt FILE gt NEW and move your video into the SELECTED FILES window and press OK Advise for information about decent resolution video quality and lighting condition read Appendix I Guidelines for video capture Look In Test M Gal E G 0 Selected Files C4 Latest C SSL JI int b mpg Y AvatechiMedialTestiBSL PS fab1 bi Y BSL CH int b mpg Y SSL_JI_int_b mpg xml Y BSL CH int b mpg xml TN Thumbs db Y BSL PS fab1 b mpg xmi Select Y BSL PS fab2 b mpg Media Y BSL PS fab2 b mpg xml a elan example1 mpg Templa
11. pdated with hands movement information foutput_xm IZ results output mpg xml OUT csv csv files with hands head frame by frame information fo ZAresults o utput csv OUT aux video with overlayed hands heads tracked foutput_video ZAresults o utput mpg J Threshold used to decide whether the pixels changed from last frame S gt Le Le _5 Progress oe Ne as s Vo ae wth a Figure 10_ Video Recognizer and Parameters This recognizer in current version is somewhat unstable and it can either crash or endlessly keep you waiting for the results The way of checking if it s still working is to open your work folder and see if the files you have just created above are changing their size If they re growing the recognizer is still working If not it is finished In such case press CANCEL once or twice this is a bug and it should be fixed soon Once the program is finished you have to import the tiers manually one at a time Press FILE gt IMPORT gt IMPORT TIERS FROM RECOGNIZERS Open ouput mpg xml which you can find in your RECOGNIZERS folder and import the tiers one at a time i e by selecting just one tier at a time and clicking CREATE for each of them Figure 11_ Create tiers from segment 13 Per Segmentation All Segmentations Select a segmentation msme O Select and configure segments Include in tier Curren
12. reduce file prompt Elapsed time 00 00 Time since last update 00 00 00 00 00 000 Selection 00 00 00 000 00 00 00 000 0 ld 14 FQ 1 gt DE IDE DAY PI Dil bs Ss k gt L T sSelecionMode _ LoopMode gi Kc ge Figure 4 ELAN automatically saves the output 6 IAIS 03 Fine audio segmentation The IAIS 03 Fine audio segmentation works in a similar way to the IAIS 02 recognizer The difference is that is offers the user the chance to modify a parameter which controls the sensitivity of the segmentation process Choseichelfinelaudiolsegimentation Figure 5 Fine audio section and Define the output file where the results of the segmentation will be stored C Users diasor Documents Recognizers output audio fine segmentation xml Use the to tune the sensitivity of the recognizer and the size of the resulting segments Choose whether the recognizer should perform merging of the results This step will merge the resulting segments that are neighbors and have high similarity Click on START and let ELAN define segments within your audio file SE ELAN Undefined File Name es File Edit Annotation Tier Type Search View Options Window Help Grid Text Subtitles Lexicon Audio Recognizer Metadata Controls Recognizer IAIS 03 Fine audio segmentation v Parameters Settings Percentage of frames considered a
13. s low energy frames 0 0 1 0 0 1 a 1 0 56 Perform merging stage 3rd yes vw Input audio Input audio file liesa17a 1 wav v Output xmi Tier holding the fine segmentation C Users diasorDocuments Recognizers output audio fine segmentation xml He 3 Progress 7 seres bi Ls Le Le Je Elapsedtime 00 00 Time since last update 00 00 00 09 35 025 Selection 00 00 00 000 00 00 00 000 0 Id 14 Fd 4 gt bb br Di DLI DI bs slk ef gt Ju t O seecion mode LoopMode di Figure 5_ Fine audio section 7 IAIS 04 Speech Non Speech detection based on pre trained acoustic models To detect what parts of the recordings that contains human speech change the recognizer to IAIS 04 Speech Non Speech detection based on pre trained acoustic models see Figure 6_ IAIS 04 Speech Non Speech recognizerFigure 6_ IAIS 04 Speech Non Speech Within the parameter section fill in the input file name tick file which contains the segmentation information you have created earlier and output file name which will hold the speech no speech results For example C Users diasor Documents Recognizers output audio fine segmentation xml OUT xml Tier holding the speech non speech segmentation e g C Users diasor Documents Recognizers output audio speech xml Another option w
14. t segment label New segment label Number segments L LEFT_HAND_MOTIO L mm dk BGL DAD MOT LJ LEFTLHAND_woTIOw U LJ BGH DAD MOT IL Figure 11_ Create tiers from segment If everything has worked out ELAN should have created four tiers with annotation information See example in Figure 12_ Example of four tiers with annotation JE ELAN Unde ed File Name Lal File Edit Annotation Tier Type Search View Options Window Help Grid Text Subtitles Lexicon Audio Recognizer Video Recognizer Metadata Controls Recognizer HHI Tracks motion of hands and heads e File s pre intermediate mpg H Selections Parameters deeg Add Tier tracking background image mentation1 XML filename g xml a hovement information foutput vm d il gt Ls Le Segmentations Progress Create Tier s Canceled O O OE Start Report 00 00 13 162 Selection 00 00 12 640 00 00 13 320 680 KJK 14 Fd 4 gt pb DE DD Dil bs x e gt 4 T Gsetection mode Loopmode gl a Lon 00 00 12 000 00 00 22 000 default JE 00 00 13 000 00 00 14 000 LEFT HAND M LEFT HAND MO RIGHT HAND M HANDS JOINED HANDS_JOINED HANDS JO 00 00 15 000 00 00 16 000 00 00 17 000 00 00 18 000 00 00 19 000 00 00 20 000 00 00 21 000 hands info 11 hands inf
15. te IN elan example1 mpg xml DST File Name BSL PS fab1 b mpg X d H Files of Type ll Files Add Streaming File OK Cancel Figure 9 Open new video file To your upper right click on VIDEO RECOGNIZER Figure 10_ Video Recognizer and Parameters and choose Tracks motion of hands and head Under Parameters you will see four empty text fields 1 IN csv xml containing result of segmentation1 XML_filename Here you need to import the XML file you have created with GuiSkin in previous section 2 OUT aux xml updated with hands movement information This file holds the resulting tiers and annotation segments describing which body parts are moving Fill in X recognizers output mpg xml This file holds the position of the tracked body parts for each frame of the video using X and Y coordinates Fill in X recognizers output csv OUT aux video with overlayed hands heads tracked Here you can save the video with ellipses over the tracked body parts Fill in X recognizers output avi Once filled in press START OE ai tles Lexicon Audio Recognizer Video Recognizer Metadata Controls ks motion of hands and heads M File s BSL PS fab1 b mpg H Parameters E Remove IN csv xml containing result of segmentation1 XML_filename pak ly Avatech Media TestiBSL_CN_int_b mpg xml OUT aux xml u

Executing recognizers under ELAN

Contents

Download Pdf Manuals

Related Search

Related Contents