Home

SAIL. User Guide - European Bioinformatics Institute

image

Contents

1. E Parameter list res Parameter tree i Parameter hierarchy Vocabulary ys Mets Mi Code Name Description APOB Apo B mgd Biochemistry Apolipoprotein B BASO Basophils 0 02 0 1 Blood Basophils BICEPS Biceps mm Thickness of a skinfold on the biceps muscle BMI BMI Body Mass Index kg m2 BMO Month of birth Month of birth BP Blood pressure Blood pressure systolic diastolic mm Hg BYR Birth Year Birth Year CHD Coronary Heart Disease Coronary Heart Disease cM Cholesterol medication Cholesterol medication CRP CRP CRP mg L DB Type of diabetes Type of diabetes DCHD Date first CHD Date of coronary heart disease diagnosis DMI Date first Ml Date of first myocardial infarction DS Date first stroke Date first stroke EDU Education Education EOSIN Eosinophils 0 02 0 05 Blood Eosinophils EXYR Year examination Year examination FAMH Family History Family History FMHRT Family history heart disease Family history heart disease FMSTRK Family history stroke Family history stroke FMT2D Family type 2 diabetes Family type 2 diabetes GLU Glucose Glucose mMol L GLUM Glucose medication Glucose medication GWY_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping GWW_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set GVW_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 GVWY_AFFY_SOOk Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set Z quick query Eaa 5
2. KoreF3 RTLIPI DGI CHRELAT Coronary Heart D Disease _ Date frst Ch CHD eaenennyowsnenanensn ranted sernenenerat ractenetsevasnsevesneasn end oers ey sovenvenpesurseenensneveriuensniasnvaensnntsnuasnas nest onsn sera seins _ Date first stroke KoraF3 rc040 re ca mae Basal visit date Basophils 0 02 0 1 sa abe mm Birth month Birth year Boo pressure lea Blood pressura CRP KoraF3 RH_CRP Children DM Description Biochemistry Apolipoprotein B Calculated basal visit date Blood Basophils Thickness ef a Seinra on De eer m meee Birth Year Birth month Birth coe Mean ot two measurements CRP ot Children DM Children had congestive heart Children had congestive heart Children have MI Cholesterol Cholesterol medication Children have Ml gmesterol Blood we yak dst NR ca ae RC en lo io o Bam 0 0 0 0 pinua Cholesterol medication Congestive heart in family Sarna Heart Disease eron Heart Disease Date first MI pas ot first macanda ne Date first Stroke Diabetes Diabetes EE _ Congestive heart in family coronan Heart Disease See of pe TENE n ger z S il A aquest Muse split by collection Cl Use relations Create new classifiers Two new categories can be found under the Classifier tab Classifier list and Classifiers tree are used to display what classifiers have been defined in SAIL and how they have been structured
3. i tun t za pi ko parameter with code ALC and name Alcohol ADS Export Lists of Parameters Users can export lists of parameter definitions from the user interface Parameter lists can be used as templates to create new vocabularies or and to understand parameter structure and how to prepare dataset for data upload into SAIL The following steps are needed to export a list of parameters 1 In order to export all the parameters visible in the parameter list window first you click on Extra A welcome E Summary Report constructor Report 1 Report 2 Report 3 E Report 4 Report 5 Study ANY Collection ANY x Report request Z Parameter list I Parameter tree ll tis Parameter hierarchy for xz NO FILTER setect tag x Columns 7 xlo FA Request Code Name Description Filter Records V E DGI Affected_Statu Affected_Status Affected_Status a 3142 S QAN _AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set Fg 3 0 GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 Fd a 0 GYV_AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 3 0 GYV_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a lt gp GYV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 3 0 AGE Age Age a 33080 0 KoraF3 RTALTERU Age
4. Po gt Fe GH Report constructor Classifiers Projections Study Collection Collection View Metadata Import Name TestCollection Classifier Tag Repository Description Country 7 To import availability data to a collection start by selecting the collection to which you want to add data and click the Import Data button Collection MolOBB NFBC66 EGP STR Genmets Case Genmets Control KoraF4 Too KoraF3 ERF DGI Add Edt Import Data 8 In the pop up window click on Add to select the file you want to use to upload availability data Collection MolOBB NFBC66 EGP STR Genmets Case Genmets Control KoraF4 ee pg KoraF3 ERF DGI State Filename 9 Once the file is selected click on Upload Messages will be displayed in the Note section showing the state of the data upload Collection MolOBB NFBC66 EGP STR Genmets Case Genmets Control KoraF4 1 a KoraF3 ERF DGI sail twinsUK _2_fixed txt Queued for upload Add Edit Edt EEEN 10 Once data load is finished close the File upload dialog window and return to the Collection tab Collection MolOBB NFBC66 EGP STR Genmets Case Genmets Control KoraF4 a a gg pp KoraF3 ERF DGI Metadata Import relations and vocabulary upload The metadata import tab is used to upload vocabulary files and files containing information about the relation
5. KD 4 Click on Add and all the parameters will be added to the Report request with the OR connector welcome summary amp Report constructor li EJ Report 1 x E Report 2 x E Report 3 Study ANY Colection ANY x Report request E Parameter list Parameter tree il fg Parameter hierarchy x g Vocabulary ANY v Columns x aA Code Name Description Filter Records Y DGI Affected_Statu Affected_Status Affected_Status a 3142 1 E KoraF3 RTALKKON Alcohol GVWV_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 3 0 i ALCOHOL Alcohol GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a im 0 GVWW_AFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set 4 3 0 GYV_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 e d 3 0 E GW_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a E 0 AGE Age Age a 33080 o KoraF3 RTALTERU Age Age a 1644 0 KoraF3 RTMIALT Age of Ml Age of the first miocardial infarction a 58 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 0 ALC Alcohol Alcohol a 13741 4 KoraF3 RTSLKKON Alcohol grams absolute ethanol per day at week before examination a 1640 0 DGILALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 DGEALCODOS Alcohol doses week a 0 1 ALCQ Alcohol quantity
6. grams absolute ethanol per day at week before examination Have you been drinking alcohol during last 12 months doses week grams absolute ethanol week Antihypertensitives medication Antihypertensive treatment antihypertensive treatment Biochemistry Apolipoprotein B Body Mass Index kg m2 kgim2 Calculated basal visit date Blood Basophils Thickness of a skinfold on the biceps muscle Birth Year Birth month Columns 2q Bq 84 84 24 Bq 84 Bq Bq Bq 24 Bq 24 24 24 24 9 24 24 24 2d 2d 24 2d 24 2 2 F Records Y 3142 o wo a wo ww 33080 1644 58 44 13741 1640 16252 1643 11677 297 1781 32569 1636 3111 3142 69 69 14994 yo enl Se Be ee a eae Sl ee ah ah Boh eye ie ee ea eae ee Hil 4 zo A Request ES ALC Alcohol Use split by collection Use relations Specific relations ea C Specify collections 12 Notice that in this case you will get an R super index in some of the results indicating that this value contains entries with a related parameter to the main one stated as the header of the column A Welcome Summary ir Report constructor j E Report 1 x ie Report 2 x E Report 3 E Report 4 B Report 5 27 ji E i TE Total records 33284 Genmets Case Genmets Control ra fa 0 ox 5677 97 997 99 0 0 912 6 931 96 0 0 5224 84 1641 0 0 0 0 0
7. 0 0 0 1 Use split by collection C use relations O Specify collections mass C Specific relations lt 2 Inthe request report panel click on the check box Split by collection at the bottom of the panel FA Welcome Summary Report constructor Report 3 Study ANY vi Collection ANY v Parameter list Parameter tree 4 Parameter hierarchy Vocabulary Code AGE AGEST ALC ALCO ANTIHYPR APOB BASO BICEPS BMI BMO BP BYR CHD cM CRP DB DCHD DMI DS EDU EOSIN EXYR FAMH FMHRT FMSTRK FMT2D GLU vi Mets Name Age Gestational age Alcohol Alcohol quantity Antihypertensives Apo B mgA Basophils 0 02 0 1 Biceps mm BMI Month of birth Blood pressure Birth Year Coronary Heart Disease Cholesterol medication CRP Type of diabetes Date first CHD Date first MI Date first stroke Education Eosinophils 0 02 0 05 Year examination Family History Family history heart disease Family history stroke Family type 2 diabetes Glucose J Quick query v Description Age Gestational age Alcohol grams absolute ethanol week Antihypertensive treatment Biochemistry Apolipoprotein B Blood Basophils Thickness of a skinfold on the biceps muscle Body Mass Index kg m2 Month of birth Blood pressure systolic diastolic mm Hg Birth Year Coronary Heart Disease Cholesterol medication CRP mg L Type of diabet
8. Another way to use relations allows you to select only one parameter and then specify in the report request panel to look for relations 7 Select the parameter you want to add to the report and press Add Study ANY i Collection ANY y Report request Parameter list tes Parameter tree i Parameter hierarchy wo f Vocabulary x ANY Sal Columns 7 o x P FA Request Code Name Description Filter Records Y E DGI Affected_Statu Affected_Status Affected_Status a 3142 1 1S GYV_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 3 1 0 QN_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a i 1 0E GVWW_AFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set 4 3 1 0 GVY_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 e d 3 1 ar GW_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 3 1 0 AGE Age Age a 33080 1 0 KoraF3 RTALTERU Age Age a 1644 1 0 KoraF3 RTMIALT Age of Ml Age of the first miocardial infarction a 58 1 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 1 0 ALC Alcohol Alcohol a 13741 1 4 KoraF3 RTALKKON Alcohol grams absolute ethanol per day at week before examination a 1640 1 0 DGILALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 1 DGEALCODOS Alcohol doses week a 0 1 1 ALCQ Alcohol quantity
9. In the list of collections that belong to the Study click on the one to which you want to upload data relations and click on Select Colection EGP STR _ Genmets Case 10 Select the Add button to choose the file with the data relations and click Ok 11 Select Upload Total Samples Eligible Samples Selected Samples o a o EE m ORAE E A T a EA O EE T y A E Queued for upload 12 The amount of Samples eligible and selected in a study will be displayed in the right hand columns in the study windows Create a new collection The collection tab allows administrators to create or edit collections and add new availability data to a collection 1 To create a new collection select Add A Report constructor Collection Collection view Metadata Import Columns x p Collection Records oas eee Sistas bss tats asks atta ibs AE EEEN EEEE EEEE aa EGP 998 STR 8467 Genmets Case 946 Genmets Control 965 KoraF4 1814 UK Twins 6190 KoraF3 1644 ERF 3205 DGI 3142 __ Add _Import Data 2 Inthe Add collection tab once you added the name of the new collection click the Add button on the Structured description section A Report constructor Study Collection Collection View Metadata Import Add collection Add collection Name Structured description a Classifier Tag Info Remove Save Cancel 3 In the classifier list select Reposito
10. O Collection Records MolOBB 69 BASO Basophils 0 02 0 1 Bigi NFBCEG 5844 BICEPS Biceps mm Thi EGP 398 BMI BMI Ba STR 8467 BMO Month of birth Mol V Genmets Case 946 BP Blood pressure Blo V Genmets Control 965 BYR Birth Year Birl KoraF4 1814 CHO Coronary Heart Disease Col UK Twins 6199 KoraF3 1644 cM Cholesterol medication Che ERF 3205 CRP CRP CRI DGI 3142 DB Type of diabetes Tyf DCHD Date first CHD Dal DMI Date first MI Dal DS Date first stroke Dal EDU Education ea EOSIN Eosinophils 0 02 0 05 Bla EXYR Year examination Yeu J FAMH Family History Family History 69 1 0 FMHRT Family history heart disease Family history heart disease a 5148 i F Use split by collection CI Use relations FMSTRK Family history stroke Family history stroke a 7814 1 1 Specify collections Fi Specific relations FMT2D Family type 2 diabetes Family type 2 diabetes ar 7838 1 PET Select GLU Glucose Glucose mMol L a 31068 1 2 o A ouckauere Eaa lesitin a Extra 6 Again click query to check your results A Welcome B Summary K Report constructor a Report 6 x TE Total records 33284 tion Genmets Case _ 946 100 912 96 946 100 946 100 941 9 Genmets Control 946 98 931 6 964 99 965 100 941 97 Summary 1911 1892 1843 1910 1911 1882 parameter with code AGE and name Age parameter with code ALC and name Alcohol parameter with code BM
11. in count or only those collections selected from the pop up list once you click on the Select button Use relations button allows including into a query those other parameters that have any type of relation with the parameters selected The specify relations option allows you to select which type of relations do you want to take into account when you make a query The list of available relations will be displayed when clicking on the Select button Query button sends the request to the server and opens a new tab with the results Right click in one of the parameters of your query to get the following options in a small pop up window 26 2i 28 29 Remove button allows to remove selected object parameter enumeration or group from request Down button moves selected object to one line lower Up button moves selected object to one line upper Filter option Brings up the filtering pop up window similar to the one with button 7 The simplest request is about the availability of a single parameter To construct such a request we need to choose the parameter of interest and add it by Select button A Wekorne Study ANY v LI Colettion ANY M Summary a Report constructor T Parameter list Param ter tree Gg Parameter hierarchy NO FLTER Setect t Code Name WEDI Mediators ui Myocardial ntarction MONOC Monocytes 0 2 1 0 MS Metabote syrdrome NEFA NEFA umo NEUT Neutrophits 2 7 PLAT Pinte
12. you can create a report where you only select those samples with values within a range 1 Select the parameter that you want to add by clicking on the funnel icon filter column A Welcome f Summary r Report constructor Study ANY i Collection ANY Report request Parameter list Parameter tree 4 Parameter hierarchy ZiQ Vocabulary v Mets v Columns x p A Code Name Description Filter Records V E BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 of BYR Birth Year Birth Year a 14994 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 cM Cholesterol medication Cholesterol medication a 14771 1 CRP CRP CRP mg L a 23530 1 DB Type of diabetes Type of diabetes a 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 0 DMI Date first Ml Date of first myocardial infarction a 148 0 DS Date first stroke Date first stroke a 137 0 EDU Education Education a 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils a 69 0 EXYR Year examination Year examination a 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease a 5148 1 FMSTRK Family history stroke Family history stroke a 7814 1 FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 GLU Glucose Glucose mMol L a 31068 2 GLUM Glucose medication Glucose medication a 11339 1 GvV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 0 1 0 GN _AFFY
13. 3 Sj Liver 3S P3G P 3 Sj ENGAGE P1 P2 J Sj Blood Sj P3G P4 3 Sj ENGAGE P3 3 Sj Undefined J J ENGAGE 5Ps Using different projections we can easily organize parameters into as many ways as required for a particular case Collections and Studies What is a collection A collection in SAIL is an availability data set coming from one data provider where all the samples have been annotated using a common vocabulary What is a study in SAIL A Study is a way to group data availability coming from different collections with the common denominator that it has been used during the development f a study This was of grouping data is useful when samples from many collection have been used and the user wants to keep track of what samples have been selected for each study Samples that take part in a study must have two labels one indicating if the sample was eligible for the study and another one to show if the sample has been used during the study Samples don t need to be eligible in order to be used in a study as eligibility only means that the sample has the feature we are studying and we may want to add some control samples For examples in a study of Diabetes we may want to use samples where the individual has diabetes eligible and samples when the patient is healthy non eligible Parameters import Parameters can be entered into the system using the parameter edit form This is a common form for both
14. 4 GW AFFY 6 Affymetrix Genome wide Hum Affymetrix Ge lag ___ Wocabulary ets S wt 3 Variable fvailabilt D E GN_AFFY Affymetrix Genome wide gen Affymetrix Ge Type BOOLEAN A Parameter GW_AFFY_5 EE IE AGE Age Age Name Affymetrix Genome wide Human SNP Array 5 0 EE RT Description Affymetrix Genome wide Human SNP Array 5 0 Co ee aeia nea faa a SS O S KoraF3 RTMIALT Age of MI Age of the fir Variable Availabilit a Type _ BOOLEAN Oo ee KoraF3 RTSCHALT Age of stroke Age of the fir Parameter Gw _AFFY_500k OOOO S OE ae ES aed ame Affymetrix Genome wide Human SNP Array 500k Set cr WT 4 Description 4ffymetrix Genome wide Human SNP Array 500k Set E Wo KoraF3 RTALKKON Alcohol grams absold pape oo e d SA Variable DGILALCOHOL Alcohol Have you beg E E Parameter Ww _AFFY_6 ed eT DEEALCOUES ASEO doses week Kame _jaffymetrix Genome wide Human SNP Array 6 0 es ALCA Alcohol quantit rams absol Description Affymetrix Genome wide Human SNP Array 6 0 Ey pastas pappe KoraF3 RTANTIHY Antihypertensitives medicatior Antinypertens Nariable availabilit Type _ BOOLEAN E ANTIHYPR Antinypertensives Aniiyperen Parameter W AFFY I DGLANTIHYP Antihypertensives antihypertens Mame ___ Affymetrix Genome wide genotyping rs DescriptionfAffymetrix Genome wide genotyping i A APOB Apo B mgA Biochemistry CA s oo Variable O BMI BMI Body Mass In Type BOOLEAN O o ieee ois E a ae E E DGI BMI_basal BMI Calculated
15. AGE Age GLUTM Concentration GLUTM Timing OX1A 01 Male l l Fasting OX1A 02 Female 1 0 OX1A 03 0 l Non fasting OX1A 04 l l Fasting 1 OXIA 01 Sample has full information 2 OXIA 02 Sample has no information about glucose concentration 3 OX14A 03 Sample has no information about sex and age 4 OX1A 04 Sample has full information but information about sex is not disclosed Report Constructor Interface Now we are ready to get sample availability report To create a report we need to prepare request by using the Report Constructor Report Constructor consists of two panels Left panel represents parameters in form of a plain list or projection tree Right panel is for a request 5 iver rT A Report constructor Study ANT v D Cetin ant T Report request Parameter list v Parameter tree Yig Parameter heeranchy a a us NO FILTER x ei ts on ducatin l 3 ia Code 2 Harme Description T Records po wi ALC Aleghol lt 7 ae es J SLC cohol Al ALCO Alcohol grams absolule ehari sw af ee JDB Typi of dakeje ANTHY Anbhypertensives A Antihypertensive tesmeni fe 3 Change APOE l ApoB mgl 3 Biochemisiry Api poprobein B a ri BARO Basophis 00 02 0 1 Biad Grhophit a j BEE Biega mm Tiira of a daniel on Pee beeps muai a ma Ei Body Mass index kgm a Bead Month of birth Month of birth a Blood pressure Bloc prerane ayahe dheshoke men Hg a Erih F P it BYR Beth ves Birth vem 7 u 1 o 2 a pi F CHD Co
16. DB Type of diabetes Type of diabetes i 4 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 0 DMI Date first MI Date of first myocardial infarction a 148 0 DS Date first stroke Date first stroke a 137 o EDU Education Education a 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils a 69 0 EXYR Year examination Year examination a 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease a 5148 1 FMSTRK Family history stroke Family history stroke 4 7814 1 FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 GLU Glucose Glucose mMol L a 31068 2 GLUM Glucose medication Glucose medication a 11339 1 GWY_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 0 0 T 5 GW_LAFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 0 0 Cuse am by ED s Se AR GVY_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 0 ec e reuns GVWY_AFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 0 0 i PE ECER a a O N ES e a 3 3 Press b Add at the b f the li Press button at the bottom of the list A Welcome x f Summary amp Report constructor Study ANY Q Collection any v g __ Report request i B Parameter list t Parameter tree a Parameter hierarchy er x 7 Yocabulary Mets ly Columns xP l a GLU Glucose Code Name D
17. DescriptionBge O oOo i O o y flag Wocabulary o ooo o oOMs lt SCSC ENNNNNNNNNN __ SCdz DGIYVISIT_bas Basal visit date basal visit daj Relation MetSRelations Synonym KoraF3 RTALTERU 3 Relation h st3f elatione pen PATAR VISITAGE BASY cee UUE Epoa Basopt Wariable pe O oOo E BICEPS Biceps mm Thickness of Hype h NTEGER _ 0 Toed voce oraF3 RTALTERU ee L Specify collections Specific relations Plame age S S A DG BIRTH_MONTH Birth month Birth month fag Nocabulary rors dC M Use split by collection Cl Use relations BYR Birth Year Birth Year Relation MetSRelations Synonym AGE J Quick query en 4 To export a set of selected parameters first click on the parameters you want to export from the parameter list You can select a complete set of parameters by selecting the first parameter in the list and then by keeping the shift button pressed click on the last parameter of the list This will select all the parameters between the ones you selected If you want to select a subset of parameters press ctrl click on the name of the parameters you want to select cmd parameter in Mac A Welcome Summary i A Report constructor E Report 1 l E Report 2 I E Report 3 E Report 4 Report 5 Study ANY v Colection ANY v Report request E Parameter list ltz Parameter tree ti Parameter hierarchy xs NO FILTER Select t
18. KoraF3 RTALTERU Age Age a 1644 41 0 KoraF3 RTMIALT Age of MI Age of the first miocardial infarction a 58 1 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 1 0 ALC Alcohol Alcohol a 1374 1 1 KoraF3 RTALKKON Alcohol grams absolute ethanol per day at week before examination a 1640 1 0 DG ALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 1 DGI ALCODOS Alcohol doses week 4 0 1 1 ALCO Alcohol quantity grams absolute ethanol week a 16252 1 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication a 1643 1 1 ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1 1 DGLANTIHYP Antihypertensives antihypertensive treatment Pg 297 1 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BMI BMI Body Mass Index kg m2 4 32569 1 0 KoraF3 RTBMI BMI kg m2 a 1636 1 0 DGIBMI_basal BMI Calculated a 3111 41 0 DGI VISIT_bas Basal visit date basal visit date 4 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 Use split by collection o les relations BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 1 0 F Specify collections Specific relations BYR Birth Year Birth Year a 14994 1 0 DGIBIRTH_MONTH Birth month Birth month 4 0 1 0 re oct Eisai Eaa Drane akaa 2 By pressing the button Relations in the parameter list panel a new pop up window displays the names of the related parameters and what type of relation they have ameters yj orure pse l Gehms
19. ParameterAGE O O O TO O TSS 5 Parameter lst fea Parameter E e P qe xs Tags ocabulary Mes ARequest etSRelations BynonymKoraF3 RTALTERU ead etSRelations BynonymDGI VISITAGE Code Name Description fYariable Bae O o o o Oo o CS Type NTEGER Oooo oo y y 7 ee Affected Sta Parametere O O o O CCS GW AFFY_100k Affymetrix Genome wide Hur Affymetrix G Hame Alcohol o Z o Jo sd Pescriptionfilcohol o CT o Cd GW_AFFY_S Affymetrix Genome wide Hun Affymetrix Ge fag Wocabulary det gt Relation _MetSRelations BynonymKoraF3 RTALKKON GYV_AFFY_S500k Affymetrix Genome wide Hun Affymetrix G et Relations BynonymDGI ALCOHOL GW AFFY 6 Affymetrix Genome wide Hun Affymetrix Ge Variable tatus ee Type ENUM a Ci CisdS GWW_AFFY Affymetrix Genome wide gen Affymetrix Ge PredefinedNOC C iRSO OT Parameter tco CCTs AGE Age Age Name Alcohol quantit a eee KoraF3 RTALTERU Ace PR Pescriptionjgrams absolute ethanol week Cd 3 3 Tag Wocabulary Mes KoraF3 RTMIALT Age of MI Age of the fir jetSRelations BynonympGl ALCODOS Variable Quantity Cd CT KoraF3 RTSCHALT Age of stroke Age ofthe fir Type INTEGER gt aay ab fone cane t__Eo _ A R KoraF3 RTALKKON Alcohol grams absol DescriptionBody Mass Index ka m2 EE Tags ocabulary Met Tag Definition WHO DGEALCODOS Alcohol doses week joiners __ peti ahs tie ALCG Alcohol quantity grams absoly parila BR H e I I KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertens Parame
20. SNP Array 100k Set a 3 1 0 GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 3 1 0 GYV_AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 3 1 0 GN_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 E d 3 1 af GVWW_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 3 1 0 AGE Age Age a 33080 1 0 KoraF3 RTALTERU Age Age a 1644 1 Oo KoraF3 RTMIALT Age of Ml Age of the first miocardial infarction a 58 1 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 1 0 ALC Alcohol Alcohol a 13741 1 1 KoraF3 RTALKKON Alcohol grams absolute ethanol per day at week before examination a 1640 1 0 DG ALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 1 DGEALCODOS Alcohol doses week a 0 1 1 ALCQ Alcohol quantity grams absolute ethanol week a 16252 1 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication 4 1643 1 1 ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1 1 DG ANTIHYP Antihypertensives antihypertensive treatment a 297 1 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BMI BMI Body Mass Index kg m2 a 32569 1 0 KoraF3 RTBMI BMI kgim2 a 1636 1 0 DGI BMl_basal BMI Calculated 4 3111 1 QO DGIVISIT_bas Basal visit date basal visit date e d 3142 A 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 Use split by collection i T Jera BICEPS Biceps mm Thickness of a skinfold on the
21. URL Contacts AddTag EditTag Remove Tag Classification a Classifier Tag Info Classifier type Annotation __AddTag_ Remove Tag_ Save Cancel What is a vocabulary in SAIL Vocabularies are lists of parameters that have been defined to be used when describing samples Different types of sample collections studies and even different users may use different terms when describing their samples In order to harmonize the annotation of samples to improve compatibility among collections and ease querying SAIL supports the creation of vocabularies and provides tools to relate terms from multiple vocabularies What is parameter from SAIL point of view The simplest and most commonly used case is a parameter that consists of single variable Such kind of parameter describes single physical values such as temperature or concentration In addition to variables every parameter has Code Name and Description Code is used to attach a short and stable designation to each parameter Name is short but human readable designation of parameter Name can be translated to different languages if required Description is free text part of parameter structure Descriptions can be as long as required and it is assumed it will contain as much information about the nature and origin of the parameter as possible Variable itself can belong to different types ENUM INTEGER STRING REAL BOOLEAN and a special type of Boolean called TAG E
22. a 3205 1 1 NEFA NEFA pmol l Biochemistry Non Esterified Fatty Acids E g 69 1 0 NEUT Neutrophils 2 7 Blood Neutrophils a 69 1 0 PLAT Platelets 150 400 Blood Platelets ee R FTA x 69 1 0 POP PopulationEthnicity Population Ethnicity 3 SEX Sex 1182304 1 RCELL Red cell count 4 5 5 5 Blood Red Cell Count SV Sex ENUM 69 1 0 RESD Residence Country of residence Ue 18375 14834 1 1 M man 7364 SET Type of study setting Type of study setting Y Woman 7363 5844 1 1 SEX Sex Sex 33102 1 1 SMK Smoking status Smoking status 18971 1 1 SMKQ1 Smoking quantity 1 number of cigarette day at time of phenotyp 11094 1 0 SMKQ2 Smoking quantity 2 number of packyears smoked at time of pher 4477 1 0 STRK Stroke Stroke 12698 1 1 STUDY Type of study Type of study 5844 1 1 SUBS Subscapular mm Thickness of a skinfold on the subscapular ni 69 1 0 SUPRAI Suprailiac mm Suprailiac measurement aa 69 1 0 TAP Total adiponectin Total adiponectin a 2256 1 0 Te Total Cholesterol Total Cholesterol mMol L a 27730 1 1 TG Triglycerides Triglycerides mMol L a 30887 1 2 TRICEPS Triceps mm Thickness of a skinfold on the triceps muscle 4 69 1 0 TANZ YG Zygosity twins Zygosity twins a 6008 1 1 UA Uric acid Uric acid pmol a 5890 1 1 F Use split by collection WST Waist Waist circumference cm a 21466 1 0 Specify collections WSTIHIP Waisthhip Waist to hip ratio a 20446 1 0 select WT Weight Weight kg 4 2473 1 08 i Padito can filtered only for
23. aost oroso Columns Records Y 1781 69 69 32569 6008 32112 14994 1029 14771 23530 20080 29 148 137 11391 69 15334 69 5148 7814 7838 31068 11339 H O amp 1 84 84 4 4 84 8 4 4 eq EREEREER 84 Bq 84 84 Bq 84 84 BY 4 94 4 942 SO a a Oa o ae oO E j Cluse split by collection C use relations Specify collections Een CI Specific relations 3 es xs 2 Alternatively you can select all the parameters between two selected ones by pressing the Shift button while doing your selection B Welcome i J summary amp Report constructor Study ANY Collection ANY x Report request SET list res Parameter tree Gy Parameter hierarchy E ro x ZS Vocabulary Y Mets v Columns xp P Code Name Description Filter Records V APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 BASO Basophils 0 02 0 1 Blood Basophils a 69 0 BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 0 BMI BMI Body Mass Index kg m2 4 32569 0 BMO Month of birth Month of birth a 6008 0 BP Blood pressure Blood pressure systolic diastolic mm Hg yY 32112 0 BYR Birth Year Birth Year Y 14994 oii CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 cM Cholesterol medication Cholesterol medication yY 14771 1 CRP CRP CRP mal a 23530 1
24. biceps muscle 4 69 1 0 a Specify collections rm Specific relations BYR Birth Year Birth Year 4 14994 1 0 DGIBIRTH_MONTH Birth month Birth month a 0 1 QO F J 9 Click on Specific Relations and press on Select welcome x E Summary Report constructor a Report 1 Report 2 xl E Report 3 x E Report 4 x Study ANY vy QO Collection any x Report request i m FETERE list res Parameter es a TERE hierarchy l Zz x z I Vocabulary ANY iv Columns x e Request paca x akai ES ALC Alcohol Code Name Description Filter Records Y E DGI Affected_Statu Affected_Status Affected_Status a 3142 1 1 i GVWW_LAFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 3 1 0 GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a bs 1 0E GVWW_AFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set 4 3 1 0 GVY_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 e d 3 1 0 GVY_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 3 1 0 AGE Age Age a 33080 1 0 KoraF3 RTALTERU Age Age a 1644 1 0 KoraF3 RTMIALT Age of MI Age of the first miocardial infarction a 58 1 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 1 0 ALC Alcohol Alcohol Pg 13741 1 1 KoraF3 RTSLKKON Alcohol grams absolute ethanol per
25. box one can choose a tag of the appropriate classifier here Only parameters with such tag will be shown Together with classifiers this combo box is used as a filter of parameters Special value ANY means that parameters with any tag of correspondent classifiers will be shown Description column Here 1s part or entire if it is short of parameter description Search field One can write a pattern to filter parameters There is the possibility to choose which part of parameter code name description will participate in filtering Filter column By selecting this option you can specify a subset by value of the parameter you want to add to the query For example in a variable of type ENUM you can select only to display entries with one particular value or for variables with an integer value you can select entries within a range of Max and Min value Records column Common count of database records that contain information described by this parameters is shown here Variables column This column shows the number of available variables for corresponding parameter Enumerations column This column shows the total number of enumeration values for this parameter Enumerations are both qualifiers and variables with ENUM type 11 12 13 14 15 16 17 18 19 20 21 22 DD 24 25 Filtered value When the parameter added to the query has been filtered it appears in the report request window with
26. correspond to Timing when sample was taken lt int gt lt void gt lt void method add gt lt int gt Variant ID i e the variant ID that correspond to fasting lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt Notice that predefined queries can be made by a combination of Parameters and subqueries Example of the SQL code to upload three predefined queries for Metabolic Syndrome IDF WHO and NCEP IDs are based on actual parameterIDs in the current main SAIL instance personal installations of the system may have different IDs assigned to the parameters queries part and variants First create the entries for the 3 main predefined queries in the expression table insert into expression values N IDEF 2 IDf description for a person to be defined as having metabolic syndrome insert into expression values N WHO 2 WHO clinical criteria for Metabolic Syndrome insert into expression values N NCEP 3 NCEP definition for metabolic syndrome IDF WST gt threshold or BMI gt 30 and at least 2 of the following subqueries TG HDL and SEX or HDL treatment BP or ANTIHYPR GLU fasting or DB Type2 or FMT2D insert into expression values N 1 Central Obesity insert into expression values N 2 Addtional criteria IDF insert into expression values N 2 HDL and SEX insert into expression values N 1 BP or ANTHY
27. creation and amending of parameters This form has a basic section that includes Code Name and description of parameter and also six additional sections for e Annotations e Variables e Qualifiers e Inheritance e Classification e Relations Edit parameter y Code SMKQ1 Name Smoking quantity 1 Description number of cigarette day at time of phenotyping Classifier Tag Parameter Annotation Source Parameter Name Type Variants this Number NTEGER Add Variable Edit Variable Remove Variable Clazesitier Vocabulary Structured description section Structured descriptions or annotations are like descriptions but every such description has attached a tag of a classifier of type Parameter Annotation The use of structured description allows split the description of a parameter into well defined sections Variables section Variables section is for adding editing and removing variables Listed in this section are both owned variables and inherited variables Only owned variables can be edited or removed There is a form for variables manipulations Edit Variable x Name Number Type Ww Description Number of cigarette per day Predefined variants Variants The form allows changing variable name choosing type for new variable and editing description of variable For ENUM variables there are two additional fields One for choosing to use only predefined values otherwise new case
28. grams absolute ethanol week a 16252 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication 4 1643 1 ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1 DGLANTIHYP Antihypertensives antihypertensive treatment a 297 l APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 BMI BMI Body Mass Index kg m2 a 32569 0 KoraF3 RTBMI BMI kgim2 a 1636 1 Oo DGI BMI_basal BMI Calculated a 3111 1 0 DGI VISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 Use split by collection go orao BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 1 0 go Specify collections Specific relations BYR Birth Year Birth Year a 14994 1 0 selest me DGIBIRTH_MONTH Birth month Birth month a 0 1 0 zi l Ta aia I Query 5 Click on query to get your report A welcome 5 Summary Report constructor E Report 1 x E Report 2 Report 3 Report 4 Se 2 A H a Total records 33284 z 5 0 0 1640 m i 5844 5677 97 997 99 Genmets Case 912 96 T Genmets Control 931 06 KoraF4 5224 84 0 0 8 oe o o o o gt 3 3 B 3 i un ww za ie parameter with code ALC and name Alcohol ae parameter with code KoraF3 RTALKKON and name Alcohol ae parameter with code DGLALCOHOL and name Alcohol 4_ ALC Bd KoraF3 RTALKKON gd DGLALCOHOL 6
29. grams absolute ethanol week a 16252 1 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication 4 1643 1 1 ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1 1 DGLANTIHYP Antihypertensives antihypertensive treatment a 297 1 l APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BMI BMI Body Mass Index kg m2 a 32569 1 0 KoraF3 RTBMI BMI kgim2 a 1636 1 Oo DGI BMI_basal BMI Calculated a 3111 1 0 DGIVISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 ties Sat by ane o E ARS BICEPS Biceps mm Thickness of a skinfold on the biceps muscle 4 69 1 0 T Specify collections Specific relations BYR Birth Year Birth Year a 14994 1 0 DGIBIRTH_MONTH Birth month Birth month a 0 1 0 z C JG _ E add B Add to group gf Extra v 8 On the bottom of the report request panel click on Use Relations This will enable all types of relations Z Welcome f summary Report constructor Report 1 x E Report 2 E Report 3 te Report 4 Study ANY yO Collection any v Report request Parameter list kz Parameter tree iig Parameter hierarchy Zo Vocabulary ix ANY x Columns 7 xe ee e Code Name Description Filter Records V E DGI Affected_Statu Affected_Status Affected_Status a 3142 1 1 amp 8 GVWW_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human
30. in family l Quick query And getting such a report E welcome Cae e A E Total records Summary SA Report constructor Collections 33080 9 X Relations LF extra_ Columns x ip Fitter Records V E a 33080 1 of a 7295 1 0 a ia tt a 16252 1 0 a 11677 141 1 a 1781 1 1 a 6 1 0 a 6 1 0 a 32569 1 0 a 6008 14 0 a 32112 2 0 a 14994 1 0 a 1029 1 14 a 14771104 a 23530 14 41 a 20080 1 41 a 2 14 0 a 3142 t l a ot 4 a O a 27 1 1 a 0 1 a 342 1 0 a Hi t o a 3082 2 0 a 29477 4 0 o o Og L374 alw ki parameter with code AGE and name Age 2 2 AGE parameter with code ALC and name Alcohol Report request zo Cluse split by collection A Request E AGE Age E ALC Alcohol Specify collections Specific relations O Use relations xy That means that we have 33102 samples 33080 of them have information about age and 13741 have information about alcohol 13720 have information about age and alcohol status We can make more interesting reports using enumerations To add enumerations into a request we need first select the parameter with enumerated variable or qualifier and then press filter button Dialog with all available enumerations will appear Here we can choose one of the available enumerations and add it to the request E welcome lt TE summary dij Report constructor Colle
31. kgm2 4 1636 0 DGI BMl_basal BMI Calculated a 3111 0 DGI VISIT _bas Basal visit date basal visit date a 3142 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 0 Use split by collection oO tka robio BICEPS Biceps mm Thickness of a skinfold on the biceps muscle Ez Export visible a 69 0 O Specify collections Specific relations BYR Birth Year Birth Year E Export selected 4 14994 o iSelect select DGIBIRTH_MONTH Birth month Birth month BD isu parameter 4 0 03 5 Retains 3 A new pop up window appears with the visible parameters and their definitions Te c m A Welcome Summary e Report EEE Ep Renark 1 XI El panar 2 26 I Renort 3 XI El Danat a ix IE Denork amp 96 E j Study ANY Collection ANY Parameter DGl affected Status al port request a Fame Affected Status rs E E Parameter list ltz Parameter tree I i Parameter hiel Pescriptionl ffected_Status nn EG x 3 fag Wocabular Request etSRelations atch E Type o o y O Code Name Description Nariable Type ENUM e e sre My SE arene Predefined GW_AFFY_100k Affymetrix Genome wide Hur Affymetrix oe es Wariant am o CSC isd GW_AFFY_5 Affymetrix Genome wide Hut Affymetrix Ge Parameter ew _AFFY_100k Oooo o oo 7 7 Name Bffymetrix Genome wide Human SNP Array 100k Set E GW _AFFY_500k Affymetrix Genome wide Hun Affymetrix ci Description Affymetrix Genome wide Human SNP Array 100k Set rT
32. llumina HumanCytoSNP 12 arr Illumina HumanCytoSNP 1 2 array GVW_ILMN_Hap300 Illumina HumanHap300 array Illumina HumanHap300 array GYV_ILMN_Hap370 Illumina HumanHap370 array Ilumina HumanHap370 array GW LMN _Hap550 Illumina HumanHap550 array Ilumina HumanHapS50 array GYV_ILMN_Hap650 Illumina HumanHap6S50 array Illumina HumanHap6S0 array GW LMN Humani Ilumina Humani 100k array Ilumina Humant 100k array GW LMN iSelect Illumina iSelect array llumina iSelect array GAN ILMN_OE llumina HumanOmniExpress a Illumina HumanOmniExpress arra GVWV_ILMN_OMNI1 Ilumina HumanOmnit Quad ar Illumina HumanOmnit Qu A Quick query ay Y Relations Columns Records Y vimo 7814 7838 33102 6008 69 69 69 31068 11339 Bq S4 Sq Bq 4 Bq Bq Bq Aq Bq Bq Bq Ad 2q 2q 2q 2q 2q 2q 292d sd sd OM Ee ee Soa aio a ao et oa eo oe ol a ao a oo on a on By oy St oO at Oy ey G2 Sr Oy Sy Of a on S Selected parameters are shown in Report request panel How to add group of parameters into a report Report request TO A EF GLU Glucose Cluse split by collection O use relations Specify collections O Specific relations 2 ES xy 1 Select parameters from the parameter list the left panel of the screen pressing Ctrl button cmd button on mac to add parameters into the group one by one A welcome x E Summary s Report constructor Study ANY Collection ANY
33. mm Thickness of a skinfold on the biceps muscle a 69 1 0 BM BM Body Mass Index kg m2 a 32569 1 0 BMO Month of birth Month of birth a 6008 1 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 2 0 BYR Birth Year Birth Year a 14994 1 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 1 CM Cholesterol medication Cholesterol medication a 14771 1 1 CRP CRP CRP mg L a 23530 1 1 DB Type of diabetes Type of diabetes a 20080 1 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 1 0 DMI Date first Ml Date of first myocardial infarction a 148 1 0 DS Date first stroke Date first stroke a 137 1 0 EDU Education Education a 1139 1 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils a 69 1 0 EXYR Year examination Year examination a 15334 1 0 FAMH Family History Family History 69 1 0 FMHRT Family history heart disease Family history heart disease yY 5448 1 1 Luse split by colection Luse relations FMSTRK Family history stroke Family history stroke a 7814 1 1 feet lls OECS L ee FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 1 select ETS au ee W sanco a 9 2 Now select one of the parameters you want to add to the group and again add i Quick query Add nbd it to the query La era x A Welcome f Summary 4 Report constructor Study ANY v Collection ANY x Report reque
34. of the first miocardial infarction a 58 1 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 1 0 ALC Alcohol Alcohol a 13741 1 1 KoraF3 RTALKKON Alcohol grams absolute ethanol per day at week before examination a 1640 1 0 DGILALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 1 DGLALCODOS Alcohol doses week a 0 1 1 ALCQ Alcohol quantity grams absolute ethanol week a 16252 1 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication a 1643 1 1 ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1 1 DGLANTIHYP Antihypertensives antihypertensive treatment a 297 1 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BM BMI Body Mass Index kg n2 a 32569 1 0 KoraF3 RTBMI BMI kgim2 a 1636 1 0 DGI BMI_basal BMI Calculated a 3111 1 0 DGIVISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils 3 a 69 1 0 Use split by collection go oratio BICEPS Biceps mm Thickness of a skinfold on the biceps muscle Ez Export visible Na 69 1 0 oO Specify collections Specific relations BYR Birth Year Birth Year E Export selected 4 14994 1 0 DGIBIRTH_MONTH Birth month Birth month D ion paramen d 0 1 QO vi A 6 Select Export Selected and a new pop up window will show the descriptions on the parameters selected A Welcome x ll Summary ry Report constructor Study ANY x Colection any
35. request Parameter list Parameter tree is Parameter hierarchy fnd xs Source Vocabulary v A Request DGI a O FinnTwin H C General aB O KoraF3 SS Mets aE AGE Age B AGEST Gestational age o ALC Alcohol V Status a E ALC Alcohol quantity E ANTIHYPR Antihypertensives APOB Apo B mg a BASO Basophils 0 02 0 1 4 BICEPS Biceps mm a EBM BMI a BMO Month of birth H BP Blood pressure 4 BYR Birth Year E CHD Coronary Heart Disease B CM Cholesterol medication a CRP CRP a DB Type of diabetes 8 DCHD Date first CHD a DMI Date first MI B EDS Date first stroke a EDU Education a EOSIN Eosinophils 0 02 0 05 EXYR Year examination 8 FAMH Family History E FMHRT Family history heart disease D FMSTRK Family history stroke a FMT2D Family type 2 diabetes B GLU Glucose a LUM Glucose medication Use split by collection C Use relations D GW_AFFY Affymetrix Genome wide genotyping C Specify collections Specific relations E GWV_AFFY_100k Affymetrix Genome wide Human SNP Array 100k Set select select Gv_AFFY_S5 Affymetrix Genome wide Human SNP Array 5 0 A as Ee GVY_AFF Y_500k Affymetrix Genome wide Human SNP Array 500k Set v ae Ea xX eraz Hierarchies allow displaying parameter relations and inheritance For example a parameter called Familiar Diabetes will be rel
36. template You can use Ctrl click on the name of a Parameter cmd click on Mac in order to select parameters one by one or you can Shift click on the first and last parameters of a list to select all the parameters in between amp Report constructor Classifiers Projections Study Collection Collection view Metadata Import Study ANY i Collection ANY xvm Report request E Parameter list ta Parameter tree is Parameter hierarchy Te x y NO FILTER v Select tag v Columns xP ges Code Name Description Filter Records V E APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 ig BMI BMI Body Mass Index kq m2 a 32569 1 0 KoraF3 RTBMI BMI kgin2 a 1636 1 0 DGI BMI_basal BMI Calculated a a f n DGI VISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 1 0 BYR Birth Year Birth Year ar 14994 1 0 DGIBIRTH_MONTH Birth month Birth month a 0 1 0 DGIBIRTH_YEAR Birth year Birth year a 3142 1 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 2 0 DGI BP_basal Blood pressure Mean of two measurements a 3062 2 0 CRP CRP CRP mg L ar 23530 o a KoraF3 RH_CRP CRP mgl a 183 1 0 DGI DIABCHIL Children DM Children DM a 510 1 1 DGI CHRELAT4 Children had congestive heart Children had congestive heart a 389 1 1 DGEMIRELA4 Children have Ml Children have Ml a 387 1 1 DGI CHOL_basal Cholesterol Choleste
37. 94 1029 14774 23530 20080 29 148 137 11391 69 15334 69 5148 7814 7838 31068 11339 C ROE Pe 0 a Pes Pe i ak O See OO oP oO oo SB SS oS CE oe Se oe Se oe ik C specific relations ea Specify collections Query bs 5 Inthe report you will get your parameters with the filtered comment and an indication of what was the filtering applied te eP mE Total records 33102 parameter with code EX YE and name Year ezamination filtered out by e Year within 3 12 2 _EXYR Report request Ze xs A EEXYR Year examination Cl use split by collection O use relations How to use split by collection When querying more than one collection at the same time it may be useful to split the results by collection so it is easier to choose which data provider contains the data of interest To split results by collection we do 1 Select the parameters you are interested in and add them to the request report AA Welcome summary Report constructor Report 3 Study ANY Collection ANY vi E Parameter list ez Parameter tree a Parameter hierarchy Vocabulary Code AGE AGEST ALC ALCO ANTIHYPR APOB BASO BICEPS BMI BMO BP BYR CHD CM CRP DB DCHD DMI DS EDU EOSIN EXYR FAMH FMHRT FMSTRK FMT2D GLU Y Mets Name Age Gestational age Alcohol Alcohol quanti
38. Age 4 1644 0 KoraF3 RTMISLT Age of MI Age of the first miocardial infarction Fg 58 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke Yd 44 0 ALC Alcohol Alcohol Pg 13741 1 KoraF3 RTSLKKON Alcohol grams absolute ethanol per day at week before examination a 1640 0 DG ALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 DGLALCODOS Alcohol doses week an 0 Ri ALCQ Alcohol quantity grams absolute ethanol week a 16252 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication Fg 1643 1 ANTIHYPR Antihypertensives Antihypertensive treatment a 11677 1 DGLANTIHYP Antihypertensives antihypertensive treatment Fg 297 1 APOB Apo B mgd Biochemistry Apolipoprotein B 4 1781 1 BMI BMI Body Mass Index kg m2 4 32569 0 KoraF3 RTBMI BMI kgim2 a 1636 0 DGI BMl_basal BMI Calculated a 3111 0 DGI VISIT_bas Basal visit date basal visit date a 3142 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 0 Use split by collection Oo oa BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 0 F Specify collections Specific relations BYR Birth Year Birth Year a 14994 0 select _ select DGIBIRTH_MONTH Birth month Birth month 4 0 O ig aaen Edi Eanan Cassis Ea o 2 Inthe available list of options select Export visible A welcome x 5 Summary j amp Report constructor Report 1 x A Report 2 x E Report 3 x E Report 4 E Report 5 x Study ANY Collection ANY v Report request a Rarameter l
39. Configuration of SAIL The basic functionality of SAIL can be used without any extra configuration If you want to enable the admin interface that allows for advance functionality see the instructions in the configuration section Configuration There are only 2 files that need to be modified in order to enable the admin interface In the file CATALINA HOME webapps WEB INF web xml you need to describe the security role that enables the admin interface lt security role gt lt description gt The role that is required to log in to the SAIL lt description gt lt role name gt SAILAdmin lt role name gt lt SecuriLy role gt To define the user and password for the security role previously defined you need to modify the file CATALINA HOME conf tomcat users xml You may need admin rights in order to edit this file Within the section lt tomcat users gt you need to create a role for SAIL admin lt role rolename SAILAdmin gt You also need to create the user name and the password to be used when login in the admin interface of SAIL lt user username SAILAdmin password password roles SAILAdmin gt Every time the configuration file has changed you will need to restart Tomcat Understanding SAIL Parameters Classifiers and Vocabularies The main goal of developing SAIL is to produce a tool for sample availability reports SAIL uses availability information that is collected from a number of repositories It allo
40. EX NCEP insert into expression values N 1 BP or ANTIHYPR NCEP insert into expression content values 3 191 0 NULL insert into expression content values 3 49 0 NULL insert into expression content values 3 0 15 NULL insert into expression content values 3 0 16 NULL insert into expression content values 3 0 17 NULL insert into expression content values 15 32 0 NULL insert into expression content values 15 12 0 NULL insert into expression content values 16 45 0 NULL insert into expression content values 16 12 0 NULL insert into expression content values 17 56 0 NULL insert into expression content values 17 59 0 NULL update expression content set filter lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0_16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property Variants gt lt object class java util ArrayList gt lt void method add gt lt object class java util ArrayList gt lt void method add gt lt int gt 232 lt int gt lt void gt lt void method add gt lt int gt 396 lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt where expressionID 3 and ParameterID 191 EXAMPLE FILES These files are for reference only If you want to use these files as a test in a self installation you need to create the correspo
41. European Bioinformatics Institute SIMBioMS org SAIL User Guide Version 1 01 INDEX What is SAIL Install SAIL Understanding SAIL Parameters Classifiers and Vocabularies What is a classifier in SAIL What is a vocabulary in SAIL Parameter classification Collections and Studies What is a Collection What is a study Parameter import Structure description section Variable section Qualifiers section Inherited parameters section Classification section Relations section Parameters import file format vocabulary import Data availability import file format Report Constructor Interface How to Use SAIL Summary of Data User interface How to add a parameter into a report How to add a group of parameters into a report How to add a parameter to a group How to add enumerated values of a parameter into a report How to add value ranges for a parameter into a report How to split by collection How to combine parameters Use split by collection Use Parameter relations Export list of parameters Parameter trees and hierarchies How to combine parameters Administrator Interface Produce a template for data import Create new classifiers Create projections Create a new study Create a new collection Metadata Import relations and vocabulary upload Application Design SAIL Glossary Appendix Vocabulary import Data availability import Quick guide to create Data and vocabulary import files Hidden feature Predefined quer
42. I and name BMT parameter with code DB and name Type of diabetes A AA U N me parameter with code EDU and name Education _ AGE Big ALC Bf Bo Biff De MS EDU Use Parameter Relations In SAIL collections can be annotated using the same or different vocabularies In order to facilitate the queries among collections with parameters annotated with different vocabularies SAIL makes use of relations Using relations in your queries can be achieved by different methods One method would be adding all the related parameters to your query 1 Start by selecting the parameter that you want to add A Welcome Summary Report constructor E Report 1 E Report 2 E Report 3 Study ANY D Colection ANY vy Report request Parameter list Parameter tree a Parameter hierarchy S x Si Vocabulary ANY v Columns x Pp EPa Code Name Description Filter Records Y E DG Affected_Statu Affected_Status Affected_Status 4 3142 1 1S GW_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 3 1 0 GW_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 4 3 1 0 GWN _AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 3 1 0 GYV_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 3 1 0 GW_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping 4 3 41 0 AGE Age Age 4 33080 1 0
43. L a 23530 1 DB Type of diabetes Type of diabetes a 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis 4 29 0 DMI Date first Ml Date of first myocardial infarction a 148 0 DS Date first stroke Date first stroke 4 137 0 EDU Education Education a 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils 4 69 0 EXYR Year examination Year examination a 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease a 5148 1 Cea aae A o Tape ee FMSTRK Family history stroke Family history stroke a 7814 1 Specify collections o Specific relations FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 GLU Glucose Glucose mMol L a 31068 2 al Quick query 5 In the pop up window select the collections you want to use and click ok A Welcome Summary B Report constructor Study ANY v 0D Collection ANY v Report request Parameter list Parameter tree os Parameter hierarchy anor xX 7 A Request Vocabulary v Mets v Columns 7 x iP lt EE ACE Age Code Name Description Filter Records V E w AGE Age Age ar 33080 1 0 E ALC Alcohol n io AGEST Gestational age Gel Choose collections x ly enn ALC Alcohol aig Collection list ALCO Alcohol quantity gre 8 Type of diabetes Columns x p ANTIHYPR Antihypertensives Ant U Education ucation APOB Apo Bnet pia
44. NUM is used for variables that take on a number of fixed string values like for example MALE and FEMALE Here is example of simple parameter Parameter AGE Name Age Description Age of patient Variable Age Type INTEGER Example with ENUM variable Parameter TWINZYG Name Zygosity twins Description Type of twins zygosity Variable Type Type ENUM Variant monozygotic Variant dizygotic Variant opposite sex dizygotic Another type of parameter is one that can t be described by just a single variable As an example blood pressure can be defined with two INTEGER variables one for Systolic and one Diastolic pressure Parameter BP Name Blood pressure Description Blood pressure Variable Systolic Type INTEGER Variable Diastolic Type INTEGER Next case is when we have a value described by one variable and besides we need to attach some additional information about how this value was taken For example we need to measure temperature but it is important when this measurement was made So each temperature reading must be qualified by an enumeration Morning Afternoon For this purpose parameters with qualifiers were established For instance Parameter GLUTM Name Glucose w timing Description Glucose with timing mMol L Variable Concentration Type REAL Qualifier Timing Variant fasting Variant non fasting Note we can have as many qualifiers as required The previous example introduces one problem It may be cases
45. PR insert into expression values N 1 Glu Fasting or DB type 2 or FMT2D insert into expression content values 1 0 4 NULL insert into expression content values 1 0 8 NULL insert into expression content values 4 31 0 NULL insert into expression content values 4 32 0 NULL insert into expression content values 5 0 6 NULL insert into expression content values 5 0 7 NULL insert into expression content values 5 0 8 NULL insert into expression content values 6 45 0 NULL insert into expression content values 6 12 0 NULL insert into expression content values 7 56 0 NULL insert into expression content values 7 59 0 NULL insert into expression content values 8 191 0 NULL insert into expression content values 8 41 0 NULL insert into expression content values 8 28 0 NULL update expression content set filter lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0_ 16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property intRanges gt lt object class java util ArrayList gt lt void method add gt lt object class uk ac ebi sail client common IntRange gt lt void property limitLow gt lt int gt 30 lt int gt lt void gt lt void property limitHigh gt lt int gt 100 lt int gt lt void gt lt void property partID gt lt int gt 3 1 lt int gt lt void gt lt object gt lt
46. Temperature Description Body temperature Variable Value Type REAL Parameter is coding of parameter Coding must be unique across all SAIL parameter The best way to ensure uniqueness is to prefix coding by the name of the vocabulary MetsS in this example Name is the name of the parameter It isn t required to be unique Description is a free text description of parameter Variable designates the new variable within the parameter The name of variable must be unique within a parameter Type can be one of the following REAL INTEGER STRING BOOLEAN DATE ENUM or TAG In some cases parameter can have two or more variables Parameter MetS BP Name Blood pressure Description Blood pressure measured according to standard technique Variable Systolic Description Systolic part of blood pressure Type INTEGER Variable Diastolic Description Diastolic part of blood pressure Type INTEGER ENUM variables must be described in special way It must be declared whether it has predefined variants or not Variant values can also have a numeric value that is a reference to the real value Parameter MetS SEX Name Sex Description Gender of a patient Variable Sex Type ENUM Predefined YES Variant Man Variant Woman 2 A list of variants can be left open Parameter MetS RESD Name Country Description Country of residence Variable Country Type ENUM Predefined NO Data availability import The data import file contains t
47. _100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 0 1 0 GW_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 1 0 GW_LAFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 0 1 0 GW_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 0 1 0 GGT Genome wide genotypes Genome wide 100k SNPs genotypes a 0 1 0 g iGo erat eon o S GyV_ILMN llumina Genome wide genotyp Illumina Genome wide genotyping a 0 1 0 Sec COET o specific relations GA ILMN 1M llumina Humani M Duo array Illumina HumaniM Duo array a 0 1 0 select GW ILMN_660y Ilumina Human660VV Quad arr Illumina Human660VV Quad array a 0 1 0 z ea a Z ouck query E ada X Relations LF extra 2 Ifthe parameter is of type INTEGER and contains real values that can be used you will see a range already display showing and infinitum A TO a summary I Report constructor E ouick query Lada l Relations Lf extra x 4 Click ok again and your parameter will be added to the request report panel Study ANY x Collection ANY Parameter list kz Parameter tree is Parameter hierarchy z r ry vocabulary M Mets Dil Columns 7 i x P P Code Name Description Filter Records V E BP Blood pressure Blood p
48. a set of parameters described as above we need to have a way to classify them Tags will be used for classification A tag is a simple string that is attached to a parameter Every tag belongs to a tag class Tag classes denote an entire classification field and every tag is for a particular region in that field For example let s suppose that we have a number of figures of different shapes and colors And we need to classify them by these features We need to have tag classes Color and Shape Then we need to have a number of tags within every class e g Red Blue and Green within Color and Square and Circle within Shape D N EEEE 3 2 6 Shape aa m Color class ha pe class Circle Square We can consider the Tag class as a classifier and tag values as classifier values Every class has two options The first is this classifier mandatory that means that every parameter must have one of the tags from this class And the second does this classifier allow to attach several tags to one parameter Edit classifier Name Knowlegde domain Description Knowlegde domain to which parameter is related i Allow multiple tags This classifier is mandatory Tags Tag name Info systems biology clinical trials CE populational genomics a Add Tag Edit Tag Classification al Save Cancel Parameter GLU Name Glucose Description Glucose mMol L T
49. ab or comma delimited text where each row corresponds to one Sample Such file can be produces by Excel or Open Office So format will be described as Excel spreadsheet SAIL can accept data in following simple format SAMPLE ID Var 1 ref Var 2 ref Var 3 ref Var 4 ref SIl l 2 3 Male S52 0 4 5 Female In this example Var 1 is string variable with variants coded as 0 and 1 for example presence of some disease Var 2 may be numeric or enumerated variable Actual values are not disclosed so we use availability sign instead of values Var 3 is a numeric variable with real values Var 4 enumerated variable with real enumerated values Male and Female To annotate data one should use references to the variables that are already described within SAIL Variable references must be either in form lt Parameter code gt in cases when a parameter has only one variable and no qualifiers or in the form lt Parameter code gt lt varname gt in a general case Example SAMPLE ID SEX BP Systolic BP Diastolic SI Male 120 80 S2 Female In the next section special format for import of variables is described To refer a variable in a data submission header the following format should be used lt Parameter code gt lt Variable name gt Example MetS BP Systolic If only one variable is defined for a parameter then the lt Variable name gt part of the header can be skipped SAMPLE MetS SEX MetS TEMP V MetS BP Syst MetS BP Diast Me
50. about Sex A more complex example We are choosing Age and then Alcohol A Welcome E Summary Study E Parameter list ANY amp Report constructor x Collection ANY vio t Parameter tree is Parameter hierarchy NO FILTER M Select tag M Code a Name Description AGE Age Age AGEST Gestational age Gestational age ALC Alcohol Alcohol ALCO Alcohol quantity grams absolute ethanol week ANTIH YPR Antinypertensives Antihypertensive treatment APOB Apo B mgd Biochemistry Apolipoprotein B BASO Basophils 0 02 0 1 Blood Basophils BICEPS Biceps mm Thickness of a skinfold on the biceps muscle BMI BMI Body Mass Index kg m2 BMO Month of birth Month of birth BP Blood pressure Blood pressure systolic diastolic mm Hg BYR Birth Year Birth Year CHD Coronary Heart Disease Coronary Heart Disease cM Cholesterol medication Cholesterol medication CRP CRP CRP mg L DB Type of diabetes Type of diabetes DCHD Date first CHD Date of coronary heart disease diagnosis DG Affected_Statu Affected_Status Atfected_Status DGLALCODOS Alcohol DG ALCOHOL Alcohol doses week Have you been drinking alcohol during last 12 months DGEANTIHYP Antinypertensives antihypertensive treatment DGIBIRTH_MONTH Birth month Birth month DGIBIRTH_YEAR Birth year Birth year DGI BMI_basal BMI Calculated DGI BP_basal Blood pressure Mean of two measurements DGECHOL_basal Cholesterol Cholesterol DGECHRELAT Congestive heart in family Congestive heart
51. accepts comma or tab separated files that can be easily created by Excel or any other tool or software Such CSV or TSV files represent a matrix where columns are variables or qualifiers of corresponding parameters and each row contains information about a single sample Here are the rules on how to prepare an import file e The first row is the data header that defines format of this data set Column header format is lt Parameter code gt lt Variable name gt e g TWINZYG Type Column with qualifier is described by similar way GLUTM Timing e The first column must always be a set of sample IDs e Ifa parameter has more than one variable or has qualifiers own or inherited the file must contain columns for every variable and or qualifier of this parameter e Availability information of variables of types STRING and BOOLEAN including special TAG booleans is represented by 0 and 1 e STRING REAL or INTEGER variables can also use the real text or numeric value to show availability e Enumerated variables or qualifiers values must be represented by strings Such strings must correspond to an enumerated value in the description of this variable or qualifier e Ifsome samples have no values for an ENUM REAL or INTEGER variable or qualifier then the corresponding cells can be empty e If we need to designate that a variable or qualifier has a value but this value can t be disclosed then the special symbol can be used SAMPLEID ID SEX Sex
52. ag Ontology P3G Tag Ontology ENGAGE Tag Physical value Concentration Tag Knowledge domain Systems biology Variable Concentration Type REAL Tags provide an easy way to group and search parameters In SAIL one can also filter parameters by the value of a classifier One useful way to represent parameters 1s 1n a tree like projection In a projection tree every layer of the tree corresponds to one of the classifiers and every branch corresponds to a tag from that classifier Leaves of the tree correspond to parameters When more than one classifier has been defined for our data set we need to define the order of preference in which different classifiers will be used when building the tree This specification of classifier hierarchy within a tree is known in SAIL as a projection Let s consider an example Parameter Pl P2 P3 P4 P5 Classifier Tag Ontology P3G Ontology ENGAGE Knowledge domain System biology Ontology ENGAGE Knowledge domain System biology Ontology ENGAGE Knowledge domain Clinical trials Ontology P3G Knowledge domain Clinical trials Ontology ENGAGE Now we need to choose a projection Let it be Ontology Knowledge domain So the tree will look like Parameters P3G 3 4 Liver Pt 3 5 Blood Ara ENGAGE 36 Liver P P2 Blood P3 34 Undefined Ps If we change the order of the classifiers Knowledge domain Ontology the tree will look as such Parameters
53. ag v Columns x p A Request Code Name Description Filter Records Y E DGI Affected_Statu Affected_Status Affected_Status a 3142 1 1S GW _AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set Pg 3 1 0 GW_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 4 S 1 0 G GW_LAFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set Ed 3 1 0 GYWV_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 3 1 of GYV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 3 1 0 AGE Age Age a 33080 1 0 KoraF3 RTSLTERU Age Age a 1644 1 0 KoraF3 RTMIALT Age of MI Age of the first miocardial infarction a 58 1 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 1 0 ALC Alcohol Alcohol a 13741 1 1 KoraF3 RTSLKKON Alcohol grams absolute ethanol per day at week before examination a 1640 1 0 DG ALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 1 DGEALCODOS Alcohol doses week a 0 1 1 ALCA Alcohol quantity grams absolute ethanol week a 16252 1 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antinypertensitives medication a 1643 1 1 ANTIHYPR Antinypertensives Antihypertensive treatment a 11677 1 1 DGLANTIHYP Antinypertensives antihypertensive treatment Pg 297 1 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BMI BMI Body Mass Index kg m2 a 32589 f KoraF3 RTBMI BMI kgim2 bd 1636 1 0 DGIBMI_basal BMI Calcula
54. al infarction a MCE Monccytes 02 1 0 Binod Monocytes a ME WMebaiois yr ome Metabo fyrri dta iy DF deinde a MEF A MEFA pmo Biochamisiry Mon Esterified Fatty Acids a MELT Neutrophils 2 7 Blood Neutrophils a PLAT Piabeints 150 400 Blood Piabeiets a Por Populatorvbthwiciy PopulationEthinicity a RCELL Red osl coui 4 5 5 5 Baood Red Cal Couri a RESD Resience Country of pendence a SFT Type of study setting Type of study setting SEX Sex Sax a i 1 i oo 1 q 4 i 1 SHH Smoking simu Smoking simie a earn o oa Seng faster of cigarette day af bere of phenolypang a 1104 1 a SHO number of packjesrs smoked al tee of phenotyping a aT STRE Stroke a iJa o o STuoy T Type of study F a T d SUES Subat Thickness of a skiniokd on ihe subscapular muache a Ba oO SUPRAJ Sugeadinc t Tupaks mentunenent a e 1 Oo TAP Tola adponsci Toba adponsctin a me 1 oO Te Total Cholesterol Total Chokestencl mabo a Im i l T Trighycerides Trighyoerides miiob a war i 2 REFS ricepa mm Thickness of a sbiriokd on The tcepes mus a eo i f TAIZ Zyqosdy twins a bos 1 1 ua Leie acid a Po a 1 WET Wines a des 61D WETHER yisi a 74s i d WT Weight rid MITA t oN aikaer aa Se Relations af Extra And report welcome SE i Report constructor fe Report Sm PSIG Total records 33102 Records in collection al 33102 L374 1 La971 57 Stary 33102 3T 18971 E parameter with code ALC and na
55. an icon of a small yellow funnel Name column This column shows names of parameters If request contains group of parameters then Parameter group N is shown here Query type This allows you to define if the parameters in query need to be all available AND or at least one of them available OR for an entry to be counted as positive in the results report Remove from query button Select from the list of parameters the one that you want to remove and click in this button to remove it from the query Remove all Click on this option if you want to clean your query and remove all the selected parameters at once The quick query button allows you to select one parameter from the Parameter list and make a simple query for availability The Add button allows you to add a parameter to the Report request view An alternative way to add a parameter would be to double click on its name Add to group allows you to group parameters into set where at least one of the parameters needs to be available Relations button This button opens a dialog that shows all relations of the selected parameter Related parameters can then be added to selection This button is disabled if the selected parameter has no relations Extra button will display a set of options to export or edit parameters If this check box is checked then result will be split first in relation to repository This radio button allows you to choose if either all repositories will participate
56. and to which classifier group they belong More interesting is the option to be able to create new classifiers or edit existing ones 1 To create a new classifier start by selecting the Add button at the bottom of the screen amp Report constructor Classifiers Projections Study Collection Collection View Metadata Import Classifiers list Classifiers tree NO FILTER M Select tag v Columns v Classifier Description Type Mandatoty Tags Vocabulary Classification by relation to some dictionary Parameter false Repository Description Set of tags for structured description of Repository Collection annotation false Classifier type Classifier for classifiers classification Classifier false MetSRelations Metabolic syndrome vocabulary relations Relation false Definition Metabolic syndrome definitions Parameter false dd Edit 2 Inthe new tab call Add Classifiers you have the following fields Name is how the classifier is going to be called Description optional holds the description of your classifier Type holds the type of Classifier that you want to create The main type 1s Parameter and 1s the one we will use in our 6 5 4 2 3 example For a description on the different types of Parameters check section Understanding SAIL Classification Report constructor Classifiers Projections Study Collection Collection view Metadata Import Add classifier Add classifier Name ClasiE
57. ated to parameters Mother diabetes Father diabetes Sibling diabetes and so on All this parameters will inherit characteristics from Familiar diabetes so the descriptions for the generic variables that describe diabetes have to be introduced only once This is useful when creating a vocabulary to avoid redundancy It also allows doing queries using the more generic parameter and getting as a result all the samples that have been annotated using the more complex version of the parameter A Welcome Summary amp Report constructor Study ANY vy D Collection ANY vi E Report request Parameter list Parameter tree 4 Parameter hierarchy fadl vocabulary ANY v Columns xP ZA Request Sex DGI SEX A VISITAGE_basal DG VISITAGE Birth year DG BIRTH_YEAR Birth month DGIBIRTH_MONTH Basal visit date DGI VISIT_bas Educational years DGSCHOOL2 Regular smoker DGI smoking_comb_all Sigarettes per day DGICIGDAY Alcohol DG ALCOHOL Alcohol DG ALCODOS J Family history heart disease DGI FMHRT B Myocardial infarction in family OG MIRELA Myocardial infarction in family DG MIRELA1 Mother had MI DG MIRELA2 Siblings had MI DG MIRELA3 Children have MI DGI MIRELA4 3 Congestive heart in family DG CHRELAT Father had congestive heart DG CHRELAT1 Mother had congestive heart DG CHRELAT2 Sibling had congestive heart DG CHRELATS Children had congestive h
58. between parameters For a review on how these files should be formatted refer to the Appendix section Upload vocabularies and relations work the same way with the only difference that you select the Upload vocabulary or Upload Relations button depending on what type of data you want to upload We are going to review how to upload a vocabulary file 1 Start by clicking the Upload vocabulary button 3 Report constructor Classifiers Projections Study Collection Collection view Metadata Import Upload vocabulary Upload Realtions 2 Inthe pop up dialog box select Add Classifiers Projections Metadata Import File upload dialog x op Add a Close Upload Yocabulary Upload Realtions 3 Once you select the file to upload click on Upload State X Filename Oo Book1 txt Queued for upload 4 Once the vocabulary has been uploaded successfully close the dialog window Application design SAIL was designed as client server application The server part is written according to Java Servlet specification and is running inside a Tomcat web application container The client part is intended to be run inside common web browsers The client application is based on the Google Web Toolkit GWT technology with Ext JS widget library GWT allows developing client applications using Java programming language Java code is translated into Java Script to be ex
59. ction information Report 1 Study JAN iD Collection Bar oO Report request x r ii er Parameter list l Pmamnoter tree il e Paadi hierarchy To x J MOFLTERI az Select tig EEE ra E E PALS Alcohol Toja Hara Desiipiion Fia Records Y E AE Age Age x 1 oo EI ALC Alona ALC Alco Aloo gr I 4 ALC Alcohol quantity grams abtchite ethanol J week T ie ANTHYPR Antinypertencsives Antinypertensive treatment a jn 4 ew Bea Body biss Inden bgn zs o 1 0 amp Blood pressure Blood pressure systokc dasioic mm Hg a filters Pe a E o TH CPobrrharo mair Ceci se erica maimik a ZS ALC Alcona 1 1 CHD Connery Heart Disease Conorary Heart Deeease 3 F Sausi if ee i 1 WHD Date fest CHD Date of coronary hesarl dieane diagnos Ue 7 1 Bou Education Education e p ibaa ood i PMMAT Fem r history heart diosese Fam y history heart disease a lourre ra Ee 1 Fes TRA Famy history strode Famdy history sirobg Eo 1 1 Feit Fandy type 2 Gabe Fam y type 2 cane ee 5 1 1 Gen CENT Deria A ebabay Of garap daia T i HT Heaghl Hesp m EEH i o Mi Myocardial infarction Myocardial infarction 1 1 Reso Riemer Gouir of residence i m 1 ae i oe adito gown eet SMA Sang Gundy T number of cigarette J day at tne of phenotyping 7 i ay 1 o EMG Si uay 2 Misr of packyaa s amoiad al tree of phenotype a 442 i o swe Smoking slates Smoking tatu a mo SIRK Stroke Stroke T owe a be Type of diabetes Type of diab
60. d As Parameter code must be unique it is recommended to prefix it with colon Such prefix can be a short designation of the vocabulary to which the parameter belongs i e MetS Glu for Glucose concentration in vocabulary for Metabolic syndrome Parameter row must be followed by mandatory Name row Name Parameter Name The name is a human readable name for the parameter There is no requirement of uniqueness for parameter name It can also contain spaces Next row is a Description Description Description text Description provides a description for parameter Description text can contain the end of line symbol n that will be translated into a new line There is another way to provide multiline descriptions Description line can be repeated to provide separate lines of description These two examples are equivalent Description Description text line 1 nline 2 nand line3 and Description Description text line 1 Description line 2 Description and line3 Next optional row is Annotation Annotation is a structured description Annotation consists of several parts that are text with attached tags Tags must belong to classifiers with Parameter Annotation type Text representation in Annotation line is similar to Description line Also all Annotation lines with the same tag will be merged together with new line delimiter Annotation Classifier name Tag name Description text Inherit row is optional It can be used in any part of
61. d Param2 have to be true and for Param5 OR Param6 we would define depth 1 as the query will be true if at least one of the parameters is true e Description Here the user can add a description of the query to better understand what it is coding The expression table is used to define the queries and subqueries and their depths In the expression content table the user specify which parameters belong to a query subquery and also which subqueries are combined into a complex query The columns available are e ExpressionID This column is used to define which expression from the expression table we are about to describe e ParameterID mutually exclusive with SubexpressionID Here you add the ID of the parameters that you want to use in your expression Adding a value here means that the system if going to check 1f the specified parameter annotation is available for the existing samples e SubexpressionID mutually exclusive with ParameterID In the case that you query 1s formed by a combination of subqueries you need to specify the ID of the subqueries that you want to combine In this column you specify the ID of the subquery that you want to use from the expression table e Filter For parameters where real data has been provided it is possible to specify only a subset of values to be considered in the query To do so you need to use the filters column Ther are two types of filters that you can apply o Ranges If your parameter contains nu
62. d Params Warm Params Hot Params Add Tag Edit Tag Remove Tag Classification Classifier Tag Into Add Tag Remove Tag Save Cancel Edit classifier Select Tag Tag Description Relation Annotation Parameter Classifier Select Records 0 0 0 0 Cancel 10 Select the type of tag you want to add this tag have been defined in a classifier of type classifier that holds definitions of classifier types to help with grouping classifiers by type Name ClasiExample D MOSSIES FFOJECUHONS tu k SOEC j Description Example of a classifiers ction Vie ww baie Type Classifier allow multiple tags This classifier is mandatory s Tag name Info Cold Params Warm Params Hot Params 11 Click on Save to finish edition ifi no reter gal Selecttag A e Classifier Description Type Mandatoty Tags Vocabulary Classification by relation to some dictionary Parameter false 6 Repository Description Set of tags for structured description of Repository Collection annotation false Classifier type Classifier for classifiers classification Classifier false MetSRelations Metabolic syndrome vocabulary relations Relation false Definition Metabolic syndrome definitions Parameter false ClasiExample Example of a classifiers Classifier false The Classifier tree tab allows displaying classifiers grouped by type as well as browse what tags have been defined for a classifi
63. day at week before examination a 1640 1 0 DGILALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 1 DGEALCODOS Alcohol doses week a 0 1 1 ALCQ Alcohol quantity grams absolute ethanol week a 16252 1 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antihypertensitives medication 4 1643 1 41 ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1 1 DGLANTIHYP Antihypertensives antihypertensive treatment a 297 1 l APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 41 BMI BMI Body Mass Index kg m2 a 32569 1 0 KoraF3 RTBMI BM kgim2 a 1636 1 Oo DGI BMI_basal BMI Calculated a 3111 1 0 DGIVISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 Use split by collection Lies falations BICEPS Biceps mm Thickness of a skinfold on the biceps muscle 4 69 1 0 go Specify collections Vi Specific relations BYR Birth Year Birth Year a 14994 1 0 DGIBIRTH_MONTH Birth month Birth month a 0 1 a j Fonar 10 Select Synonym and press OK A Welcome Summary amp Report constructor j Report 1 E Report 2 x Report 3 Report 4 Study ANY xv C9 Collection ANY v Report request E Parameter list amp Parameter tree os Parameter hierarchy pxcifor x Ei Vocabulary ANY xv Columns 7 xp ear Alcohol Code Name Description Fitter Records Y E f DG Affected_Statu Affected_Status Aftfected_Status a 3142 1 1 GW_AFFY_100k Affymetrix Genome wide Hun Affym
64. de 100k SNPs genotypes T 0 0 Fie split by ane o enek GAN ILMN llumina Genome wide genotyp Illumina Genome wide genotyping a 0 0 Speci fy collections oO Speci eration GAILMN 1M llumina Humani M Duo array Illumina Humani M Duo array an 0 0 GVV_ILMN_66OVY Ilumina Human660VV Quad art Ilumina Human660VY Quad array a 0 0 a i J 3 Double click on top of the displayed range In the new Range pop up add the lower and upper limit values and click o pper limit val d click ok A TE Ss Summary Report constructor Study ANY 9 Collection ANY v Report request E Parameter list t Parameter tree I os Parameter hierarchy a Prd l x Zz Vocabulary l Mets v Columns 7 x p A Code Name Description Fitter Records Y E BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 of BYR Birth Year Birth Year a 14994 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 CM Cholesterol medication Cholesterol medication a 14771 1 CRP CRP CRP mg L a 23530 1 DB Type of diabetes Type of diabetes Rasuduckes Seis aa x 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis l J EJ EXYR Year examination 29 0 DMI Date first MI Date of first myocardial infarction 3 V Year INTEGER 148 0 DS Date first stroke Date first stroke OC e 137 o EDU Education Education Edt Range xiha 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils Lower limit 0 EXYR Year examination Year examination Upper li
65. eart DG CHRELAT4 3 Family type 2 diabetes DGI DIAB Mother DM DGIDIABMOT Father DM DGI DIABF AT Sibling DM DGIDIABSIBL Children DM DGIDIABCHIL Other relatives with DM DGIDIABREL1 Other relatives with DM DGI DIABREL2 Weight DGLVVEIGHT_basal Height DGIHEIGHT_basal BMI 0G BMI_basal Waist DGIVVAIST_basal Hip DGIHIP_basal WaistiHIP DG wh_basal Use split by collection C Use relations Fasting glucose DGI cptastgluc Cl Specify collections Specific relations Glucose type DG GLUCTYPE selects Fasting insulin DGLFASTINS_basal SE SEES AS Insulin type DGIINSTYPE_basal 3 uicounes f xX How to combine parameters 1 Combine parameters There are two possibilities to do it Logical and Selected parameters which are listed in the Report request panel are combined with logical operation and The report in SAIL is created by gradually adding parameters into request for instance parameters are listed in the following order Mets SEX and MetS GLU that means that firstly samples with provided gender will be selected from the database secondly samples with measured glucose level will be selected among them You can use Up and Down buttons in the bottom of the Report request panel to change the order of selected parameters See sections How to add parameter into a report and How to add possible values of parameter into a report Logical
66. ect itself but values of variables For example a qualifier can describe when temperature variable was measured like MORNING AFTERNOON Parameter inheritance In some cases a parameter doesn t represent a new notion but extended the view of an existing parameter In such cases we can use one parameter as base to produce the new one The new parameter will have all variables and qualifiers of the inherited parameter plus its own ones Appendix Vocabulary import Data in SAIL is described by set of parameters A parameter represents one phenotype entity Such entities can be represented by single measurable values like Height or Weight a few values blood pressure systolic or diastolic or even more complex set of values for measurements with attached conditions Every single parameter is represented by a variable Parameters in SAIL consist usually of one or more variables There can be cases when a parameter contains no variables at all for example when you want to create the same variable in two different languages you can create the first parameter with the full description and list of variables and then create a second parameter with the new name in a different language and a Inherited tag pointing to the original parameter and with no extra information A variable in SAIL can t exist outside of the parameter context In most cases one parameter contains only one single variable Example 1 Temperature Parameter MetS Temp Name
67. ecuted inside client browser The Ext JS widget library provides rich set of well developed widgets for building program like interface of a web page Such set of widgets consists of windows menus tool bars etc Google Web Toolkit also provides means for seamless client server interactions and enable a software engineer to reference server methods from client Java code Web browser Browser DOM GWT environment HTTP The server part of SAIL is mostly a kind of database management layer It controls the creation and modification of SAIL objects like parameters classifiers and so on Another important part of SAIL s backend is a module that deals with data and metadata import It parses validates and load data metadata into a database The final component is the sample counting engine that processes queries and makes counts of availability based on the results of the query The database itself can roughly be divided into metadata storage and data availability storage Metadata storage contains information about parameters structure classification relations repositories classifiers projections Data availability storage contains information about samples such as identifiers relations to repositories relations to parameters and finally availability information XML HTTP SAIL web application Counting Object Data metadata engine controller import SAIL Glossary Parameter Parameter serves for the descr
68. en creating a parameter we can then specify to which vocabulary if belongs by adding the corresponding tag E Report constructor Classifiers Projections Study Collection Collection View Metadata Import Add classifier Edit classifier Edit classifier Name Vocabulary Description Classification by relation to some dictionary Type v C Allow multiple tags L This classifier is mandatory Tags Tag name Info Mets FinnT win General KoraF4 KoraF3 DGI Add Tag Edit Tag Remove Tag Classification a Classifier Tag Info Classifier type Parameter Add Tag Remove Tag Save Cancel e Classifier Classifiers of type classifier are used to add information to other types of classifiers so they can be easily organized This is useful in cases when there are lots of classifiers defined in SAIL and a user wants to filter them to one specific type in order to find the classifiers he wants to edit A Report constructor Classifiers Projections Study Collection Collection view Metadata Import Add classifier Edit classifier Edit classifier Name Classifier type Description Classifier for classifiers classification Type v O Allow multiple tags C This classifier is mandatory Tags Tag name Info Relation Annotation Parameter Classifier F Add Tag Edit Tag l Remove Tag Classification a Classifier Tag Info Classi
69. en creating a summary of the contents of a collection An optional Description line can set description for variable Syntax is the same as for parameter s description If type of variable is ENUM then Predefined and Variant lines can be entered in context of variable Variant row provides values for an enumerated variable Syntax 1s as follows Variant variant string variant coding Variant coding is an integer number that is used if variable has numeric representation of variants Variant coding is optional Variant line should be repeated for every enumerated value Predefined row takes the form of Predefined yes or no Predefined row determines whether this variable can only accept values from a predefined set or it can include new values that can come along with data Qualifier row starts a new context within the parameter context It terminates a previous variable or qualifier context 1f it exists Qualifier row syntax is as such Qualifier qualifier name A qualifier section is similar to an ENUM variable section with the exception of the Type row that is not used within qualifier context because qualifier is always enumerated Data availability import file format Having a system to describe create and organize parameters now we can concentrate on sample availability counts SAIL has its own database that holds information about sample availability Before using the system we need to import export such information SAIL
70. er Classifiers by type Classifier type M Create projections Projections are ways to organize data based on the tags used to define parameters Projections are used in the User interface in the parameter tree and parameter list sections to show how a tree should be built or to filter parameters To know more about the use of projection read section Understanding SAIL Classification 1 To create a new projection start by selecting Add A Report constructor Classifiers Projections Study Collection Collection view Metadata Import Columns x P Projection Classifiers Source Vocabulary Classifiers by type Classifier type add edt 2 Name the new created projection Report constructor Classifiers Projections Study Collection Collection view Metadata Import Add projection Add projection Name TestProjection Description Classifiers Classifier Description Add Remove Up Down Save Cancel 3 In the classifiers section select Add Classifier Description Type Mandatoty Tags Vocabulary Classification by relation to some dictionary Parameter false 6 Repository Description Set of tags for structured description of Repository Collection annotation false 5 Classifier type Classifier for classifiers classification Classifier false 4 MetSRelations Metabolic syndrome vocabulary relations Rela
71. es Date of coronary heart disease diagnosis Date of first myocardial infarction Date first stroke Education Blood Eosinophils Year examination Family History Family history heart disease Family history stroke Family type 2 diabetes Glucose mMol L Se Columns 8q 84 Bq 84 Bq 8 84 8g 84 84 9y 84 9g 4 94 8y 4 94 9Y 24 9q 84 94 84 4 B Records WY 33080 7295 13741 16252 11677 1781 69 69 32569 6008 32112 14994 1029 14771 23530 20080 29 148 137 11391 69 15334 69 5148 7814 7838 31068 E Report request xy at Saar of i Alcohol 5 BMI BMI z 0 EDB Type of diabetes A Education o o o o o o 1 1 1 1 o o o 0 o o o f Use split by collection C Use relations 1 CI Specify collections CI Specific relations 2 v 3 Click on query to see your results B Welcome xl f summary e Report constructor E Report 3 xi E Report 4 x Sp eenaa A Total records 33284 Ea J AN NN gt 998 995 99 997 9 996 99 998 100 995 99 a _ 946 100 912 66 946 100 946 100 941 99 gt Genmets Control se 946 98 931 96 964 99 965 100 941 97 4 6008 97 5224 84 5784 93 4866 78 5059 819 gt jours 1644 1644 100 1636 99 1623 98 1644 100 gt 284 32569 20080 ae parameter with code AGE and name Age a parameter with code ALC and name Alcohol ae paramete
72. escription Filter Records V APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 fl DMI Date first MI BASO Basophils 0 02 0 1 Blood Basophils a 69 0 BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 0 ae BMI BMI Body Mass Index kg m2 a 32569 0 E BP Blood pressure BMO Month of birth Month of birth a 6008 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 0 BYR Birth Year Birth Year 4 14994 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 CM Cholesterol medication Cholesterol medication a 14771 1 CRP CRP CRP mg L a 23530 1 DB Type of diabetes Type of diabetes 4 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis 4 29 0 DMI Date first MI Date of first myocardial infarction a 148 0 DS Date first stroke Date first stroke a 137 0 EDU Education Education a 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils 4 69 0 EXYR Year examination Year examination a 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease a 5148 1 FMSTRK Family history stroke Family history stroke a 7814 1 FMT2D Family type 2 diabetes Family type 2 diabetes 4 7838 1 GLU Glucose Glucose mMol L an 31068 2 GLUM Glucose medication Glucose medication a 11339 1 GYV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 0 41 0 T GYV_AFFY 100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 0 1 0 L
73. etes a oe of 4 au zan i rie Use split by collection L Use relations ERYR Yea excaranation Year ecaranation a 3 1 0 C Spectty collections C spectic relations C aukar LE aid E atiae E Relations Laf exten Report will look like A welcome a Summary d Report constructor B Report 1 xil Ce 25 2 oe Total records 33102 g S11 aw Bed 2 5479 169 ggd 5479 ni parameter with code ALC and name Alcohol filtered out by Status is passed fee parameter with code ALC and name Alcohol filtered out by Status is never oe parameter with code ALC and name Alcohol filtered out by Status is current Here we can see all values of enumeration and corresponding counts By default parameters added to a query are included with the AND option which means that both parameters have to be available in a sample to be counted as positive Another option is to select OR as a linker which mean that at least one of the parameters have to be present To do it we need select several parameters in list Then click on OR button to the change the type of request Ez WoT C Summary A Report constructor Study AN v iD Collections JANY Report request T Parameter list vy Parameter tree Hy Parameter hierarchy j NO FILTER Saori Jag v Column x 0 Conte a Hoe Desinta Fe Records WO E x ee ar aA hF Fre RA hk aa a g re M Myocardial iniarction yoni d
74. etrix Genome wide Human SNP Array 100k Set a 3 1 0 GVY_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 3 1 0 GQW _AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Select relations x GVV_AFFY _6 Affymetrix Genome wide Hun Affymetrix Genome wide F Classifier Relation z J GW_AFFY Affymetrix Genome wide gen Affymetrix Genome wide MetSRelations Synonym AGE Age Age C MetSRelations Partial match KoraF3 RTALTERU Age Age KoraF3 RTMISLT Age of Ml Age of the first miocardial KoraF3 RTSCHALT Age of stroke Age of the first stroke ALC Alcohol Alcohol KoraF3 RTALKKON Alcohol grams absolute ethanol pi DGIALCOHOL Alcohol Have you been drinking al DGLALCODOS Alcohol doses week ALCO Alcohol quantity grams absolute ethanol KoraF3 RTANTIHY Antihypertensitives medicatior Antinypertensitives medi ANTIHYPR Antihypertensives Antihypertensive treatmel DGEANTIHYP Antihypertensives antihypertensive treatme APOB Apo B mgd Biochemistry Apolipoprot BMil BMI Body Mass Index kg n2 KoraF3 RTBMI BM kgin2 a 1636 1 0 DGIBMI_basal BM Calculated a 3114 1 0 DGEVISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils i 69 1 0 Use split by collection Uere BICEPS Biceps mm Thickness of a skinfold on the biceps muscle i 69 1 0 El Specify collections Specific relations BYR Birth Year Birth Year a 14994 1 0 select DGEBIRTH_MONTH Birth month Birth month a 0 1 Qo Cosas Eta Ciian Danaon Ca
75. fier type Classifier Add Tag Remove Tag Sae Cancel e Relation These classifiers are used to specify the types of relations between parameters allowed for a specific vocabulary amp Report constructor l Classifiers Metadata Import Add classifier Edit classifier Edit classifier Name MetSRelations Description Metabolic syndrome vocabulary relations Type v CO Allow multiple tags C This classifier is mandatory Tags Tag name Info Synonym Partial match j Add Tag Edit Tag Remove Tag Classification f Save Cancel e Collection Study and Parameter Annotation These three types of classifiers are used to add descriptions to the collections studies or parameters themselves and not to the data they hold For example a Collection classifier can be one to describe a repository and it may have tags to add information about who is the data provider and how to contact the person responsible of the repository which is useful information that is not directly related to the sample data a Report constructor Classifiers Projections Study Collection Collection view Metadata Import Add classifier Edit classifier Edit classifier Name Repository Description Description Set of tags for structured description of Repository Type me C Allow multiple tags C This classifier is mandatory Tags Tag name Info Description Country Owner
76. iabetes a 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 0 DMI Date first Ml Date of first myocardial infarction a 148 0 DS Date first stroke Date first stroke a 137 0 EDU Education Education a 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils a 69 0 EXYR Year examination Year examination a 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease a 5148 1 I 7 FMSTRK Family history stroke Family history stroke a 7814 1 2 oan s Ea FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 GLU Glucose Glucose mMol L a 31068 2 a 4 Now select the parameter that you want to add and press button Add to group Welcome Summary amp Report constructor Study ANY v Collection ANY x Report request E Parameter list Parameter tree ie Parameter hierarchy 9 Vocabulary vi Mets v Columns x p 2 AGE Age Code a Name Description Filter Records E AGE Age Age Y 33080 1 0 SEX Sex AGEST Gestational age Gestational age a 7295 1 0 CHD Coronary Heart Disease ALC Alcohol Alcohol a 13741 1 1 er j ALCQ Alcohol quantity grams absolute ethanol week a 16252 1 0 Z CM Cholesterol medication ANTIHYPR Antihypertensives Antihypertensive treatment a 11677 1 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 BICEPS Biceps mm Thickness
77. ies Example Files Vocabulary Data Availability Relations Study What is SAIL SAIL SAmple availability index 1s a web application that provides a federated view on data availability across different data repositories SAIL has been designed to be able to deal with both data availability matrices and metadata data that is used to describe availability data SAIL is divided in two different user interfaces one for end users and one for administrators The user interface provides means to browse metadata and to prepare and view availability reports The administrator interface 1s intended for editing metadata and underlying structures and importing both metadata and availability data Install SAIL Pre requisites To install SAIL the following software needs to be preinstalled in your system e Apache Tomcat e MySQL database Installation procedure SAIL 1s distributed as a package that consists of 3 files e SAIL war Web Application Archive file This is the application file e sail xml is an example config file e sail schema sql contains the schema that needs to be loaded in the database After downloading the distribution package from http www simbioms org index php downloads mainmenu 43 html SAIL Extract the three files in a local folder Installation steps 1 First you need to load the database schema There are two way to do this a Youcan create the database and load the schema from the command prompt wi
78. iption of one particular characteristic of an object Such characteristics can be a simple property of the object like human height a measured value like temperature some binary state like patient have disease or not Characteristics can be more complex like blood pressure that requires two values to be described Parameters have code name description set of variables 1 or more set of qualifiers 0 or more Parameter code Parameter code is short alphanumeric identifier of a parameter Code must be unique across entire set of parameters Code uses only latin characters and is localization independent Variable Variable is the mandatory part of parameter can be exceptions of parameters with no variables Parameters can have one or several variables The variable describes the atomic property of a parameter In most cases such property is a numeric representation of some physical value such as concentration temperature pressure and so on In other cases a variable describes non numeric properties for example free text descriptions or the enumerated property of an object or the boolean state of an object Variables have name description and type Type set is fixed and consists of ENUM STRING INTEGER REAL BOOLEAN or TAG ENUM type variables can have set of allowed values Qualifier Qualifier is an optional part of a parameter Qualifier is similar to enumerated variable but in contrast to variable it doesn t describes the obj
79. is row oriented That means that every unit takes one row Rows in turn can have several parts separated by tabs or in case of Excel located in several cells in one row Any number of empty rows in any part of the file is allowed Such rows can be used as separators to facilitate human readability of the file Every non empty row must begin from one of the following keywords e Parameter e Tag e Variable e Qualifier e Description e Variant e Inherit e Relation e Type e Predefined e Annotation Every keyword can be used only in the appropriate context The file begins with a global context In the global context only Tag and Parameter rows are allowed The Tag row in global context provides tag descriptions that will be applied to every parameter that is defined within this file Tag line syntax 1s as follows Tag Classifier Name Tag Name For example Tag Vocabulary MetS Global context can contain as many Tag lines as required or none at all The Parameter row commences the parameter context This context will span until the next Parameter row or to the end of the file There is no way to go back from parameter context to global context All keywords can be used inside the parameter context but there are some rules Parameter line syntax is as follows Parameter Parameter Code The parameter code is the unique identifier of the parameter Code must consist of symbols a z A Z 0 9 and colon Spaces are not allowe
80. ist l z Parameter tree I is Parameter hierarchy E xs NO FILTER v Select tag v Columns x p fA Request Code Name Description Fitter Records Y E DG Affected_Statu Affected_Status Affected_Status a 3142 18 GYV_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 3 0 GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 GVV_AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set aH 3 0 GVY_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 3 0 GVY_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping Fg 3 0 AGE Age Age 4 33080 o KoraF3 RTSLTERU Age Age Fg 1644 0 KoraF3 RTMIALT Age of MI Age of the first miocardial infarction P d 58 0 KoraF3 RTSCHALT Age of stroke Age of the first stroke a 44 0 ALC Alcohol Alcohol a 13741 1 KoraF3 RTSLKKON Alcohol grams absolute ethanol per day at week before examination 4 1640 0 DG ALCOHOL Alcohol Have you been drinking alcohol during last 12 months a 0 1 DGLALCODOS Alcohol doses week Fg 0 1 ALCQ Alcohol quantity grams absolute ethanol week Pg 16252 0 KoraF3 RTANTIHY Antihypertensitives medicatior Antinypertensitives medication a 1643 1 ANTIHYPR Antihypertensives Antihypertensive treatment a 11677 1 DG ANTIHYP Antihypertensives antihypertensive treatment an 297 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 BMI BMI Body Mass Index kg m2 a 32569 0 KoraF3 RTBMI BMI
81. ke Family history stroke a 7814 aj 2 Sane i F AENEAN FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 GLU Glucose Glucose mMol L a 31068 2 a 3 To add a parameter to a group select from the report request window the one on the right the parameter that will be part of a group Once selected the Add o group button will become available to group butt ll b lable B Welcome x l f Summary amp Report constructor Study ANY v Collection any vi g Report request pacametes list res TAE tree i Syne eae p Zz x Z Vocabulary Mets v Columns x 2 a AGE Age Code Name Description Fitter Records V i AGE Age Age a 33080 o A E SEX Sex AGEST Gestational age Gestational age a 7295 0 CHD Coronary Heart Disease ALC Alcohol Alcohol a 13741 1 ALCQ Alcohol quantity grams absolute ethanol week a 16252 0 ANTIHYPR Antihypertensives Antihypertensive treatment a 11677 15 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 BASO Basophils 0 02 0 1 Blood Basophils a 69 0 BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 0 BMI BMI Body Mass Index kg m2 a 32569 ap BMO Month of birth Month of birth a 6008 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 0 BYR Birth Year Birth Year a 14994 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 CM Cholesterol medication Cholesterol medication a 14771 1 CRP CRP CRP mg L a 23530 1 DB Type of diabetes Type of d
82. lets 150 400 PoP PogesationvEthwwcay RCELL Red cea court 4 5 5 5 RESO Resxience SET Type of study setting SEX Sex EMK Smoking status SMKOI Smoking quantity 1 SMKO2 Smoking quantity 2 STRK oke STUDY Typo study ES Subse mm SUPRA Supraliec TAP Total adiponectin te Total Cholesterct TG Trigycerides TRICEPS Triceps mm TVANZYG Zygosty twins UA Unie acid WST Waist WSTHP Waist Myocardial intarction Bood Monocytes Metabolic syndrome by DF defintion Biochemistry Non Esterified Fatty Acids Bood Nautroph s Biood Piateiets Popuation Ehnicty Biood Red Cel Court Country of residence Type of study setting Sex Senoking status number of Cigarette day af time of phenotyping Mater of packyess smoked af tine of pherctyping Stroke Type of study Thickness of 6 skinfoki on the subscapular muscle Suprasiec measuremert Total adiponectin Total Cholesterol mMout Trigyceridas mou l 4 SEX hickness of a shinfoki on the triceps mu 2q 24 4 24 04 24 24 94 by 24 24 2q 24 2q q 2q 24 2q 24 24 2q 24 2q e4 27730 AE IE Ur O TT U D U r UE UE ee ee T o COTO OO O O ae 33102 33102 100 parameter with code SEA and name Sex S amp Request TSEX Sex Use spit by collection E Speaity collections Use relations Cl Specific relations Le guy It means that we have 33102 samples in database and all of them have information
83. me Alcohol parameter with code SME and name Smoking status 3_ ALC a sue This report means that entire database contains 6120 records 5301 of them have information about smoking status 5677 about alcohol status and 5699 have smoking status or alcohol status or both How to use SAIL 1 Open the link http www eb1 ac uk Tools sail 2 From the welcome page on the top left corner you can select the tabs to browse a summary of the existing collections or to make a query Red arrows Alternatively you can also select one of the options in the menu on the botton right Green 3 Use the Summary to view the general info about collections 4 Youcan extend the available information about a collection by selecting the inverted triangle at the right side of each collection DGI Records 3142 Last update 28 April 2009 16 16 38 ERF Records 3205 Last update 24 April 2009 00 00 00 7 KoraF4 Records 1814 Last update 23 April 2009 00 00 00 z KoraF3 Records 1644 Last update 23 April 2009 00 00 00 X UK Twins Records 6008 Last update 09 March 2009 00 00 00 dd EGP Records 998 Last update 14 January 2009 00 00 00 pas STR Records 8467 Last update 14 January 2009 00 00 00 a Repository Description Description Swedish Twin Registry STR cohort Repository Description Country Sweden Repository Description Owner Karolinska Institutet Repository Descrip
84. meric values you can specify a filter where only those samples with the value within a range will be selected The filter would look like lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0_ 16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property intRanges gt lt object class java util ArrayList gt lt void method add gt lt object class uk ac ebi sail client common IntRange gt lt void property limitLow gt lt int gt Minimun Value i e 10 lt int gt lt void gt lt void property limitHigh gt lt int gt Maximum Value i e 30 lt int gt lt void gt lt void property partID gt lt int gt Part ID of the parameter we want to filter lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt o Variants If your parameter is an enumeration you can use a filter where you select only those samples where the value of the parameter is a specific variant The filter would look like lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0_ 16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property Variants gt lt object class java util ArrayList gt lt void method add gt lt object class java util ArrayList gt lt void method add gt lt int gt Part ID i e the part ID that
85. mit 0 FAMH Family History Family History 0 FMHRT Family history heart disease Family history heart disease 1 FMSTRK Family history stroke Family history stroke 1 FMT2D Family type 2 diabetes Family type 2 diabetes 1 GLU Glucose Glucose mMol L 2 GLUM Glucose medication Glucose medication 1 GV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a o 0 GVWV_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 0 0 GW_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 0 GVV_AFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 0 0 GYV_AFFY_6 Affymetrix Genome wide Hur Affymetrix Genome wide Human SNP Array 6 0 a o 0 GYV_GT Genome wide genotypes Genome wide 100k SNPs genotypes a 0 0 7 7 l 7 GVV_ILMN llumina Genome wide genotyp Illumina Genome wide genotyping a 0 0 a ene ponte GYV_ILMN_41M llumina HumaniM Duo array Illumina HumaniM Duo array a 0 0 Era Ea GVV_ILMN_660VY Ilumina Human660VV Quad arr Illumina Human660VV Quad array a 0 0 a Collection ANY welcome E Summary amp Report constructor Study ANY m PETEN list kz Parameter tree I a Parameter hierarchy Vocabulary Mets Code a Name BP Blood pressure BYR Birth Year CHD Coronary Heart Disease cM Cholesterol medication CRP CRP DB Type of diabetes DCHD Date first CHD DMI Date first MI DS Date first stroke EDU Educatio
86. mport sample relation 2 Inthe Add study window add the name of the Study you want to create If you want to add any descriptors click on Add Collection Records a ae 3 Here you can add the name of your descriptor or select an existing tab by clicking on Change Tag 4 Ifyou select Change Tag a new pop up window with available classifiers will be displayed Here you can select the one you want to use and click on Select 5 Inthe Collection List section click on the Add button to get a pop up window with the list of available collections Here you can select the collections that are part of the study by clicking in the selection box Once done click OK 6 Click Save to finish creating your Study Report constructor Study Collection Collection view Metadata Import Study Columns Y TestStudy Total Samples Eligible Samples Selected Samples 0 0 0 7 Now you need to import the sample relations data in the newly created study To do so you need to have a file with the format SampleID Eligible Used with values 0 and for eligible and used No headers are used in the file Select the study to which you want to add data and click on Import sample Relation Report constructor Study Collection View Metadata Import Study Columns Y TestStudy Total Samples Eligible Samples Selected Samples 0 0 0 9
87. n EOSIN Eosinophils 0 02 0 05 EXYR Year examination FAMH Family History FMHRT Family history heart disease FMSTRK Family history stroke FMT2D Family type 2 diabetes GLU Glucose GLUM Glucose medication GvV_AFFY Affymetrix Genome wide gen GVY_AFFY 100k Affymetrix Genome wide Hur GVV_AFFY_5 Affymetrix Genome wide Hurt GyvV_AFFY_S00k Affymetrix Genome wide Hur GVV_AFFY_6 Affymetrix Genome wide Hurt GYV_GT Genome wide genotypes GyV_ILMN llumina Genome wide genotyp GvV_ILMN_1M llumina Humani M Duo array GVV_ILMN_66OVY Ilumina Human660VV Quad arr Description Blood pressure systolic diastolic mm Hg Birth Year Coronary Heart Disease Cholesterol medication CRP mg L Type of diabetes Date of coronary heart disease diagnosis Date of first myocardial infarction Date first stroke Education Blood Eosinophils Year examination Family History Family history heart disease Family history stroke Family type 2 diabetes Glucose mMol L Glucose medication Affymetrix Genome wide genotyping Affymetrix Genome wide Human SNP Array 100k Set Affymetrix Genome wide Human SNP Array 5 0 Affymetrix Genome wide Human SNP Array 500k Set Affymetrix Genome wide Human SNP Array 6 0 Genome wide 100k SNPs genotypes llumina Genome wide genotyping lumina Humani M Duo array lumina Human660VV Quad array Columns Fitter ar ar a ae a ar a ar a a a ar a a a a a a a a a ar a a an a Records Y 32112 149
88. n SNP Array 100k Set a 0 0 GW_AFFY_S5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 0 GYV_AFFY_S00k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 0 0 GW_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 0 0 GYV_GT Genome wide genotypes Genome wide 100k SNPs genotypes a 0 0 GAN ILMIN llumina Genome wide genotyp Illumina Genome wide genotyping a 0 0 GyV_ILMN_1M llumina HumaniM Duo array Ilumina Humani M Duo array a 0 0 GW ILMN_660 A Ilumina Human660VV Quad arr Ilumina Human660VV Quad array 4 0 0 GYV_ILMN_CS12 llumina HumanCytoSNP 12 arr Illumina HumanCytoSNP 1 2 array a 0 0 GVV_ILMN_Hap300 Illumina HumanHap300 array Illumina HumanHap300 array a 0 0 GAW LMN _Hap370 Illumina HumanHap370 array Ilumina HumanHap370 array a 0 0 GW ILMN_Hap550 Illumina HumanHap550 array Ilumina HumanHap550 array a 0 0 GYV_ILMN_Hap650 Illumina HumanHap6S50 array Ilumina HumanHap6S0 array a 0 0 i GVV_ILMN Humani Ilumina Humani 100k array Illumina Humant 100k array a 0 0 Cl use split by collection O Use relations GW LMN iSelect Ilumina iSelect array Ilumina iSelect array Wy 0 0 Specify collections Cl Specific relations GVV_ILMN_OE Illumina HumanOmniExpress a Illumina HumanOmniExpress array a 0 0 GAW ILMN_ OMNA Illumina HumanOmnit Quad ar Illumina HumanOmnit Quad array Pd 0 1 os E ouick query Eada C Reations of extra 2 Press button Add at the bottom of the li
89. ncoding UTF 8 gt lt java version 1 6 0_ 16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property Variants gt lt object class java util ArrayList gt lt void method add gt lt object class java util ArrayList gt lt void method add gt lt int gt 232 lt int gt lt void gt lt void method add gt lt int gt 396 lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt where expressionID 9 and ParameterID 191 update expression content set filter lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0 16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property variants gt lt object class java util ArrayList gt lt void method add gt lt object class java util ArrayList gt lt void method add gt lt int gt 41 lt int gt lt void gt lt void method add gt lt int gt 368 lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt where expressionID 9 and ParameterID 41 NCEP At least 3 of the following criteria has to be true GLU Fasting WST and SEX TG HDL and SEX BP Gr ANTIHYPR insert into expression values N 2 WST AND SEX NCEP insert into expression values N 2 HDL AND S
90. ndatory Variant Variant Qualifier Description Predefined Mandatory Variant Variant Blood pressure systolic diastolic mm Hg Vocabulary Definition Definition Definition TestRelations Systolic INTEGER Diastolic INTEGER DIAB Type of diabetes Type of diabetes Vocabulary TestRelations Type ENUM NO GLUC Glucose Glucose mMol L Vocabulary Definition Definition Definition TestRelations Concentration REAL GLUCTM Glucose with Timing and Type TestVocabulary IDF WHO NCEP Synonym BP TestVocabulary Partial match DB TestVocabulary IDF WHO NCEP Partial match GLU Glucose mMol L with timming and type of tissue Vocabulary GLUC Timing YES NO fasting non fasting Biomaterial YES NO plasma serum TestVocabulary DATA AVAILABILITY SAMPLE ID example example2 example3 example4 exampled example6 example7 examples example9 example10 example1 1 example12 example13 example14 example15 example16 AGEVIS 35 28 29 31 54 18 ANTIHYPER Accupril 1 Alatone Accupril Oy SS Alatone Alatone 1 Alatone ALCQUANT 1 12 BMIDX 21 27 21 21 22 S rv e e BPSD Systolic 12 14 12 12 14 12 12 12 12 14 12 BPSD Diastolic 7 00 DIAB Type 2 GLUCTM Concentration 4 4 2 3 8 5 1 4 4 2 3 8 5 1 4 4 4 2 3 8 5 1 GLUCTM Timing fasting non fas
91. nding entries in the classifier collection and study sections You also need to create the vocabulary for MetS in order to be able to use the synonyms VOCABULARY Parameter AGEVIS Name Age Description Age at visit Tag Vocabulary Relation TestRelations Variable Age Type INTEGER Parameter ANTIHYPER Name Antihypertensives Description Antihypertensive treatment Tag Vocabulary Tag Definition Tag Definition Tag Definition Relation TestRelations Variable Type Type ENUM Predefined NO Parameter ALCQUANT Name Alcohol quantity grams absolute ethanol Description week Tag Vocabulary Relation TestRelations Variable Quantity Type INTEGER Parameter BMIDX Name BMIDX Description Body Mass Index kg m2 Tag Vocabulary Tag Definition Relation TestRelations Variable BMI Type REAL Parameter IMC Tag Vocabulary Name Inherit BMIDX Parameter BPSD Name Blood pressure Indice Masa Corporal Spanish equivalent to BMI TestVocabulary Synonym TestVocabulary IDF WHO NCEP Synonym TestVocabulary Synonym TestVocabulary WHO Synonym TestVocabulary AGE ANTIHYPR ALCQ BMI Description Tag Tag Tag Tag Relation Variable Type Variable Type Parameter Name Description Tag Relation Variable Type Predefined Parameter Name Description Tag Tag Tag Tag Relation Variable Type Parameter Name Description Tag Inherit Qualifier Description Predefined Ma
92. ndows or terminal window MacOsX Linux Run the commands mysql uuser ppassword Ddatabase e create database sail mysql uuser ppassword Ddatabase lt Location of file sail scheme sgqi b Ifyou are using a MySQL database software like phpmyadmin you will need to create the database first In the database tab go to the create new database section and add the name of the database 1 e sail Click on create Once the database is created go to the import tab and select the file to import sail_schema sql Check that you have selected the SQL option in the Format of imported file Click Go Now your database should ready See phpmyadmin or your database management software documentation for more information on how to create a database and load the database schema 2 Deploy SAIL software The easiest way to install SAIL 1s by copying the SAIL war file in your CATALINA HOME webapps directory CATALINA HOME will be defined by your tomcat installation Once the file is copied Tomcat will automount SAIL and create the sail directory with the application Note it may take a few minutes for tomcat to detect and deploy the SAIL war 3 Edit the file sail xml This file is use to configure the database access in SAIL You need to change the values in SAIL_DBUserName and SAIL DBPassword for those needed to access your mysql instance Once the file is edited copy it to CATALINA HOME conf Catalina localhost sail xml 4
93. nes 11 Click on Query to get your report i welcome xii t Summary amp Report constructor IE Report 1 E Report 2 Study ANY v Collection any v Parameter list res Parameter tree a Parameter hierarchy H i hajen tta a ataa x E Report 3 l E Report 4 Vocabulary Code DGIAffected_Statu GV_AFFY_100k GVWW_AFFY_5 GW_AFFY _500k GW_AFFY_6 GW_AFFY AGE KoraF3 RTSLTERU KoraF3 RTMIALT KoraF3 RTSCHALT ALC KoraF3 RTSLKKON DG ALCOHOL DGLALCODOS ALCO KoraF3 RTANTIHY ANTIH YPR DGIANTIHYP APOB BMI KoraF3 RTBMI DGI BMI_basal DGIYVISIT_bas BASO BICEPS BYR DGIBIRTH_MONTH ANY Name Affected_Status Affymetrix Genome wide Hurt Affymetrix Genome wide Hurt Affymetrix Genome wide Hurt Affymetrix Genome wide Hurt Affymetrix Genome wide gen Age Age Age of MI Age of stroke Alcohol Alcohol Alcohol Alcohol Alcohol quantity Antihypertensitives medicatior Antihypertensives Antihypertensives Apo B mgd BMI BMI BMI Basal visit date Basophils 0 02 0 1 Biceps mm Birth Year Birth month Description Affected_Status Affymetrix Genome wide Human SNP Array 100k Set Affymetrix Genome wide Human SNP Array 5 0 Affymetrix Genome wide Human SNP Array 500k Set Affymetrix Genome wide Human SNP Array 6 0 Affymetrix Genome wide genotyping Age Age Age of the first miocardial infarction Age of the first stroke Alcohol
94. nfold on the subscapular muscle Suprailiac measurement Total adiponectin Total Cholesterol mMol L Triglycerides mMol L Thickness of a skinfold on the triceps muscle Zygosity twins Uric acid pmol Waist circumference cm Vaist to hip ratio Weight kg Z ouick query E aaa 5 Relations LP era Columns Fitter Od 21 9q 94 4 BY 24 8Y BY 4 BY BY 84 BY Bq 4 BY BY 8y BY 94 84 94 94 9y 84 94 Records V 9084 69 3205 69 69 69 11823 69 14834 5844 33102 18971 11094 4477 12698 5844 69 69 2256 27730 30887 69 6008 5890 21466 20446 24173 E x p Ol M Ss O Ol als 0l o o oOo oOo Report request Zo A E SEX Sex Cluse split by collection Specify collections O Use relations CI Specific relations Selected parameters with enumerations are shown in the Report request panel Notice the super index 1 on top of the SEX header linking to a legend that say that the parameter has been filtered to show only does entries with Sex values Man or Woman r m E Total records 33102 oe parameter with code SEX and name Sex filtered out by A Welcome 5 Summary E Report constructor Report 4 m 22 33102 Sexis Man OE Woman TSRS xy How to add value ranges for a parameter into a report In cases where Integer values for a particular parameter are provided for instance Mets EXYR year of examination
95. ng and at least 2 OF che TOLLoOwing BP or ANTIHYPR TG HDL and SEX BMI or WSTIHIP and SEX insert into expression values N 1 Insulin Resistance insert into expression values N 2 Additional Criteria WHO insert into expression values N 1 BP or ANTIHYPR WHO insert into expression values N 2 HDL and SEX WHO insert into expression values N 1 BMI OR WSTIHIP AND SEX insert into expression values N 2 WSTIHIP and SEX insert into expression content values 2 0 9 NULL insert into expression content values 2 0 10 NULL insert into expression content values 9 41 0 NULL insert into expression content values 9 28 0 NULL insert into expression content values 9 191 0 NULL insert into expression content values 10 49 0 NULL insert into expression content values 10 0 11 NULL insert into expression content values 10 0 12 NULL insert into expression content values 10 0 13 NULL insert into expression content values 13 0 14 NULL insert into expression content values 11 56 0 NULL insert into expression content values 11 59 0 NULL insert into expression content values 12 45 0 NULL insert into expression content values 12 12 0 NULL insert into expression content values 13 31 0 NULL insert into expression content values 14 34 0 NULL insert into expression content values 14 12 0 NULL update expression content set filter lt xml version 1 0 e
96. nran Heart Deets Conaneny Hasi Drees a toe 1 1 cM Oholesterol mediosten Cholesterol medication T iarri 1 i CRP oP CRP maL a 041 be Type of diabetes Type of dinbetes a I l THD Dole Ars CHO Daie of coronary haart dihiaia dagor a Q DHA Ateded Satu Aeda Aa Ailsie n E Fi a Si42 1 1 PStALCODSS Alcohol dobes weak a ea 3 DStALCOMOL Alcohol Have you been drinking wicohol during bast 42 months T 0 4 i 73 ie ANT Arbh eee anbraten eaa a oar 4 4 i DGtBIRTH_MONTH Birth morth Birth morh a o 1 22 24 DSBRTH_VEAR Birth year Birth yasr a Ma i at 5 ie EMi Calculated aT m i p Gt GP basa Blood pressure l 6 Mean of hwo measurements 5 n l Q 20 a se CE CHOL_basal Cholesterol Chotesterct 7 m7 i D mue spit bF colection Eel Use rel DHCHELAT Cingi Fapt i Lady Conpesinee heat ind o 0 p O Specify o tias Ol Specitic faatiors CxS OHRELATI Father hed Gongedtiove hear Father had Goeepestice Feb a 450 1 1 eS OHRELAT Mother Fad congestive Paari Y Mother ed conpestives a 445 1 m ikoan LE Eao SZ Relion Era guern U 10 Code column It contains parameter codes short stable identifiers of parameters Classifiers combo box Here is the list of all available classifiers One can choose a classifier and then choose correspondent tags in a tag combo box There is a special NO FILTER value that means that no filtering by tags is required Name column Here are names of parameters Tags combo box If a classifier is chosen in the classifiers combo
97. of a skinfold on the biceps muscle a 69 1 0 BMI BMI Body Mass Index kg m2 a 32569 1 0 BMO Month of birth Month of birth a 6008 1 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 2 0 BYR Birth Year Birth Year a 14994 1 0 CHD Coronary Heart Disease Coronary Heart Disease b d 1029 1 1 cM Cholesterol medication Cholesterol medication a 14771 1 1 CRP CRP CRP mov a 23530 1 4 DB Type of diabetes Type of diabetes a 20080 1 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 1 0 DMI Date first Ml Date of first myocardial infarction a 148 1 0 DS Date first stroke Date first stroke a 137 1 0 EDU Education Education a 11391 1 n EOSIN Eosinophils 0 02 0 05 Blood Eosinophils a 69 1 0 EXYR Year examination Year examination a 15334 1 o FAMH Family History Family History 69 i 0 FMHRT Family history heart disease Family history heart disease a 5148 1 1 FMSTRK Family history stroke Family history stroke a 7814 1 1 2 oe eee F Am FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 1 Select _ _select GLU Glucose Glucose mMol L a 31068 1 2 a a 3 nator When adding a parameter to a group notice that it gets added with the contrary linker OR that the one used for the rest of the parameter in the query AND How to add enumerated values of a parameter into a report In some cases data providers submit real values for a particular parameter for instance Mets SEX gender from ha
98. or You can select a group of parameters from parameter list in such a case parameters in the frame of a group will be combined with logical or operation for instance group of parameters consists of Mets SEX and MetS GLU that means that samples with provided gender or measured glucose level will be selected from the database See section How to add group of parameters into a report 2 Press Make report button at the bottom of the Report request panel 3 Report appears in the next tab congratulations you have done it Administrator Interface The administrator interface of SAIL allows the user to import new data into SAIL as well as adding new vocabularies and defining new data relations It also allows to structure data into groups trees and or hierarchies After login into the Admin interface the user 1s presented with a new set of tabs to choose from Two of them are similar to those in the user interface Report Constructor and Collection view Summary view The new tabs are Classifiers Projections Study Collection and Metadata Import Produce a template for data import Report constructor offer similar characteristics as it s counterpart in the user interface One of the new features available from the Administrator interface is the ability to create templates for data import based on a set of selected parameters To do so the following steps are required 1 Select the list of parameters that you want to use in your
99. parameter context to show that the parameter extends the definition of an already existing parameter or group of parameters The syntax is as follows Inherit Parameter code Parameter code must be code of some existing parameter Note The inherited parameter can already exist in SAIL database or can be defined earlier in the same file A parameter can inherit several other parameters so Inherit line can be repeated for every inherited parameter Relation row is to designate relation of current parameter with some other parameter Syntax 1s Relation Classifier name Tag name Parameter code Classifiers must be of type RELATION This line is optional and can be repeated as many times as required Tag row has the same syntax as in global context but in the context of a parameter it applies the tag only to the current parameter This line 1s optional and can be repeated for different tags Variable row starts a new variable context within the current parameter context Variable context spans until next variable context next qualifier context or end of current parameter context Variable line syntax is as follows Variable Variable name Variable line must be followed by mandatory Type line Type line syntax is Type Variable type Variable type must be one of following STRING INTEGER REAL BOOLEAN ENUM DATE or a special type of Boolean called TAG TAG is a special Boolean that indicates that this variable is going to be displayed wh
100. r with code BMI and name BMI t parameter with code DB and name Type of diabetes E J parameter with code EDU and name Education v 4 Alternatively if you want to select only a number of collections to check you can do so by selecting the Specify collections check box and then pressing on Select A Welcome Summary Report constructor E Report 6 x Study ANY Colection ANY v Report request a TEER list Parameter tree Ws Parameter hierarchy zx o z Vocabulary v Mets v Columns x 2 mo pe Code Name Description Fitter Records V E AGE Age Age a 33080 1 o 4 E gt ALC Alcohol AGEST Gestational age Gestational age a 7295 1 0 E BM ALC Alcohol Alcohol a 13741 1 1 ALCO Alcohol quantity grams absolute ethanol week a 16252 1 0 EDB Type of diabetes ANTIH YPR Antihypertensives Antihypertensive treatment a 11677 1E 1 Education APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 Si BASO Basophils 0 02 0 1 Blood Basophils 4 69 0 BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 0 BMI BMI Body Mass Index kq m2 a 32569 oO BMO Month of birth Month of birth a 6008 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 0 BYR Birth Year Birth Year a 14994 o CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 cM Cholesterol medication Cholesterol medication a 14774 1 CRP CRP CRP mg
101. relations Metabolic syndrome definitions idee Parameter Collection annotation Classifier Relation Parameter Classifier Type Parameter Collection annotation Classifier Relation Parameter false false false false false asexample OOO Pp Gasser OOOO Tag name Cold Params Warm Params Hot Params 7 Add a description to the classifier dlasiBxampe OOO Example ofa eee Cold Params Warm Params Hot Params 8 In the Classification section click on the black inverted triangle to expand the options and select Add Tag Report constructor Edit classifier Name ClasiExample Description Example of a classifiers Type Select classifier _ Allow multiple tags Classifiers list This classifier is mandatory NO FILTER Tags z Classifier Tag name Classifier type Cold Params ClasiExample Warm Params Hot Params Add Tag Classification Classifier Tag Edit classifier Classifiers tree Mij Select tag i Description Classifier for classifiers classification Select Cancel Columns Type Classifier Classifier x x p Mandatoty Tags false 4 false 3 9 Select Classifier type from the list of options and click select Report constructor Classifiers Projections tud Il r I Edit classifier Name ClasiExample Description Example of a classifiers Type LJ Allow multiple tags C This classifier is mandatory Tags Tag name Info Col
102. rement TAP Total adiponectin Total adiponectin FO Total Cholesterol Total Cholesterol mMol L 1G Triglycerides Triglycerides mMol L TRICEPS Triceps mm Thickness of a skinfold on the triceps muscle TWINZYG Zygosity twins Zygosity twins UA Uric acid Uric acid pmol WST Waist Waist circumference cm VISTIHIP Waistihip VWaist to hip ratio WT Weight Weight kg Padito oup 9084 69 3205 69 69 69 11823 69 14834 5844 33102 18971 11094 4477 12698 5844 69 69 2256 27730 30887 69 6008 5890 21466 20446 24173 Records V E x ip 2 16 0 1 0 o 0 1 0 1 1 1 1 o 0 1 1 0 0 o 1 2 0 8 1 Cluse split by collection 0 Specify collections o Report request ES O use relations O Specific relations LEE 2 Inthe pop up window select the enumerated values that you want to add to your report request Enumerated values are those with a value of 1 in the enumeration column A Welcome E summary Report constructor Study ANY Collection ANY v Report request B Parameter list IE Parameter tree l Hg Parameter hierarchy ffol ip I A Vocabulary v Mets v Columns xP et Code Name Description Fitter Records V E M Myocardial infarction Myocardial infarction e 9084 1 1 S MONOC Monocytes 0 2 1 0 Blood Monocytes a 69 1 0 MS Metabolic syndrome Metabolic syndrome by IDF definition
103. ressure systolic diastolic mm Hg a 32112 o 4 BYR Birth Year Birth Year a 14994 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 cM Cholesterol medication Cholesterol medication a 14771 1 CRP CRP CRP mg L a 23530 1 DB Type of diabetes Type of diabetes Se EA lt ra 20080 a DCHD Date first CHD Date of coronary heart disease diagnosis 3 SJEXYR Year examination 29 0 DMI Date first MI Date of first myocardial infarction 3 V Year INTEGER 148 0 DS Date first stroke Date first stroke E e 0 137 0 EDU Education Education 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils 69 0 EXYR Year examination Year examination l 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease 5148 1 FMSTRK Family history stroke Family history stroke 7814 1 FMT2D Family type 2 diabetes Family type 2 diabetes 7838 1 GLU Glucose Glucose mMol L 31068 2 GLUM Glucose medication Glucose medication a 11339 1 GVWY_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 0 0 GVWY_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 0 0 GW_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 0 GW _AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set an 0 0 GVV_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 0 0 GW GT Genome wide genotypes Genome wi
104. rmonised metabolic syndrome vocabulary has following possible values Man Woman means that real values are not provided You can create report with enumerated values for selected parameter in the following way 1 Select the parameter from the parameter list by clicking on the funnel icon Columns 2q Bq Bq S4 84 Bq Bq 4 S4 Bq S4 A Bd Bd A A A ad ededed ad sd 2d 1 F A welcome i E summary Report constructor Study ANY Collection ANY v fe Parameter list _ l ta Parameter tree 4g Parameter hierarchy Vocabulary Mets Mis Code Name Description Ml Myocardial infarction Myocardial infarction MONOC Monocytes 0 2 1 0 Blood Monocytes MS Metabolic syndrome Metabolic syndrome by IDF definition NEFA NEFA pmol l Biochemistry Non Esterified Fatty Acids NEUT Neutrophils 2 7 Blood Neutrophils PLAT Platelets 150 400 Blood Platelets POP Population Ethnicity Population Ethnicity Ii RCELL Red cell count 4 5 5 5 Blood Red Cell Count RESD Residence Country of residence SET Type of study setting Type of study setting SEx Sex Sex SMK Smoking status Smoking status SMKQ1 Smoking quantity 1 number of cigarette day at time of phenotyping SMKQ2 Smoking quantity 2 number of packyears smoked at time of phenotyping STRK Stroke Stroke STUDY Type of study Type of study SUBS Subscapular mm Thickness of a skinfold on the subscapular muscle SUPRAI Suprailliac mm Suprailiac measu
105. rol a 2947 1 0 CM Cholesterol medication Cholesterol medication a 14771 1 1 KoraF3 RTLIPI Cholesterol medication Cholesterol medication a 1642 1 1 DGICHRELAT Congestive heart in family Congestive heart in family 0 0 0 CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 1 DGIKORONAR Coronary Heart Disease Coronary Heart Disease a 0 1 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 1 0 DMI Date first MI Date of first myocardial infarction a 148 1 0 ae en aae igh al CI Specify collections CI Specific relations DS Date first stroke Date first stroke a 137 1 0 KoraF3 rc040 Diabetes Diabetes a 1623 1 1 J moa Ti ws I Quick query add T Add to group K Relations a Extra Query 2 Click on the button labelled Extra KoraF3 RTBMI DGI BMI_basal DGIYISIT_bas BASO BICEPS DGEBIRTH_MONTH DGI BIRTH_YEAR DGI BP_basal KoraF3 RH_CRP DGIDIABCHIL DGICHRELAT4 DGIMIRELA4 DGI CHOL_basal KoraF3 RTLIPI DGiCHRELAT DGKORONAR KoraF3 rc040 en maa Basal visit date Basophils 0 02 0 1 a mm ath You OOOO Birth tohi Birth yea Blood Presea Children DM Description Biochemistry Apolipoprotein B Body Mass Index kgin2 56 E kgin2 Calculated basal visit date Blood Basophils Thickness et a Sinan on ne A caps m muso Bith Ye Year i Birth marth Birth year Blood pr pressure e Blood pressure s systole diastolic J mmHg meai at bag meas
106. ry description as this is the type of classifier that holds the descriptors available for new collections they have been defined in the Classifier section Click on Select Report constructor Classifiers Projections Stud Collection Collection View Metadata Import Add collection Add collection Name TestCollection Structured description Classifier Tag Info Select classifier x Classifiers list Classifiers tree Save l NO FILTER v Select tag v Columns xP Classifier Description Type Mandatoty Tags Repository Description i Set of tags for structured description of Repository Collection annotation false 5 Select Cancel 4 Select the Tag that you want to used and click select Study Collection Collection view Metadata Import Add collection Add collection Name TestCollection Structured description Classifier Tag Info Add Edit Remove ancel Type Select Tag x Tag Description Records Description 0 Country 0 Owner 0 URL 0 Contacts 0 Select Cancel Ok Cancel Change Tag 5 Add the value that you want to use for that descriptor Notice that at the top of the window it specifies the type of descriptor to which you are adding a value Click Ok Repeat these steps for as many descriptors as you want to add Classifier Tag Info Type Repository Description Country United Kingdom 6 Once you finished creating a new Collection click on Save
107. so Nee s ee GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 0 1 0 GVWW_AFFY_SO0k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 0 1 0 PARES Ar a Ste ea ie ar Se a e The selected group is shown in Report request panel How to add a parameter into a group Sometime you may want to group parameters in your request so you can do complex queries where you can choose to report samples where at least one of the parameters of each group is present Let s say for example that you want to get all samples that have Age Sex and at least Date of coronary heart disease diagnosis or Date of first myocardial infarction You can group the last two parameters together to achieve this 1 First select the parameters Age and Sex and add them to your query A Welcome Summary ey Report constructor Study ANY Collection ANY v Report request E Parameter list z Parameter tree Os Parameter hierarchy Te A Vocabulary v Mets v Columns xP SEX Sex Code Name Description Fitter Records E pad AGE Age Age a 33080 1 O E AGE Age AGEST Gestational age Gestational age a 7295 1 0 ALC Alcohol Alcohol a 13741 1 1 ALCA Alcohol quantity grams absolute ethanol week a 16252 1 0 ANTIHYPR Antihypertensives Antihypertensive treatment a 11677 1 1 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 1 BASO Basophils 0 02 0 1 Blood Basophils a 69 1 0 BICEPS Biceps
108. st A Welcome x B Summary ry Report constructor Study ANY v Collection ANY E Parameter list Parameter tree is Parameter hierarchy NO FILTER M Select tag v Code Name Description rewwer PUI Oy GUN ASUSE PUI ISLO Y TeUre USLUGE FMSTRK Family history stroke Family history stroke FMT2D Family type 2 diabetes Family type 2 diabetes Gen GENDT Genotyped Availabilty of genotyping data Gen LNGT Longitudinal data Shows that the record belongs to longitudinal data Gen METDT Metabonomics data Availability of metabonomics data Gen PROTDT Proteomics data Availabilty of proteomics data Gen TRCPDT Transcriptomics data Availabilty of transcriptomics data GLU Glucose Glucose mMol L GLUM Glucose medication Glucose medication GyV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping GVWW_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 GVWW_AFFY_S500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set GVY_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 GyYV_GT Genome wide genotypes Genome wide 100k SNPs genotypes GAA ILMIN llumina Genome wide genotyp Illumina Genome wide genotyping GyV_ILMN_1M llumina HumaniM Duo array Ilumina HumaniM Duo array GVV_ILMN_66OVY Illumina Human660VV Quad arr Illumina Human66O0VV Quad array GYV_ILMN_CS12
109. st fe ESIR list Parameter Geen a ny hierarchy cas f 2 xs Vocabulary Mets v Columns x p a AGECAQS Code Name Description Fitter Records Y T 3 AGE Age Age Fd 33080 o amp EE SEX Sex AGEST Gestational age Gestational age a 7295 0 EEJ CHD Coronary Heart Disease ALC Alcohol Alcohol a 13741 1 ALCO Alcohol quantity grams absolute ethanol week a 16252 0 ANTIHYPR Antihypertensives Antihypertensive treatment a 11677 16 APOB Apo B mgd Biochemistry Apolipoprotein B a 1781 1 BASO Basophils 0 02 0 1 Blood Basophils a 69 0 BICEPS Biceps mm Thickness of a skinfold on the biceps muscle a 69 0 BMI BMI Body Mass Index kg m2 a 32569 re BMO Month of birth Month of birth a 6008 0 BP Blood pressure Blood pressure systolic diastolic mm Hg a 32112 0 BYR Birth Year Birth Year a 14994 o CHD Coronary Heart Disease Coronary Heart Disease a 1029 1 cM Cholesterol medication Cholesterol medication a 14771 1 CRP CRP CRP mg L a 23530 1 DB Type of diabetes Type of diabetes Na 20080 1 DCHD Date first CHD Date of coronary heart disease diagnosis a 29 0 DMI Date first Ml Date of first myocardial infarction a 148 0 DS Date first stroke Date first stroke a 137 0 EDU Education Education a 11391 0 EOSIN Eosinophils 0 02 0 05 Blood Eosinophils a 69 0 EXYR Year examination Year examination a 15334 0 FAMH Family History Family History 69 0 FMHRT Family history heart disease Family history heart disease a 5146 1 z 7 x FMSTRK Family history stro
110. tS RES ID Sex alue olic olic D SI l Latvia S2 2 Latvia Hidden feature Predefined queries To allow for the storage and reuse of frequently requested complex queries SAIL introduced the capability to create predefined queries and save them within the database Currently SAIL doesn t have a graphical interface for the creation of the predefined queries and they have to be created directly on the database To do so you need to use two tables expression and expression content In the expression table the user has three columns available e Name optional Here you specify the name of the Predefined query that will be displayed in the user interface If you leave this column empty the predefined query will NOT be available from the interface 1 e you don t want to make subqueries available e Depth Predefined queries consist on a group of nested subqueries where one or many of the subqueries have to return values In the depth field you specify how many of the subqueries that form the final predefined query have to be true For example in a query where we want to retrieve samples with the following combination of parameters available Paraml AND Param2 OR Param3 AND Param4 OR Param5 OR Param6 AND Param7 AND Params would specify a depth of 3 as Param7 and Param amp have to be true and at least one of the other 3 subqueries have to be true In the case of Param AND Param2 the depth would be 2 as both Param1 an
111. ted a 3111 1 0 DGIYISIT_bas Basal visit date basal visit date a 3142 1 0 BASO Basophils 0 02 0 1 Blood Basophils 4 69 4 0 Use split by collection o oroare BICEPS Biceps mm Thickness of a skinfold on the biceps muscle 4 69 1 0 go Specify collections Specific relations BYR Birth Year Birth Year a 14994 4 0 EET iia DGEBIRTH_MONTH Birth month Birth month Fd 0 1 0 a __ lt ran Dna aac 5 Now click on the Extra button Z Welcome E Summary A Report constructor E Report 1 x E Report 2 E Report 3 E Report 4 E Report 5 Study ANY x Collection ANY l ix Report request Parameter list tz Parameter tree Parameter hierarchy fd xs NO FILTER i y Select tag i Columns x cas Code Name Description Fitter Records Y E DG Affected_Statu Affected_Status Affected_Status a 3142 1 1 Bl GYV_AFFY 100k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 100k Set a 3 1 0 GYV_AFFY_5 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 5 0 a 3 1 0E QA_AFFY_500k Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 500k Set a 3 1 0 GVY_AFFY_6 Affymetrix Genome wide Hun Affymetrix Genome wide Human SNP Array 6 0 a 3 1 On GyYV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 3 1 0 AGE Age Age a 33080 t KoraF3 RTALTERU Age Age a 1644 1 0 KoraF3 RTMIALT Age of Ml Age
112. ter BYR O OZ O ToO DO ANTIHYPR Antihypertensives Antihypertens poe phin ooo a NO FILTER DGILALCOHOL Alcohol Have you bee DGLANTIHYP een antihypertens lag Vocabulary Mets ia YP eres Relation _ MetSRelations bynonymDGIBIRTH YEAR APOB Apo B mgd Biochemistry variable Wear O Z o Z To Type INTEGER BMI BMI Body Mass In KoraF3 RTBMI BMI kgm DGIBMI_basal BMI Calculated DGIVISIT_bas Basal visit date basal visit dal BASO Pasophie tita Bipod Basond Use split by collection O use relations BICEPS Biceps mm Thickness off _ Specify collections Specific relations BYR Birth Year Birth Year DGIBIRTH_MONTH Birth month Birth month eit j Parameter Tress and Hierarchies Parameter trees and hierarchies are used to show how parameters are organized and how hey relate to each other Parameters are organized in trees depending on the values of the TAGs in their description They can be organized as belonging to a vocabularies and then to a specific subgroup of parameters within the vocabulary For example you can select to display all the parameters that belong to vocabulary MetS and from Mets all the parameters associated with Disease and within the disease subset all the parameters related to Cancer Trees structures are defined through the administrator interface A Welcome Summary amp Report constructor Study ANY Collection ANY v Report
113. those values selected xy O use relations Specific relations sste Press the OK button and the parameter will be added to the query Notice that the parameter will show a small funnel icon which means that the parameter is A Welcome T Study ANY Parameter list Vocabulary Code Ml MONOC MS NEFA NEUT PLAT POP RCELL RESD SET SEX SMK SMKQ1 SMK 2 STRK STUDY SUBS SUPRAI TAP TC TG TRICEPS TAINZYG UA WST VISTIHIP WT Summary vii T Parameter tree i Mets Name Myocardial infarction Monocytes 0 2 1 0 Metabolic syndrome NEFA moli Neutrophils 2 7 Platelets 150 400 Population Ethnicity Red cell count 4 5 5 5 Residence Type of study setting Sex Smoking status Smoking quantity 1 Smoking quantity 2 Stroke Type of study Subscapular mm Suprailliac mm Total adiponectin Total Cholesterol Triglycerides Triceps mm Zygosity twins Uric acid Waist Waisthip Weight Collection ANY x O amp Report constructor a Parameter hierarchy v Description Myocardial infarction Blood Monocytes Metabolic syndrome by IDF definition Biochemistry Non Esterified Fatty Acids Blood Neutrophils Blood Platelets Population Ethnicity Blood Red Cell Count Country of residence Type of study setting Sex Smoking status number of cigarette day at time of phenotyping number of packyears smoked at time of phenotyping Stroke Type of study Thickness of a ski
114. ting fasting non fasting fasting non fasting fasting non fasting fasting fasting non fasting fasting non fasting GLUCTM Biomaterial plasma serum plasma serum plasma serum RELATIONS ANTIHYPER ANTIHYPR AGE AGEVIS ALCQ ALCQUANT BMIDX BMI BPSD BP DIAB DB GLUC GLU STUDY example example2 example3 example6 examples example9 example 11 example12 example15 example16 TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations TestRelations Oo lt CGC Synonym Synonym Synonym Synonym Synonym Synonym Synonym Synonym Synonym Synonym Partial match Partial match Partial match Partial match ANTIHYPR ANTIHYPER AGEVIS AGE ALCQUANT ALCQ BMI BMIDX BP BPSD DB DIAB GLU GLUC
115. tion URL http ki sefki jsp polopoly jsp d 96 10 amp l en Rennositarv Desenintion Cantacts User Interface How to add parameter into a report You can add particular parameter for instance MetS GLU glucose from harmonised metabolic syndrome vocabulary into a report in the following way 1 Select parameter from the parameter list left panel of the screen ZA Welcome Summary Report constructor Study ANY x Collection ANY v Report request B Parameter list k Parameter tree Parameter hierarchy pad fer x F NO FILTER v Select tag v Columns xP 2 Code a Name Description Filter Records V E ewer Panny sory GUN aSeaSE i Any otor y noat URCUSE Bil vere i a FMSTRK Family history stroke Family history stroke a 7814 1 ip FMT2D Family type 2 diabetes Family type 2 diabetes a 7838 1 1 Gen GENDT Genotyped Availability of genotyping data 33102 1 0 Gen LNGT Longitudinal data Shows that the record belongs to longitudinal data a 6008 1 1 Gen METDT Metabonomics data Availabilty of metabonomics data a 69 1 1 Gen PROTDT Proteomics data Availabilty of proteomics data a 69 1 1 Gen TRCPDT Transcriptomics data Availabilty of transcriptomics data a 69 1 1 GLU Glucose Glucose mMol L a 31068 41 2 wo Glucose medication Glucose medication a 11339 1 1 GVV_AFFY Affymetrix Genome wide gen Affymetrix Genome wide genotyping a 0 0 GYV_AFFY_100k Affymetrix Genome wide Hun Affymetrix Genome wide Huma
116. tion false 2 Definition Metabolic syndrome definitions Parameter false 3 ClasiExample Example of a classifiers Classifier false 5 4 Select a classifier from the list in the pop up window Repeat for a couple of parameters beware that all the parameters in a projection have to be of the same type Metabolic syndrome definitions gt a 7 y l f il i k ihe 5 Once the parameters have been added they can be reordered by using the UP and DOWN buttons 6 Click on Save when finished creating your projection z f cS Classifiers Projections if Study 5 allectior Projection Classifiers Source Vocabulary Classifiers by type Classifier type TestProjection Definition Vocabulary 7 Inthe Parameter Tree view under Report Constructor you can select the newly created projection in the drop down menu C General KoraF3 Mets Create a new study Studies are used to group data from different collections but that share some type of characteristic that made it suitable to be combined Study would be a super level that would be on top of Collection Data can only belong to one Collection but could belong to different studies 1 To create a new study start by selecting Add ions Study Collection Collection View Metadata Import Columns xP Total Samples Eligible Samples Selected Samples Add Edit I
117. ty Antihypertensives Apo B mgA Basophils 0 02 0 1 Biceps mm BMI Month of birth Blood pressure Birth Year Coronary Heart Disease Cholesterol medication CRP Type of diabetes Date first CHD Date first MI Date first stroke Education Eosinophils 0 02 0 05 Year examination Family History Family history heart disease Family history stroke Family type 2 diabetes Glucose v Description Age Gestational age Alcohol grams absolute ethanol week Antihypertensive treatment Biochemistry Apolipoprotein B Blood Basophils Thickness of a skinfold on the biceps muscle Body Mass Index kg m2 Month of birth Blood pressure systolic diastolic mm Hg Birth Year Coronary Heart Disease Cholesterol medication CRP mg L Type of diabetes Date of coronary heart disease diagnosis Date of first myocardial infarction Date first stroke Education Blood Eosinophils Year examination Family History Family history heart disease Family history stroke Family type 2 diabetes Glucose mMol L Louch query aad ee Columns 2 84 24 8 84 Bq 4 84 8g 84 84 9y 84 8g 84 94 8Y 4 94 9Y 84 9q 94 84 84 4 B Records W 33080 7295 13741 16252 11677 1781 69 69 32569 6008 32112 14994 1029 14771 23530 20080 29 148 137 11391 69 15334 69 5148 7814 7838 31068 E Report request Ta xlo A Request o 0 1 a 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0
118. urements Children DM Children had congestive heart Children had congestive heart Children have Ml Cholesterol Cholesterol medication z Cholesterol medication Congestive heart in family Soronary Heart Disease Soronen Heart Disease Date first CHD Date first MI evn neeayest saunannaapennsat even savaaeneanannanetveedseeneten Date test stroke Diabetes Children have Ml osaera ma ae ee ne ee a Oe ee DE a ee Ee EN Cholesterol medication Congestive heart in family _ Coronary Heart Disease Coronary Heart Dissase Bre Tp ae z E zoge E A veayurneanweneeaneannanenvmednesnnnnonsnvecnsneyperaensewrseren 1 yeeseevnevanvnsanacnsnensneannonvantiaensuranennenenanenegunsvsnsnsnunevee act anrseuaanan aansae4 evdersarrsMrvebsave4nedeenrastonen ndsnedsovaenes bust act deedsnra nenbanteaen erdnesnnenaebsacevecdaeiaenensasen pa of first myocardial infarction anarsa pantese raet ase ien artantnet ENGL GOUEMDOARLOADAODINELUBEAEEGAOKONUNCH BOL arrantst ist rastaned seia sisn tist sei serdanis esa restosedserarsa etss ristn Date irat stroke Diabetes Export visible A Export selected New parameter k l Edit parameter EN Data template 3 Now select the option Data Template and select the location to store the template file and click OK KoraF3 RTBMI DGI BMI_basal DGIVISIT_bas BASO BICEPS DGLBIRTH_MONTH DGI BIRTH SYAN i BP DCHER basal DGI DIABCHIL DGI CHRELAT4 DGI MIREL 4 DGI CHOL_basal
119. values can be added during data submission And the other one is a table for predefined values Any number of values can be added to this table Qualifiers section Qualifiers section is similar to the variable section The qualifier and variable forms are similar with the difference that the qualifier form has no Type field Inherited parameters section In Inherited parameters section one can add and remove inherited variables Not only directly inherited parameters will be listed but also all ancestors Such ancestors are here for information and they can t be removed explicitly Classification section Classification section contains tags that are used for parameters classification See classifiers section Relations section Relations section contains Parameter lt gt Tag couples Parameter is destination of relation and Tag describes type of relation Relations are unidirectional so if two parameters are related we need to create entries in both parameters specifying the type of relation Parameters import file format vocabulary import In addition to using a form for parameter input there is the possibility to batch upload new parameters Batch upload file format was designed so that it can be easily prepared using spreadsheet programs like MS Excel Actually this file 1s plain tab delimited text in Unicode encoding Such file can be created by MS Excel by choosing correspondent option Save As Unicode Text File format
120. void gt lt object gt lt void gt lt object gt lt java gt where expressionID 4 and ParameterID 31 update expression content set filter lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0_ 16 class java beans XMLDecoder gt lt object class uk ac eb1 sail client common ComplexFilter gt lt void property Variants gt lt object class java util ArrayList gt lt void method add gt lt object class java util ArrayList gt lt void method add gt lt int gt 232 lt int gt lt void gt lt void method add gt lt int gt 396 lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt where expressionID 8 and ParameterID 191 update expression content set filter lt xml version 1 0 encoding UTF 8 gt lt java version 1 6 0_ 16 class java beans XMLDecoder gt lt object class uk ac ebi sail client common ComplexFilter gt lt void property Variants gt lt object class java util ArrayList gt lt void method add gt lt object class Java util ArrayList gt lt void method add gt lt int gt 41 lt int gt lt void gt lt void method add gt lt int gt 368 lt int gt lt void gt lt object gt lt void gt lt object gt lt void gt lt object gt lt java gt where expressionID 8 and ParameterID 41 WHO DB Type2 or FMT2D or GLU Fasting or GLU Timi
121. when we have a more generic parameter 1 e Glucose Concentration and parameters that may be more specific and look like and extension of the more generic one i e Glucose concentration with Timing If no relation is established between these two parameters queries using the more generic parameter will present incomplete availability counts This will happen as none of the samples with availability information annotated using the more specific parameter will be taken into consideration To avoid such situation parameter inheritance was introduced Parameter GLU Name Glucose Description Glucose mMol L Variable Concentration Type REAL Parameter GLUTM Name Glucose w timing Description Glucose with timing mMol L Inherit GLU Qualifier Timing Variant fasting Variant non fasting In the example above parameter GLUTM inherits parameter GLU This means that GLUTM also has variable Concentration that coincides with the corresponding variable from GLU parameter In a report where GLU must be counted GLUTM will be also counted All qualifiers and variables from the top level parameter will be inherited in the derived parameter If the derived parameter doesn t add any own variables or qualifiers then such parameter will be a full alias of the basic parameter This may be useful in cases where we need to have two different names for one parameter like when we want to add multiple language support Parameter classification Having created
122. ws the generation of complex queries to get as fine a report as required A major part of biomedical investigations is data collection and annotation The best way to annotate collected data and metadata is to use commonly accepted terms and notions There are a number of projects for developing sets of such terms for different fields of bioinformatics SAIL also deals with the description of submitted data A big part of SAIL is a subsystem to create manage and classify parameters that are used for data descriptions What is a classifier in SAIL Classifiers in SAIL are the basic descriptor units and are used to add information to Parameters and Collections of parameters in a structured way SAIL allow for 6 types of classifiers e Parameter This is the main type of classifier A classifier of this type will be used when creating a vocabulary to add features to a parameter Parameter classifiers are comprised by Name Type and Tags as mandatory fields and Description and Classification as optional fields There are also two radio buttons to specify if one parameter can have more than one values of this type of classifier and to specify is the classifier is mandatory so a parameter so have it defined when creating a vocabulary One example would be a classifier of name Vocabulary This would be used to specify to which vocabulary a parameter belongs As SAIL may contain various vocabularies defined they are defined as tags in the classifer Wh
123. xample Description Type Classifier M C Allow multiple tags C This classifier is mandatory Tags Tag name Info f Add Tag Edit Tag Remove Tag Classification v Save Cancel 3 To add Tags to the Classifier select Add Tag We are going to add three tags Cold Warm and Hot Params Enter the name and description of your parameters and click OK Collection view Metadata Import Add classifier Add classifier Name ClasiExarnple Description Type Classifier v Allow multiple tags C This classifier is mandatory Edit Tag x Tags 3 Name Hot Params Tag name Info Da sided i Description Cold Params Warm Params Add Tag Edit Tag Remove Tag Classification i ancel Save Cancel 4 Once finished creating your Classifier click on Save Vocabulary Repository Description Classifier type MetSRelations Definition ClasiExample Classifier Vocabulary Repository Description Classifier type MetSRelations Definition 6 Click on the Edit button LINOFLTER Selecttag _ Description Classification by relation to some dictionary Set of tags for structured description of Repository Classifier for classifiers classification Metabolic syndrome vocabulary relations Metabolic syndrome definitions Description Classification by relation to some dictionary Set of tags for structured description of Repository Classifier for classifiers classification Metabolic syndrome vocabulary

Download Pdf Manuals

image

Related Search

Related Contents

SMP-U10  Divosan Forte VT6 FT  Norstone Magic 3255  Poulan P4500 User's Manual  ecológico huerta ecológico horta  Minox DC 6311 Digital Camera  IN1145 Operating Instructions Harmony 1 and  Manual Power Pack HI-C10 2000mAh January 2011 for  Kenmore Elite 30'' Warming Drawer - Stainless Steel Owner's Manual (Espanol)  Betriebsanleitung Sterilgut-Transportwagen  

Copyright © All rights reserved.
Failed to retrieve file