Home
RDSAT 5.6 User Manual - Respondent Driven Sampling
Contents
1. Oe ez Ba o wc Sl e Zar 113 e f me ED 154 2 1 EN 1 1 1 1234 1111 2 1 1 1002 1 5 3 1 1002 1 1 EI 4 1 2128 4563 453 5 1 2546 4563 4565 ER 6 1 5452 7456 2314 ER 7 1 4566 4564 4564 FIGURE 1 11 Imported RDSCM Data Change the network sizes to their appropriate values and save the data as described in the section entitled Preparing Data from Excel Figure 1 11 shows fictitious data exported from RDSCM 12 RDS INCORPORATED Loading Viewing and Editing Data in RDSAT his chapter covers how to load data into RDSAT Topics covered include loading RDSAT format files setting options for analysis and viewing editing the data Loading Data RDS Analysis Tool File Analyze Help Rds Data File f C Documents and Settings Doug HeckathorniMy DocumentsttreyiRDSAT5 6 manualiman Data Included Add Data Analyze Breakpoint Age Edit Dat Change Options IRace WBO a ele Tt Estimation Network Sizes and Homophily Graphics and Histograms Respondent Driven Sampling Analysis Tool v 5 6 0 If you are new to Respondent Driven Sampling refer to the documentation included with this distribution More help and resources are available on the web at http www respondentdrivensampling ora Bak Volz emv7 comelledu Douglas Heckathom douglas heckathom comell edn Department of Sociology Comell University FIGURE 2 1 RDSAT Open New RDS Button First open the c
2. 47 RDS INCORPORATED In our case we want the prevalence of HIV among males within the population Thus the numerator is Group 1 1 HIV positive males and the denominator contains BOTH Group 1 1 and Group 1 2 non HIV positive males Estimate Prevalence xl Denominator Mumerator Group 11 Select Group o Speri O Once the analysis is performed the output will appear in a new tab called ratio The output contains a prevalence estimate and confidence interval for that estimate as well as those groups used by the function and Key of Group and Trait Correspondence 48 RDS INCORPORATED In our example 14 9 of males are estimated to be HIV positive The confidence interval for this estimate is 10 3 to 19 6 Thus we are 95 confident that between 10 3 and 19 6 of males are HIV positive in this population RDS Analysis Tool Oj xj File Analyze Help asili Data Add Dete Analyze Breckpoint LESCE f ale gesch Edt Data Change Options f Recrutment d Estimation Network Sizes and Homophily Ratio Composition Key of Group and Trat Correspondence 49 RDS INCORPORATED The RDSAT File Menu T he RDS Analysis Tool File Menu has multiple features located in it This chapter describes how to use them RDS Analysis Tool File Menu Features New RDS This feature allows one to open a new RDS data set The button Open New RDS on the m
3. Estimate Number of Waves Required in RDSAT s Analyze menu This will cause the window of Figure 6 2 to appear Then select a starting group for a hypothetical sample Next choose a convergence radius The smaller this number the higher the confidence intervals will be However the dataset will take longer to analyze The default is 02 which should serve as a good starting point A radius of 02 means that the population proportions will change by less than 02 with further recruitment In other words the sample population proportions ate considered converged at equilibrium when the change in population proportions in between waves is less than the convergence radius times of the population proportions Select analyze and this utility will use the Markov process implicit in RDS INCORPORATED the calculated transition probabilities to check how many waves ate required for the sample population proportions to reach equilibrium The results of the analysis will be output to a new report page See Figure 6 3 fs Group with Initial Recruit Convergence Radius FIGURE 6 2 RDSAT Waves Estimation Window File Analyze Hep Reis Data File C Progrem Fiesrdsatrwormit i gi Data Included Number Of Waves Required tory of convergence of sample population proportions Wave number 1 Group Number 1 1 0 Group Number 2 0 0 FIGURE 6 3 RDSAT Waves Estimation Figure 6 3 is ascreenshot of the waves estimation output
4. 14250022 14250023 14250023 ag 700 FIGURE 1 5 Sample RDS Data in SPSS Tf the data you wish to analyze is in an SPSS spreadsheet see Figure 1 5 you may convert it to the RDS format by copying and pasting the data into an excel spreadsheet First organize the columns so that the main data set appears in the standard RDSAT format namely Respondent ID Self Reported Network Size Coupon Received from Recruiter Coupons given to Respondent C1 to C3 in Figure 1 5 and finally other variables you want to analyze like gender race age etc RDS INCORPORATED Note In this sample data set the variable label for Respondent or Survey ID is rid for the network size is net for the coupon received from the recruiter is coupon for the coupons given to respondents is C1 C4 These variable labels may look different when exported from RDSCM 3 0 or from the questionnaire data file Variable Variable Label Data Source Survey ID Survey ID RDSCM 3 0 Network Size Degree Questionnaire data file Coupon received from recruiter Coupon_submitted RDSCM 3 0 Coupons given to respondent Coupon_given_0 RDSCM 3 0 ce ha ST Coupon given Coupon_given_2 RDS INCORPORATED NYJazz SPSS Data Editor fx File Edit View Data Transform Analyze Graphs Utilities Window Help ea 0 A E aaa a i airplay coupon ar a 1 Eu 4 42 4 9
5. N R Equal Edge Lengths Starting Positions Current positions No of iterations 1000 If you get overlapping chains increase this Distance Between Components 10 Proximities geodesic distances 61 RDS INCORPORATED Click OK and you should see your recruitment chains The Attribute File The attribute file is VERY similar to the RDSAT data file To make it 1 Open the RDSAT data file with Excel 2 Replace RDS with node data in the first line all lower case no space between and node 1 space between node and data 3 Replace the sample size tow 2 column 1 with Respondent ID 4 Delete the columns of Coupon s since they are not needed 5 Save the file as a Tab delimited text file do not overwrite your RDSAT file 6 Go back to NetDraw and select Filed Open DVNA Text File Attributes In the popup select the file you just saved and Select the Node Attribute s bullet under Type of Data Click OK 7 Your attributes are now loaded 8 NetDraw is almost completely interactive and fairly straight forward to use You can control individual nodes by clicking on them or groups of nodes by using the popup menus on the side For example select Properties Nodes Color Attribute based This will bring up a popup box with a pull down menu with all your attributes in it Selecting an attribute will color code the node for that attribute A
6. gt Age Edit Data Race WwBO Estimation Network Sizes and Homophily Graphics and Histograms Respondent Driven Sampling Analysis Tool v 5 6 0 If you are new to Respondent Driven Sampling refer to the documentation included with this distribution More help and resources are available on the web at http www respondentdrivensampling ora Rak Volz emy comell edu Douglas Heckathom douglas heckathom comell edu Department of Sociology Comel Univesity FIGURE 3 3 RDSAT Analyze Breakpoint Button To analyze a breakpoint click on Analyze Breakpoint in the main window see Figure 3 3 A Breakpoint analysis can be done on any trait but it is more effective to use traits with many variables such as age in the data set of New York jazz musicians The bound fields allow the range of variables to be chosen over which the breakpoint will be set For example from the NYC Jazz dataset located in the RDSCM distribution folder see Chapter 2 for details age is selected from the drop down list The step size is set to 1 and 25 and 50 are entered for the lower and upper bound see Figure 3 4 This will perform a breakpoint analysis for groups above and below 25 then above and below 26 and so on 22 RDS INCORPORATED Breakpoint Analysis alo Trait to Analyze Age Lower Bound Upper Bound Step Analyze Td FIGURE 3 4 RDSAT Breakpoint Analysis Window In the above window we are s
7. Data gt 4287604 42 373 42675 6 337 426760 9 Impute Median Values N 4267607 25 355 4267608 352 267609 Ht Impute Degree 67610 12 313 457611 Add Field Sample Weights 420X612 GT DE err 42676 Attributes Sample Size 595 426761 Ga S 4267617 Age Number of Coupons per Recruit 3 4267618 4267619 New Value for Missing Data Value for Missing Data 9999 4267620 4267621 Po 4267622 4267623 4267624 Commit Changes 4267625 4267626 341 4267627 4267628 305 4267629 308 4267630 302 4267631 4267632 Gei 4267633 254 4267634 257 4267635 263 GI1 gt 1el 1 gt 11 gt 11 1 1 1S Te 1S Te 1 Te l lol 1 1 Iv lol 1 11 gt I FIGURE 5 1 RDSAT Replace Missing Data Impute Median Values This feature calculates the median value of the trait being analyzed and replaces all missing data cells with this median value First click Impute Median Values on the left side of the Edit Data screen Select the trait you want to replace values in and click Commit Changes To make the changes permanent click Save RDS Data File 41 RDS INCORPORATED letwork Size Own Coupon Coupons oupons x ONS ender m ny ree Save RDS Data File 4267601 6 353
8. The actual output is listed below for a partition analysis of the New York Jazz dataset See Chapter 2 for more information on this dataset 45 RDS INCORPORATED Number Of Waves Required 4 History of convergence of sample population proportions Wave number 1 Group Number 1 1 0 Group Number 2 0 0 Wave number 2 Group Number 1 0 836 Group Number 2 0 164 Wave number 3 Group Number 1 0 79 Group Number 2 0 21 Wave number 4 Group Number 1 0 778 Group Number 2 0 222 What this information means is that it took a total of 4 recruitment waves before the population estimates changed by less than 02 times the population proportion Assuming a convergence radius of 02 As we can see the change in proportion estimates of Group 1 from wave 3 to 4 is 79 778 012 which is less than 02 79 0158 The same is true of Group 2 Save RDS Analysis in the File menu Allows the report pages from the analysis to be saved to a formatted html file The analysis can then be viewed at any time with any web browser and it can be cut and pasted onto most spreadsheets In the current version of RDSAT only saving to HTML is possible however copying and pasting should allow the data to be imported into many applications including plain text editors Estimate Prevalence Prevalence estimation is now possible with RDSAT 5 6 As an example we will determine the HIV prevalence and confidence interval among males in an RDS sam
9. estimation algorithms in RDSAT 1 LLS Population Weights Multiplicative factors by which the Least Squares Estimates are different from the naive estimates 2 Data Smoothed Population Weights Multiplicative factors by which the Data Smoothed Estimates are different from the naive estimates Recruitment Component RCx The recruitment component of the RDS estimator RCx DCx RDS estimator for group x Degree Component DCx The degree component of the RDS estimator RCx DCx RDS estimator for group x Standard Error of P The standard error of the RDS estimator based on the RDS bootstrapping algorithm Confidence Intervals Are obtained by bootstrapping the original sample The confidence intervals only correspond to the Least Squares population estimates and can be set in the options panel click options in the main window 30 RDS INCORPORATED Network Sizes and Homophily This tab displays Homophily Affiliation and Average Network Sizes Network Sizes and Homophily Graphics and Histograms Adjusted Unadjusted Average Net Average Net Sizes Sizes Key of Group and Trait Correspondence FIGURE 4 4 RDSAT Single Variable Partition Analysis Network Sizes Tab Adjusted Average Network Sizes Network sizes are adjusted for sampling bias In a chain referral sample those with more connections and larger personal network sizes tend to be ovet represented in the sample This can poten
10. feature patses the data so that each cell in the recruitment matrix is as close as possible to the number entered in the field The default is 30 because most statistical procedures require each cell in the matrix have 30 or more cases The results are interpreted in the same way as a partition analysis Custom This allows partitions to be created based on non overlapping ranges of values For instance selecting a trait such as age and using a custom partition with parameters 10 20 21 30 31 40 41 50 would create 5 groups based on 5 intervals of ages Each range must be enclosed in curly braces and delimited with commas Ranges should not overlap Upper and lower bounds may be the same however e g 30 30 if a group must be based on only one value Note It is very easy to create a partition with a great number of groups such as by selecting complete with a trait with many values e g age In general the amount of data is insufficient to handle partitions with such a large number of groups and the analysis will fail 21 RDS INCORPORATED Breakpoint Analysis Breakpoint analysis allows one trait to be analyzed over a range of possible breakpoints This is very useful for continuous variables such as age RDS Analysis Tool DER Fie Analyze Help Rds Data File ji _ AA C Documents and Settings Doug HeckathornWMy DocumentsttreyiRDSAT5 6 manualiman la Data Included Add Data Analyze Breakpoint
11. h3 192 250 251 252 40 lo lo o 13 REGER laz67633 o 128 253 254 255 40 a 4 o 23 1 031642177 la267634 17 157 256 257 258 E 2 h o 17 1031642177 l4267635 19 ER 262 263 264 St lo o 4 19 0 966270318 Jaze7e36 17 215 259 260 261 29 2 lo 0 17 0 966270318 la267637 Wa 38 274 275 276 31 4 4 0 13 1031642177 laze7638 20 air 277 278 279 17 4 lo 0 20 0 966270318 4267639 e 240 280 281 282 l42 a h a 16 1 031642177 la267640 23 258 283 284 285 39 A h a 23 1031642177 14267641 ha 285 289 290 291 43 2 i o 14 1031642177 42567642 7 199 286 287 288 E 2 H 0 7 1 031642177 42567643 e ES 295 295 297 37 lo h 1 6 11031642177 FIGURE 5 3 RDSAT Add Field Sample Weights 43 RDS INCORPORATED Extra RDSAT Features T he RDS Analysis Tool has several extra features that will be discussed in this chapter Estimate Number of Waves Required RDS Analysis Tool Analyze Partition Analyze Breakpoint Estimate Number Required Estimate Prevalence FIGURE 6 1 RDSAT Estimate Number of Waves Required Menu Item This feature allows hypothetical recruitment scenarios to be examined A group is selected to be the initial recruiters and they are allowed to recruit based on their transition probabilities until the population proportions converge to the actual sample proportions This helps in determining how many waves of recruitment are necessary before the population is at equilibrium First click on
12. the number of positives recruited by negatives However in the demographically adjusted matrix these will be if not equal at least strongly correlated Sample population sizes Reports the total number of recruits in each group Initial Recruits Reports the number of seeds from each group i e people recruited by the researcher in each group Note Much of the data reported above also have corresponding data smoothed estimates Data Smoothing is a method for eliminating deviations in cross group recruitments that occur due to chance For more information about data smoothing refer to Douglas D Heckathorn 2002 Respondent Driven Sampling Il Deriving Valid Population Estimates from Chain Referral Samples of Hidden Populations Social Problems v 49 No 1 pages 11 34 27 RDS INCORPORATED Estimation Displays estimates of population proportions Recruitment Network Sizes and Homophily Graphics and Histograms Population estimates Confidence Interval alpha 0 05 NOTE Cannot be Calculated Key of Group and Trait Correspondence FIGURE 4 3 RDSAT Single Variable Partition Analysis Estimation Tab RDS INCORPORATED Total Distribution of Recruits The raw count of recruits in the data set for each group The total is the sample size minus the number of seeds Estimated Population Proportions The estimated population proportion can either be calculated using the linear least squares alg
13. 349 350 351 4 o a 1 031642177 Impute Median Values 4267607 D 317 355 356 44 i it o 25 1 031642177 4267608 6 324 352 354 33 lo o o 6 0 966270318 14267609 4 223 EI 311 512 40 2 h 0 14 1031642177 impute Degree Jemen 12 ES 313 314 315 a l2 h 0 12 11 031642177 4267611 o 7 316 317 318 30 a h a 10 1031642177 r 4267 239 319 320 321 135 4 h 1 40 1031642177 de 7613 B 225 322 323 324 40 4 h 0 a 1 031642177 14267614 25 ES 325 326 327 45 lo lo H 25 0 966270318 Sample Size 14267616 14 129 328 329 1330 45 H H o 14 1 031642177 la267617 o 161 334 335 336 52 a h 0 32 1031642177 Number of Coupons per Recruit 3 426761817 292 331 332 333 49 lo H o 17 IA la267619 9 at 343 344 345 39 a lo 1 24 0 966270318 Value for Missing Data 9999 Jaze7e20 12 E 271 272 273 Ei lo h 1 12 1031642177 4267621 42 248 265 266 267 Ei 4 h 0 12 1031642177 l4267622 15 247 268 269 270 36 lo i 0 15 1031642177 142676239 197 247 248 249 Sr i 0 1 9 0 966270318 la267624 o 190 241 242 243 28 a h a E 1031642177 la267625 np 232 238 239 240 38 a lo o 10 REGER a267628 9 516 340 EU 342 39 a h a 39 1 031642177 4267627 10 265 346 347 348 43 a h 0 10 1031642177 4267628 e 290 304 E 306 Sr 4 h 0 52 1 031642177 4267629 e 257 307 308 309 26 2 i 0 96 1031642177 142676308 ee 301 302 303 34 i o 0 8 0 966270318 14267631 3 log 244 245 246 27 a h 0 a 11031642177 la267632
14. 367 368 369 52 1 1 lo 6 4267602 33 336 370 371 372 33 1 4 lo 33 267603 20 309 376 377 378 VE 2 o lo 20 Replace Missing Data D SC 42 352 373 374 375 53 4 lo lo 42 EEE DE 6 308 bor 338 1339 49 2 o lo 6 1426760 9 318 349 350 351 36 4 i lo 9 14267607 25 317 355 356 1357 44 1 4 lo 25 14267608 324 352 353 354 33 0 lo lo 6 267609 4 223 310 311 312 ao 2 a lo 14 gero Ga D 313 bia as Jo 2 1 o Je 14260811 10 307 316 317 318 80 4 1 lo 10 Sample Size 595 42676 40 239 319 320 321 las 1 1 h 40 4267613 9 WC 922 _p23 824 40 u u Bi 9 Number of Coupons per Recruit 3 4267614 25 H 325 326 327 We a o h 25 4267616 1 a lo 14 Value for Missing Data 9999 SE 82 d e b 32 4267618 17 2 lo 17 14267619 24 2 h 24 14267620 12 3 SelectedNgtribute will have missing data replaced with median values i 12 14267621 12 d lO 42 14267622 M5 2AN Attributes o 15 14267623 9 g h 9 4267624 32 D e D D Ji 14267625 10 2 jo 10 14267626 39 3 Commit Changes fo 39 4267627 no a Jo ho 14267628 52 2 lo 52 14267629 96 257 207 308 309 pe 2 i lo 96 4267630 8 266 301 302 303 34 4 o lo 8 14267631 9 194 244 245 246 27 1 1 lo 9 14267632 13 192 250 251 252 ap o o lo 13 14267633 23 128 253 254 Ges 40 4 i lo 23 14267634 47 157 256 257 258 27 2 1 fo 17 14267635 19 214 262 263 264 El 0 o 4 19 FIGURE 5 2 RDSAT Impute Median Values Impute Degree This feature imputes missing values on degree Network Size To use this feature fi
15. 54 y A Variable view SPSS Processor is ready FIGURE 1 6 RDs Data highlighted in SPSS Highlight all relevant columns in the dataset To do this first click on the left most column header this should highlight the entire first column Next hold down the Shift key and press the right arrow key until all the desired fields have been highlighted see Figure 1 6 Finally either press Ctrl C on the keyboard or click Edit gt Copy on the menu screen to copy the data to the clipboard Paste this data into the third line of a blank excel spreadsheet see Figure 1 7 and add the relevant header information described in the previous section entitled Preparing Data from Excel Note If there are missing data entries in the SPSS dataset they will be denoted by a period X However RDSAT only accepts integers in the dataset Before saving to the Tab Delimited Text Format you must replace all occurrences of a period to the missing data value integer This can be done by pasting the data into Excel Figure 1 7 and clicking Edit gt Replace in the Excel menu bar In the window that appears type a period in the Find what textbox and the missing data value in the Replace with textbox see Figure 1 8 Then click Replace All RDS INCORPORATED Fd Microsoft Excel Book3 FIGURE 1 8 Excel replace dialog window RDS INCORPORATED Preparing data from SAS If the data to be analyzed is in a SAS data f
16. Data Smoothed Recruitments Probabilities Demographically Adjusted Recruitment Matrix Key of Group and Trait Correspondence FIGURE 4 2 RDSAT Single Variable Partition Analysis Recruitment Tab RDS INCORPORATED Note Seeds are not meluded in the sample population sizes Key of Group and Trait Correspondence The green Key of Group and Trait Correspondence at the bottom is used to interpret the data related to recruitment in the analysis It lists all of the various groups that were analyzed and relates them to the trait they have in common In this example Group 0 corresponds to Race 1 Looking at the Race variable we see that the races are listed in parentheses by their initials WBO W White B Black O Other So Group 0 corresponds to the first race in the list White Group 1 corresponds to Black and Group 2 to Other Recruitments Matrix of recruitments to and from each group The vertical axis rows depicts the recruiters and the horizontal axis columns show recruits For example this matrix tells us that Group 0 recruited 36 other people in Group 0 Transition probabilities Normalizes recruitments by dividing by the total number of recruitments and gives the probability of one group recruiting another For example Group 1 recruited 94 from the same group and so the normalized transition probability is 94 94 32 18 652 where the denominator is the total num
17. Douglas Heckathom donglas heckathom comell edu Department of Sociology Coml Univesity FIGURE 3 1 RDSAT Analyze Partition Button 19 RDS INCORPORATED A partition is a user defined set of groups Everyone in the population belongs to a group in a partition The groups are defined by common traits For instance a simple partition would consist of just one trait such as gender Those with a gender of 1 would form one group those with gender of 2 another A multi trait partition of race and gender can also be created A group would then be defined by both a gender and race value For example tace gender 1 1 would be a separate group from tace gender 2 1 although both groups have the same gender Analyze Partition Attributes Attributes to be analyzed E rr O Analyze FIGURE 3 2 RDSAT Analyze Partition Window The partition panel is divided into three parts see Figure 3 2 The top left contains a list of all traits that may be used for analysis The top right contains a list of all traits that will be used to make the partition The bottom contains options for parsing the trait data To include a trait in the partition select it and press the right atrow To remove it from the partition select it and press the left arrow For each of the traits included in the partition how to parse the data values must be selected Data Parsing Options Complete This option will find every di
18. Hidden Populations for HIV Surveillance By Robert Magnani Keith Sabin Tobi Saidel and Douglas Heckathorn In AIDS 2005 57 RDS INCORPORATED Appendix 1 The RDS Data File Components of Core Data Files Note that all data outside of the first two lines must be integer valued Header on line 1 Every core data set must begin with the string RDS on the first line Parameters on line 2 From left to right the second line must contain the following integer valued information Sample Size Maximum number of coupons received by a recruit in the sample Value for missing data This value will be used throughout the analysis to refer to missing data It will over ride all other values so it is important to choose an integer value that will not occur elsewhere in the data Main data Subsequent lines contain the main recruitment information with each line corresponding to a recruit Arrange the columns from right to left as followed Recruit ID an integer code acting as the recruit s name Personal Network Size The serial number of the coupon the recruit recieved NOTE if the recruit is a seed then this number must be set to the missing data value Serial numbers of the coupons given to the recruit This data will take up the number of columns specified by the max number of compons given to a recrnit parameter specified on line two If the recruit was given a number of coupons l
19. Problems 44 174 199 o The original article in which RDS was introduced Respondent Driven Sampling II Deriving Valid Population Estimates from Chain Referral Samples of Hidden Populations By Douglas D Heckathorn Social Problems 2002 o Article extending the RDS method to include calculation of standard errors and post stratification to control for differences in network size and clustering across groups Salganik Matthew J and Douglas D Heckathorn In press December 2004 Sampling and Estimation in Hidden Populations Using Respondent Driven Sampling Sociological Methodology o Article showing through both analytic means and simulations that the RDS population estimator is statistically unbiased o Outstanding Article Publication Award of the Mathematical Sociology Section of the American Sociological Association Extensions of Respondent Driven Sampling A New Approach to the Study of Injection Drug Users Aged 18 25 By Douglas D Heckathorn Salaam Semaan Robert S Broadhead and James J Hughes AIDS and Behavior 2002 o Empirical evaluation of some of the assumptions underlying RDS and its use to study younger drug injectors Group Solidarity as the Product of Collective Action Creation of Solidarity in a Population of Injection Drug Users By Douglas D Heckathorn and Judith E Rosenstein Advances in Group Processes 2002 Development of a Theory of Collective Action From the Emergence of Norms to AIDS Pr
20. RDS INCORPORATED RDS Analysis Tool v3 6 User Manual RDS INC RDSAT 5 6 User Manual O RDS Incorporated 45 Beckett Way Ithaca NY 14850 607 255 4368 Publication Date December 6 2006 Table of Contents Installing the RDS Analysis Tool V5 6 reenen 1 Basic Layout Information EE 2 Preparing Data from Excel AEN 3 Preparing Data from SPSS ii 6 Preparing data from SAS surco cian cdi id 10 Preparing Data from the RDS Coupon Manager 11 loading Patania A a ess 13 Viewing Data EE 15 Setting Options For Analysis 17 Adjust Average Network Gtzes nono nnnnnnnnnns 17 Number of Re Samples rrr tteetnrennrr rt rrerennnne nnn 17 Confidence Interval csraiie ieri 18 Pull In Outliers of Network SIZES nn nnereeenenn nn 18 Algonthim Types ht eet a too 18 Partition Analysis hides 19 Data Parsing Option Sirera tordo 20 COMPILA a 20 A A RI 21 Analyze Continuous Variable nn nn nnnnnnnos 21 CUSTOM A IO 21 Breakpoint ANAlYySis i 22 Interpreting a Partition ANAlySsiS eeen nn reren 24 Si EEN 25 Key of Group and Trait Correspondence 26 Recrulimens saio nadal 26 Transition probabilities tn nnrennennnr nn tnnnnennnnn nnn 26 Sample population SIZES 27 Initial RECTUS 2 oa iene 27 Sne ET 28 Network Sizes and Homop
21. ain screen serves the same function View Edit RDS This feature opens the Edit Data screen The Edit Data button on the main screen serves the same function Save RDS Analysis This feature saves an RDS partition analysis in the form of a text file It can be imported to Excel as a delimited file Print This feature prints an RDS analysis Export DL Network File Allows a DL network file to be exported to the recruitment chain data DL format is recognized by numerous network analysis packages including UCI net and Pajek Pajek in particular can be used to create attractive social network visualizations as seen in Figure 7 1 RDS INCORPORATED FIGURE 7 1 Pajek Generated Social Network Visualization UCINET http www analytictech com ucinet_5_description htm PAJEK http vlado fmf uni lj si pub networks pajek Export Population Weights This function exports a text file of Population Weights From Population Estimates table under Estimation tab See XX XX for each respondent based on the most recent partition analysis Weights are linked to respondents by the respondent ID Export Individualized Weights This function export a text file of individualized RDS weights for each respondent The weights are calculated based on respondents individual degrees and the latest partition analysis performed When generated for a dependent variable these weights can be used to weight the entire data set f
22. ampling Analysis Tool v 5 6 0 If you are new to Respondent Driven Sampling refer to the documentation included with this distribution More help and resources are available on the web at http www respondentdrivensampling ora Erik Volz emy comell edu Douglas Heckathom donglas heckathom comell edu Depastment of Sociology Comell Univesity FIGURE 2 2 RDSAT Edit Data Button View the data loaded by clicking on the Edit Data Button or select View Edit RDS from the file menu A new window will pop up displaying the contents of the data files you have loaded see Figure 2 3 Sample size 264 the value for missing data 0 and the number of coupons per respondent 7 are displayed on the left The table columns may be rearranged by clicking and dragging them Click on Save RDS Data to save the data loaded into one file with an rds extension The next time this file is loaded all data including the core and trait data will load automatically Trait data is any variable that is not core data Core data consists of the respondent id network size and coupons Trait data can be Race Age etc Notice that when a cell in the table is clicked on its contents may be changed The changes will be saved to any data file created with the Save RDS Data button Note Be careful not to delete data unintentionally 15 RDS INCORPORATED Replace Missing Data Impute Median Values Im
23. at SAS will export the data as a space delimited data file and not a tab delimited data file RDSAT is capable of reading both file types The completed data file will resemble the example below RDS 530 11 0 sex agecat race 3 3310000000000 2520000000000 50 3 17 608 607 609 18 0 5 6 00 N N N 10 4 20 21 414 416 41 40 17 25 23 24000 2 Y O OT 0 0 0 2 0 CO OO ONN OO ON bh kA OO ONN Nooo NO OF RDS INCORPORATED Preparing Data from the RDS Coupon Manager Text Import Wizard Step 1 of 3 FIGURE 1 9 Excel text import window To load data exported from the RDS Coupon Manager click File gt Open in Excel s menu bar and select the exported data The window of Figure 1 9 should appear Select Delimited in the file type section and click Next 11 RDS INCORPORATED Text Import Wizard Step 2 of 3 2 x This screen lets you set the delimiters your data contains You can see how your text is affected in the preview below IV Treat consecutive delimiters as one Text qualifier I z Semicolon I comma Other EI FIGURE 1 10 Excel text import window In the next wizard screen be sure to check the box entitled Space You should see the data line itself up properly at this point see Figure 1 10 Finally click Finish EA Microsoft Excel FromRDSCH_txt H File Edit View Insert Format Tools Data Window Help X
24. ber of recruits Group 1 made Demographically adjusted Recruitment Matrix Gives hypothetical recruitments if each group recruited with equal effectiveness Transition probabilities implied by this matrix are identical to those of the original Recruitment Matrix It is well known that some groups of respondents recruit more than others e g HIV positives often recruit substantially more than do negatives This is shown in the recruitment matrix if the number of recruitments by HIV positives 1 e the row sum in the matrix exceeds the number of recruitments of HIV positives i e the column sum in the matrix The demographically adjusted recruitment matrix shows what the recruitment matrix would have looked like if all groups had recruited equally 1 e so row and column sums are equal without any change in recruitment patterns 1 e no change in transition probabilities This type of adjusted matrix is useful for testing one of the assumptions of the statistical theory on which RDS is based which holds that if recruitment effectiveness is uniform across groups cross group recruitments will tend to be equal Therefore the cross group recruitments in the adjusted matrix will differ only by amounts consistent with stochastic variation 26 RDS INCORPORATED Thus if positives recruit more than negatives then in the original recruitment matrix all else equal the number of negatives recruited by positives will tend to be greater than
25. coupons given to each respondent the symbol for missing values In this sample dataset the number of respondents in 264 the number of coupons distributed to each respondent is 4 and 0 entries are treated as missing data E Microsoft Excel Book3 SI 2 Edit View Inset Format 2 Data Window Help Acrobat 1 2 3 585 400 5 150 6 7 100 ann FIGURE 1 2 Sample RDS Data in an Excel Spreadsheet SO ao rea Sa a Bes 14250004 14250007 14250010 14250025 14250022 14250028 1A25NN1R 14250005 14250008 14250011 14250026 14250023 14250029 1A95NNA7 14250006 14250009 14250012 14250027 14250023 14250030 1125NN18 Note In this sample data set each recruiter is given 4 coupons to distribute and the coupon numbers are 8 digits RDS INCORPORATED EAEE e _ o A G H J K L FI Gender Race Age Airplay 7 14256002 1 1 40 1 14256003 1 2 64 1 14256004 2 3 41 1 14256009 2 2 TT 0 14256008 1 1 da 1 14256010 1 3 31 2 14256006 1 2 70 1 FIGURE 1 3 Excel Spreadsheet Custom Field Headers and Data Column headers must be entered for all fields other than the main data set i e respondent or survey ID network size coupon received from recruiter coupons given to respondents such as Gender Race Age etc If a data value corresponds to a specific group for example if a value of 1 corresponds to Male and 2 to Female you can indicate this in the data set Abbreviat
26. detailed discussion of the various features of NetDraw is beyond the scope of this document
27. e the group with a single character for example m for Male and f for Female Add the abbreviations in order of increasing value to the gender header surrounded by parentheses In this example the resulting header would be Gender mf Similarly to indicate for the Race header that Whites correspond to group 1 Blacks to group 2 and all other races to group 3 you may use Race WBO RDS INCORPORATED 5 RAData90 nygender a E uwen zz RADatas0 NYEth LMSM_SF2 RADataS nyeth4 LMSM_SF1 PS i RAData30 j E RAData3 LMSM_CH8 Nybreak LMSM_CHS NYAirU LMSM_CH4 nyai LMSM_CH2 History LMSM_SF9 E CUYear2_4 LMSM_SF8 CUYear Z nyjazz4 LMSM_SF7 cmtest2 E NYJazz LMSM_SF6 Favorites nygroup LMSM_SFS b Br A L lz riename rz El Web Folders eem Save as type Text Tab delimited y Cancel FIGURE 1 4 Excel Save As Dialog To save this data set to a file choose Save As and choose the Text Tab Delimited format RDS INCORPORATED Preparing Data from SPSS EN YJazz SPSS Data Editor File Edit View Data Transform Analyze Graphs Utilities see 9115 Bel al de Sable e AAA AA ee REH 14250007 14250008 14250009 14250010 14250011 14250012 14250042 300 14256002 14250013 14250014 14250015 aa 14250019 14250020 14250005 14250031 14250032 14250033 14250004 14250034 14250035 14250036 a 3 4 14250025 14250026 14250027 5 L
28. each group The vertical axis rows depicts the recruiters and the horizontal axis columns show recruits Re samples This is the number of times random subsets of the data are sampled to derive the bootstrap confidence intervals More re sampling will result in better confidence intervals but will be more CPU intensive Respondent A participant in an RDS sampling study Respondent ID A unique integer representing a respondent in a given RDS dataset Sample Population Proportions The naive estimates of population proportions without correction of over sampling and other biases Sample Population Sizes The total number of recruits in each group Self Reported Network Size The number of individuals a respondent reports he or she has in his her network Transition Probabilities Normalizes recruitments by dividing by the total number of recruitments and gives the probability of one group recruiting another Unadjusted Network Sizes A straight forward arithmetic mean of the sample s network sizes Waves Estimation This feature allows hypothetical recruitment scenarios to be examined The sample population proportions ate considered converged when the change in population proportions in between waves is less than the convergence radius times of the population proportions 55 RDS INCORPORATED References Respondent Driven Sampling A New Approach to the Study of Hidden Populations By Douglas D Heckathorn Social
29. eeeeee e 59 Appendix 3 Graphing Recruitment Chains with NETDraw 61 A ei iii 61 The Attribute File nale lella ile 62 RDS INCORPORATED Ci i RDSAT 5 6 Basics Topics covered include installing the Analysis Tool and preparing data from his chapter will introduce the basics of the RDS Analysis Tool version 5 6 SPSS Excel and the RDS Coupon Manager Installing the RDS Analysis Tool v5 6 The RDS Analysis Tool is installed using a standard windows installer application First download the installer to a temporary folder from the following web address URL http www respondentdrivensampling org Click on Downloads and select the download that matches your particular operating system and java configuration If you ate unsure about your java configuration and are running windows choose Option 2 which includes the Java Virtual Machine JVM Once the file is downloaded double click the newly downloaded application The installer program will guide you through the installation process Default installation options are recommended and assumed throughout this manual To open the program double click the RDSAT icon or select 1t from the Programs listing in the Start Menu RDS INCORPORATED Basic Layout Information All RDSAT features are located in the right hand side of the main screen as buttons or in the menu bar See Figure 1 1 The current dataset being analyzed is displayed in the selection men
30. electing Age as the variable to be analyzed and choosing where the breakpoints will lie A Step of 5 with lower and upper bounds of 25 and 50 will break the dataset into the following 7 categories Recruits age 25 or under Recruits 26 30 Recruits 31 35 Recruits 36 40 Recruits 41 45 Recruits 46 50 Recruits age 51 or older Likewise a Step of 1 would produce 27 different categories one for recruits 25 or under one for a recruit of every age between 25 and 50 and one for recruits age 51 or older 23 RDS INCORPORATED Interpreting Analysis Results various size and proportion estimates ate explained along with their T his chapter explains how to interpret the results of an RDSAT analysis The corresponding graphs and diagrams Interpreting a Partition Analysis First create a simple partition with one variable and the complete option as shown in Figure 4 1 Click Analyze Analyze Partition Attributes Attributes to be analyzed FIGURE 4 1 RDSAT Single Variable Partition Analysis 24 RDS INCORPORATED After a moment the results of the analysis will be output to the pages in the main window To move between pages of the analysis click on its corresponding tab Recruitment Displays general statistics regarding the recruitment Estimation Network Sizes and Homophily Graphics and Histograms Recruitment by Race WBO Recruitment Count Transition Probability
31. eption of an anomaly at 500 Complete Degree Distribution Frequency 0 20 300 400 500 600 Degree 36 RDS INCORPORATED Interpreting a Breakpoint Analysis A breakpoint analysis breaks a dataset into groups based on a single continuous variable A continuous variable of interest might be Age where one wouldn t examine each individual age as a separate group but rather a range of Ages As such there is no recruitment data for breakpoint analyses Rather there are interesting trends to notice in Homophily and population proportion as the breakpoint is shifted and respondents are moved from the upper group of the lower group The Estimation tab shows a table of Least Squares population estimates corresponding to each breakpoint value Similarly the Network Sizes and Homophily tables are arranged by breakpoint value see Figure 4 5 File Analyze Help Rds Data File e open New RDS Analyze Partition C Program Filesi ei Data Included Add Data Analyze Breakpoint Gender MF Edit Deta change Options Race vyBO Population Proportions Linear Least Squares and Data Smoothed FIGURE 4 5 RDSAT Breakpoint Analysis Estimation Tab 37 RDS INCORPORATED Viewing the data in the graphics tab will often make patterns very clear For example in the breakpoint analysis of Chapter 3 New York Jazz musicians were analyzed based on their age Try clicking on Homophily in the
32. ess than that set some of the values to the m ssing data value For example below are the first 7 lines of the core data set for Doug Heckathorn s New York jazz musicians RDS 264 7 0 1 350 0 14250004 14250005 14250006 14256002 901 0 0 2 0 0 14250007 14250008 14250009 14256003 902 0 0 3 585 0 14250010 14250011 14250012 14256004 903 0 0 4 400 0 14250025 14250026 14250027 14256009 904 0 0 5 150 0 14250022 14250023 14250023 14256008 905 0 0 RDS INCORPORATED Appendix 2 RDSAT Questions 8 Answers Are seeds included in the RDSAT analyses calculations Yes because recruitments by seeds are treated like any other recruitments and all recruitments in combination are used to calculate the transition probabilities In contrast the self reported network sizes of seeds are not used to calculate network size estimates because seeds were not recruited by a peer they were recruited by key informants or in some other manner If a participant reports that the person who gave them a coupon is a stranger are they included in the RDSAT analysis If so what are the implications for the recruitment chains that follow In RDS studies recruitment rights are both scarce and valuable so respondents tend not to waste them on strangers so recruitment by strangers tends to be rare generally 1 to 3 A reasonable research strategy is to check to see if the respondents recruited by strangers differ significantly from other respondents and if n
33. evention and the Analysis of Social Structure By Douglas D Heckathorn In New Directions in Sociological Theory Growth of Contemporary Theories Joseph Berger and Morris Zelditch editors Rowman and Littlefield 2002 o History of RDS and the research project from which it emerged Heckathorn Douglas D and Joan Jeffri 2003 Social Networks of Jazz Musicians pp 48 61 in Changing the Beat A Study of the Worklife of Jazz Musicians Volume III Respondent Driven Sampling Survey Results by the Research Center for Arts and Culture National Endowment for the Arts Research Division Report 43 Washington DC 2003 o Use of RDS to study a non stigmatized hidden population jazz musicians Finding the Beat Using Respondent Driven Sampling to Study Jazz Musicians By Douglas D Heckathorn and Joan Jeffri Poetics 2001 o Use of RDS to study a non stigmatized hidden population jazz musicians RDS INCORPORATED Making Unbiased Estimates from Hidden Populations Using Respondent Driven Sampling By Matthew J Salganik and Douglas D Heckathorn Paper presented at the International Social Network Conference Februaty 2003 Cancun Mexico Street and Network Sampling in Evaluation Studies of HIV Risk Reduction Interventions By Salaam Semaan Jennifer Lauby and Jon Liebman AIDS Review 2002 o Comparison and Evaluation of Alternate Methods for Sampling Hidden Populations Review of Sampling Hard to Reach and
34. further recruitment waves do not change the population proportion by a significant amount Mean Network Size N adjusted Network sizes are adjusted for sampling bias In a chain referral sample those with more connections and larger personal network sizes tend to be ovet represented in the sample This can potentially bias sample estimates To learn mote about the methods used refer to Sampling and Estimation in Hidden Populations Using Respondent Driven Sampling by Douglas Heckathorn and Mathew Salganik 29 RDS INCORPORATED Mean Network Size N unadjusted Straight forward arithmetic mean of the sample s network sizes Homophily Hx A measure of preference for connections to one s own group Varies between 1 completely heterophilous and 1 completely homophilous Affiliation Homophily Ha A homophily measure based on the equilibrium proportions It provides a measure of homophily which is not effected by differential degree across groups Degree Homophily Hd A measure of the level of homophily that is attributable to differential degree across groups Population Weights The population weights can either be calculated using the linear least squares algorithm or the data smoothing algorithm depending on how the options are set for the RDS analysis In the above diagram the data smoothing algorithm was used See the Algorithms section of Chapter 2 for more information on the difference between various
35. graphics tab of the RDSAT main window Homophily Hornophily A Lower Group 1 0 Upper Group 0 5 15 Breakpoint There are several visible patterns Homophily tends to zero as the age variable increases This implies that differences in age become less important for choosing relationships the older the recruits ate It is also notable that the older group is always more homophilous than the younger group Finally it is possible to see that homophily is strongest where age is the lowest 25 This implies that young jazz musicians show strong preference for relationships with other young jazz musicians Population Proportions Population Proportions A Lower Group Upper Group Breakpoint Next click on LLS Population Proportions on the Graphics page to find the breakpoint where the population of the upper group equals that of the lower group From this it can be inferred that half of the musicians are less than 43 years old Note that although the graph s x axis ranges from 0 to 25 we are conducting 38 RDS INCORPORATED a breakpoint analysis on groups age 25 to 50 Therefore the above intersection corresponds to an age of 43 18 25 not 18 39 RDS INCORPORATED Handling Missing Data in the Dataset ost data sets contain missing data RDSAT offers two ways of setting missing data Both of these options will be covered in this chapter RDSAT employs two features to handle missing data The first makes i
36. hilY cooooccccccccncnnccnncnnoncncnnncnnnonos 31 Adjusted Average Network SIzeS 31 Unadjusted Network Sizes ranma 31 Network Size Information usina 32 el Tee 32 Affiliation AA O A ET 32 Graphics and sFISIOGLAMS EE 32 Transition Probabilities eccceceececeececceceeceececceeaeeeseneeees 33 Degree List occ e eee deves du ds Ads a ae ae 35 Bootstrap Simulation Results ooooooccccnnncccconoconcccccnoncnnnonnnannnnnnnos 35 Degree DISTIDUNONS erger EE 36 Interpreting a Breakpoint Anahvsis eeen 37 Replace Missing Data iii 40 Impute Median Values ii 41 fens feele 42 Add Field Sample WeightSs ennenen nnne 42 Estimate Number of Waves Required 44 Save RDS Analysis in the File MENU reenen 46 SITE 46 RDS Analysis Tool File Menu Features 50 New RDSI AI LA IRA ran 50 WIGWIECILIR DS ide aaa ERA AS 50 Save RDS tele 50 dE 50 Export DL Network File nnn 50 Export Population WeightS nna 51 Export Individualized WeightS 51 Export Estimation Table taa ARAS 51 Export Table of Recruitments 52 OPHONS EE 52 EX irrita 52 RDS Glossary of IES at coi 53 Appendix 1 The RDS Data File nnne 58 Appendix 2 RDSAT Questions amp Answers eeeee
37. ile then the following steps will transform the data from a SAS data file to a data file that can be read by RDSAT First export the SAS data file using the following code fragment The portions highlighted in blue are specific to the dataset and must be altered data lt one gt set lt name of your main SAS data file gt file lt Target Directory RDSATdata txt gt put 1 SurveyID Degree Coupon submitted Coupon given 0 Coupon given 1 Coupon given 2 age sex race Run Note The lt gt brackets indicate that user fills in this information Age sex and race are examples of vanables you might want to analyze There are two features of note in the above code First the output file must be a text file suffix txt or a data file suffix dat RDSAT only reads these file types Second the variables that comprise the main data set SurveyID Degree Coupon submitted Coupon given 0 Coupon given 1 Coupon given 2 must be in the order shown above Then add variables you want to analyze such as age sex race RDSAT requires that the data be placed in this order and doing so in the output step will save time Once the data has been exported open the file using NOTEPAD or WORDPAD and add the two line header as described in the Section of this chapter entitled Preparing Data From Excel An example header is displayed highlighted in bold in the data file fragment below The data file is ready to be read by RDSAT Note th
38. ing The recommended algorithm is Data Smoothing which adjusts recruitments across groups providing tighter Confidence Intervals than the naive LLS method Enhanced Data Smoothing assigns tiny non zero number to all cells in recruitment matrix then uses Data Smoothing This allows for an analysis to include non recruiting groups which would normally fail using LLS or Data Smoothing 18 RDS INCORPORATED Analyzing a Dataset his chapter introduces the analysis features of RDSAT This is the heart of the software s functionality Topics include Partition Analysis Breakpoint Analysis and Custom Analysis Partition Analysis If an RDS dataset is successfully loaded click on Analyze Partition in the upper right of the main window see Figure 3 1 By clicking on this button the window of Figure 3 2 will appear RDS Analysis Tool File Analyze Help Rds Data File C Documents and Settings Doug Heckathorn My DocumentsttreyiRDSAT5 6 manualiman ei Data Included Add Data Analyze Breakpoint Analyze Partition Age Race WBO Edit Data Change Options Estimation Network Sizes and Homophily f Graphics and Histograms Respondent Driven Sampling Analysis Tool v 5 6 0 If you are new to Respondent Driven Sampling refer to the documentation included with this distribution More help and resources are available on the web at http www respondentdrivensampling org Rak Volz emy comell edu
39. n respondent s degree and the partition variable When calculated for a dependent variable the data set can be weighted by this value for multivariate analysis Degree The respondent s degree or personal network size Export Table of Recruitments Options This feature opens the options menu The Change Options button on the main screen serves the same function Exit This feature exits the RDS Analysis Tool program It serves the same function as the x in the top right hand corner of Windows programs NOTE Make sure not to close the program without saving necessary data changes RDS INCORPORATED RDS Glossary of Terms Adjust Average Network Size Option In a chain referral sample those with more connections and larger personal network sizes tend to be over represented in the sample This RDSAT option corrects this bias Adjusted Average Network Sizes Network sizes that are adjusted for sampling bias Affiliation Matrix Displays preference measures for connections between all group pairs The diagonal of this matrix is Homophily within a group Bootstrap Simulation Results Shows the histogram of Bootstrap estimates of Least Squares population proportions The horizontal axis depicts population estimates for the specified group The vertical axis shows the frequency of the Bootstrap estimate Breakpoint Analysis A Breakpoint analysis allows one trait to be analyzed over a range of possible breakpoints This i
40. or multivariate analysis Export Estimation Table This function exports a text file of output and weights corresponding to the most recent partition analysis preformed for each respondent in the data In essence this reproduces the Population Estimates table from the Estimation tab in RDSAT therefore a partition analysis MUST be performed in order for this function to be available See XXXX in this manual for more detailed explanation of the Population Estimates table The exported fields are RID The Respondent ID Group Group number to which the respondent belongs 51 RDS INCORPORATED PopEst The RDS population proportion estimate of the respondent s group Sample The sample proportion of the respondent s group RecruitProp The recruitment proportion of the respondent s group Equilibrium The equilibrium proportion of the respondent s group Hx The RDS homophily measure for the respondent s group Ha The affiliation homphily measure for the respondent s group Hd The degree homphily measure for the respondent s group Weight The population weight for the respondent s group RecComponent The recruitment component for the respondent s group RCx DegComponent The degree component for the respondent s group DCx IndDegreeComp The degree component based on the respondent s individual degree This value is unique to the respondent IndweightComp The individualized RDS estimator weight based o
41. ore data set The core data set contains information about the 13 RDS INCORPORATED sample size missing data values and number of coupons per respondent Start the RDS Analysis Tool and choose Open New RDS or select the file menu and click on New RDS see Figure 2 1 When a file chooser dialog window appears select the RDS data file and choose Open The nyjazz txt file included in this distribution is a good sample file to work with if no real dataset is available If the default installation directory was used this sample file will be located at C Program Files rdsat nyjazz txt For more information on the core data set refer to Appendix 1 Data pertaining to other population features of interest can also be included in this file Analysis cannot be carried out until this data is loaded Note The sample RDS data set of New York jazz musicians was collected by Douglas Heckathorn and analyzed in Finding the Beat Using Respondent Driven Sampling to Study Jazz Musicians Douglas D Heckathorn and Joan Jeffri Poetics 2000 14 RDS INCORPORATED Viewing Data RDS Analysis Tool File Analyze Help Rds Data File C Documents and Settings Doug HeckathornWMy DocumentsttreyiRDSAT5 6 manualiman ei Data Included Add Data Analyze Breakpoint gt Edit Data P Change Options Analyze Partition Estimation f Network Sizes and Homophily f Graphics and Histograms Respondent Driven S
42. orithm or the data smoothing algorithm depending on how the options are set for the RDS analysis In the above diagram the data smoothing algorithm was used See the Algorithms section of Chapter 2 for more information on the difference between various estimation algorithms in RDSAT 1 Least Squares Population Proportions Reports the estimated population proportions of each group using linear least squares to solve the population equations 2 Data Smoothed Population Proportions Reports estimated population proportions for the Data Smoothed population equations Sample Population Proportions Report the sample population proportions also called the naive estimates of population proportions The term naive is used because the proportion is a simple ratio of how many of a particular group were recruited to the total number of recruits It is not adjusted for any statistical biases To learn more about the methods used refer to Sampling and Estimation in Hidden Populations Using Respondent Driven Sampling by Douglas Heckathorn and Mathew Salganik Recruitment Proportions The unadjusted recruitment proportions for the sample This is the same as the Sample Population Proportions if seeds were not included in the calculation Equilibrium Sample Distribution The equilibrium sample population proportions indicate each group s population size after the proportions have converged to their equilibrium value This occurs when
43. ot then to treat these as valid recruitments A maximally conservative research strategy would be to delete from the data set the serial number linking the recruit to the stranger rectuiter The recruit would then be treated as a seed and the stranger recruiter would become the terminus of a recruitment chain Neither respondent would be deleted from the data set but the number of peer recruitments would be reduced Are there any other essential variables we should be analyzing in RDSAT Other than gender race and age The variables to be analyzed depend on the research questions being addressed RDS is a sampling method a method for drawing statistically valid samples so its role is to help ensure that the answers are statistically valid How does restricting recruitment to specific races affect the legitimacy of the survey and or RDSAT analysis This restriction of the sampling frame narrows the scope of the study e g limiting RDS INCORPORATED recruitment to Latino IDU would mean that the study would yield no information about non Latino IDU or Latina IDU How to best choose the sampling frame depends on the aims of the study How does RDSAT account for missing data For example one of our sites lost 2 interviews handheld computer malfunction one from a seed and the other from a non seed respondent Currently RDSAT will not process the entire recruitment chain linked to a record with missing data How does RDSAT adj
44. ple First a partition analysis of the relevant variables must be run see p 17 for more information on executing a multivariate partition analysis Once you have done a partition analysis identify the groups of interest for prevalence estimation using the Key In our example HIV positive males are Group 1 1 and non HIV positive males are Group 1 2 46 RDS INCORPORATED RDS Analysis Tool Mle x File Analyze Help Rds Data Hie DD Open New RDS Analyze Partilion C Documents ei Deeg eg EE _ tii LESIDU N Change Opliors REFERP 4 ED IL Change Options Frecrutmentl Estrretion Vank Sizes and Homoohity Graphics ara Histogreris Key of Group and Trait Correspondence We are now ready to perform prevalence estimation From the menu items select Analyze gt Estimate Prevalence RDS Analysis Tool Mli E nalyze Partition Analyze Brsakpoirt _ Estimate Numker Of Waves Required Da Estimate Prevalence Analyze Partilion Analyze Breakpoint LESIDU REFERP Al Change Options Estivation Network Sizes and Homophity Graphics arc Histograma The prevalence function requires you to enter the denominator and numerator used for estimation Use the Select Group buttons to enter these fields The groups appearing in the pull down menu correspond to groups from the most recent partition analysis preformed Then click OK
45. pute Degree Sample Size 595 Number of Coupons per Recruit 3 Walue for Missing Data 9999 FIGURE 2 3 RDSAT Spreadsheet View 16 RDS INCORPORATED Setting Options For Analysis Before conducting an analysis check the options that will be used Choose Options from the main window The window of figure 2 4 will appear Options Number of Resamples for Bootstrap GES Contidence Interval tail alpha Pos F Ful Outliers of Network Sizes 0 Maximimum 50 Algorithm type Aus Data Smoothing Oenhanced Data Smoothing FIGURE 2 4 RDSAT Options Window Adjust Average Network Sizes In a chain referral sample those with more connections and larger personal network sizes tend to be over represented in the sample This can potentially bias sample estimates The phenomenon can be corrected however and the RDS analysis tool does so by default To learn more about the methods used refer to Sampling and Estimation in Hidden Populations Using Respondent Driven Sampling by Douglas Heckathorn and Mathew Salganik If you do not wish to adjust the average network sizes for this sample bias uncheck the flag Number of Re samples This is the number of times the data is re sampled to derive the bootstrap confidence intervals For accurate confidence intervals keep this option at least the default value of 2500 For optimal accuracy a number over 15 000 is recommended Be aware however tha
46. raph displays the adjusted network sizes of each group Observe that group 3 the rightmost bar has the highest network size Transition Probabilities This is a 2 dimensional histogram of the transition probabilities A brighter color corresponds to a higher value It is a method of visualizing the corresponding transition matrix 33 RDS INCORPORATED Transition Probabilities 34 RDS INCORPORATED Degree List List of all network sizes reported in the sample The list is sorted from least to greatest for easy view of the distribution Sorted Degree Sequence In the graph above we see that there are a few respondents with networks as large as 900 but most respondents fall within a degree of 100 300 Bootstrap Simulation Results Shows the histogram of Bootstrap estimates of Least Squares population proportions The horizontal axis depicts population estimates for the specified group The vertical axis shows the frequency of the Bootstrap estimate Frequency of Population Proportions from Bootstrap Procedure Frequency 0 06 0 05 0 04 0 03 0 02 0 01 0 00 0 50 0 55 Population Prop 35 RDS INCORPORATED Degree Distributions Distribution of network sizes for each group and for the population as a whole The diagram below happens to be of the entire population We see that most members of the population have network sizes close to 100 or 200 and the frequency of higher network sizes decreases with the exc
47. rst run a partition analysis This analysis defines the groups that will be used to impute the degree Next click Impute Degree To make changes permanent click Save RDS Data File Note The Impute Degree feature only functions after a partition has been analyzed because it uses the mean unadjusted network size for the group defined by the partition in which each respondent is a member to impute the degree To learn more about partition analysis see Chapters 3 and 4 of this manual Add Field Sample Weights This feature adds the Field Sample Weights to the RDS data file It only appears in the Edit Data screen when a partition has been analyzed In the Edit Data screen click Add Field Sample Weights A new column of data will appear that contains the Field Sample Weights Click Save RDS Data File to make this change permanent 42 RDS INCORPORATED D mj twork Size on ns S BO Gen ee ample Weig 4 Seve RDS Data Fie 1267601 6 353 367 368 369 52 4 o 6 1031642177 eren o 335 370 371 372 ba a i 0 1031642177 4267603 9999 309 376 SO 378 49 2 o 20 0 966270318 Replace Missing Data la267604 Wa 852 373 374 375 Sa a a 42 0 966270318 4267605 e oe 337 338 339 49 lo lo 6 0 966270318 4267608 9 318
48. s very useful for continuous variables such as age Complete Variable Analysis This option will find every distinct value in the data file associated with a variable trait and create new groups based on that value Confidence Interval The value of this parameter determines the level of confidence for the confidence intervals reported in the analysis The default 05 measures the normalized length of a tail of the distribution of population proportions In short it determines 90 confidence for the intervals reported in the analysis Cut Outliers An RDSAT option that eliminates extremely small and large outliers in network sizes from the dataset Data Smoothed Population Proportions Reports estimated population proportions for the Data Smoothed population equations Data Smoothed Population Weights Multiplicative factors by which the Data Smoothed Estimates ate different from the naive estimates 53 RDS INCORPORATED Degree Distributions Distribution of network sizes for each group and for the population as a whole Degree List List of all network sizes reported in the sample The list is sorted from least to greatest for easy view of the distribution Demographically adjusted Recruitment Matrix Gives hypothetical recruitments if each group recruited with equal effectiveness Transition probabilities implied by this matrix are identical to those of the original Recruitment Matrix DL Network File DL format is recogni
49. stinct value in the data file associated with that trait and create new groups based on that value For example if the trait gender has two values in the data file 1 2 the complete option will make a new group associated 20 RDS INCORPORATED with each of these values If the trait race has three values 10 11 12 then the complete option will create 3 more groups corresponding to those trait values If both gender and race are included in the partition there will be 2 x 3 6 groups in all race gender 10 1 11 1 12 1 10 2 11 2 12 2 Breakpoint This will take every value below the specified breakpoint and create a new group based on it a 2nd group is created based on every value greater than or equal to the specified breakpoint This is different from a breakpoint analysis discussed in the next section in that only one breakpoint is chosen for the dataset rather than a range of breakpoints The analysis is identical to a complete partition analysis with the exception of creating exactly 2 groups from a partition in the dataset rather than one for evety possible trait value For example the trait age has a range of values associated with it It would be impractical to create a group for every distinct age but by choosing breakpoint with a value of 40 the population can be divided into a group less than 40 years old and a group greater than 40 years old Analyze Continuous Variable This
50. t possible to reassign another value to missing data In this way respondents for whom data is missing can be included in the analysis to see if missing data is random ot associated with other variables For example in an analysis of HIV prevalence respondents would be divided into three categories positive negative or missing One could then run analyses in a statistics program to see if having missing data was correlated with other terms such as race ethnicity The other data imputation procedure sets missing values at the median of the variable Replace Missing Data This feature is found in the Edit Data menu It allows each trait to be chosen and to specify which value the missing data within that trait should have This option can also be used to give missing data a unique value to allow groups to form on the basis of whether they have missing data For reference the missing data value is displayed on the left hand side of the Edit Data screen To use this feature click Replace Missing Data In the pop up select both the trait for which data is to be replaced and the value to which it should be set Click Commit Changes to replace data Then click Save RDS Data File to make changes permanent To re analyze a dataset close the Edit Data screen and run new analyses 40 RDS INCORPORATED _ Network Size Own Coupon Coupons Save RDS Data File 4287601 B 367 4267602 33 370 267603 20 376 Replace Missing
51. t the bootstrap is demanding of CPU time There may be a short wait if this value is set to a high number 17 RDS INCORPORATED Confidence Interval The value of this parameter determines the level of confidence for the confidence intervals reported in the analysis The default 05 measures the normalized length of a tail of the distribution of population proportions In short it determines 90 confidence for the intervals reported in the analysis Pull In Outliers of Network Sizes With this option you may eliminate extremely small and large outliers in network sizes Check the box and input the desired percentages of each end of the network distribution you would like to be pulled in For example a value of 5 would pull in the top 5 and bottom 5 of the network size values If this option is selected when the program encounters an individual whose network size is outside of the specified bounds their network size will be set to the value of the nearest lower or upper bound percentage A modest value is recommended To view the changes use the View Edit utility The changes enacted by the Pull In Outliers of Network Sizes option may then be saved to a data file Note Check for outliiers by running a univariate frequency in SAS SPSS Excel before importing data to RDSAT Algorithm Type Three different algorithms are available for analyzing an RDSAT dataset Linear Least Squares LLS Data Smoothing and Enhanced Data Smooth
52. tially bias sample estimates To learn more about the methods used refer to Sampling and Estimation in Hidden Populations Using Respondent Driven Sampling by Douglas Heckathorn and Mathew Salganik Unadjusted Network Sizes Straight forward arithmetic mean of the sample s network sizes 31 RDS INCORPORATED Network Size Information Displays the minimum and maximum network sizes for the sample Homophily A measure of preference for connections to one s own group Varies between 1 completely heterophilous and 1 completely homophilous Affiliation Matrix Displays the same preference measures but for all group pairs Graphics and Histograms This tab displays visual illustrations of data presented in the previous sections of this chapter Homophily Homophily 1 0 This graph displays homophily within 3 different groups Each group is shown as a separate bar This graph illustrates that Group 2 the middle bat has the highest homophily roughly 3 followed by Group 1 the leftmost bar and Group 3 rightmost 32 RDS INCORPORATED Population Proportions Population Proportions This graph displays the population proportions of each group The y axis is the population proportion and should be read as a percentage We see that Group 1 the leftmost bar comprises more than half the total population followed by group 2 and 3 Average Adjusted Network Sizes Avg Net Sizes 120 115 This g
53. u entitled Rds Data File When a dataset has been analyzed all graphs and figures can be found in the set of tabbed windows at the bottom of the main screen RDS Analysis Tool File Analyze Help Rds Data File f C Documents and Settings Doug Heckathorn y DocumentsttreyiRDSAT5 6 manualiman ei Analyze Partition Analyze Breakpoint Data Included Add Data Age Edit Data Change Options RacetvWBO Pene rey Estimation Network Sizes and Homophily Graphics and Histograms Respondent Driven Sampling Analysis Tool v 5 6 0 If you are new to Respondent Driven Sampling refer to the documentation included with this distribution More help and resources are available on the web at http www respondentdrivensampling org Rak Vols emy comell edu Donglas Heckathom donglas heckathom comell edu Department of Sociology Coml Univesity FIGURE 1 1 RDSAT Main Window RDS INCORPORATED Preparing Data from Excel RDSAT Accepts data in the form of a text file To load an existing excel spreadsheet into RDSAT the columns of the dataset must be in the following order Respondent ID ui E oab r Self Reported Network Size Coupon Received from Recruiter Coupons given to Respondent C1 to C4 Other variables then follow e g gender race age etc The first two rows of the spreadsheet make up the RDSAT header The first line must be RDS The second line is the sample size the number of
54. ust for differential coupon distribution For an in depth look at the methods used in RDS analysis please consult Sampling and Estimation in Hidden Populations Using Respondent Driven Sampling The citation for this paper can be found in the references section of this manual Please also consult the References section for mote RDS related literature RDS INCORPORATED Appendix 3 Graphing Recruitment Chains with NETDraw Graphing recruitment chains can be done using NetDraw a network graphing program that comes with UCINet Graphing an RDS recruitment chain requires 2 different data files 1 The DL File created with RDSAT contains information on the structure of the chains who recruited whom 2 The Attribute File contains information of the respondents and is created from the RDSAT data file The DL File 1 To create the file load your data into RDSAT Select File Export DL Network File Save the file 2 Open UCINet 3 Click on the Draw menu option This will open NetDraw 4 Once you have opened NetDraw It should say NetDraw Visualization Software at the top open the DL File you saved by selecting File Open gt Ucinet DL text file gt Network 1 mode Open the DL file you created You should see a few red dots on the screen 5 To view the recruitment chain select Layout Graph Theoretic layout gt Spring Embedding Select the following criteria in the popup box Layout Criteria Distances
55. zed by numerous network analysis packages including UCI net and Pajek Pajek in particular can be used to create attractive social network visualizations Enhanced Data Smoothing An RDSAT option that allows analysis to take place even in a dataset with no recruitment data for a particular group Homophily A measure of preference for connections to one s own group Varies between 1 completely heterophilous and 1 completely homophilous Impute Missing Data and Re Analyze Sets missing data to their most probable value given the transition probabilities Initial Recruits Reports the number of seeds i e people recruited by the researcher in each group Least Squares Population Proportions Reports the estimated population proportions of each group using linear least squares to solve the population equations LLS Population Weights Multiplicative factors by which the Least Squares Estimates ate different from the naive estimates Partition A user defined set of groups Everyone in the population belongs to a group in a partition The groups are defined by common traits Re Analyze with Specified Missing Data This feature allows each trait to be chosen and to specify which value the missing data within that trait to have It can also be used to give missing data a unique value to allow groups to form on the basis of whether they have missing data 54 RDS INCORPORATED Recruitment Matrix Matrix of recruitments to and from
Download Pdf Manuals
Related Search
Related Contents
Here - BookingBuilder Fisher-Price J9632 User's Manual mAirList 3.1 User Manual, English Western Digital My Book Studio Edition II P23 LEDモジュール (8018KB) Portable Ultrasonic Energy meter Type 4.9MB Uživatelská příručka Copyright © All rights reserved.
Failed to retrieve file