Home
User Guide for the Public-use Microdata File Survey of
Contents
1. were reduced by the proportion of rent charged to business This may affect comparisons with data from previous years For the 2001 and 2005 reference years extra questions were included for use in the weighting of the Consumer Price Index This change may affect some historical comparisons For example in both 2005 and 2001 questions were added under Personal care to collect extra information about hair care products makeup fragrances deodorants and oral hygiene products As a result of these extra questions respondents may have given more precise information and the Statistics Canada 13 Catalogue no 62M0004XCB increase in the estimated expenditures for Personal care in 2001 and 2005 may have been caused by an improvement in respondent recall The effect of additional questions on estimates is difficult to quantify However in 2002 when the extra questions were removed the estimate for Personal care spending decreased again For the 2006 SHS and subsequent years the extra questions of 2005 were retained The section of the questionnaire which covers Repairs and improvements of owned principal residences was extensively revised From 1997 to 2003 this section had three broad questions Additions renovations and other alterations Replacement or new installation of built in equipment appliances and fixtures and Repairs and maintenance Starting with the 2004 Survey of Household Spending there were
2. 9 97 onus m a Example 1b Approximation of CV for estimates of totals qualitative variable Let us assume we now want to estimate the total number of Canadian one person households as well as the total number of Canadian households made up of one person living in different tyoes of accommodations In this case variable is defined as having the value 1 if the household is one person If not it is 0 We must create five Z variables Z1 with a value of 1 if the type of residence occupied is a single family dwelling and 0 if not Z2 equals 1 if the type of residence is semi detached and 0 if it is not Z3 equals 1 if the type of residence is a townhouse and 0 if it is not Z4 equals 1 if the type of residence is a row house and 0 if it is not Finally Z5 equals 1 if the type of house is other and O if it is not Y1 is defined as the product of and Z1 Y2 as the product of and Z2 etc The estimates obtained are 3 644 715 for the set of one person households 1 163 660 for single family dwellings 147 987 for semi detached houses 181 246 for town houses and 2 151 822 for other We want to calculate the approximate CVs for these estimates Figure 6 shows the results for some steps in the calculation of the approximate CV The results presented for steps 4 to 9 are the results for Manitoba presented as an example for a province they will be used for comparison in the next example while tho
3. In the territories Coverage was restricted to 91 7 in the Yukon 91 5 in the Northwest Territories and 91 4 in Nunavut Note that the coverage in Nunavut for 2005 had decreased to 68 3 but is now back at the level it was before 2005 91 4 Users should remember this when comparing aggregated data over time Spending data were collected for every household member at the time of the interview including those who joined the household in 2007 or 2008 regardless of whether the previous household existed or the person was living alone Data were not collected for those who left the household in 2007 or 2008 As a result an important difference between the 2006 2007 SHS and previous SHS methodology is the elimination of the distinction between part year and full year members and households Persons temporarily living away from their families for example students at university were included in the household to avoid double counting Survey content and reference period Detailed information was collected about expenditures for consumer goods and services changes in assets mortgages and other loans and annual income This information was collected for the calendar year 2007 the survey reference year Information was also collected about dwelling characteristics e g type and age of heating equipment and household equipment e g appliances communications equipment and vehicles This type of information was collected as
4. differences in the documents to be published or released f Unrounded estimates cannot under any circumstances be published or released in any way whatsoever by users Unrounded estimates give the impression that they are much more precise than they actually are Guidelines for the weighting of the sample for totalling purposes The sample design used for the SHS is not self weighted meaning that the households in the sample do not all have the same sampling weight To produce simple estimates including standard statistical tables users must use the appropriate sampling weight Otherwise the estimates calculated using the microdata files cannot be considered as representative of the observed population and will not correspond to those that would be obtained by Statistics Canada using this microdata file See Weighting re weighting and Census historical revision of SHS Users should also note that depending on the method they use to process the weight field some software packages may not produce estimates that correspond exactly to those of Statistics Canada using this microdata file Types of estimates categorical versus quantitative Before discussing how SHS data can be totalled and analysed it is useful to describe the two main types of estimations that may be produced from the microdata file for the Survey of Household Spending Categorical estimates Categorical estimates are estimates of the number or percentage of househol
5. fourteen detailed questions and two columns giving respondents the opportunity to split the costs for each question between Repairs and maintenance and Improvements and alterations Beginning with the 2006 SHS computer assisted personal interviews CAPI replaced the previous paper questionnaire The household members dwelling characteristics and household facilities and equipment are all as of the time of the interview instead of as of December 31 as in previous years Household spending were collected for the reference year for all members of the household as of the time of the interview eliminating the distinction between part year and full year members and households Guidelines for tabulation analysis and dissemination This section describes the guidelines that users should follow when totalling analysing publishing or releasing data taken from the public use microdata file Guidelines for rounding To ensure that estimates from this microdata file intended for publication or any other type of release correspond to estimates that would be obtained by Statistics Canada we strongly recommend that users comply with the following guidelines for rounding estimates a Estimates in the body of a statistical table must be rounded to the nearest hundredth using the traditional rounding technique i e if the first or only number to be eliminated is between 0 and 4 the preceding number does not change If the first or only
6. number to be eliminated is between 5 and 9 the value of the last number to be retained increases by 1 For example when using the traditional technique of rounding to the nearest hundredth if the last two numbers are between 00 and 49 they are replaced by 00 and the preceding number denoting hundredths stays as is If the last two numbers are between 50 and 99 they are replaced with 00 and the preceding number increased by 1 Statistics Canada 14 Catalogue no 62M0004XCB b Total partial sub totals and total sub totals in statistical tables must be calculated using their unrounded corresponding components then rounded in turn to the closest hundredth using the traditional rounding technique c Means ratios rates and percentages must be calculated using unrounded components i e numerators and or denominators and then rounded to a decimal using the traditional rounding technique d Totals and differences in aggregates or ratios must be calculated using their corresponding unrounded components then rounded to the nearest hundredth or decimal place using the traditional rounding technique e If due to technical or other limitations a technique other than traditional rounding is used with the result that the estimates to be published or released differ in any form from the corresponding estimates that would be obtained by Statistics Canada using this microdata file we strongly advise users to indicate the reasons for the
7. sheet music and other printed matter Education supplies textbooks text books for post secondary tuition fees for post secondary and other educational services Games of chance 00000000 O Statistics Canada 11 Catalogue no 62M0004XCB Non response error Non response error occurs in sample surveys because not all potential respondents cooperate fully The extent of non response varies from partial non response to total non response Total non response occurs when the interviewer is unable to contact the respondent no member of the household is able to provide information or the respondent refuses to participate in the survey Total non response is handled by adjusting the basic survey weight for responding households to compensate for non responding households For the 2007 Survey of Household Spending the overall response rate was 65 1 See Table 1 for provincial response rates In most cases partial non response occurs when the respondent does not understand or misinterprets a question refuses to answer a question or is unable to recall the requested information Imputing missing values compensates for this partial non response The importance of the non response error is unknown but in general this error is significant when a group of people with particular characteristics in common refuse to cooperate and where those characteristics are important determinants of survey results Table 1 Response rates Canada and pr
8. the question Is your house Owned mortgage free by your household or Owned with one or more mortgages by your household To obtain a weighted mean expressed by the formula X Y the numerator X is calculated as a quantitative estimate and the denominator Y as a categorical estimate For example to estimate mean household expenditures for electricity by owners you must a estimate the total expenditure for electricity for households where the residence is owned using the method described above b estimate the number of owned households by adding the final weights for all records with a positive response to the question Is your house Owned mortgage free by your household or Owned with one or more mortgages by your household and then c divide the estimate obtained in a by the one calculated in b Guidelines for statistical analysis The Survey of Household Spending is based on a complex survey design that includes stratification and multiple stages of selection as well as uneven respondent selection probabilities The use of data from such complex surveys poses problems for analysts because the survey design and the selection probabilities influence the estimation and variance calculation methods to be used Although numerous analytical methods in statistical software packages allow for the use of weights the meaning or definition of weights differs from that suitable for a sample survey As a resul
9. the following warning should accompany the estimates The user is advised that specify the data do not meet Statistics Canada s quality standards for this statistical program Conclusions based on these data will be unreliable and most likely invalid Computation of approximate CVs In order to provide a way of assessing the quality of estimates Statistics Canada has produced a coefficient of variation table CV table which is applicable to estimates of averages ratios and totals obtained from this public use microdata file for the major variables of the SHS by province and at the Canada level see Appendix D The CV of an estimate is defined to be the square root of the variance of the estimate divided by the estimate itself and expressed as a percentage The numerator of the CV is a measure of the sampling error of the estimate called the standard error and is calculated at Statistics Canada with the bootstrap method This method requires among other things information about the strata and the clusters which can t be given on the public use microdata file for reasons of confidentiality So that users may estimate CVs for variables not included in the CV tables Statistics Canada has produced a set of rules to obtain approximate CVs for a wide variety of estimates It should be noted that these rules provide approximate and therefore unofficial CVs The quality of the approximation however is quite satisfac
10. the initial editing at the same time For example the range edit provided a minimum and maximum amount for certain purchases and was triggered if the amount entered by the interviewer was unusual Other edits indicated inconsistencies in responses e g if the household tenure was renter but no rent was paid In addition to automatic edits built into the electronic questionnaire a balance edit comparing total revenues expenses and changes in assets and liabilities performed by the interviewer acted as a check on data quality The next stage of editing was done in the head office to verify unusual or high values and inconsistencies and to correct invalid responses If a household indicated that it had an expense but could not provide the amount these missing responses were imputed using the nearest neighbour method Statistics Canada s Canadian Census Edit and Imputation System CANCEIS were used to insert values from donor records having similar characteristics chosen specifically to fit the variable For example total household income was 1 A detailed description of the Labour Force Survey sampling frame can be found in Methodology of the Canadian Labour Force Survey Statistics Canada Catalogue no 71 526 XIE Statistics Canada 7 Catalogue no 62M0004XCB used for most variables dwelling type household size and province were also frequently used Tabulation for the 2007 Survey of Household Spending was completed usin
11. totals or means medians or other central tendency measurements of quantities based on all members of the observed population or based on some of them They also explicitly include estimates in the form X Y where X is an estimate of the total quantity for the observed population and Y is an estimate of the number of individuals in the observed population who contribute to that total quantity An example of a quantitative estimate is mean annual expenditure for personal and health care per household in the target population The numerator corresponds to an estimate of total annual expenditure for personal and health care and the denominator corresponds to an estimate of the number of households in the population Statistics Canada 16 Catalogue no 62M0004XCB Example of quantitative question In 2007 how much did your household spend for telephone services Totalling of quantitative estimates Quantitative estimates can be obtained from the microdata file by multiplying the value of the desired variable by the final weight of each record and then adding this quantity for all records of interest For example to obtain an estimate of total expenditure by households that were owners at the time of interview for electricity the value reported for the question In 2007 how much did your household spend on electricity is multiplied by the final weight of the record and then that result is summed over all records with a positive response to
12. 9 1 079 909 6 0 13 108 24 9 62 823 895 1 3576 x10 10 62 823 895 1 3576 x10 11 5 70 9 97 The estimate of the mean would be 116 890 010 138 947 841 3 How do we determine the CV of this estimate Rule 2 should be applied in this case Thus the CV of this mean may be approximated with the CV of the numerator the total expenditure on furnishings and household equipment in Manitoba for one person households This CV is 9 97 Example 2 Approximation of CV for estimating ratios Let us assume we want to estimate the ratio between the total expenditures on furnishings and household equipment for couples without children households in urban Manitoba and rural Manitoba Statistics Canada 26 Catalogue no 62M0004XCB Figure 8 Calculation of ACV Step Total expenditure on furnishings and Total expenditure on furnishings and household equipment for households household equipment for households consisting of couple without children consisting of couple without children and without additional persons in and without additional persons in Manitoba urban Manitoba rural 4 177 415 907 35 538 850 5 1 079 909 1 079 909 6 164 29 32 91 9 2 7383 x 10 6 5057 x 10 10 2 7383 x 10 6 5057 x 10 11 9 33 22 70 The estimate of the ratio would be equal to 177 415 907 35 538 850 4 99 couple without children households in urban Manitoba spend approximately 5 times more on furnishing than those in rural Manitoba How does the user dete
13. Catalogue no 62M0004XCB User Guide for the Public use Microdata File Survey of Household Spending 2007 July 2009 Income Statistics Division Statistics Canada Ottawa K1A 0T6 Telephone 613 951 7355 Ce document est disponible en francais Income Statistics Division Statistics Canada must be credited when reproducing or quoting any part of this document Table of contents Lundgers 0 fos 1 fo eee eee a ENT ae Hen ae oe a aaa OAC aA aaa aTa 3 Backgro nd sersan oaa Cee rrret meter t etree meen Cee refer meet E a a entre rare aa en 3 ING WTOP 2007 annm iea tenana a dea chine iae hence Goh came ebiies 3 Other d c men Sn merenn tet teigetcee E tee ae aw sete A dedicat 3 For further information sa 222 cies saasseek ida vee xeay veel ies tees eededesaniqes ita sueaseen ees ttes euemceenees 4 Technical characteristics Of the file cccccssceseeeeeeeeeeeeeeeeeeeeeeeeeeeeneeesneeeeeeeees 5 Survey METNOGOlOG V5 5 eis cseec tek a ai Saiar aa aiaia 6 The survey universe s sssesserseeeeeorrettrrnttterttttrrtntettrttnnttseertttrnnreseeernerrn EEren 6 Survey content and reference Period cceeeeceeceeeeeeeeeeeeeeeeeeeeeeeeeaeeeeeeeeeeeeeeee 6 TVG sample eren R saaduwane hay chs E E E ER REEE O noes 7 Data collection se a aa aa e E EE ARA EEE a EE 7 Data processing and quality control sssssssesssssssenrrrrtressertrtrrnnnrssrrrrnnrnnnneseernnne 7 Weighting re weighting and Census historical revis
14. ata variability is the difference between members of the population with respect to spending on a specific item or the presence of a specific dwelling characteristic or piece of household equipment In general the greater these differences are the larger the sampling error will be In addition the larger the sample size the smaller the sampling error Standard error and coefficient of variation A common measure of sampling error is the standard error SE Standard error is the degree of variation in the estimates as a result of selecting one particular Statistics Canada 9 Catalogue no 62M0004XCB sample rather than another of the same size and design It has been shown that the true value of the characteristic of interest lies within a range of 1 standard error of the estimate for 68 of all samples and 2 standard errors for 95 of all samples The coefficient of variation CV is the standard error expressed as a percentage of the estimate It is used to indicate the degree of uncertainty associated with an estimate For example if the estimate of the number of households having a given dwelling characteristic is 10 000 households and the corresponding CV is 5 then the true value is between 9 500 and 10 500 households 68 of the time and between 9 000 and 11 000 households 95 of the time Standard errors for the 2007 Survey of Household Spending were estimated using the bootstrap method This method is suitab
15. ce interval is desired z 1 6 if a 90 confidence interval is desired z 2 if a 95 confidence interval is desired z 3 if a 99 confidence interval is desired Statistics Canada 28 Catalogue no 62M0004XCB Note Release guidelines which apply to the estimate also apply to the confidence interval For example if the estimate is not releasable then the confidence interval is not releasable either Example 4 A 95 confidence interval for the estimated mean of spending on household furnishings and equipment for one person households in Manitoba would be calculated as follows EST 841 3 Z 2 ACV 9 97 Cl 841 3 2 x 841 3 x 9 97 100 841 3 2 x 841 3 x 9 97 100 673 5 1 009 1 How to do a Z test Coefficients of variation may also be used to perform hypothesis testing a procedure for distinguishing between population parameters using sample estimates The sample estimates can be totals averages ratios etc Tests may be performed at various levels of significance where a level of significance is the probability of concluding that the characteristics are different when in fact they are identical Let EST and EST be sample estimates for 2 characteristics of interest Let the approximate CV of the difference EST EST be ACVpier If z 100 ACVopirr is less than 2 then no conclusion about the difference between the characteristics is justified at the 5 level of significance If however this ra
16. d range of values For example a 95 confidence interval can be described as follows If sampling of a population is repeated many times each sample leading to a new confidence interval for an estimate then in 95 of the samples the interval will cover the true population value Using the CV of an estimate its confidence intervals may be obtained assuming that under repeated sampling of the population the various estimates obtained for a characteristic are normally distributed around the true population value Using this assumption the chances are about 68 out of 100 that the difference between a sample estimate and the true population value would be less than one standard error about 95 out of 100 that the difference would be less than two standard errors and about 99 out 100 that the differences would be less than three standard errors These different degrees of confidence are referred to as the confidence levels Confidence intervals for an estimate EST are generally expressed as two numbers one below the estimate and one above the estimate as EST k EST k where k is determined depending on the level of confidence desired and the sampling error of the estimate Confidence intervals for an estimate can be calculated by first determining the ACV of the estimate and then using the following formula to convert to a confidence interval Cl EST z x EST x ACV 100 EST z x EST x ACV 100 where z 1 if a 68 confiden
17. ds in the survey s target population that have certain characteristics or belong to a Statistics Canada 15 Catalogue no 62M0004XCB defined category The number of households reporting a particular expenditure is an example of this type of estimate The expression aggregate estimate can also be used to refer to an estimate of the number of individuals with a given characteristic Examples of categorical questions Does anyone in your household use the Internet from home _yes _no When was this dwelling originally built _ 1945 or earlier _ 1946 1960 _ 1961 1970 _ 1971 1980 _ 1981 1990 _ 1991 2008 Is your dwelling _ Owned without a mortgage by your household _ Owned with a mortgage s by your household _ Rented by your household _ Occupied rent free by your household Totalling of categorical estimates Estimates of the number of persons with a given characteristic can be obtained from the microdata file by adding the final weights of all records containing the desired characteristic or characteristics Percentages and ratios in the X Y form are obtained as follows a by adding the final weights of records containing the desired characteristic for the numerator X b by adding the final weights of records containing the desired characteristic for the denominator Y c by dividing the estimate for the numerator by the estimate for the denominator Quantitative estimates Quantitative estimates are estimates of
18. e microdata file presents data from the 2007 Survey of Household Spending SHS conducted in January until March 2008 Information about the spending habits dwelling characteristics and household equipment of Canadian households during 2007 was obtained by asking people in the 10 provinces and three territories to recall their expenditures for the previous calendar year spending habits or as of the time of the interview dwelling characteristics and household equipment Conducted since 1997 the Survey of Household Spending integrates most of the content found in the Family Expenditure Survey and the Household Facilities and Equipment Survey Many data from these two surveys are comparable to the Survey of Household Spending data However some differences related to methodology to data quality and to definitions must be considered before comparing these data See For further information below New for 2007 For the 2006 reference year automatic edits built into the electronic questionnaire replaced the balance edit and regional office editing performed in previous years For the 2007 reference year balance edit checks were reinstated Other documents e Data dictionary variable specifications code sets and other information is available in pdf format e Record layout is available in Excel format e Appendices are available in Excel format Appendix A presents the frequency counts for non dollar variables in the public use
19. e negatively correlated Examples Detailed calculations of approximate CVs used for estimating totals are initially presented using fictional cases Then actual cases of estimating totals averages or proportions ratios and differences based on microdata file data will be presented so users can check results and ensure that the method used was valid Part 1 Fictional case details of calculating an approximated CV for estimating a total A Quantitative variable Let us assume we wanted to estimate the total for a quantitative expenditure variable X for households containing at least one person less than 18 years of age To illustrate this procedure we will use a fictional sample see Figure 3 on which we will present calculation details see Figure 4 for each of the eleven steps described above As this procedure is applied independently within each province we shall merely describe calculations for one province Let us use the following sample for Ontario Figure 3 Fictional example Identifier i i i Initial Data Number of children Variable of 5 Oo 3 0 4 00006 00007 E In step 1 we define the domain of interest by creating a binary variable equal to 1 for all units belonging to the domain In the present case these are households with at least one child between the ages of 0 and 17 years We then proceed to o o o 00004 o o Statistics Canada 22 Catalogue no 62M0004XCB steps 2 throug
20. f sampling error as evaluated using the coefficient of variation CV as shown in the table below In addition to calculating CVs users should also read the section of this document regarding the characteristics of data quality Whatever CV is obtained for an estimate from this microdata file users should determine the number of sampled respondents who contribute to the calculation of the estimate If this number is less than 30 the weighted estimate should not be released regardless of the value of the CV for this estimate For weighted estimates based on sample sizes of 30 or more users should determine the CV of the rounded estimate following the guidelines below Statistics Canada 18 Catalogue no 62M0004XCB Figure 2 Sampling variability guidelines Type of CV in Estimate 1 ae 0 0 16 5 eae can be considered for general unrestricted release Requires no special notation 2 Z 16 6 33 3 Estimates can be considered for general unrestricted release but should be accompanied by a warning cautioning subsequent users of the high sampling variability associated with the estimates Such estimates should be identified by the letter M or in some other similar fashion 3 Unacceptable Greater than 33 3 Statistics Canada does not recommend the release of estimates of unacceptable quality However if the user chooses to do so then estimates should be flagged with the letter U or in some other similar fashion and
21. f code sets for non spending variables Income values at the household reference person and spouse of reference person levels were rounded in the following manner For income values between 1 and 9 999 round to the nearest 100 For income values between 10 000 and 99 999 round to the nearest 1 000 For income values between 100 000 and 999 999 round to the nearest 10 000 For income values between 1 000 000 and 9 999 999 round to the nearest 100 000 For income values between 10 000 000 and 99 999 999 round to the nearest 1 000 000 there are no such values on the 2007 file The variables Purchase price of dwelling and Selling price of dwelling were also rounded Statistics Canada 30 Catalogue no 62M0004XCB Appendices See accompanying Excel file Appendix A Frequency counts Appendix B Averages aggregates minimum and maximum values Part 1 of 2 Suppressed PUMF file Part 2 of 2 Unsuppressed survey file Appendix C Inclusion of spending variables in past microdata files Appendix D Coefficients of variation for published data from the 2007 SHS Part 1 of 3 Average expenditure per household Canada and provinces Part 2 of 3 Median expenditure per household reporting Canada and provinces Part 3 of 3 Dwelling characteristics and household equipment Canada and provinces Statistics Canada 31 Catalogue no 62M0004XCB
22. g a PC client server based system This system provides tools database querying searching and viewing capabilities for spotting systematic errors Weighting re weighting and Census historical revision of SHS Users should note that the weights for the SHS reference years 1997 to 2003 have been revised These revisions were published along with the 2005 survey results in December 2006 The estimation of population characteristics from a sample survey is based on the idea that each sampled household represents a certain number of other households in addition to itself These numbers are called the survey weights of the sample To improve the representativity of the sample the weights are adjusted so that the estimates from the sample are in line with population totals or benchmarks from other independent sources of information that are considered reliable This is called weight calibration SHS uses two sources for calibration The first source is the Census of Population which provides demographic benchmarks From 1997 to 2003 SHS used benchmarks derived from the 1996 Census Since the Census is conducted once every five years Statistics Canada projects the Census results for later years up to the present and then revises those estimates when the next Census data become available The projections use a variety of secondary information including administrative data on births deaths and migration The second source used for adjustin
23. g the survey weights for SHS are T4 data from Canada Revenue Agency which ensures that the estimated distribution of earners in the survey matches the one in the Canadian population It was decided to take advantage of this historical revision to also introduce an improved calibration strategy for the SHS weights Improvements to the calibration strategy were deemed necessary to put emphasis on SHS needs such as the age groups used for calibration and to take into account the quality of the benchmarks It was also felt that there were too many benchmarks leading to too many constraints on the weights and that this produced undesirable results such as negative weights which were not acceptable The current calibration strategy is as follows e Age At the provincial level there are controls for 8 age groups 0 6 7 17 18 24 25 34 35 44 45 54 55 64 65 Atthe CMA level two age groups 0 17 18 Statistics Canada 8 Catalogue no 62M0004XCB e There are controls for three size of household categories one person two persons 3 e T4 adjustments are made to the weights of the population for income from wages and salaries 0 25 percentile 25 50 50 65 65 75 75 95 95 100 Due to their smaller population only two age groups are used for the three northern territories number of persons under 18 and number of persons 18 and older The weights are also calibrated to the totals for one person households two
24. h 9 to estimate variance which will lead to calculation of the CV We thus obtain the following results Figure 4 Calculation details for approximating the CV of a total steps 1 to 9 Binary Quantitative a ar ETN Weigted Y Variable K cA size Y CE 1 Weight Step 8 noer Eeo EEN DES EE oor 1 foori 30 s a 1505 3 15 fers o foo 9 21 Ja se2i 2y 8820 ove 1 fosio 20 0 o f ors 10 fors 15 fo 15 15 19 20 t15 15 85500 oos 1 aos 20 2520 5o0f5 2 50 ar2 6 fao 6 14 isasrrasiay 117 600 ooa 1 foor1 50 5 50 2505 420 ara t2 b0 12 08 4 5 a0 39 28880 poos o faoro 0 fisto o fists o a 9 o o o 4t5 19 19 17o oo o horo o foro o foris ern a fos 3 hororaa e0 poor o fisso 0 fisto o fisas feras ote n2laa 15 1a 12 0 200 L fremo from 0 fooorso0 3 rotat 208 060 If we wanted to know the CV for Ontario we would perform the following calculation CV yn 100 MAMAN Con Variance pyr or E Pour q Step Dona 100 288860 59 7 Estimation Step 4 ONT p ONT If we wanted to know the CV for Canada we would proceed in similar manner by totalling the results for each province In other words 4 Variance CV cay 100 A Estimation cay Variancey Variance gc Varianceproy o0 100 Estimationy Estimationgc Estimationproy on B Qualitative variable categorical In the event a categorical variab
25. ion of SHS ccceeee 8 D ta dUd a r a aeara da teins es ee aa r eaaa Rede edie 9 Sampling CMON Occasion ean es eean es citeaan ah eons eee taan 9 Non sampling GMlOM cdanceeceseresntiaresevaridoacsteaneansedecuacevuateenentseteeeageerrsanasertepeenesonds 10 Tine Sie ct OMlarge Values stscnc2e manina ates lee mhebhe Duck amdemabevtacetees 13 Comparability over ANNO occa cece thas 2e ia Re Seeae carne ct bemead eves cage etle soeg Reged epee benodeeee a 13 Guidelines for tabulation analysis and dissemination ccsssseeeeeeeeeees 14 Guidelines for rounding sasx2c2eetncstses oracstee naactennedteodehlcthetiseedleetae Mette 14 Guidelines for the weighting of the sample for totalling purposes e 15 Types of estimates categorical versus quantitative seeeereerrrrrrrrrrrrre 15 Confidentiality of the public use microdata ccccceeeeeeeeeeeeeeeeeeeeeteeeenteeeeeeees 29 Appendices See accompanying Excel file ccssscsssseeeeeeeeeeeeeeeeeneeeeeeeees 31 Appendix A Frequency Counts s 2 2 tetieaendisncktettacteileckimieteleetaieeiete 31 Appendix B Averages aggregates minimum and maximum values 31 Appendix C Inclusion of spending variables in past microdata files 31 Appendix Doene lee cea ee cae tne ceed entaas E E cheers A ates 31 Coefficients of variation for published data from the 2007 SHS seeeeee 31 Introduction Background This public us
26. le for variance estimation of non smooth statistics such as quintiles For more information on standard errors and coefficients of variation refer to the Statistics Canada publication Methodology of the Canadian Labour Force Survey Catalogue no 71 526 XIE Coefficients of variation are available on request contact Client Services Income Statistics Division 1 888 297 7355 income statcan gc ca Data Suppression For reliability reasons estimates with CVs greater than 33 are normally suppressed Since CVs are not calculated for all estimates data suppression for the Survey of Household Spending has been based on a relationship between the CV and the number of households reporting expenditure on an item Analysis of past survey results indicates that CVs usually reach this level when the number of households reporting an item drops to about 30 Therefore data have been suppressed for spending on items reported by fewer than 30 households However data for suppressed items do contribute to summary level variables For example the expenditure for a particular category of clothing might be suppressed but this amount forms part of the total expenditure estimate for clothing Non sampling error Non sampling errors occur because certain factors make it difficult to obtain accurate responses or responses that retain their accuracy throughout processing Unlike sampling error non sampling error is not readily quantified Four sources of n
27. le is estimated the steps in calculating the approximate CV will be the same as in the quantitative variable example presented Instead of a quantitative value for variable of interest X we would create a dichotomous variable that would be equal to 1 if the household has the features we want to estimate If not it would be equal to 0 To estimate categorical variables various approaches may be used for defining the domain and the variable of interest both of which will produce the same results Statistics Canada 23 Catalogue no 62M0004XCB Let us assume we want to estimate the number of households consisting of more than one person living in a single family dwelling We could proceed in different ways 1 Binary variable is equal to 1 for all households and variable X is equal to 1 for households consisting of more than one person living in a single family dwelling 2 Binary variable is equal to 1 for all households consisting of at least one person and variable X is equal to 1 for all households the members of which live in a single family dwelling 3 Binary variable is equal to 1 for all households the members of which live in a single family dwelling and variable X is equal to 1 for all households made up of more than one person 4 Binary variable is equal to 1 for all households made up of more than one person living in a single family dwelling and X is equal to 1 for all households Whatever approach is used the resu
28. lting Y variable step 2 will be equal to 1 if the household possesses all the necessary features more than one person and living in a single family dwelling If not it will be equal to 0 Results in terms of point estimates and estimates of variance CV will thus be the same Part 2 Actual cases based on the microdata file Example 1a Approximation of CV for estimates of totals quantitative variable Let us assume that we have estimated that household furnishings and equipment expenditures for one person households in Manitoba total 116 890 010 We have to estimate the approximate CV for this estimate Users must therefore follow steps 1 to 11 of rule 1 1 Create a binary variable whose value is 1 if the household is a one person household and resides in Manitoba otherwise equals 0 2 Y is defined for each household as the product of the binary variable and the total household furnishing and equipment expenditures variable Note that the estimate of spending on household furnishings and equipment is obtained by adding the product of variable Y defined in 2 and the final weight of the household Figure 5 shows the results of some of the steps in the approximate CV calculation Statistics Canada 24 Catalogue no 62M0004XCB Figure 5 Calculation of ACV Step Total spending on household furnishings and equipment for one person households in Manitoba 116 890 010 1 079 909 108 24 1 3576 x10 1 3576 x10
29. microdata file They are included to help you verify your tabulations Appendix B presents expenditure data tabulated using the public use microdata file and also using the internal survey database They are included to help you verify your tabulations Appendix C contains a table indicating the spending variables included in previous public use microdata files of the Survey of Household Spending and the Family Expenditure Survey Appendix D presents the coefficients of variation for published data from the 2007 SHS Statistics Canada 3 Catalogue no 62M0004XCB For further information e Additional information about the SHS can now be obtained free on the Statistics Canada web site www statcan gc ca See especially e Note to former users of data from the Family Expenditure Survey 62FOO026MIE2000002 e Note to former users of data from the Household Facilities and Equipment Survey 62FO026MIE2000003 e User Guide for the Survey of Household Spending 2007 62FO026MIE2009001 e Methodology for the Survey of Household Spending 62FO026MIE2001 003 e 2003 Survey of Household Spending Data Quality Indicators 62FO026MIE2005006 For more information about the current survey results and related products and services or to enquire about the concepts methods or data quality of the Survey of Household Spending contact Client Services 613 951 7355 1 888 297 7355 fax 613 951 3012 income statcan gc ca Income Statistic
30. nada 20 Catalogue no 62M0004XCB More formally steps 1 to 10 above can be obtained with the following formula 11 2 Dis w Dw r M Dace WY aes Win j p where the index p corresponds to provinces S is the sample of respondents for the province p the index k corresponds to households wx is the final weight for the k household m is the household size for the k household and Y is the value of the variable Y defined in step 2 above for the k household As you can see index p the province indicator takes values ranging from 1 to 11 Eleven distinct province codes appear on the microdata file one for each of the ten provinces and a 00 province code assigned to a set of records for reasons of confidentiality See Confidentiality of the public use microdata on page 29 Important When estimating variance for a given domain do not limit yourself to units belonging to the domain The entire sample should always be used to estimate variance Units that do not belong to the domain of interest are not considered when computing the point estimate of the total but do contribute when estimating the variance Rule 2 Approximating CV for estimates of averages or proportions An estimated mean or proportion is obtained by the ratio of two estimated totals For a proportion the numerator is an estimate that is a sub set of the denominator for example the proportion of expenditures for households in Manitoba compared t
31. o all Canadian households The CV of an estimated mean or proportion tends generally to be slightly lower than the corresponding CV of the numerator The CV of an estimated mean or proportion can thus be approximated with the CV of the numerator and the technique described in rule 1 can be used Rule 3 Approximating CV for estimates of ratios Ratio refers to the relationship between any two estimates of totals for which rule 2 does not apply Approximate CVs for any other types of ratio may be calculated using the following formula ACV JACV2 ACV where ACVz is the approximate CV of the ratio ACVy is the approximate CV of the numerator of the ratio and ACV gt is the approximate CV of the denominator of the ratio The formula will tend to overestimate the CV if the two estimates forming the ratio are positively correlated and underestimate the CV if these two estimates are negatively correlated Statistics Canada 21 Catalogue no 62M0004XCB Rule 4 Approximating CVs for estimates of differences The approximate CV of a difference between any two estimates ESTpirr EST EST is given by EST ACV EST ACV ACV ore EST DIFF where ACV is the approximate CV associated with EST and ACV is the approximate CV associated with EST The formula will tend to overestimate the CV if the two estimates forming the difference are positively correlated and underestimate the CV if these two estimates ar
32. of the time of the interview Statistics Canada 6 Catalogue no 62M0004XCB Because the Survey of Household Spending is designed principally to provide detailed information on non food expenditures only an overall estimate of food expenditure is recorded Detailed information on food expenditure is provided by the Food Expenditure Survey which is conducted every four to six years It was last conducted in 2001 In February 2003 the results were published in Food Expenditure in Canada 2001 Catalogue no 62 554 XIE The sample The sample size for the 2007 Survey of Household Spending was 21 407 eligible households The regular SHS sample was a stratified multi stage sample selected from the Labour Force Survey LFS sampling frame Sample selection comprised two main steps the selection of clusters small geographic areas from the LFS frame and the selection of dwellings within these selected clusters The LFS sampling frame mainly uses 2001 Census geography and 2001 population counts Data collection The 2007 Survey of Household Spending was conducted from January to March 2008 Data were collected by computer assisted personal interview CAPI using a laptop personal computer A copy of this questionnaire is available on request Data processing and quality control As with the 2006 Survey of Household Spending for the 2007 SHS the interviewers recorded the information provided by the respondents using a laptop and performed
33. on sampling error can be identified coverage error response error non response error and processing error Coverage error Coverage error results from inadequate representation of the intended population This error may occur during sample design or selection or during data collection and processing Statistics Canada 10 Catalogue no 62M0004XCB Response error Response error may be due to many factors including faulty design of the questionnaire interviewers or respondents misinterpretation of questions or respondents faulty reporting Several features of the survey help respondents recall their expenditures as accurately as possible First the survey period is the calendar year because it is probably more clearly defined in people s minds than any other period of similar length Second expenditure on food can be estimated as either weekly or monthly expenses depending on the respondent s purchasing habits Third expenses on smaller items purchased at regular intervals are usually estimated on the basis of amount and frequency of purchase Purchases of large items automobiles for example are recalled fairly easily as are expenditures on rent property taxes and monthly payments on mortgages However even with these items the accuracy of data depends on the respondent s ability to remember and willingness to consult records In the Survey of Household Spending the difference between receipts and disbursemen
34. or absence of these extremes Comparability over time Conducted since 1997 the Survey of Household Spending integrates most of the content found in the Family Expenditure Survey and the Household Facilities and Equipment Survey Many variables from these two surveys are comparable to those in the Survey of Household Spending However some differences related to the methodology to data quality and to definitions must be considered before making comparisons For more information refer to Note to Former Users of Data from the Family Expenditure Survey Catalogue no 62FOO26MIE2000002 and Note to Former Users of Data from the Household Facilities and Equipment Survey Catalogue no 62FOO26MIE2000003 Both documents are available free of charge on the Statistics Canada web site www statcan gc ca Historical data from the 1997 to the 2003 surveys of household spending have been re weighted using the weighting methodology described in the section Weighting Historical comparisons between data from those surveys and data from recent years of the Survey of Household Spending should generally be made with re weighted data although the differences between survey estimates from the old and new methodologies appear to be minimal at a summary level Certain populations or variables however may be more strongly affected Starting with the 1997 Survey of Household Spending Tenants maintenance repair and alterations and Insurance premiums
35. ovinces 2007 coe ae Refusals Unusables Usables e ate number Newfoundland and Labrador 1 776 198 278 49 1 251 70 4 Prince Edward Island 890 94 192 14 590 66 3 Nova Scotia 1 966 311 394 68 1 193 60 7 New Brunswick 1 783 194 250 98 1 241 69 6 Quebec 2 621 297 584 57 1 683 64 2 Ontario 3 110 489 758 119 1 744 56 1 Manitoba 1 960 198 369 71 1 322 67 4 Saskatchewan 1 901 108 375 91 1 327 69 8 Alberta 2 011 244 342 107 1 318 65 5 British Columbia 2 359 234 473 88 1 564 66 3 Yukon 410 86 53 1 270 65 9 Northwest Territories 400 100 31 3 264 66 0 Nunavut 220 34 10 3 173 78 6 Canada 21 407 2 587 4 109 771 13 940 65 1 1 There is no longer a distinction between part year and full year households 2 Rejected at the editing stage 3 Usable eligible 100 Statistics Canada 12 Catalogue no 62M0004XCB Processing error Processing errors may occur in any of the data processing stages for example during data entry editing weighting and tabulation See Data processing and quality control for a description of the steps taken to reduce processing error The effect of large values For any sample estimates can be affected by the presence or absence of extreme values from the population These extreme values are most likely to arise from positively skewed populations The nature of the subject matter of the SHS lends itself to such extreme values Estimates of totals averages and standard errors may be greatly influenced by the presence
36. person households and households with three or more persons Before the historical re weighting the calibration strategy varied slightly between the territories and between survey years The northern calibration is now consistent across all three territories and over time The weights and calibration strategy were implemented for SHS for the years 1997 and onward resulting in revised estimates of household spending for each year up to 2003 Users of SHS data should take care to make comparisons using the re weighted data Data quality For more detailed information see the Survey of Household Spending Data Quality Indicators soon to be available free on the Statistics Canada web site at www statcan gc ca Sampling error Sampling errors occur because inferences about the entire population are based on information obtained from only a sample of the population The sample design the variability of the data and the sample size determine the size of the sampling error In addition for a given sample design different methods of estimation will result in different sampling errors The design for the 2007 Survey of Household Spending was a stratified multi stage sampling scheme The sampling errors for multi stage sampling are usually higher than for a simple random sample of the same size However the operational advantages outweigh this disadvantage and the fact that the sample is also stratified improves the precision of estimates D
37. rmine the CV of this estimate We have already calculated CVs for each of the two estimates involved in estimating the ratio We would thus apply rule 3 to obtain the desired CV CVA CVA CVA 9 33 22 70 24 54 This CV should be identified as Marginal see Guidelines for release as it is quite high being between 16 6 and 33 3 Example 3 Approximation of CV for estimating differences Let us assume we wanted to estimate the difference between total expenditures on furnishings and household equipment in Alberta and in Manitoba as well as the CV for this difference We would estimate total expenditures on furnishings and household equipment along with their respective CVs for Manitoba total 762 835 523 CV 3 65 and for Alberta total 2 956 581 785 CV 4 53 Estimation of the difference would thus be 2 956 581 785 762 835 523 2 193 746 262 Rule 4 can be applied to obtain the desired CV Statistics Canada 27 Catalogue no 62M0004XCB EST CVA EST CVA CVA pr LEST ir E 2 956 581 785 4 53 762 835 523 3 65 E 7 2 193 746 262 7 6 24 How to obtain confidence limits Although coefficients of variation are widely used a more intuitively meaningful measure of sampling error is the confidence interval of an estimate A confidence interval constitutes a statement on the level of confidence that the true value for the population lies within a specifie
38. s Division Statistics Canada 4 Catalogue no 62M0004XCB Technical characteristics of the file Content Household spending dwelling characteristics and household equipment 2007 Source Survey of Household Spending 2007 Income Statistics Division Statistics Canada Data set definition Data Set arm niera a a ooo a A aN SHS2007 TXT Number of reCords ccccccccceeeceeeeeeeceeeeaeeeeeeeueaseeeeeeeeeeenaees 13 939 Format Record length iecerei eiere eia ra eE Eae EAA RER betel FEE EEEE TE 2 066 Statistics Canada 5 Catalogue no 62M0004XCB Survey methodology For more detailed information see the Methodology of the Survey of Household Spending available free on the Statistics Canada web site at www statcan gc ca The survey universe The 2007 Survey of Household Spending was carried out in private households in Canada s 10 provinces and three territories The following groups were excluded from the survey e those living on Indian reserves and crown lands with the exception of the territories official representatives of foreign countries living in Canada and their families members of religious and other communal colonies members of the Canadian Forces living in military camps and people living full time in institutions for example inmates of penal institutions and chronic care patients living in hospitals and nursing homes The survey covers about 98 of the population in the 10 provinces
39. se presented for steps 10 and 11 are Canada wide Single family single detached Semi detached double Town houses row or terrace Other duplex apartment hotel mobile home other IAN Statistics Canada 25 Catalogue no 62M0004XCB Figure 6 Calculation of ACV Step Number of Number of Number of one Number of Number of one one person one person person one person person households households households households households living livingina living in a semi living ina in other housing single family detached townhouse dwelling dwelling 4 138 947 65 206 3 130 5 365 65 246 5 1 079 909 1 079 909 1 079 909 1 079 909 1 079 909 6 0 13 0 06 0 003 0 005 0 06 9 62 823 895 29 287 210 1 243 214 2 274 055 26 783 367 10 8 730 161 681 2 273 106 572 326 137 377 373 581 362 5 122 737 690 11 2 56 4 10 12 20 10 66 3 33 Example 1c Approximation of CV for estimates of totals used in the calculation of average expenditure Let us assume we want to estimate average expenditure on furnishings and household equipment for one person households in Manitoba To do so we would have to estimate the number of one person households in Manitoba as well as the total of their expenditure on furnishings and household equipment Figure 7 Calculation of ACV Step Number of one person Total expenditure on furnishings and households in Manitoba household equipment for households consisting of one person in Manitoba 4 138 947 116 890 010 5 1 079 90
40. t although the estimates done using those packages are in many cases accurate the variances calculated have almost no significance For numerous analytical techniques for example linear regression logistic regression variance analysis there is a way to make the application of standard packages more significant If the weights of the records contained in the file are converted so that the mean weight is 1 the results produced by standard Statistics Canada 17 Catalogue no 62M0004XCB packages will be more reasonable and will take into account uneven selection probabilities although they still cannot take into account the stratification and the cluster distribution of the sample The conversion can be done using in the analysis a weight equal to the original weight divided by the mean of original weights for sampling units households that contribute to the estimator in question However because this method still does not take into account sample design stratification and clusters the estimates of the variance calculated in this way will very likely be underestimates of true values Guidelines for release Before releasing and or publishing estimates taken from the microdata file users must first determine the level of reliability of the estimates The quality of the data is affected by the sampling error and the non sampling error as described above However the level of reliability of estimates is determined solely on the basis o
41. te a quantitative variable create a variable Y representing the product of the binary variable and the variable of interest To estimate a categorical variable create a variable Z equal to 1 if the categorical variable is equal to the value of interest and equal to 0 otherwise Define variable Y as the product of and Z Do step 4 to step 9 for each province separately 4 Calculate the sum over all the households of the product of the final weight section Weighting and Y this sum represents the estimate of the total for the domain of interest in the province under consideration 5 Calculate the sum over all the households of the product of the final weight and the household size Divide the result obtained in step 4 by the result obtained in step 5 7 For each household multiply the result obtained in step 6 by the household size 8 For each household define a variable say E by the subtraction of the result obtained in step 7 from Y 9 Calculate the sum over all the households of the product of the final weight minus 1 the final weight and E squared this sum represents the estimated variance of the total estimated at step 4 10 Add up the result obtained in step 9 for each province 11 The ACV is defined to be 100 times the square root of the result obtained in step 10 divided by the estimate The estimate is the sum over all the provinces of the result obtained in step 4 Statistics Ca
42. tio is larger than 2 the observed difference is significant at the 5 level Example 5 Let us suppose we wish to test at the 5 level of significance the hypothesis that there is no difference between the total of spending on furnishings and equipment in Alberta and the same total in Manitoba From example 3 the approximate CV of the difference between these two estimates was found to be 6 24 and z 16 03 Since this value is greater than 2 it must be concluded that there is significant difference between the two estimates at the 0 05 level of significance Confidentiality of the public use microdata Microdata files for public use differ in many ways from the master file of the survey held by Statistics Canada These variations are due to measures taken to preserve the anonymity of respondents to the survey The confidentiality of this file is ensured mainly by reducing information i e deleting variables or suppressing or collapsing some of their detail Statistics Canada 29 Catalogue no 62M0004XCB To protect confidentiality All explicitly identifying information such as identification numbers was removed from the file Names and addresses are not data captured 228 records had their province codes set to 0 due to special characteristics e g exceedingly high or low expenditure values These records were reweighted Other records were also reweighted for confidentiality reasons There was top coding and collapsing o
43. tory especially for the most reliable estimates Note that accuracy of this approximation is reduced when the domains become smaller Therefore the CV approximation method Statistics Canada 19 Catalogue no 62M0004XCB must be used prudently when the domains are small The document on data quality for the 1997 SHS contains the results of the evaluation of the performance of the CV approximation method How to obtain approximate CVs The following rules should enable the user to determine the approximate coefficients of variation for estimates of totals means or proportions ratios and differences between such estimates for sub populations domains for which the Bootstrap CV is not provided in the CV tables Important If the number of observations on which an estimate is based is less than 30 the weighted estimate should not be released regardless of the value of the CV for this estimate Rule 1 Approximating CVs for estimates of totals aggregates All the steps below must be followed to obtain an approximate CV ACV for an estimate of a total either a number of households possessing a certain characteristic categorical estimate or a total of some expense for all households quantitative estimate for a sub population domain of interest 1 Create a binary variable for each household say equalling 1 if the household is part of the domain of interest i e possesses the desired characteristic and 0 otherwise 2 To estima
44. ts is calculated as a check on respondents recall This important quality control tool involves the balancing of receipts income and other money received by the household and disbursements total expenditure plus the variable Money flows assets loans and other debts for each questionnaire If the difference is greater than 30 of the larger of receipts or disbursements the record is considered unusable and therefore will not be used In 2007 in order to reduce respondent s burden new screening questions were added to the questionnaire for some categories Since the answers to these questions were yes or no where the response was negative the interviewer would skip the remaining parts of the question and would go to the next one This would result in saving time and a shorter interview The addition of the screening questions did not change the reporting percentage for most categories However we have noted that for a few categories it has resulted in a lower than expected percentage reporting and therefore slightly lower averages for some items under that category These screening questions will be modified for the 2008 SHS The following is a list of the categories where the screening questions may have affected the response rate Cooking equipment Microwave ovens Sewing machines vacuum cleaners Home and workshop tools and equipment Other lawn garden and snow removal tools Use of recreational facilities Maps
Download Pdf Manuals
Related Search
Related Contents
Owner`s Manual - Dell Rapids Custom Trailers Istruzioni per l`installazione Petzl Tikka Easy Seeder ES0011 Instructions / Assembly : Free Download, Borrow, and Streaming : Internet Archive What is Acellus? Duet™ - Cybarcode, Inc. do Trabalho - Peamb ACCU-CHEK Inform II Base Unit HUB Operator`s Manual Service Manual Product Instructions Copyright © All rights reserved.
Failed to retrieve file