B. Statistical Methods
This submission requests clearance for sampling and school recruitment activities for the High School Longitudinal Study of 2009 (HSLS:09) main study to be completed in 2009. This section provides a description of the target universe for this study, followed by an overview of the sampling and statistical methodologies proposed for the main study. We will also address suggested methods for maximizing response rates and for tests of procedures and methods, and we will introduce the statisticians and other technical staff responsible for design and administration of the study.
The target population for the HSLS:09 main study consists of 9th grade students in public and private schools that include 9th and 11th grades; their parents; and corresponding math and science teachers, school administrators, and high school counselors. The required respondent samples will be selected from all public and private schools with 9th and 11th grades in the 50 states and the District of Columbia.1 Excluded from the target universe will be specialty schools such as Bureau of Indian Affairs schools: special education schools for people with disabilities; area vocational schools that do not enroll students directly; schools that service students at detention centers, and rehabilitative or correctional facilities; and schools for the dependents of U.S. personnel overseas.
The primary sampling units (PSU) of schools for this study will be selected from the two databases of the U.S. Department of Education. The Common Core of Data (CCD) was used for selection of public schools, while private schools were selected from the Private School Survey (PSS) universe files. The full-scale study sample of schools was selected prior to the field test sample to reduce possible overlap in the samples and thus reduce burden. However, the early selected full-scale study sample will be “refreshed” by a small supplemental sample of schools that became eligible in the time between the administration of the field test and of the full-scale study. The secondary sampling units (SSU) of students will be selected from student rosters that will be secured from the sample schools. The PSU and SSU sampling procedures for this study are detailed in the next section.
The following section describes sampling procedures for the main study for which clearance is requested. First discussed is the selection plan for the full-scale study sample of schools, then, selection procedures for the student samples will be presented for the main study that will be conducted in 2009. This section also includes descriptions of the procedures to be followed after data collection, including survey weight adjustments, to measure and reduce bias associated with nonresponse.
RTI used NCES’s 2005-2006 CCD as the public school sampling frame and the 2005-2006 PSS as the private school sampling frame. Given that these two sample sources provide comprehensive listings of schools, and that CCD and PSS data files have been used as school frames for a number of other school-based surveys, it is particularly advantageous to use these files in HSLS:09 for comparability and standardization across NCES surveys.
As mentioned earlier, the survey population for the full-scale study of HSLS:09 consists of all ninth-graders in the 50 states and District of Columbia enrolled in
regular public schools, including state department of education schools, that include 9th and 11th grades; and
Catholic and other private schools that have 9th and 11th grades.
Excluded from this study is the following list of schools:
schools with no 9th or 11th grade;
ungraded schools;
Bureau of Indian Affairs schools;
special education schools;
area vocational schools that do not enroll students directly;
schools that service students at detention centers and rehabilitative or correctional facilities
Department of Defense schools; and
closed public schools.
The school samples were selected using a stratified probability proportional to composite size methodology developed by RTI statisticians (Folsom, Potter, and Williams, 1987). This methodology will support the desired oversampling of students in key analytical domains (e.g., Asians), maintain near equal sampling weights for students within each domain, and result in approximately equal total student sample sizes within sampled schools. Details of school sample selection for the main study are provided next.
The public and private school samples for the full-scale study under the original specifications are large enough to secure approximately 800 participating schools, combined. The required samples were selected from the CCD (2005–2006) and PSS (2005–2006) within sampling strata defined by
school type: Public, Catholic, or Other private schools;
Census region: Northeast, Midwest, South, or West; and
locality: City, Suburban, Town, or Rural.
Prior to sample selection, the list frames were randomly sorted by state within Census region to maximize the spread of the sample across the U.S.
Per request from NCES, RTI selected a supplemental sample of public schools for the following states to ensure state-representative estimates could be produced: Florida, Georgia, Michigan, North Carolina, Ohio, Pennsylvania, Tennessee, and Washington. Power analyses determined that a minimum number of 40 participating schools per state would satisfy the analyses. State-level estimates are also required for California and Texas but the original full-scale study samples had sufficient numbers of public schools within these two states. Because the original full-scale study had been selected before the state-specific supplemental sample request, we used a Keyfitz (1951) sampling procedure to minimize the chance of selecting schools already for the HSLS:09 full-scale study, the HSLS:09 field test, and the 2010 PISA. The required sample of public schools was again selected from the CCD (2005–2006) within sampling strata defined by:
state: Florida, Georgia, Michigan, North Carolina, Ohio, Pennsylvania, Tennessee, and Washington; and
locality: City, Suburban, Town, or Rural.
The state-specific samples will increase the required number of participating public schools from 600 to approximately 756.
Enrollment information was unavailable for less than 0.1 percent of the eligible schools on the CCD sampling frame. RTI imputed the needed enrollment counts to the median value for ninth-graders within region, locality, and race/ethnicity categories prior to sample selection.
As mentioned earlier, a new sample of schools will be added to the main study sample to account for new schools or those that become eligible after the sampling frames were constructed. For this purpose, we will compare 2005–2006 and 2006-2007 CCDs to determine the frequency of new public high schools. Moreover, districts associated with this fresh sample of schools will be contacted to identify eligible schools recently opened in their jurisdiction. The districts will be provided with a list of all public schools on the sampling frame in their district to help them identify the appropriate schools. Analogous activities will be carried out for private schools using the 2007-2008 PSS. Additional sources of information such as the Quality Education Data (QED), will be examined for use in identifying new schools.
Obviously, a sample size larger than 956 schools (or, 800 + 156) is necessary to compensate for the anticipated nonresponse and ineligibility. As per NCES standards, we will target a weighted response rate of at least 70 percent at the school level. In unweighted terms, this means that a sample of size 1,366 (or, 956 / 0.7) schools will be required to secure 956 participating schools. Based on our experience with the Education Longitudinal Study of 2002 (ELS:2002), about 4 percent of sampled schools will emerge as ineligible for this study. Consequently, the projected size for the starting sample will be 1,421 (or, 1,366 × 1.04) schools. Moreover, based on ELS:2002 response rates, we expect that an additional sample of 175 schools will be needed to secure 956 participating schools, for a grand total of 1,596 (or, 1,421 + 175) schools. Table 10 provides the total number of sampled and participating schools by the original design strata and includes the additional 156 state-specific public schools.
Table 10. Illustrative school sample allocation and expected yields (full-scale study HSLS:09)
| School stratum | Total | Northeast | Midwest | South | West | |||||
| Sampled | Participating | Sampled | Participating | Sampled | Participating | Sampled | Participating | Sampled | Participating | |
| Total | 1,596 | 956 | 274 | 165 | 397 | 238 | 614 | 367 | 311 | 186 | 
| 
				 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
				 | 
				 | 
| Public, total | 1,263 | 755 | 200 | 120 | 301 | 180 | 506 | 302 | 256 | 153 | 
| Public, city | 365 | 219 | 52 | 31 | 78 | 47 | 137 | 82 | 98 | 59 | 
| Public, suburban | 468 | 279 | 90 | 54 | 114 | 68 | 166 | 99 | 98 | 58 | 
| Public, town | 132 | 79 | 23 | 14 | 32 | 19 | 50 | 30 | 27 | 16 | 
| Public, rural | 298 | 178 | 35 | 21 | 77 | 46 | 153 | 91 | 33 | 20 | 
| 
				 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
				 | 
				 | 
| Catholic, total | 165 | 100 | 45 | 27 | 57 | 35 | 40 | 24 | 23 | 14 | 
| Catholic, city | 92 | 56 | 21 | 13 | 32 | 19 | 28 | 17 | 11 | 7 | 
| Catholic, suburban | 51 | 30 | 19 | 11 | 18 | 11 | 7 | 4 | 7 | 4 | 
| Catholic, town | 18 | 10 | 4 | 2 | 6 | 4 | 4 | 2 | 4 | 2 | 
| Catholic, rural | 4 | 4 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 
| 
				 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
				 | 
				 | 
| Other private, total | 168 | 101 | 29 | 18 | 39 | 23 | 68 | 41 | 32 | 19 | 
| Other private, city | 73 | 44 | 11 | 7 | 15 | 9 | 27 | 16 | 20 | 12 | 
| Other private, suburban | 57 | 35 | 8 | 5 | 16 | 10 | 26 | 16 | 7 | 4 | 
| Other private, town | 16 | 10 | 3 | 2 | 4 | 2 | 8 | 5 | 1 | 1 | 
| Other private, rural | 22 | 12 | 7 | 4 | 4 | 2 | 7 | 4 | 4 | 2 | 
*Counts include additional sample selected for state-level estimates.
We will closely monitor the school recruitment activities and release additional schools as needed to ensure that we reach our goal of 956 participating schools. To this end, in addition to the above sample of 1,596 schools, a reserve pool of 366 schools will be selected should observed yield rates fall below expectations. Operationally, the entire sample of 1,962 (or, 1,596 + 366) schools will be randomly partitioned within each stratum into two release pools and two reserve pools. The two release pools will compose the basic sample of 1,596 schools. If needed, schools randomly grouped by reserve pool and stratum will be released for data collection in only those strata with low yield rates.
As with the HSLS:09 field test, RTI will use data from QED to obtain principal and district superintendent names along with related information needed to contact the sampled schools. Contacted schools who agree to participate will be asked to provide current student rosters within a few months of the start of data collection. For refusing schools, an abbreviated questionnaire will be used to obtain important school-characteristic data to complement frame information. The resulting information will enable us to conduct a more effective analysis of nonresponse bias.
All sampled schools will be contacted and asked to upload their student lists to a secure website to serve as sampling frames for student samples. Moreover, a backup option will allow schools to provide their student lists via e‑mail of zipped/password-protected files. If the school cannot provide electronic lists, we will ask for paper lists to be faxed to a fax machine in a locked room at RTI. For data security reasons, we will request that paper lists not be mailed. RTI will ask each sample school to provide the following information for each eligible student:
student ID number;
full name;
sex;
race (White; Black; Asian; Native Hawaiian or Other Pacific Islander; American Indian or Alaska Native);
ethnicity (Hispanic indicator, regardless of race); and
whether an Individualized Education Program (IEP) has been filed for the student (yes, no).
Race/ethnicity will be needed to guide oversampling of minority students. Moreover, race/ethnicity along with gender and IEP indicators often serve as effective variables for nonresponse adjustments in the full-scale study.
As requested by NCES, no students will be excluded from the sampling frame because of disabilities or language problems. Specifically, the HSLS:09 main study will include students with severe mental disabilities, those with limited command of the English language for understanding the survey materials, and students with physical or emotional problems. Schools will identify such students, and we will work with the schools to determine if any accommodations can be made for these students to complete the survey and assessment. Students who cannot complete the survey or cognitive tests will be excused from doing so; however, contextual information about such students will be collected from teachers, principals, high school counselors, and parents.
The student lists will be reviewed for quality, and schools whose lists fail the quality checks will be recontacted by the school recruiter to resolve observed discrepancies.2 We will proceed with selecting sample students when we have either confirmed that the list received is correct or have received a corrected list. Students will be sampled on a flow basis as student lists are received. We will stratify the lists by race/ethnicity and select a systematic sample of students from the resulting lists. For schools that provide paper lists, RTI will use a two-stage process that we have used effectively to select systematic samples from paper lists. This simple, yet scientific, method eliminates the need for data entry of the entire list of students when such lists are provided on paper. Instead, only information for sampled students will be data-entered.
A sample of 25 students from ninth grade will be randomly selected on average from the HSLS:09 sample schools for a base sample of approximately 23,900 (or, 956 × 25) students. Moreover, this base sample will be augmented by selecting at least 1,800 additional Asian students for a total sample of 25,700 students.3 This augmentation is required to ensure that this subpopulation meets the minimum sample size needed to achieve the following general precision requirements:
detect a 15 percent change in proportions across waves of the study;
detect a 5 percent change in means;
produce relative standard errors of 10 percent or less for proportion estimates based on data from a single wave of data collection; and
produce relative standard errors of 2.5 percent or less for estimated means based on data from a single wave of data collection.
Using student enrollment counts from the CCD/PSS and relying on our experience from the field test, the student sampling rates will be set in advance based on race/ethnicity. Students will be sampled from the student lists RTI will receive from sample schools, using a stratified, systematic sampling procedure. Sample sizes will be monitored by race/ethnicity and the sampling rates will be adjusted, if necessary, to achieve all sample size goals. While we expect to achieve the stated response and eligibility rates, an early identification of low sample yields will be vital in making sure we can adjust appropriately to reach our target yields. Table 11 shows a possible student sample allocation and yield for the HSLS:09 full-scale study. We anticipate requesting student lists and drawing student samples on a flow basis between August and November 2009.
Table 11. Student sample allocation and expected yields for HSLS:09 ninth-graders
| School stratum | Total | Hispanic | Asian | Black | Other | |||||
| Sample | Respondents | Sample | Respondents | Sample | Respondents | Sample | Respondents | Sample | Respondents | |
| Total | 23,900 | 20,888 | 2,900 | 2,221 | 2,652 | 2,082 | 2,943 | 2,236 | 15,405 | 14,349 | 
| 
				 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
 | 
 | 
 | 
				 | 
| Northeast | 4,302 | 3,760 | 523 | 399 | 476 | 372 | 530 | 403 | 2,773 | 2,586 | 
| Public, city | 747 | 652 | 91 | 69 | 83 | 66 | 92 | 70 | 480 | 447 | 
| Public, suburban | 1,328 | 1,160 | 162 | 124 | 148 | 115 | 163 | 124 | 854 | 797 | 
| Public, town | 406 | 355 | 49 | 37 | 44 | 36 | 50 | 38 | 262 | 243 | 
| Public, rural | 508 | 444 | 61 | 47 | 56 | 45 | 63 | 47 | 328 | 305 | 
| Catholic, city | 387 | 340 | 47 | 36 | 43 | 34 | 48 | 36 | 249 | 233 | 
| Catholic, suburban | 320 | 280 | 38 | 30 | 35 | 26 | 40 | 30 | 207 | 194 | 
| Catholic, town | 99 | 86 | 12 | 9 | 11 | 8 | 12 | 10 | 64 | 59 | 
| Catholic, rural | 30 | 26 | 3 | 3 | 3 | 2 | 3 | 3 | 20 | 18 | 
| Other private, city | 180 | 157 | 22 | 16 | 20 | 15 | 22 | 16 | 116 | 108 | 
| Other private, suburban | 138 | 119 | 16 | 12 | 15 | 10 | 16 | 12 | 90 | 85 | 
| Other private, town | 42 | 37 | 5 | 4 | 4 | 3 | 5 | 4 | 26 | 25 | 
| Other private, rural | 117 | 104 | 17 | 12 | 14 | 12 | 16 | 13 | 77 | 72 | 
| 
				 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
 | 
 | 
 | 
				 | 
| Midwest | 6,005 | 5,248 | 729 | 559 | 667 | 524 | 738 | 561 | 3,871 | 3,604 | 
| Public, city | 1,046 | 915 | 127 | 98 | 115 | 93 | 129 | 96 | 674 | 628 | 
| Public, suburban | 1,601 | 1,399 | 194 | 149 | 178 | 138 | 197 | 149 | 1,032 | 963 | 
| Public, town | 490 | 429 | 59 | 45 | 55 | 43 | 60 | 46 | 316 | 295 | 
| Public, rural | 1,105 | 966 | 134 | 103 | 123 | 96 | 136 | 103 | 713 | 663 | 
| Catholic, city | 598 | 522 | 72 | 56 | 67 | 53 | 73 | 56 | 385 | 357 | 
| Catholic, suburban | 343 | 299 | 42 | 32 | 38 | 30 | 42 | 32 | 221 | 206 | 
| Catholic, town | 105 | 92 | 13 | 10 | 12 | 9 | 12 | 11 | 68 | 62 | 
| Other private, city | 269 | 235 | 33 | 25 | 30 | 23 | 33 | 25 | 173 | 161 | 
| Other private, suburban | 274 | 240 | 34 | 25 | 31 | 24 | 34 | 26 | 175 | 164 | 
| Other private, town | 84 | 73 | 10 | 8 | 9 | 8 | 10 | 8 | 56 | 50 | 
| Other private, rural | 90 | 78 | 11 | 8 | 9 | 7 | 12 | 9 | 58 | 55 | 
| 
				 | 
				 | 
 | 
				 | 
 | 
				 | 
 | 
 | 
 | 
 | 
				 | 
| South | 8,903 | 7,779 | 1,080 | 827 | 989 | 777 | 1,096 | 832 | 5,738 | 5,343 | 
| Public, city | 1,881 | 1,644 | 228 | 174 | 208 | 164 | 231 | 176 | 1,214 | 1,129 | 
| Public, suburban | 2,379 | 2,078 | 289 | 220 | 264 | 208 | 293 | 223 | 1,533 | 1,427 | 
| Public, town | 728 | 636 | 88 | 68 | 81 | 64 | 90 | 68 | 469 | 436 | 
| Public, rural | 2,002 | 1,749 | 242 | 186 | 223 | 175 | 247 | 187 | 1,290 | 1,199 | 
| Catholic, city | 538 | 470 | 66 | 50 | 60 | 45 | 66 | 50 | 346 | 324 | 
| Catholic, suburban | 138 | 119 | 16 | 12 | 16 | 12 | 18 | 12 | 88 | 83 | 
| Catholic, town | 42 | 37 | 5 | 4 | 4 | 3 | 5 | 4 | 26 | 25 | 
See notes at end of table.
Table 11. Illustrative student sample allocation and expected yields for ninth-graders (full-scale study HSLS:09)—Continued
| School stratum | Total | Hispanic | Asian | Black | Other | |||||||||||||||
| Sample | Respondents | Sample | Respondents | Sample | Respondents | Sample | Respondents | Sample | Respondents | |||||||||||
| Other private, city | 508 | 444 | 61 | 47 | 56 | 45 | 62 | 47 | 328 | 305 | ||||||||||
| Other private, suburban | 435 | 380 | 53 | 41 | 48 | 37 | 54 | 41 | 281 | 262 | ||||||||||
| Other private, town | 133 | 116 | 16 | 12 | 14 | 12 | 16 | 12 | 86 | 80 | ||||||||||
| Other private, rural | 119 | 106 | 16 | 13 | 15 | 12 | 14 | 12 | 77 | 73 | ||||||||||
| 
				 | 
				 | 
				 | 
				 | 
				 | 
				 | 
				 | 
				 | 
				 | 
				 | 
				 | ||||||||||
| West | 4,690 | 4,101 | 568 | 436 | 520 | 409 | 579 | 440 | 3,023 | 2,816 | ||||||||||
| Public, city | 1,314 | 1,149 | 159 | 122 | 146 | 115 | 162 | 123 | 848 | 790 | ||||||||||
| Public, suburban | 1,533 | 1,339 | 186 | 141 | 170 | 135 | 189 | 144 | 988 | 919 | ||||||||||
| Public, town | 469 | 410 | 56 | 45 | 53 | 41 | 58 | 44 | 303 | 281 | ||||||||||
| Public, rural | 388 | 340 | 47 | 36 | 43 | 34 | 48 | 36 | 250 | 234 | ||||||||||
| Catholic, city | 209 | 183 | 25 | 20 | 23 | 19 | 26 | 20 | 135 | 125 | ||||||||||
| Catholic, suburban | 137 | 120 | 16 | 12 | 16 | 12 | 16 | 12 | 88 | 83 | ||||||||||
| Catholic, town | 42 | 37 | 5 | 4 | 3 | 3 | 5 | 4 | 27 | 25 | ||||||||||
| Other private, city | 358 | 314 | 44 | 33 | 39 | 30 | 44 | 34 | 231 | 217 | ||||||||||
| Other private, suburban | 114 | 101 | 13 | 11 | 12 | 10 | 14 | 11 | 75 | 69 | ||||||||||
| Other private, town | 35 | 31 | 4 | 3 | 4 | 3 | 4 | 3 | 22 | 21 | ||||||||||
| Other private, rural | 91 | 77 | 13 | 9 | 11 | 7 | 13 | 9 | 56 | 52 | ||||||||||
Analogous to the field test sample, one math and one science teacher will be selected for each ninth-grade student. Where sample students have more than one math or science teacher in fall 2009, we will randomly sample one of the teachers. In addition, for each sample school there will be one sample high school counselor and one sample parent. In two-parent households, the parent most knowledgeable with the student’s school situation and experience will be asked to participate.
After data collection, survey data must go through several steps before analysis and reporting tasks can begin. Once data have been compiled and edited, survey weights will be computed, followed by variance estimation and imputation of missing data. In this section we provide a brief overview of each of these steps for the HSLS:09 full-scale study.
Virtually all survey data are weighted before they can be used to produce reliable estimates of population parameters. While reflecting the selection probabilities of sampled units, weighting also attempts to compensate for practical limitations of a sample survey, such as differential nonresponse and undercoverage. Furthermore, by taking advantage of auxiliary information about the target population, weighting can reduce the variability of estimates. The weighting process essentially entails four major steps. The first step consists of the computation of design or base weights. In the second step, base weights will be adjusted for nonresponse, while in the third step nonresponse-adjusted weights will be further adjusted so that aggregate counts can match reported estimates for the target population. Finally, adjusted weights will go through a series of quality control checks to detect extreme outliers and to prevent any computational as well as procedural errors.
The HSLS:09 multilevel and multicomponent design introduces significant complexity to the task of weighting. Cognizant of this complexity, RTI will make every effort to keep the resulting weights as simple and intuitive as possible. A minimum of two sets of weights will be required for the analysis of the HSLS:09 data: school weights and student weights. While we expect to secure the stated rates of response, when response rates fall below the accepted limit (both at unit and item levels) we will carry out detailed nonresponse bias analysis to estimate the extent of the incurred bias and to identify effective methods for producing an effective nonresponse weight adjustment.
Several methods have been suggested for measuring
nonresponse bias. In the simplest form, this bias can be approximated
temporally by comparing responses obtained from those who respond
earlier in the data collection period against late respondents. The
incurred bias due to nonresponse can be measured more systematically,
however, as the difference between survey estimates and their
respective target parameters—the values that would result if a
complete census were conducted and all units responded. For instance,
when estimating a population mean ()
based on respondents only nonresponse
bias can be expressed as
nonresponse
bias can be expressed as
 .
.
However, for variables that are
available from the sampling frame, 
can be estimated by without sampling error, in which case the bias in
without sampling error, in which case the bias in can
then be estimated by
can
then be estimated by
 .
.
Moreover, an estimate of the population mean based on respondents and nonrespondents can be obtained by
 .
.
where is the weighted unit nonresponse rate, based on design weights prior
to nonresponse adjustment. Consequently, the bias in
is the weighted unit nonresponse rate, based on design weights prior
to nonresponse adjustment. Consequently, the bias in can
then be estimated by
can
then be estimated by
 .
.
That is, the estimate of the nonresponse bias is the difference between the mean for respondents and the mean for nonrespondents, multiplied by the weighted nonresponse rate, using the design weights prior to nonresponse adjustment. This basic approach will be used to measure bias in key survey estimates by relying on data that will be available for both respondents and nonrespondents.
As an attempt to reduce some of the bias due to nonresponse, when appreciable bias is detected at any level, design weights will be adjusted within cells indexed by variables that are deemed strong predictors of response status. To identify such variables, which typically include sampling stratification variables and indicators that can efficiently partition units into homogenous segments, we will rely on classification procedures such as CHAID (Chi-square automatic interaction detection method). CHAID is a hierarchical clustering algorithm that successively partitions units according to a categorical characteristic. The algorithm begins with all sample units as a whole and cycles over each predictor to find the optimal partition of the units. The most significant predictor is identified, resulting in partitioning of units into smaller subsets. Next, the algorithm is applied to each partitioned subset of units to find further partitions using the remaining predictors. The process stops after a specified number of partitioning steps or if none of the partitions at a given step is found to be significant.
For HSLS:09 all weight adjustments—including those for nonresponse and poststratification—will be calculated using RTI’s WTADJUST procedure in SUDAAN® (Research Triangle Institute, 2008). This procedure has been proven to produce weights with less variability than what is achievable via traditional methods. Two reasons are noted: First, WTADJUST allows a much larger set of variables and their interactions to be used during the model development for nonresponse and raking adjustments, hence enabling the weighted data to mimic the distribution of the target universe with respect to a more comprehensive set of indices. Second, this desirable property is achieved while preventing the adjusted weights from becoming too extreme. That is, the procedure produces study estimates that better represent the target universe without increasing variance of estimates significantly, which would otherwise reduce the power of statistical tests.
For variance estimation, we will create sets of 200 balanced repeated replication (BRR) weights for school and student samples. The BRR weights are appropriate for use in NCES’s Data Analysis System (DAS) and do not affect the analysis weights used for point estimation. The BRR weighting process will replicate the full weighting process and will use procedures developed for a number of other studies, including ELS:2002 and the National Study of Postsecondary Faculty. In addition, design strata and PSUs will be included on the electronic code book for analysts wanting to use Taylor series variance estimation rather than BRR weights.
Missing values due to item nonresponse will be imputed after the data are edited. Imputation will be performed for items commonly used to define analysis domains, items that are frequently used in crosstabulations, and items needed for weighting. Missing items from HSLS:09 will be imputed using the weighted hot deck method in RTI’s HOTDECK procedure in SUDAAN® (Research Triangle Institute, 2008; Iannacchione, 1982). By incorporating the sampling weights, this method of imputation takes into account the unequal probabilities of selection in the original sample while controlling the expected number of times a particular respondent’s answer will be used as a donor.
Our procedures for maximizing response rates at the institution and respondent levels are based on our successful experience on predecessor and other similar studies. In this section we discuss methods for maximizing response rates for school recruitment as well as for students, parents, and school staff.
School Recruitment. Achieving high school participation rates on voluntary research studies has proven increasingly difficult in recent years. Our experience is that many schools already feel burdened by mandated “high stakes” testing and, at the same time, are hampered by fiscal and staffing constraints. Moreover, we will face roadblocks not only at the school, but also at the district level, where research studies must sometimes comply with stringent requirements to submit formal and detailed applications similar to those one would submit to an IRB before individual schools can even be contacted. The keystone of our plan to work with school districts and schools is to demonstrate the importance of the study while maintaining flexibility in our negotiations with school districts and schools.
Immediately after drawing the sample, recruitment will commence. Sample materials to be sent to states, districts, and schools are provided in appendix A. We will send succinct yet compelling advance materials to the school districts and schools to introduce the study. Within a few days of receiving the materials, a trained recruiter will contact the school district or school to discuss their participation in the study. Our recruiters are hired for their knowledge, skill, and articulation with the proven ability to develop relationships with district and school contacts that will foster participation and persist throughout the in-school follow-ups for the longitudinal study.
As much as possible, we will shift the burden from the school to RTI staff. Possible ways of shifting the burden include scheduling contacts or survey administrations to best fit the school calendar, mailing consent forms to parents from RTI, providing compensation for time/help completing forms, offering a session administrator to come to the school to compile sampling information, and having a session administrator coordinate all aspects of survey day (e.g., posting reminders, processing consents, and gathering students). These options have proven helpful on similar studies to gain cooperation in schools that expressed scheduling, burden, or staffing concerns.
One of the key factors to a successful recruitment period is time. A task force convened in 2004 to help NCES brainstorm ways to improve school response rates in their international studies recommended that all recruitment activities begin at least 1 year prior to the start of data collection.
It is worth noting that our proposed sample design will not cluster schools at the district level. This will mitigate the undesirable situation of losing clusters of schools from sample districts that opt not to participate in this study.
Student. Ensuring a high student response at each school begins several weeks prior to the student session. Session administrators will work closely with the school coordinators to coordinate the logistics of the sessions and notify students about the sessions. Because the sampled students are not selected by classroom and are dispersed across multiple classes, there is a heavy burden on the school coordinator to inform students about the session, distribute parental consent materials, and ensure that the students arrive at the prescribed location at the scheduled date and time. Session administrators will assume as much of this burden as is possible and permissible by the school.
In our experience, ensuring that students are made aware of the session is the most critical aspect of making sure they arrive at the session at the scheduled time. Despite receiving the consent form to take home, students do not necessarily distinguish the form from other materials they take home, and they often forget about the session without frequent reminders. To help remind students about the sessions, we will implement options such as distributing postcard reminders a day or two prior to the session, notifying the teachers of selected students, asking the school coordinator to make an announcement on the PA system, and having the session administrator visit a few days prior to the session and convene a brief meeting of the student sample members to encourage participation. We will be collecting parent contacting information from each school from which the parent survey will be conducted. If phone numbers are provided, the session administrator will contact parents a day or two prior to the session to remind the students when they should arrive.
Each week, project staff will conduct group strategy calls with the session administrators to discuss the status of the schools with test dates scheduled for the coming 2 weeks. The purpose of these conference calls is to learn about the preparedness of each school for the student session, identify any concerns about anticipated response rate or computer capabilities at the school, provide a forum for brainstorming solutions to anticipated problems, and share success stories and lessons learned from other schools. Project staff will follow up frequently with SAs who report problems or concerns with the preparations for student sessions at particular schools.
Our plans for student incentives were described in section A9.
Parent. We will have several opportunities to interact with parents to encourage their participation in the study. The parental consent form will be sent home with the students several weeks before the student session, and the letter will mention that the parent interview is forthcoming. We will collect parent contacting information from the school after the student sample is identified. We will send a letter to the parent via e-mail and Federal Express to initiate the parent interview, providing a URL and credentials for the web instrument and a telephone number that can be used for a telephone interview. If we have a telephone number, the SA will contact the parent to remind him or her of the student session, and will take the opportunity to build a relationship with the parent and encourage participation from both the student and parent. Parents who do not complete the web instrument will be followed up via CATI. In the main study, paper-and-pencil versions of the questionnaire will be available for parents who do not have a telephone or Internet access. The parent interview will be translated into Spanish to accommodate limited English proficient and nonproficient parents.
There is no precedent for offering an incentive to complete the parent questionnaire. Thus, we have not included a parent incentive in our budget for the HSLS:09.
School Staff (School Administrators, Counselors, Teachers). School staff will receive a letter to initiate their questionnaire about 3 weeks prior to the student session. The session administrator will work with the school coordinator to prompt school staff to complete their interview. While at the school, the SA will prompt for any outstanding staff questionnaires. If the questionnaires still have not been completed by 1 week after the session(s) are complete in the school, we will commence CATI follow-up.
In the main study, teachers will have an option to complete a paper-and-pencil version of the questionnaire. Past experience has demonstrated the need for a teacher-level incentive to achieve high response rates and many schools have required that teacher compensation be commensurate with their hourly wage. Thus, we have proposed a $25 base teacher incentive.
Most of the procedures and methods to be used in the HSLS:09 base year have already been tested in prior NCES studies. Two-stage testing was first implemented in ELS:2002. Use of all-electronic questionnaire approaches (web supplemented by CATI) have been implemented in ELS:2002 second follow-up (2006), NPSAS, BPS, and B&B. The major new procedural modification in HSLS:09 is the use of an in-school computer-assisted assessment. The practical, technical and logistic challenges of this assessment approach were explored in a special pilot test, and implemented on a larger scale in the field test (both pilot and field test experience are described in the Field Test Report, currently in draft form, which will be shared with OMB upon completion). Procedures for the full-scale study have been refined to reflect the field test experience. Main study operational procedures and results will be documented and analyzed in a methodological chapter to be included as part of the base year Data File Documentation report for data users.
A number of individuals have consulted with NCES and RTI on the sampling design and recruitment plans for HSLS:09. Members of the Technical Review Panel are listed in section A8 of this document. In addition, Dr. Laura LoGerfo, Research Scientist, and Dr. Jeffrey Owings, Associate Commissioner for the Elementary/ Secondary and Library Studies Division, at NCES have reviewed and approved the statistical aspects of the study. Other statistical reviewers at NCES include Marilyn Seastrom, Chief Statistician, and the following statistical program staff: John Wirt, Tate Gould, and Michael Ross. Table 12 provides the names of additional consultants on statistical aspects of HSLS:09.
Table 12. Consultants on statistical aspects of HSLS:09
| Name | Affiliation | Telephone | 
| James Chromy | RTI | (919) 541-7019 | 
| Steven J. Ingels | RTI | (202) 974-7834 | 
| Jill A. Dever | RTI | (202) 974-7820 | 
| Peter H. Siegel | RTI | (919) 541-5902 | 
| Daniel J. Pratt | RTI | (919) 541-6615 | 
| John Riccobono | RTI | (919) 541-7006 | 
| Deborah Herget | RTI | (919) 485-7793 | 
| Gary Phillips | AIR | (202) 403-6916 | 
| Steve Leinwand | AIR | (202) 403-6926 | 
References
Folsom, Ralph E., Potter, Frank J., and Williams, S. Rick (1987). Notes on a Composite Size Measure for Self-Weighting Samples in Multiple Domains. Proceedings of the Section on Survey Research Methods (pp. 792-796). The American Statistical Association.
Iannacchione, V.G. (1982). “Weighted Sequential Hot Deck Imputation Macros.” In Proceedings of the Seventh Annual SAS User’s Group International Conference (pp.759–763). Cary, NC: SAS Institute, Inc.
Keyfitz, Nathan (1951). Sampling with Probabilities Proportional to Size: Adjustment for Changes in the Probabilities. Journal of the American Statistical Association, 46(253), pp. 105-109.
RTI International. (2008). SUDAAN language manual, release 10.0. Research Triangle Park, NC: Research Triangle Institute.
1 While the full-scale HSLS:09 sample will include only 9th-grade students, the field test sample will include both 9th- and 12th-grade students to prognosticate the progression that will be observed when reassessing the sample 9th-grade students in 2012.
2 Inevitably, there will be inconsistencies between student counts obtained from the sample schools and CCD/PSS. When the relative magnitude of an observed discrepancy exceeds 25 percent, such cases will call for further examinations. For instance, for public schools this measure will be the absolute value of (List – CCD)/List.
3 Sample augmentation will not be necessary for Hispanic or Black students, since sufficient sample sizes to support analyses by race/ethnicity will be secured for such students as part of the base sample of 20,000 students.
| File Type | application/msword | 
| Author | Laura F. LoGerfo | 
| Last Modified By | #Administrator | 
| File Modified | 2009-07-15 | 
| File Created | 2009-07-15 |