Table of Contents | Previous | Next |
Study Measures
The study directly employed three types of measures: a self-administered staff questionnaire to provide information on the educational background and experience of teachers in the Upgrade classrooms; a battery of observation measures, the Observation Measures of Language and Literacy Instruction (OMLIT, Goodson et al., 2004), that focuses on the language and literacy environment of and interactions within the preschool classroom, but also captures a wide range of other activities,11 paired with the Arnett Caregiver Rating Scale (Arnett, 1989), that rates the caregiver’s emotional tone, discipline style, supervision of and interest in children and encouragement of independence; and the Test of Preschool Emergent Literacy (TOPEL: Lonigan, Wagner, Torgesen, & Rashotte, 2002), a standardized assessment of the aspects of language development and pre-literacy skills that research has shown to predict later reading success. We discuss the rationale for the selection of the observational and child assessment measures below.
In addition, center- and classroom-level scores on the LAP-D, a broad diagnostic screening measure applied to four-year-olds receiving subsidies for child care, were provided by the School Readiness Coalition for use as covariates in the analysis.
Classroom Environment Measures
The model tracing the pathway of effects of the language and literacy interventions in the Miami experiment shows that impacts on children depend on prior changes in the children’s experiences in the child care centers. That is, the interventions must, first, change the center environments as a necessary condition for improving outcomes for the children. Although random assignment allows us to attribute treatment-control differences in children’s outcomes to the interventions, without knowing anything about the center environments, the impacts on children will be better understood if we know about the extent to which the centers themselves changed. In the worst-case scenario, if we failed to find any impacts on children, it would be important to know if the lack of impacts is the result of the failure of the interventions to effect significant changes in the centers. Further, in the event that there are child impacts, we wanted to know how these were achieved—what types of changes did occur in the centers and how much change did it take to translate into benefits for children? Therefore, the design of the study called for measuring treatment-control differences in the center environments, in addition to measuring differences in child outcomes.
If the purpose of assessing center environments is to identify differences in treatment and control centers that could be logically linked to effects on children, we wanted to use measures that would be sensitive to changes in those aspects of the center care environments that are hypothesized to be modified as a result of the interventions.12 This requires an initial analysis of the expected differences between classrooms using the intervention curricula and the “business-as-usual” classrooms.
Examination of the goals and activities of the three interventions led us to identify the following aspects of the treatment classrooms as central to the changes that should result from implementing any of the three curricula:
- Focused emergent literacy activities
- Phonological awareness activities (singing, breaking apart words into syllables, language games about alliteration and rhyming)
- Print knowledge activities (alphabet knowledge, letter-sound correspondence, grammatical rules)
- Print awareness activities (focus on uses of print, emphasis on reading aloud)
- Oral language activities (in-depth discussions, conversations, scaffolded language, open-ended questions, exposure to new vocabulary)
- Writing activities (dictation, invented spelling, journals)
- Reading aloud using dialogic reading methods
- Small group activities involving caregivers and children (individual children, pairs, small groups)
- Integration of print throughout the day and throughout the classroom
- Authentic print, literacy activities
- Print-rich classroom environments
- Caregiver engagement with the children in activities outside management/routines.
The OMLIT (Observation Measures of Language and Literacy Instruction) was a new battery developed for the national study of the Even Start Family Literacy Program being conducted by the U.S. Dept. of Education. The CLIO13 study was also an experimental test of early childhood language and literacy curricula, and, as with the Miami study, CLIO needed measures of classroom process that would be sensitive to the interventions. The CLIO study also reviewed available measures, including the ELLCO and the ECERS-R, and determined that new measures would have to be developed if measuring effects on classroom process was a priority. The Department of Education supported the development of the OMLIT battery, with the charge that the measure would be closely linked to the most up-to-date research on instructional practices shown to predict children’s reading and other academic outcomes in school. The development of the OMLIT took nearly two years, and included reliability studies and multiple rounds of piloting in child care centers. In the CLIO study, the OMLIT was administered in the field over three years, with trained observers using the measure in than 200 classrooms in each year, and calculation of inter-observer agreement for each group of observers.
Given the more than adequate reliability of the OMLIT battery (see discussion in Attachment B), its clear link to all of the critical classroom outcomes in the study, and its track record in large-scale applied research, we selected the OMLIT for the Miami study. Although we considered administering the ECERS-R along with the OMLIT, for purposes of comparison with other early childhood studies, we judged that the two measures would have to be administered in separate visits to classrooms (i.e., observers could not reliably code both the OMLIT and the ECERS-R simultaneously). The cost of the additional training and doubling the visits to classrooms was determined to be prohibitive, especially in light of what we believed to be the limited usefulness of the ECERS-R for measuring treatment-control differences (versus allowing us to characterize the quality of the child care centers in the Miami sample versus other samples).
Measures of Child Outcomes
The goal of the Miami-Dade experiment was to improve the language development of the children in the centers, since the first round of county-wide testing had shown that the children receiving child care subsidies scored, on average, at the 30th percentile on the language subscale of the LAP-D. At the same time, children in the Miami-Dade public schools were performing poorly in the high-stakes testing conducted statewide in 3rd grade. Therefore, the School Readiness Coalition was interested in testing curricula designed specifically to improve language and early literacy skills in preschool that might lead to improved performance when the children reached 3rd grade.
The SRC planned to continue its own testing of subsidized and other low-income children using the LAP-D.14 The LAP-D, which is administered by staff from the county agency that provides resource and referral services and administers subsidies, requires more than an hour of testing per child. In light of this ongoing county-wide testing program, the SRC was cautious about conducting additional testing of children for the purposes of the experiment. Therefore, the following guidelines had to be met in selecting child outcome measures:
-
The testing had to impose as little additional burden as possible on the children (and classrooms), with the goal of less than 30 minutes of testing/child; and
-
The testing should focus on outcomes that were not already assessed on the LAP-D.
Further, a high proportion of the children in the study classrooms came from Spanish-speaking homes and varied substantially in their English language skills. Despite the fact that the curricula were all English language/literacy curricula, all three also provided support for Spanish-speaking children. Therefore, the test battery had to have equivalent Spanish and English language versions (and had to articulate an acceptable policy about language of testing).
From the perspective of the study, other guidelines included:
-
The outcome battery needed to be sensitive to the content of the curricula, to increase the chances of detecting impacts;
-
The outcome battery needed to use standardized, norm-referenced measures that provided strong scores for multivariate analyses and allowed for comparison to normal development; and
-
The outcome battery should assess skills identified in the research to have longer-term significance for children’s academic success.
The study team reviewed the available child assessment measures, as well as consulting with national experts in language development (e.g., Drs. Christopher Lonigan of Florida State University, David Kaplan of the University of Texas) and also reviewed the measures being used in other national early childhood studies, including the national study of the Even Start Family Literacy Program, the National Head Start Impact study, the National Head Start Reporting System, the PCER (Preschool Curriculum Evaluation Research ) studies, and the national evaluation of Early Reading First. Across these studies, one measurement battery was being consistently used to assess children’s emergent literacy skills, the TOPEL (Test of Preschool Emergent Literacy),15 which tests three major domains: Phonological Awareness, Print Knowledge, and Definitional Vocabulary. English and Spanish versions of the test were available. In light of the county’s administration of the LAP-D, we recommended that the additional child assessments for the experiment should use the TOPEL, since it met all of the study criteria, as well as the SRC guidelines, and the recommendation was accepted.
| Table of Contents | Previous | Next |

