Blog post based on an article in Journal of Child Language, written by Carla Hudson Kam

It is obvious that language learning in children is affected by input: children exposed to English learn English and children exposed to Mandarin learn Mandarin. But the relationship between the input children receive and what they learn is more specific than that.

For instance, we know that children whose caregivers produce lots of complex sentences produce complex sentences earlier than children who hear fewer of them. Children’s input comes from sources beyond their caregivers’ natural speech, however, and researchers have become increasingly interested in input from these other sources. My co-author and I were interested in the prevalence of a specific type of information in children’s books but needed to know which books we should analyze (if you’re interested in books as input, presumably you should analyze books that children are actually being exposed to).

We conducted a survey of parents and caregivers asking about the English-language books they were reading to their child, to help us select our books for analysis.  Although we initially created this dataset to conduct our own research, we quickly realized the potential the dataset had, and so decided to share it with the wider research community.

The resulting database – which we call the Infant Bookreading Database (or IBDb) – includes responses from 1,107 caregivers of children aged 0-36 months, who answered questions about the five books they were reading to their child most often at the time. The dataset also includes demographic and language development information, so the data can be analyzed separately for children of different ages, genders, or language skill levels, or for the caregiver’s age, gender or education level.

There were 2,227 unique titles listed by caregivers, and 1,617 are identifiable (meaning we could figure out exactly which book the respondent was referring to). One of the most striking things about the identifiable titles is how much variation there is in which books children are hearing. Only one book was listed by more than 200 respondents (Goodnight Moon by M. Wise Brown), two were listed more than 100 but less than 200 times (The Very Hungry Caterpillar by E. Carle and Brown Bear, Brown Bear, What do you See? by B. Martin and E. Carle), and only four books were listed 50-99 times (so by at least 4.5% of our sample). The overwhelming majority of titles were only listed by one or two respondents. So while there are a small number of fairly popular books that show up again and again in our data, most books are being read to only one or two children in our sample.

The IBDb is available for download for use by researchers (and anyone else who is interested) at linguistics.ubc.ca/ubc-ibdb/.

