We’re doing everything we can to make sure she does, Ada. << back

A test of verbatim memory

Matt Wagers

At my Stevenson Winter Faculty lecture of 9 Feb 2011, I performed an informal test of verbatim memory (word-for-word memory) with two test utterances, both of which were contained in my talk. Some research (Sachs, 1967; Jarvella, 1971) suggests that verbatim memory is quite poor after any other sentences intervene. Others (Bates, Masling & Kintsch, 1978, i.a.) suggest that in a more naturalistic setting, verbatim memory is preserved.

There were 48 respondents overall: 24 responded to utterance #1, 24 responded to utterance #2. Although I did not measure the time elapsed between utterance time and test, I can estimate: there were approximately 1000 words of text in my script between utterance and test. There were approximately 1.6 syllables per word. So let's estimate 1600 syllables. Already that is an order of magnitude higher than Sachs (1967), who reported results for 160 syllables. Perhaps 5-7 minutes elapsed, assuming a relatively fast speech rate with few pauses.

We can use the d' measure of sensitivity as a measure of memory for surface form.

For each utterance,
perfect performance would be 3.5 (approximately);
chance performance would be 0.

As a point of comparison, I estimated that Sachs (1967) results showed d-primes ranging from 0.25 - 0.50 for the syntactic conditions.

Test utterance no. 1

Target: Rarely can we recall our linguistic exchanges verbatim
Lure: We can rarely recall our linguistic exchanges verbatim

χ2: 6.4, df = 1, p < 0.05

  1. p(Hit) = Probability of saying 'yes' to the Target = 8 / 12 = 0.67
  2. p(False alarm) = Probability of saying 'yes' to the Lure = 1 / 12 = 0.08.
  3. d' = z(pHit)-z(pFA) = 1.84

Interim conclusion #1: Surface memory was modestly preserved for this utterance.

Its d' score was 1.84

Test utterance no. 2

Target: It's the ideas that we care about
Lure: It's the ideas we care about

χ2: 4.2, df = 1, p < 0.05

Interim conclusion #2: Surface memory was also modestly preserved for this utterance.

Its d' score was 1.35

General conclusions

There was good evidence for preservation of surface memory for these two items. There was some conservativity (bias to say 'no') in the responses to the target utterance #1. The response pattern to utterance #1 ("rarely can we ...") is consistent with theories that marked constructions should be more memorable (or, be associated with greater familiarity). It is both rare to prepose the adverb 'rarely' and invert the subject phrase and auxiliary; and it is restricted in the discourse contexts in which it can occur. This could also explain why nearly everyone rejected the unmarked version. It is only necessary to have some evidence that you didn't hear something (i.e., low familiarity), versus a threshold quantity of evidence that you did hear something.

Target utterance #2 was marked in at least one sense: examining its position in the text, where it was delivered "as a punchline", it is somewhat odd prosodically to have pronounced the 'that'. However, unlike #1, there is no difference in the contribution to the discourse which either target or lure makes in #2.

Could these results have been obtained by chance? What is the likelihood that, giving each test to only 24 people, the results would have been as far from a 50/50 outcome as they were? We performed Pearson's chi-squared test to answer that question. The statistics in small type under each table show that, in neither case, are those outcomes very likely to occur by chance.

A big thanks to Mark Norris & Sylvia Soule for their help in collecting and tabulating the responses.