Tuesday, February 21, 2012

GoFuckYoMama, WSJ!

My Sister Eileen

Everything you always wanted to know about my sister Eileen and more

GoFuckYoMama, WSJ!

Lexical Bias in Cross-Dialect Word Recognition in Noise
Cynthia G. Clopper, Ohio State University
Janet B. Pierrehumbert, Northwestern University
Terrin N. Tamati, Indiana University
Abstract
Lexical bias is a well-known factor affecting phonological categorization in spoken word
recognition. The current study examined the interaction between lexical bias and dialect variation
in spoken word recognition in noise. The stimulus materials were real English words in two
regional American English dialects. To manipulate lexical bias, target words in the wordcompetitor
condition were selected so that predicted phonological confusions across dialects
resulted in real English words. The target words in the nonword-competitor condition were
selected so that predicted phonological confusions did not result in real English words. Word and
vowel recognition performance were more accurate in the nonword-competitor condition than the
word-competitor condition for both talker dialects. An examination of the responses to specific
vowels revealed the role of dialect variation in eliciting this effect. For example, in the wordcompetitor
condition, Northern // was misidentified as /æ/ significantly more often than chance,
consistent with the Northern backing and lowering of //. Thus, when the predicted phonological
confusions were real words, listeners could respond with either the target word or the confusable
minimal pair. When the predicted phonological confusions were not real words, the listeners
exhibited a lexical bias and responded with the target word.
1. Introduction
Lexical bias is a well-known factor affecting phonological categorization in spoken word
recognition tasks. Ganong (1980) observed that listeners shifted the location of their phonetic
category boundary for /t/ vs. /d/ depending on the lexical status of the endpoint stimulus items.
For the task vs. dask continuum, listeners responded /t/ more frequently than /d/, but for the tash
vs. dash continuum, listeners responded /d/ more frequently than /t/. The difference between /t/
and /d/ responses for the two sets of stimulus materials was greatest in the middle, most
phonologically ambiguous, part of the continua. The “Ganong effect,” therefore, reflects an
advantage for real word responses over nonword responses in lexical identification tasks with
ambiguous stimulus materials. This lexical bias effect can be interpreted as the result of the
robustness of the human speech recognition system to acoustic variability. When presented with
ambiguous acoustic input, the speech recognition system is biased to match the input to an
existing entry in the lexicon. However, in situations of dialect contact, this mechanism for error
2
correction could lead to cross-dialect intelligibility errors. An utterance in one dialect may be
acoustically ambiguous in another, leading the listener to select the wrong lexical item in
recognition. Thus, the lexical bias effect may produce cross-dialect errors in spoken word
recognition.
Previous studies demonstrating a lexical bias in spoken word recognition tasks have used
speech synthesis techniques to create ambiguous stimulus materials, along continua such as /t/ -
/d/, /b/ – /d/, or /g/ – /k/ (Fox, 1984; Ganong, 1980; Pitt, 1995). However, it is well known that the
perception and processing of synthetic speech stimulus materials can differ substantially from the
perception and processing of natural speech (Winters & Pisoni, 2005). Thus, one goal of the
current study was to examine the extent to which naturally-occurring sources of variability in the
speech signal, such as regional dialect, can lead to phonological ambiguity and produce lexical
bias effects in spoken word recognition.
In American English, vowel differences across dialects involve mostly phonetic shifts in the
vowel space. These phonetic vowel shifts could lead to phonologically ambiguous stimulus
materials. Rakerd and Plichta (2003) found that listeners shifted their phonetic category
boundary for a synthetic continuum from // to /æ/ depending on the dialect of the preceding
carrier phrase. For carrier phrases with fronted //s, the perceived category boundary was closer
to /æ/, whereas for carrier phrases with backed //s, the boundary was closer to // (c.f., Kraljic,
Brennan, & Samuel, 2008; Stevens, McQueen, & Hartsuiker, 2007, where isolated words did not
elicit phonetic category boundary shifts). While this finding is based on the perception of
synthetically produced stimulus materials, it suggests that listeners are sensitive to phonetic
vowel differences across dialects and that dialect variation can affect phonological processing.
Naturally-occurring dialect variation has also been shown to affect processing in speech
intelligibility tasks. Clopper and Bradlow (in press) observed significant effects of dialect
familiarity on sentence transcription performance in noise. Listeners more accurately transcribed
sentences produced in a familiar dialect than in an unfamiliar dialect (see also Mason, 1946).
Similarly, Adank and McQueen (2007) and Labov and Ash (1997) found that listeners exhibited
faster and more accurate word comprehension performance for words produced in a familiar
dialect than in an unfamiliar dialect. The results of these cross-dialect speech intelligibility tasks
suggest that the phonetic vowel shifts in unfamiliar dialects lead to phonological confusions,
which, in turn, lead to slower and less accurate lexical processing. This interpretation is further
supported by the results of lexical decision tasks across dialects. Floccia, Goslin, Girard, and
Konopczynski (2006) reported that lexical decision performance was slower and less accurate for
unfamiliar dialects than familiar dialects, suggesting that phonological ambiguity due to
unfamiliar vowel shifts also leads to slower and less accurate lexical decision judgments.
This hypothesis that phonetic dialect variation leads to phonological ambiguity is only
indirectly supported by the previous research examining the effects of dialect familiarity on
speech intelligibility and lexical decision performance. In particular, the differences in
performance between familiar and unfamiliar dialects is attributed to phonetic processing deficits
for the less familiar variants in the unfamiliar dialects. Thus, the second goal of the current study
3
was to more directly test the hypothesis that phonetic dialect variation leads to phonological
confusions. To determine how phonetic vowel variation across dialects affects phonological
classification, the error patterns produced by listeners in a word recognition task in noise were
examined to identify consistent phonological vowel confusions within and across dialects. If the
interpretation of the effects of dialect familiarity on speech intelligibility and lexical decision
performance is correct, cross-dialect phonological confusions should be predictable from existing
phonetic descriptions of the different dialects. Vowels that belong to different phonological
categories in the two dialects, but that are phonetically similar to each other, should be
perceptually confusable (see Best, 1995).
To examine the interaction between lexical bias and naturally-occurring dialect variation, a
set of predicted phonological confusions was established for vowels in two dialects of American
English (Midland and North) based on previous descriptions of regional dialect variation in the
United States (e.g., Labov, Ash, & Boberg, 2006). The error patterns produced by the listeners
were compared across two conditions in the word recognition task. In one condition, the target
words were selected so that the predicted phonological confusions resulted in real English words
(the word-competitor condition). In the other condition, the target words were selected so that the
predicted phonological confusions did not result in real English words (the nonword-competitor
condition). If naturally-occurring sources of variability, such as regional dialect, can produce
lexical bias effects, more predicted phonological confusions should be observed in the wordcompetitor
condition, when an alternative real English word was available to the listeners as a
possible response, than in the nonword-competitor condition, when an alternative real English
word was not available as a possible response.
2. Methods
2.1. Listeners
Fifty-five Ohio State University undergraduates were recruited to participate in the word
recognition experiment. Data from sixteen of the participants were excluded from the analysis:
seven participants were bilingual, four participants reported a history of a hearing or speech
disorder, two participants were substantially older than the other participants, and three sets of
data were not recorded due to a computer error. The remaining 39 listeners were all monolingual
native speakers of American English between the ages of 17 and 28 years old with no reported
history of hearing or speech disorders. The residential histories of the listeners varied. Twenty of
the listeners were lifetime residents of the Midland dialect region, one listener was a lifetime
resident of the New England dialect region, nine listeners were lifetime residents of the Northern
dialect region, and the remaining nine listeners had lived in more than one dialect region before
age 18. Nineteen of the listeners participated in the word-competitor condition. The other 20
listeners participated in the nonword-competitor condition. All of the listeners received partial
course credit in an introductory linguistics course for their participation.
4
2.2. Predicted Phonological Confusions
Two regional dialects of American English, Midland and North, were selected for the current
study. Both dialects are spoken in the state of Ohio and are, therefore, highly familiar to the Ohio
State University undergraduates who participated as listeners. A map depicting the two dialects
is shown in Figure 1. The state of Ohio is indicated by a star in the figure. Schematics of the
vowel systems of the Midland and Northern dialects are shown in Figure 2. The Midland dialect
is characterized by fronting of the back vowels /u/ and /ow/ and a merger of the low back vowels
// and // (Labov et al., 2006). The Northern dialect is characterized by the Northern Cities
Chain Shift, which includes the fronting and raising of /æ/, fronting and lowering of // and //,
backing and/or lowering of // and //, and backing of // (Labov et al., 2006). An initial acoustic
analysis of the stimulus materials also suggested lowering of // in the Northern dialect (see also
Labov et al., 2006).
Based on this characterization of these two dialects, a set of predicted phonological
confusions was established. The baseline for comparison for these predicted confusions was
assumed to be the “standard” variety of American English, depicted in Figure 2 by the location of
the phonetic vowel symbols. This strong assumption is based the results of previous cross-dialect
speech perception studies, which suggest that listeners use the standard variety as their perceptual
filter in speech processing tasks. Clopper and Bradlow (in press) and Floccia et al. (2006)
observed a processing benefit for the standard variety in their sentence transcription and lexical
decision tasks, respectively. In addition, Labov and Ash (1997) found that even native speakers
of Southern American English could only identify extremely Southern-shifted vowels with
moderate accuracy. Thus, the predicted phonological confusions in the current study were based
on a comparison to the standard description of the American English vowel system.
[Insert Figure 1 about here]
[Insert Figure 2 about here]
A summary of the predicted phonological vowel confusions is shown in Table 1. For the
Midland dialect, // was predicted to be confusable with // and /ow/ was predicted to be
confusable with //. While /u/ is fronted in the Midland dialect, it is typically not fronted enough
to be confusable with /i/ or //. Thus, no confusions were predicted for the Midland fronted /u/.
For the Northern dialect, // was predicted to be confusable with //, // with /æ/, /æ/ with //, //
with /æ/, // with //, // with //, and // with //. Note that the predicted phonological
confusions were based on comparisons of only first and second formant frequencies and that
vowel duration and formant trajectory were not taken into consideration. Thus, these predictions
are based on a very rough assessment of phonetic similarity across vowel categories.
5
[Insert Table 1 about here]
2.3. Talkers
Six female talkers were selected for the word recognition experiment from the Indiana Speech
Project Corpus (Clopper et al., 2002). The talkers were all monolingual native speakers of
American English and ranged in age from 19 to 21 years old at the time of recording. Three of
the talkers were classified as representatives of the Northern dialect. They had lived exclusively
in the Northern dialect region until age 18 and both parents of each of the Northern talkers were
also raised in the Northern dialect region. The other three talkers were classified as
representatives of the Midland dialect. They had lived exclusively in the Midland dialect region
until age 18 and both parents of each of the Midland talkers were also raised in the Midland
dialect region.
2.4. Stimulus Materials
Fifty-five real English monosyllabic words were selected as stimulus materials for the wordcompetitor
condition. Five different words were selected for each of 11 vowels (i, , ej, , æ, , ,
, ow, , u). For the eight vowels for which phonological confusions were predicted based on the
descriptions of the Midland and Northern dialects (, , æ, , , , ow, ), the target words were
selected so that the predicted phonological confusions resulted in real English words. For
example, all of the selected words containing // had a real minimal pair competitor in English
containing /æ/ (e.g., bet and bat). The competitor words did not appear in the stimulus list.
Target words were selected to control for mean lexical frequency and familiarity across vowels
and between target words and their phonological competitors. Target and competitor words were
not significantly different in mean lexical frequency or familiarity, except for target // words,
which were less frequent and less familiar on average than their // competitors, given the
relatively small number of CC words in English and the relatively low familiarity and frequency
of those words. Overall, the target words were highly familiar (mean on a 7-point scale = 6.94;
Nusbaum et al., 1984) and moderately frequent (mean log frequency = 2.38).
Forty additional real English monosyllabic words were selected for the nonword-competitor
condition. For the three vowels for which phonological confusions were not predicted across the
two dialects (i, ej, u), the same words were used in the nonword-competitor condition and the
word-competitor condition. For the eight vowels for which phonological confusions were
predicted across the two dialects, five new words containing each vowel were selected for the
nonword-competitor condition so that the predicted phonological confusions did not result in real
English words. For example, all of the selected words containing // did not have a real minimal
6
pair competitor in English containing /æ/ (e.g., chess and *chass). As in the word-competitor
condition, the target words in the nonword-competitor condition were highly familiar (mean on a
7-point scale = 6.92) and moderately frequent (mean log frequency = 2.15). Independent sample
t-tests confirmed that the stimulus materials in the two conditions did not differ significantly in
terms of lexical frequency or familiarity.
The stimulus materials were segmented into individual digital sound files at a sampling rate
of 22050Hz with 16-bit resolution. The sound files were mixed with speech-shaped white noise
at a signal-to-noise ratio of +2dB for presentation to the listeners.
2.5. Procedure
The listeners were seated at personal computers equipped with headphones and a keyboard. In
each of the two conditions, the listeners were presented with each of the 55 target words produced
by each of the six talkers, for a total of 330 trials.1 They were instructed to listen to each word
and to type the word that they heard onto the computer screen. They were permitted to listen to
each stimulus item exactly one time before responding and were asked to make their best guess
on every trial. The stimulus materials were presented in a different fully randomized order for
each listener.
The responses were scored for both correct word and correct vowel. For example, the
response code to the target word code was scored as correct for both word and vowel, whereas the
response coat for the target word code was scored as incorrect for word but correct for
vowel. Homophones (e.g., need for knead) and obvious typographical and spelling errors were
corrected (e.g., yaght for yacht). Multisyllabic words, nonsense words, heteronyms (e.g., read),
and words with multiple possible pronunciations (e.g., mauve) were scored as incorrect.
3. Results
Average word and vowel accuracy scores for each talker dialect in each condition are shown in
Table 2. At the word level, performance was more accurate for the Northern talkers than the
Midland talkers across both lexical conditions. In addition, word recognition performance was
more accurate in the nonword-competitor condition than the word-competitor condition for both
talker dialects. A repeated-measures ANOVA on word accuracy with talker dialect as a withinsubject
factor and lexical condition as a between-subject factor confirmed significant main effects
of talker dialect (F(1, 37) = 16.2, p < .001) and lexical condition (F(1, 37) = 16.7, p < .001). The
interaction was not significant.
[Insert Table 2 about here]
At the vowel level, performance did not differ across the two talker dialects in either
condition. However, vowel recognition performance was more accurate in the nonword7
competitor condition than the word-competitor condition for both talker dialects. A repeatedmeasures
ANOVA on vowel accuracy with talker dialect as a within-subject factor and lexical
condition as a between-subject factor confirmed a significant main effect of lexical condition
(F(1, 37) = 28.8, p < .001). The main effect of talker dialect and the interaction were not
significant.
The significant effect of talker dialect on word recognition performance, but not vowel
recognition performance, suggests that while the Northern talkers were more intelligible overall,
this intelligibility benefit may have been limited to consonants. The significant effect of lexical
condition on both word and vowel recognition performance is consistent with the lexical bias
effect. When the predicted phonological confusions were real words, listeners could respond
with either the target word or the confusable minimal pair competitor. When the predicted
phonological confusions were not real words, however, the listeners exhibited a lexical bias and
responded with the target word. This interpretation is further supported by an analysis comparing
the responses to the words for which no phonological confusions were predicted and which were
shared across the two conditions (i, ej, u) to the responses to the words for which phonological
confusions were predicted and that differed across the two conditions (, , æ, , , , ow, ). At
both the word and vowel levels, performance did not differ between the word- and nonwordcompetitor
conditions for the target words for which no phonological confusions were predicted.
However, performance was significantly better in the nonword-competitor condition than the
word-competitor condition at both the word and vowel levels (t(37) = -4.9, p < .001 for words;
t(37) = -6.1, p < .001 for vowels) for the target words for which phonological confusions were
predicted. One additional source of evidence for the lexical bias effect in the current study is that
the listeners in the two conditions produced on average the same number of nonword responses in
the task. Thus, in the nonword-competitor condition, the listeners did not produce more nonword
competitors as responses, but instead produced more target words as responses. Taken together,
the analyses of the response patterns in the two competitor conditions suggest a lexical bias in the
word recognition task that resulted in better overall performance in the nonword-competitor
condition than in the word-competitor condition.
An examination of the responses by individual vowel further confirmed the lexical bias effect
and also revealed the role of dialect variation in eliciting this effect in natural speech. For each
lexical condition, a stimulus-response vowel confusion matrix was constructed based on the
target and response vowels for each listener. Vowel error patterns that were significantly
different from chance were identified using a binomial distribution analysis. Table 3 summarizes
the significant error patterns (p < .05) observed in each of the two lexical conditions for each of
the two talker dialects.
[Insert Table 3 about here]
In the word-competitor condition, we observed a number of the predicted phonological
confusions, including    and ow   for the Midland talkers and   ,   æ,   , and 
8
  for the Northern talkers, suggesting that dialect variation can lead to predictable
phonological confusions between vowels. We also observed a number of unpredicted confusions,
including   ,   , and ow  u for the Midland talkers and    and ow   for the
Northern talkers. The    confusions for both the Midland and Northern talkers were
overwhelmingly the result of cop  cup confusions. The Midland    confusions were
primarily the result of caught  cut and hawk  huck confusions. These confusions between the
low back vowels /, , / are consistent with the results of previous research on the perception of
American English vowels (e.g., Hillenbrand, Getty, Clark, & Wheeler, 1995; Peterson & Barney,
1952), which also revealed more perceptual confusions among these vowels than for vowels in
other parts of the vowel space. The ow   confusions for both the Midland and Northern
talkers are overwhelmingly the result of pole  pull and bowl  bull confusions. For many
Ohioans, /ow/ and // are merged in pre-lateral position, which may account for the high error
rate for these two target words. The ow  u confusion for the Midland talkers, however, appears
to reflect the /ow/ fronting that is characteristic of the Midland dialect.
Finally, two of the predicted confusions for the Northern talkers were not observed: æ  
and   æ. An examination of the individual target words revealed that the æ   confusion
was observed for the Northern talkers significantly more often than chance for one of the five
target words (bad), and that the   æ confusion was also observed for the Northern talkers
significantly more often than chance for one of the five target words (sock). The remaining four
target words for /æ, / did not exhibit the predicted confusions. An acoustic analysis of the
stimulus materials in the current study suggests that duration or formant trajectory differences
between [æ] and [] may explain the lack of æ   confusability and that second formant
frequency differences between [] and [æ] may explain the lack of   æ confusability. In the
stimulus materials in the word-competitor condition, the Northern [æ] was too long or too
diphthongal to be confusable with // (see also Hillenbrand et al., 1995) and the Northern [] was
not fronted enough to be confusable with /æ/.
In the nonword-competitor condition, fewer significant vowel confusions were observed
overall. Two of the significant vowel confusions in the nonword-competitor condition were also
observed in the word-competitor condition:    for the Midland talkers and    for the
Northern talkers. The    confusions for the Midland talkers were primarily the result of dock
 duck confusions. As noted above, the perceptual confusions among the low back vowels /, ,
/ may reflect more general perceptual confusions in that part of the vowel space. The remaining
three significant vowel confusions in the nonword-competitor condition were not observed in the
word-competitor condition:    for the Midland talkers, and   æ and   j for the
Northern talkers. The   æ confusion for the Northern talkers was predicted based on the
9
previous descriptions of the two dialects. However, the   j confusion for the Northern talkers
and the    confusion for the Midland talkers were not predicted. An acoustic analysis of the
stimulus materials revealed that // is raised for the Midland talkers, which may account for the 
  confusion for the Midland talkers (see also Durian, Dodsworth, & Schumacher, to appear).
Similarly, the fronted [] produced by the Northern talkers may be similar to the nucleus of /j/,
resulting in the   j confusion for the Northern talkers. In addition, in the stimulus materials
for the nonword-competitor condition, the Northern [] was more fronted and longer in duration
than in the word-competitor condition, which may account for the significant confusions between
// and /æ/ and between // and /j/ in the nonword-competitor condition, but not the wordcompetitor
condition. Similarly, while the Midland [] was raised in the stimulus materials in
both the word-competitor and nonword-competitor conditions, it was shorter in duration in the
nonword-competitor condition. Thus, the Midland [] may have been too long to be confusable
with // in the word-competitor condition, but not in the nonword-competitor condition.
Taken together, the results of the analysis of the vowel error patterns revealed more predicted
phonological confusions in the word-competitor condition, when an alternative real English word
was available to the listeners as a possible response, than in the nonword-competitor condition,
when a real English word was not available as a possible response. However, vowel confusion
patterns were also affected by other factors, such as overall perceptual confusability (e.g., /, ,
/), phonetic context (e.g., pre-lateral position), and acoustic similarity in terms of duration (e.g.,
/, æ/) and formant frequencies (e.g., /, /).
3.1. Effects of Listener Dialect
Given the uneven distribution of listeners by residential history, and the relatively small number
of participants from the Northern dialect region or with mobile backgrounds, listener dialect
background was not included as a factor in the statistical analyses of the results reported above.
Previous research on dialect classification (Clopper & Pisoni, 2006), dialect intelligibility
(Clopper & Bradlow, in press), and cross-dialect lexical recognition memory (Tamati, 2008)
using similar stimulus materials and similar listener populations has revealed only very minor
effects of listener background on dialect perception. The effects of listener dialect in the word
recognition task are therefore presented here without statistical analysis and are interpreted
somewhat cautiously.
Average word and vowel accuracy scores for each talker dialect in each condition for each
listener group are shown in Table 4. The General American listener group consisted of 10
lifetime Midland residents and 1 lifetime New England resident in the word-competitor condition
and 10 lifetime Midland residents in the nonword-competitor condition (see Clopper & Bradlow,
in press; Labov, 1998). The Northern listener group consisted of 3 lifetime Northern residents in
10
the word-competitor condition and 6 lifetime Northern residents in the nonword-competitor
condition. The mobile listener group consisted of 5 listeners in the word-competitor condition
and 4 listeners in the nonword-competitor condition who had lived in more than one dialect
region before the age of 18 years old. In the word-competitor condition, the Northern listeners
performed somewhat better overall than the General American listeners, who performed
somewhat better overall than the mobile listeners. In the nonword-competitor condition, the
mobile listeners performed best overall, followed by the Northern and the General American
listeners. Given the relatively small number of participants in the Northern and mobile listener
groups, the inconsistent pattern of performance across the two conditions, and the results of
previous research demonstrating negligible effects of listener dialect on performance in similar
perceptual tasks with similar populations, these differences in overall performance would most
likely not be significant even with larger samples of participants from these college-educated
populations.
[Insert Table 4 about here]
The vowel error patterns observed in the current study provide an opportunity to obtain more
fine-grained insights into the role of talker-listener dialect match or mismatch in speech
processing. For example, in the word-competitor condition, the General American and mobile
listeners made many    confusions for the Northern talkers, whereas the Northern listeners did
not. The Northern listeners, however, made many   i confusions for the Midland talkers, and
the General American and mobile listeners did not, suggesting that the lowered and backed
Northern [] was confusable with // for the non-Northern listeners, but that the higher and fronter
Midland [] was confusable with /i/ for the Northern listeners. Similarly, the Northern [æ] was
confused with // in the word-competitor condition and the Northern [] was confused with /j/ in
the nonword-competitor condition by the General American listeners, and the Northern [] was
confused with /æ/ in both conditions by the mobile listeners. These confusions were predicted
based on the phonological descriptions of the two dialects and suggest that a dialect match
between the talker and the listener is beneficial for vowel recognition in noise.
However, the vowel error patterns within each listener group also suggest that a dialect match
does not always facilitate vowel recognition. For example, the Northern listeners exhibited a
relatively large number of   j confusions for the Northern talkers in both conditions and the
General American listeners exhibited a relatively large number of    confusions for the
Midland talkers in the word-competitor condition. Thus, even when talkers and listeners share a
dialect background, phonological confusions may occur (Labov & Ash, 1997). A summary of the
talker-listener dialect match and mismatch error patterns is shown in Table 5.
[Insert Table 5 about here]
11
4. General Discussion
The first goal of the current study was to examine the interaction between lexical bias and dialect
variation in a spoken word recognition task in noise. Taken together, the results confirm a lexical
bias in the current spoken word recognition task. First, we observed more word and vowel
recognition errors overall in the word-competitor condition, when both the target and the
competitor were real English words, than in the nonword-competitor condition, when the target
was a real English word, but the competitor was a nonword. Second, the effect of condition was
not significant for the target words that were identical across the two conditions (/i, ej, u/),
suggesting that the manipulation of target words and competitors was responsible for the effect of
condition on performance. Whereas previous studies of lexical bias have typically relied on
synthetic acoustic continua (e.g., Fox, 1984; Ganong, 1980), the findings in the current study
suggest that lexical bias can also be observed for more naturally-occurring variability. In
particular, vowel variation in American English leads to continuous acoustic-phonetic variability
across dialects that is similar to the acoustic continua produced in the laboratory and used for
studying category boundaries, and it is exactly this naturally-occurring variation that elicited the
effects of lexical bias in the current study.
The second goal of our study was to test the hypothesis that phonetic variation across dialects
can lead to predictable phonological confusions. The findings from the analysis of the vowel
confusion patterns in the word recognition task suggest that accurate predictions about
phonological confusions can be made based on phonetic descriptions of dialect variation. Out of
the nine predicted phonological confusions shown in Table 1, six were observed in the wordcompetitor
condition and two were observed in the nonword-competitor condition. However,
some of the predicted confusions did not emerge as significant in the analysis, suggesting that
first and second formant frequencies are not adequate to make accurate predictions about all
phonological confusions. In the current study, vowel duration appeared to play an important role
in limiting perceptual confusions, particularly for the shorter, lax vowels. Vowel productions that
were too long were not misidentified as the lax vowels /, /, despite formant frequency
distributions that overlapped with these vowels in the acoustic space. While formant trajectories
were not explicitly examined in the current study, they may also play a substantial role in
perceptual phonological confusions. Thus, an empirical measure of phonetic similarity or
category overlap that incorporates both spectral and temporal information is needed to make more
accurate and more comprehensive predictions about phonological confusions across dialects.
The descriptive analysis of the role of listener dialect in vowel confusion patterns in the word
recognition task suggested that phonological confusions occur both within and across dialects.
Thus, while a match between the talker’s and the listener’s dialect may facilitate vowel
recognition performance for some categories, talker-listener dialect matches do not necessarily
lead to accurate vowel recognition performance (see also Labov & Ash, 1997). Additional
research with balanced listener groups and more locally-oriented talker and listener populations is
12
needed to examine the potentially important roles of listener background and talker-listener
dialect match or mismatch in dialect perception and processing.
The predicted phonological confusions in the current study were also based on a comparison
to an idealized “standard” variety of American English. This comparison to the standard
produced a reasonably good set of predicted phonological confusions for the current study, but
comparisons to an idealized standard may not be appropriate in all cases. Experimental work is
needed to determine what listeners use as their baseline phonological system in laboratory
perception tasks. For example, they may rely on a standard variety, their own native variety, or
some other variety triggered by the experimental environment (Hay, Nolan, & Drager, 2006;
Niedzielski, 1999). If it is determined that listeners tend to rely on standard varieties in
experimental tasks, additional research will be needed to establish criteria for determining what
the standard variety of a given language is in given region.
In the current study, the stimulus materials were presented in a fully randomized order and
the listeners were not told in advance that the stimulus materials were produced by talkers from
different dialects. The error patterns produced by the listeners suggest that the experimental
design prevented them from adapting to the dialect differences across the talkers. However, we
might predict that the lexical bias effect would be reduced if listeners were able to adapt to talker
dialect before, or during, the word recognition task. Online perceptual adaptation to dialect
variation has been reported for longer utterances, including sentences and passages (Floccia et al.,
2006; Maye, Aslin, & Tanenhaus, 2008). Thus, additional research is needed to explore the time
course of dialect adaptation and the extent to which dialect adaptation reduces the lexical bias
effect in cross-dialect word recognition tasks.
The lexical bias effect is often cited as evidence for an interaction between lexical and
acoustic information during speech processing. For example, Ganong (1980) argued that the
greater lexical bias for ambiguous than for unambiguous stimulus items could not be accounted
for by a model that categorically favored word to nonword responses, but instead required a
weighted or probabilistic phonetic classification judgment that could be affected by top-down
lexical knowledge (such as whether the utterance is a word or nonword). However, Fox (1984)
conducted a replication of Ganong’s (1980) study and analyzed response times in addition to
phonetic identification judgments. Fox (1984) observed longer response times for the ambiguous
than the unambiguous stimulus items and argued that this difference in response times for the two
types of stimuli might reflect autonomous phonetic processing followed by lexical processing.
Thus, he concluded that the lexical bias effect does not necessarily require interaction between
phonetic and lexical levels of processing. Pitt (1995) also analyzed response times in a phonetic
classification task to examine the locus of the lexical bias effect in processing. He observed a
response bias effect on discriminability in a lexical bias condition (word vs. nonword), but not in
a non-lexical bias condition (monetary payoff for responding with one phoneme category or the
other). Pitt (1995) argued that his results suggest an online interaction between lexical and
acoustic information in phonetic processing, rather than a post-perceptual response bias imposed
by the experimenter, such as monetary compensation. However, the lexical bias effect could also
13
reflect post-perceptual integration of lexical and acoustic information, as in Vitevitch, Luce,
Charles-Luce, and Kemmerer’s (1997) account of the effects of phonotactic probabilities on
nonword rating task performance.
The results of the current study suggest an alternative interpretation of the lexical bias effect
that does not involve the directionality (top-down vs. bottom-up) of information flow, but rather
the online integration of different types of information, from acoustic, lexical, and social sources.
Social information has already been shown to affect response biases in phonetic processing. For
example, Niedzielski (1999) found that listeners’ responses in an acoustic vowel matching task
were biased depending on where they believed the talker was from. When the listeners were told
that the talker was from the same city as them (Detroit, MI), they selected canonical or “standard”
vowels as the best matches to the targets. However, when the listeners were told that the talker
was from a culturally distinct region (Canada), they selected acoustically similar vowels as the
best matches to the targets. Thus, explicit social information about the talker affected the
listeners’ perceptual similarity judgments for vowels in an ostensibly purely phonetic task (see
also Hay et al., 2006).
Maye et al. (2008) found that listeners accepted nonsense words containing shifted vowels as
words in a lexical decision task (e.g., weckud for wicked) following dialect adaptation, suggesting
another alternative interpretation of the lexical bias effect. Specifically, participants may exhibit
a general bias to respond yes in experimental settings. In the Maye et al. (2008) study, this bias
would result in more word judgments for weckud (see also Hazan & Barrett, 2000). In the current
study, this bias would result in more real word responses, and would inhibit the participants from
responding with nonsense words in the nonword-competitor condition. Additional research is
therefore needed to explore the social motivations of participants in experimental settings to
determine the extent to which external factors may affect performance.
Social indexical information has been found to affect other well-established linguistic
processes, including semantic priming (Hay, Warren, & Drager, 2006) and speeded lexical
classification (Clopper, 2007). In speech production, social indexical information interacts with
other well-known sources of linguistic variability, such as semantic context (Pierrehumbert &
Clopper, 2006) and lexical neighborhood density (Munson, 2007). In the current study, word
recognition performance was affected by the interaction between acoustic-phonetic dialect
variation and the structure of the English lexicon (i.e., the presence vs. absence of minimal pairs).
Recent models of speech processing based on exemplar theories of perception and memory can
account for the interactions between phonetic, lexical, and social information that have been
observed across these speech perception and production tasks (e.g., Goldinger, 1998; Johnson,
1997; Pierrehumbert, 2002). In these models, individual acoustically-detailed utterances are
stored in long-term memory. Thus, lexical representations include acoustic information and it is
not necessary to posit autonomous acoustic-phonetic and lexical levels of processing (c.f., Norris
et al., 2000). In an exemplar-based model, the interaction between lexical and acoustic
information in phonetic processing that Ganong (1980) and Pitt (1995) proposed to account for
the lexical bias effect is an inherent component of the model. Thus, the results of the current
study also contribute to the growing body of research demonstrating the important role of social
14
variation in speech production and perception, as well as the need for models of speech
processing that can account for the interactions between social and linguistic sources of
variability (Pierrehumbert, 2006).
Acknowledgements
This work was supported by the James S. McDonnell Foundation Award 21002061 to
Northwestern University and the Department of Linguistics at the Ohio State University. The
authors would like to thank John Pate for assistance with data collection.
Notes
1 Due to an oversight in setting up the experiment, the target word hug was presented to listeners
in the word-competitor condition four times (by one Midland talker and by all three Northern
talkers), for a total of 328 trials in the word-competitor condition.
References
Adank, P., & McQueen, J. M. (2007). The effect of an unfamiliar regional accent on spoken word
comprehension. Proceedings of the 16th International Congress of Phonetic Sciences, 1925-1928.
Best, C. T. (1995). A direct realist view on cross-language speech perception. In W. Strange (Ed.), Speech
Perception and Linguistic Experience (pp. 171-204). Timonium, MD: York Press.
Clopper, C. G. (2007). Effects of dialect variation on speeded word classification. Poster presented at
Acoustical Society of America 153, Salt Lake City, UT. June 4-8.
Clopper, C. G., & Bradlow, A. R. (in press). Perception of dialect variation in noise: Intelligibility and
classification. Language and Speech.
Clopper, C. G., Carter, A. K., Dillon, C. M., Hernandez, L. R., Pisoni, D. B., Clarke, C. M., Harnsberger, J.
D., and Herman, R. (2002). The Indiana Speech Project: An overview of the development of a multitalker
multi-dialect speech corpus. Research on Spoken Language Processing Progress Report No. 25
(Speech Research Laboratory, Indiana University, Bloomington), pp. 367-380.
Clopper, C. G., & Pisoni, D. B. (2006). Effects of region of origin and geographic mobility on perceptual
dialect categorization. Language Variation and Change, 18, 193-221.
Durian, D., Dodsworth, R., & Schumacher, J. (to appear). Convergence in urban blue collar Columbus
AAVE and EAE vowel systems? Publication of the American Dialect Society.
Floccia, C., Goslin, J., Girard, F., & Konopczynski, G. (2006). Does a regional accent perturb speech
processing? Journal of Experimental Psychology: Human Perception and Performance, 32, 1276-
1293.
Fox, R. A. (1984). Effect of lexical status on phonetic categorization. Journal of Experimental Psychology:
Human Perception and Performance, 10, 526–540.
Ganong, W. F. (1980). Phonetic categorization in auditory word perception. Journal of Experimental
Psychology: Human Perception and Performance, 6, 110-125.
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review,
105, 251-279.
15
Hay, J., Nolan, A., & Drager, K. (2006). From fush to feesh: Exemplar priming in speech perception.
Linguistic Review, 23, 351-379
Hay, J., Warren, P., & Drager, K. (2006). Factors influencing speech perception in the context of a mergerin-
progress. Journal of Phonetics, 34, 458-484.
Hazan, V., & Barrett, S. (2000). The development of phonemic categorization in children aged 6-12.
Journal of Phonetics, 28, 377-396.
Hillenbrand, J., Getty, L. A., Clark, M. J., & Wheeler, K. (1995). Acoustic characteristics of American
English vowels. Journal of the Acoustical Society of America, 97, 3099-3111.
Johnson, K. (1997). Speech perception without speaker normalization: An exemplar model. In K. Johnson
& J. W. Mullennix (Eds.), Talker Variability in Speech Processing (pp. 145-166). San Diego:
Academic Press.
Kraljic, T., Brennan, S. E., & Samuel, A. G. (2008). Accommodating variation: Dialects, idiolects, and
speech processing. Cognition, 107, 54-81.
Labov, W. (1998). The three dialects of English. In M. D. Linn (Ed.), Handbook of Dialects and Language
Variation (pp. 39-81). San Diego: Academic Press.
Labov, W., & Ash, S. (1997). Understanding Birmingham. In C. Bernstein, T. Nunnally, & R. Sabino
(Eds.), Language Variety in the South Revisited (pp. 508-573). Tuscaloosa, AL: University of
Alabama Press.
Labov, W., Ash, S., & Boberg, C. (2006). Atlas of North American English. New York: Mouton de
Gruyter.
Mason, H. M. (1946). Understandability of speech in noise as affected by region of origin of speaker and
listener. Speech Monographs, 13(2), 54-68.
Maye, J., Aslin, R.N., & Tanenhaus, M.T. (2008). The Weckud Wetch of the Wast: Lexical adaptation to a
novel accent. Cognitive Science, 32, 543-562.
Munson, B. (2007). Lexical characteristics mediate the influence of sex and sex typicality on vowel-space
size. Proceedings of the 16th International Congress of Phonetic Sciences, 885-888.
Niedzielski, N. (1999). The effect of social information on the perception of sociolinguistic variables.
Journal of Language and Social Psychology, 18, 62-85.
Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is
never necessary. Behavioral and Brain Sciences, 23, 299-370.
Nusbaum, H. C., Pisoni, D. B., Davis, C. K. (1984). Sizing up the Hoosier Mental Lexicon: Measuring the
familiarity of 20,000 words. Research on Speech Perception Progress Report No.10 (Speech Research
Laboratory, Indiana University. Bloomington), pp. 357–376.
Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. Journal of the
Acoustical Society of America, 24, 175–184.
Pierrehumbert, J. B. (2002). Word-specific phonetics. In C. Gussenhoven & N. Warner (Eds.), Laboratory
Phonology 7 (pp. 101-140). Berlin: Mouton de Gruyter.
Pierrehumbert, J. B. (2006). The next toolkit. Journal of Phonetics, 34, 516-530.
Pierrehumbert, J. B., & Clopper, C. G. (2006). Contextual predictability and Northern Cities vowel shifting.
Poster presented at the Tenth Conference on Laboratory Phonology, Paris, France. June 29-July 1.
Pitt, M. A. (1995). The locus of the lexical shift in phoneme identification. Journal of Experimental
Psychology: Learning, Memory, & Cognition, 21, 1037–1052.
Rakerd, B., & Plichta, B. (2003). More on perceptions of // fronting. Paper presented at New Ways of
Analyzing Variation 32, Philadelphia, PA. October 9-12.
16
Stevens, M. A., McQueen, J. M., & Hartsuiker, R. J. (2007). No lexically-driven perceptual adjustments of
the [x]-[h] boundary. Proceedings of the 16th International Congress of Phonetic Sciences, 1897-1900.
Tamati, T. N. (2008). Effects of Dialect and Talker Variability on Lexical Recognition Memory.
Unpublished Honors thesis, Ohio State University, Columbus.
Winters, S., & Pisoni, D. (2005). Speech synthesis: Perception and comprehension. In K. Brown (Ed.),
Encyclopedia of Language and Linguistics Vol. 12 (pp. 31-49). London: Elsevier.
Vitevitch, M. S., Luce, P. A, Charles-Luce, J., & Kemmerer, D. (1997). Phonotactics and syllable stress:
Implications for the processing of spoken nonsense words. Language and Speech, 40, 47-62.
17
Table 1. Predicted phonological vowel confusions for the Midland and Northern dialects.
Midland North
  
ow  
  
  æ
æ  
  æ
  
  
  
18
Table 2. Percent correct words and vowels in each condition for each talker dialect.
Words Vowels
Condition Midland North Midland North
Word-competitor 64 67 82 81
Nonword-competitor 71 74 87 88
19
Table 3. Significant vowel confusions in each lexical condition for each talker dialect.
Condition Midland North
Word-competitor   
  
  
ow  
ow  u
  
  æ
  
  
  
ow  
Nonword-competitor   
  
  æ
  j
  
20
Table 4. Percent correct words and vowels in each condition for each talker dialect for each
listener dialect. The number of listeners in each listener group is shown in parentheses.
Words Vowels
General American Midland North Midland North
Word-competitor (11) 64 67 82 80
Nonword-competitor (10) 69 73 87 88
North
Word-competitor (3) 69 72 87 85
Nonword-competitor (6) 72 75 86 88
Mobile
Word-competitor (5) 60 65 79 81
Nonword-competitor (4) 75 74 89 88
21
Table 5. Vowel confusions for talker-listener dialect matches and mismatches.
Talker Dialect
Listener Dialect Midland Northern
General American      
æ  
  j
Mobile   
  æ
North   i   j
22
Figure 1. Map of the Northern (dark gray) and Midland (light gray) dialect regions in the United
States. The state of Ohio is indicated by the black star.
NORTH
MIDLAND
23
Figure 2. Schematics of the Midland (left) and Northern (right) vowel systems

THE GENERATION OF REGIONAL PRONUNCIATIONS OF ENGLISH
FOR SPEECH SYNTHESIS1
Susan Fitt
e-mail sue@cstr.ed.ac.uk
Centre for Speech Technology Research
University of Edinburgh
80 South Bridge
Edinburgh
UK
ABSTRACT Welsh and Northern English), and two American ones
(New York and South Carolina, to represent Eastern
and Southern American); regional features were based
primarily on the descriptions in [1], with native-speaker
input where possible. The regional accents are
abbreviated in this paper as: Br(Sc) = Edinburgh;
Br(W) = Cardiff; Br(N) = Leeds; Am(E) = New York;
and Am(S) = South Carolina. For the standard accents,
Br(RP) = RP, and Am(Gen) = General American.
Most speech synthesisers and recognisers for English
currently use pronunciation lexicons in standard British
or American accents, but as use of speech technology
grows there will be more demand for the incorporation
of regional accents. This paper describes the use of
rules to transform existing lexicons of standard British
and American pronunciations to a set of regional
British and American accents. The paper briefly
discusses some features describes of the regional
accents in the project, and the framework used for
generating pronunciations. Certain theoretical and
practical problems are highlighted; for some of these,
solutions are suggested, but it is shown that some
difficulties cannot be resolved by automatic rules.
However, although the method described cannot
produce phonetic transcriptions with 100% accuracy, it
is more accurate than using letter-to-sound rules, and
faster than producing transcriptions by hand.
The accents generated represent fairly educated
regional speech, though some optional rules were
included which produce broader accents. The division
between ‘obligatory’ and ‘optional’ rules is somewhat
artificial, as there may be speakers from the region who
have a noticeably local accent but do not use all of the
‘obligatory’ rules as their speech is somewhat closer to
the standard accent. However, it enables us to produce
pronunciation lexicons which represent the main
features of the regional accents, while allowing some
freedom of variation.
1. INTRODUCTION
Some examples of the regional characteristics to be
included in a lexicon, i.e. excluding such features as
rhythm and intonation, are given below. (Throughout
this paper, transcriptions are given in IPA unless
otherwise specified.)
For some applications of speech synthesis, and for some
users, output in standard accents is inappropriate, and
as the use of speech systems increases there will be an
increase in demand for regional accents of English.
Access to regional pronunciation variants will also be
of value for speech recognition systems. A labourefficient
way of producing these is needed; this paper
describes the production by rule of pronunciation
lexicons for five accents of English, using as input the
information already contained in a lexicon of standard
British and American pronunciations. There is the
added benefit that since many linguistic rules are used
by more than one accent, the ground-work is laid for
producing further accents.
Feature Example Br(RP) Br(Sc)
Rhoticity ‘horse’ \hOs\ \hO¨s\
Vowel length/
quality distinctions
‘tide’
‘tied’ \taId\
\taId\
\t¿id\
\taòed\
Figure 1: Some features of Edinburgh English
Feature Example Br(RP) Br(W)
Presence of \ñ\ ‘llewelyn’ \lu”E.lIn\ \ñu”E.lIn\
Full vowel in final
syllables
‘endless’ \”End.l«s\ \”End.lEs\
2. REGIONAL ACCENTS
Three British accents were chosen (as spoken in Figure 2: Some features of Cardiff English
Edinburgh, Cardiff and Leeds, to represent Scottish,
1This work was supported by France Telecom CNET under the contract 6RC0328.
Feature Example Br(RP) Br(N) was used as the input to the transformation rules.
Syllable boundaries and stress patterns were retained in
the alignment as they were often useful for
transformation rules.
Different use of \Ï\ -
\A\ (realised in Leeds
as \a\ – \aÉ\)
‘hat’
‘dance’
‘part’
\hÏt\
\dAns\
\pAt\
\hat\
\dans\
\paÉt\
Optional \h\-dropping ‘hot’ \ht\ \t\ 3.2. Remapping Rules
The first and simplest step in creating regional
pronunciations was to remap the correspondences
between machine-readable symbols and phonemes for
each accent, to allow for different phonemic
inventories. These remappings are context-free. In
many cases, this allowed the regional accents to use the
same machine-readable transcription as the standard
accent. For example, Leeds English does not
differentiate between \U\ and \¿\, whereas RP has both.
The symbols ‘u’ and ‘uh’, which represent \U\ and \¿\ respectively for RP, can both be remapped to represent
\U\ in Leeds English. This gives us:
Figure 3: Some features of Leeds English
Feature Example Am(Gen) Am(E)
Presence of \Iu\ ‘new’ \nu\ \nIu\
Optional use of \Ng\ for certain instances
of \N\
‘clingy’ \”klIN.i\ \”klIN.gi\
Figure 4: Some features of New York English
Feature Example Am(Gen) Am(S)
Non-rhoticity ‘heart’ \hA¨t\ \hAÉt\
Word Machinereadable
transcription
Use of \I\ rather than Br(RP) Br(N)
\i\ in certain
environments
‘happy’ \”hÏ.pi\ \”hÏ5.pI\
Figure 5: Some features of South Carolina English ‘put’ p * u t \pUt\ \pUt\
‘putt’ p * uh t \p¿t\ \pUt\
3. RULE FRAMEWORK Figure 6: Remapping of ‘u’ and ‘uh’ for Leeds English
Previous work had produced a pronunciation lexicon
containing over 110,000 words, for use in diphone
synthesis. These were transcribed in RP and General
American, using machine-readable phonetic alphabets,
and parts of speech were also included in the entries.
This lexicon was used as the basis for the current work.
The RP transcriptions were used as the basic input for
generating the British accents, while the General
American transcriptions were used to generate the
American regional accents.
3.3. Rewrite Rules
The second method used, and the most important one,
was context-sensitive rewrite rules, based on the
existing transcriptions but also permitting other
information in the lexicon, such as part of speech, to be
used as input. The rewrite rules fall into a number of
categories, as described below. Some of the examples
have been simplified here due to lack of space.
For some of the rules a number of different
formulations would be possible. For instance,
glottalisation of \t\ may vary by phonetic environment
and social context as well as speaker, with final \t\ being transformed to a glottal stop more readily than
medial \t\. For this project, a typical set of
environments was used for such cases.
3.1. Alignment Rules
A number of the rules rely on descriptions of
relationships between the original pronunciation and
the spelling. For example, part of the rule for producing
/x/ in Edinburgh English can be stated as follows:
Replace a \k\ or \g\ which represents ‘ch’ or ‘gh’,
and is not part of a syllable-initial cluster, with \x\.
3.3.1. Pre-lexicon Transformations
These are rules for producing a basic pronunciation
We then need an alignment to distinguish between the lexicon for each accent.
\k\ in RP ‘lochside’, which represents orthographic ‘ch’
and should be converted to Br(Sc) \x\, and the \k\ in RP
‘dockside’, which represents orthographic ‘ck’ and so
remains as \k\ in Br(Sc). It is easy to see the
correspondence between the orthography and the
pronunciation, but less easy to formulate rules to
express this accurately (see [2]). An alignment
algorithm was designed for the existing lexicons,
grouping letters or short sequences of letters with
phonemes or sequences of phonemes; the output of this
a) Obligatory rules – a set of rules which are always
applied, for example non-rhoticity in South
Carolina:
‘start’: Am(Gen) \stA¨t\ ® Am(S) \stAÉt\
b) Obligatory lexical features – isolated words which
have unpredictable regional pronunciations, for
example ‘with’ in Edinburgh English:
‘with’: Br(RP) \wID\ ® Br(Sc) \wIT\
c) Optional rules – a set of rules which may optionally
be applied. These rules give ‘broader’
pronunciations than the obligatory rules alone, for
example, use of \In\ rather than \IN\ to represent
‘-ing’ in various accents, including Cardiff:
4. RESULTS
The remapping rules cover a fair number of cases, and
are straightforward. More interesting issues arise from
the rewrite rules.
4.1. Relationship between British and American
Pronunciations
‘thinking’: Br(RP) \”TINk.IN\ ® Br(W) \”TINk.In\
d) Optional lexical features – isolated words which for
some speakers have unpredictable pronunciations
in the regional accent, for example ‘make’ in Leeds
English:
Sometimes the most accurate results are obtained by
taking a feature of one of the transcribed accents for use
in one of the generated accents of the other country. For
example, in Br(RP) the ASCII combination |i@| has
been used to represent both \I«\ (or \i«\) in words such
as ‘happier’, ‘topiary’, ‘fearing’ and ‘fear’. However, for
Cardiff English this needs to be split three ways – \i«\ for ‘happier’ and ‘topiary’, \iÉ\ for ‘fearing’, and \jä\ for
‘fear’. Some generalisations can be made about the
phonetic environments in which they occur, but a more
accurate transformation can be made by including the
Am(Gen) transcriptions in the rule environment. We
then have, for |i@| preceding orthographic ‘r’:
‘make’: Br(RP) \meIk\ ® Br(N) \mEk\
3.3.2. Post-lexicon Transformations
These rules apply to the output of the pre-lexicon
transformations, and concern allophones, which it is
not necessary to include in a lexicon. Some allophone
rules were included in the pre-lexicon transformations
if they had complex contextual descriptions, including
for example morphological information.
The allophones are variants of a single phoneme
(though in a few cases, such as Edinburgh \aòe\ – \¿i\, it
is not clear whether a given alternation is allophonic or
phonemic). Allophones are used in all specified
contexts, with no lexical exceptions. Some of these
would be produced naturally by subjects recording
diphones, but others rely on a wide context (for
example, in South Carolina vowels may be conditioned
by the vowel in the next syllable). Rules are therefore
needed to specify the contexts of these allophones.
Rule i): where Am(Gen) has \i«\, \i«±\, \j«\ or \i®E\,
transform Br(RP) \I«\ to Br(W) \i«\ Examples: ‘happier’, ‘topiary’
Rule ii): in the environment preceding \r\ plus a
vowel, where Am(Gen) has \I\ not preceding a
geminate \r\, change Br(RP) \I«\ to Br(W) \i\ Example: ‘fearing’
Rule iii): other cases of Br(RP) \I«\ before
orthographic ‘r’ become Br(W) \jä\.
a) Obligatory – in natural speech, these would be Example: ‘fear’
produced by all subjects with the given accent, for
example use of taps in various American accents: No explicit alignment had been produced for matching
the Br(RP) and Am(Gen) transcriptions with each
other, and they sometimes had different numbers of
syllables, or different stress patterns, for example the
alignments for ‘topiary’ were as follows:
‘catty’: Am(E) \”kÏ.ti\ ® \”kÏ.|i\
b) Optional – in a natural situation these may vary
according to the subject or the formality of the
situation, for example Edinburgh glottal stops:
‘hot’: Br(Sc) \hOt\ ® \hO/\ Br(RP) orthog. t o p ia r y
3.3.3. Connected Speech Rules phon. t *ou . p i@ .r ii
As some accents have rules which apply in connected Am(Gen) orthog. t o p i a r y
speech, these have been included in the framework. phon. t *ou .p ii .~e .r ii
a) Obligatory rules – these include removal of preconsonantal
word-final \r\ in Cardiff English. (This
has been transcribed in non-rhotic accents to allow
for linking ‘r’.)
Figure 7: Alignments between the orthography
and the machine-readable phonetic alphabet
for ‘topiary’ in Br(RP) and Am(Gen)
‘car park’: Br(W) \kaɨ paÉk\ ® \kaÉ paÉk\ However, nearly all cases were covered by looking for
the relevant sequence at the same location in both
transcriptions, and if this failed, comparing the
previous and following segments.
b) Optional rules – Leeds English may use \r\ instead
of pre-vocalic word-final \t\:
‘shut up’: Br(N) \SUt Up\ ® \SU¨ Up\ 4.2. One-to-Many Relationships
Some one-to-many relationships, like the Cardiff
example described above in 4.1, can be predicted on the
basis of information in the lexicon. However, other oneto-
many relationships are problematic. For example,
both Edinburgh and South Carolina distinguish
between ‘hoarse’ and ‘horse’, which in RP and General
American are homophones. The difference cannot be
predicted from the spelling, as there is no consistent
correspondence between the different spellings of this
set of words and the different vowels, and nor can it be
predicted from the part of speech tags. This type of split
is the main problem in generating regional
pronunciations by rule, as it cannot be resolved except
by hand-tagging of individual lexical items, which is
not a linguistically satisfactory solution, and is not
practical in the current framework.
glottal stops for \t\, than more frequent ones. This factor
has not been investigated for the current work, but it is
possible that word-length might be an approximation to
this. More likely is that word frequency in spoken
language (not currently included in the lexicon) would
provide a basis for distinguishing such groups of words.
More detailed semantic or etymological information
would also be of assistance. For example, \ñ\ in Welsh is
only used in native Welsh names or loanwords, and this
information is not available in the lexicon.
5. EVALUATION
Large-scale evaluation of the output was unfortunately
not possible due to the lack of comparable work in this
area, but native speakers of the accents were consulted
where possible and the transcriptions were compared to
descriptions and examples in other sources. The rules
used seemed to produce acceptable output for the
different accents, but some were more successful than
others. Particularly problematic were South Carolina,
with its large number of allophones, and Edinburgh,
which has a very different vowel system from RP.
4.3. Missing Information
Certain features of the various accents are predictable,
but rely on information not currently contained in the
lexicon.
4.3.1. Morphology
The primary type of missing information is
morphological. Some rules for phonemes or allophones
depend on morphological boundaries, but these are not
explicitly marked in the lexicon. Some of this
information can be deduced from the current format, for
example by reference to parts of speech, orthography
and the phonetic environment, or by lists of affixes. For
Edinburgh English we can use the spelling,
pronunciation and part of speech to differentiate
between the past tense verb ‘mooed’, which contains a
morphological boundary and so has a long vowel, and
the noun ‘mood’, which does not.
The discussions with native speakers were invaluable,
as this enabled checking of a wider range of examples
than are commonly available in the literature. However,
it should be noted that native speakers from the same
region did not always agree with each other on the
lexical incidence of features, or even in some cases on
the phonemic inventory. While some regional features
have been studied sociolinguistically (for example, see
[3]), others have not, making consistency difficult. One
solution to this is to model each accent on a single
speaker; another is to study several speakers in order to
produce an integrated model of the regional variation.
‘mooed’: Br(RP) \mud\ ® Br(Sc) \muÉd\ ‘mood’: Br(RP) \mud\ ® Br(Sc) \mud\
Not all cases, however, are so transparent, particularly 6. CONCLUSIONS
compounds. In South Carolina, \nt\ may optionally be
reduced to \n\, following a stressed vowel and preceding
a vowel ([1], Vol. 3, p. 552). Syllable boundaries are
irrelevant, but the \t\ should not be the first syllable of a
free morpheme. So, we have:
It is possible to develop regional pronunciations by rule
from existing standard pronunciations, and most
systemic differences can be covered in this way.
However, there are certain features, for some accents in
particular, which cannot be accurately generated by this
‘winter’: Am(Gen) \”wIn.t«±\ ® Am(S) \”wIn.«\ method. 2
Unfortunately, the rule as formulated cannot be
prevented from applying to compounds such as
‘meantime’, wrongly giving us:
7. REFERENCES
[1] Wells, John C. (1982). “Accents of English.”
‘meantime’: Am(Gen) \”min.taIm\ Cambridge: Cambridge University Press. ® Am(S) \”min.aIm\2
[2] Lawrence, S.G.C, and Kaye, G. (1986).
“Alignment of phonemes with their corresponding
orthography.” Computer Speech and Language,
Vol. 1, pp. 153-65.
4.3.2. Other
It has been suggested ([3], p. 162) that lexical items
which are ‘learned’ are less prone to some kinds of
casualisation or reduction processes, such as the use of [3] Reid, Euan (1978). “Social and stylistic variation
in the speech of children: some evidence from
Edinburgh.” In: Peter Trudgill (ed.),
Sociolinguistic patterns in British English,
pp. 158-71. London: Edward Arnold.
2Other processes would subsequently apply to these strings,
such as allophonic adjustments.
Evolution Publishing
PO Box 1333
Merchantville NJ 08109, USA

Email: info@arxpub.com
 




Linguistic Geography of the Mainland United States


©1999 C. Salvucci
 
Traditionally, dialectologists have listed three dialect groups in the United States: Northern, Midland, and Southern–although some scholars prefer a two-way classification of simply Northern and Southern, and one may also find significant difference on the boundaries of each area. The map shown above represents a synthesis of various independent field studies this century. These are in chronological order: the Linguistic Atlas fieldwork begun under the direction of Hans Kurath in the 1930′s; the informal but extensive personal observations of Charles Thomas in the 1940′s; the DARE fieldwork of the 1960′s under Frederic Cassidy; and the Phonological Atlas fieldwork of William Labov during the 1990′s. Although it may seem that a great amount of data has been collected over a short time span, the shifts in American dialects this century have been rapid enough to outpace the data collection. What appears to be a well-entrenched dialect marker today such as the Northern Cities Shift, may barely appear in earlier studies–affecting both classification and mapping. Nevertheless, some basic observations on current American linguistic geography can be made.
The New England Dialects
These dialects are non-rhotic, dropping r’s before consonants and at the end of words. This area is further subdivided into Eastern New England, including Boston and much of Maine, where O and AU shift into an intermediate vowel so that cot and caught are merged. Transitional between Eastern New England and New York, Western New England is less well defined. Providence retains R-dropping, but does not merge O and AU.
The New York Dialects
New York City has a rather anomalous linguistic situation, in that its local dialect was not reproduced further westward and therefore cannot be fit into any larger regional grouping such as New England or the Midland.(1) Like New England, the dialect is R-dropping–other features are more generally common to the Northeastern seaboard. The Hudson Valley dialect of Albany, though R-preserving, is nevertheless close enough to New York City’s to be grouped with it: both of them shared a Dutch linguistic substratum which is now only vestigial.
The Great Lakes Dialects
Among all the dialect regions, the Great Lakes region is perhaps the most homogenous, since the major cities in this area (Syracuse, Rochester, Buffalo, Cleveland, Detroit, Chicago, Milwaukee) are simultaneously undergoing a chain shift known as the Northern Cities Shift, with a rotation of the short vowels so that “they may be heard as members of another phoneme by listeners from another dialect area with consequent confusion of meanings: Ann as Ian, bit as bet, bet as bat or but , lunch as launch, talk as tuck , locks as lax” (Labov 1991). This area is fully R -preserving, even though the earliest settlers of this area were primarily New Englanders. At present New England influence is evident only in the lexicon.
The Upper Midwest Dialects
This area is characterized mainly by a conservative vowel scheme, where the long vowels (often attributed to Scandinavian influence) have remained purely monophthongal, exemplified in the widely known long O in the name Minnesota. Along the northern border are found Canadianisms such as the centralized long I in fuyr (fire) and the centralized ow “uh-oo” in : ouwt (out).
The Midland Dialects
Midland dialects retain R in all positions, and long I is not flattened (monophthongized) as uniformly as in the South, but the Midland is otherwise not very easy to describe as a whole, since “each of the Midland cities — Philadelphia, Pittsburgh, Columbus, Cincinnati, Indianapolis, St. Louis, Kansas City — has its own local character.” (Labov 1997). More southerly Midland cities have a typically Southern fronted nucleus in ow, e.g. aout (out); more northerly Midland cities tend not to. Labov (1997) on this basis divides the area horizontally into a North Midland and South Midland.(2) Previous researchers have also seen east-west distinctions, separating the Pennsylvania dialect(s) from those of the Lower Midwest. (Kurath 1949, Thomas 1958, Carver 1989).
The Western Dialects
Western phonology has only recently begun to diverge, primarily with the merger of AU into the short O class: e.g. cot for both caught and cot, and the fronting of the long U class, e.g. “ih-oo” in words such as two. Otherwise it appears that the Western dialects were formed primarily from a Midland base, since both groups are similarly conservative in their phonology–in fact it was certainly Midland and Western dialects which were so often lumped together under the catch-all phrase “General American”.(3) Westward migration has also carried typically Northern features into the Pacific Northwest, and Southern features into the Southwest: both phonology (Labov 1997) and lexicon (Carver 1989) have been affected.
Endnotes
(1) Many scholars have defined New York City as “Northern” by virtue of its geographical location: but naming a “Northern” group of dialects is misleading if it implies the kind of shared phonology which we see in the Southern dialects. I share the view that a general term “Northern” makes the most sense if used the way most Americans would understand it: i.e. any dialect that does not have the full monophthongization of long I and is therefore not Southern.
(2) The South Midland described in Kurath 1949 and Kurath and McDavid 1961 is wholly different from Labov’s, referring to the area here termed Mountain Southern. Kurath’s “North Midland” is called here, as in Labov, simply Midland.
(3) See particularly Thomas 1958, which merges the Midland with a large part of the West, while cordoning off the Northwest and Southwest Coast with, as he admits, “ill-defined” boundaries. Modern linguists have been sharply critical of the now disused term “General American” but it does seem that in the early 20th century a huge area of the country used a quite similar phonology.
Bibliography
Carver, Craig M. 1989. American Regional Dialects: A Word Geography. Ann Arbor:University of Michigan Press.
Cassidy, Frederic G., ed. 1985-. The Dictionary of American Regional English. Cambridge:Belknap Press.
Kurath, Hans. 1949. A Word Geography of the Eastern United States. Ann Arbor:University of Michigan Press.
Kurath, Hans and Raven I. McDavid. 1961. The Pronunciation of English in the Atlantic States. Ann Arbor:University of Michigan Press.
Labov, William. 1991. The Three Dialects of English. In Penny Eckert, ed. New Ways of Analyzing Sound Change. New York:Academic Press, pp. 1-44.
Labov, William, Sharon Ash and Charles Boberg. 1997. A National Map of the The Regional Dialects of American English.
Thomas, Charles K. 1958. The Phonetics of American English. New York.



Regional vocabularies of American English

From Wikipedia, the free encyclopedia
For variations in the pronunciation of spoken English in North America, see North American English regional phonology.
Regional vocabularies of American English vary. Below is a list of lexical differences in vocabulary that are generally associated with a region. A term featured on a list may or may not be found throughout the region concerned, and may or may not be recognized by speakers outside that region. Some terms appear on more than one list.

Contents

[hide]

[edit] Regionalisms

Historically, a number of everyday words and expression used to be characteristic of different dialect areas of the United States, especially the North, the Midland, and the South; many of these terms spread from their area of origin and came to be used throughout the nation. Many today use these different words for the same object interchangeably, or to distinguish between variations of an object. Such traditional lexical variables include:[1][2]
  • faucet (North) and spigot (South);[3]
  • frying pan (North and South, but not Midland), spider (New England; obsolete),[4] and skillet (Midland, Gulf States);
  • clapboard (chiefly Northeast) and weatherboard (Midland and South);[5]
  • gutter (Northeast, South), eaves trough (in-land North, West), and rainspouting (chiefly Maryland and Pennsylvania);
  • pit (North) and seed (elsewhere);
  • teeter-totter (widespread),[3] seesaw (South and Midland), and dandle (Rhode Island);
  • firefly (less frequent South and Midland) and lightning bug (less frequent North);
  • pail (North, north Midland) and bucket (Midland and South).
Many differences however still hold and mark boundaries between different dialect areas, as shown below. From 2000-2005, for instance, The Dialect Survey queried North American English speakers’ usage of a variety of linguistic items, including vocabulary items that vary by region.[6] These include:
  • generic term for a sweetened carbonated beverage
  • drink made with milk and ice cream
  • long sandwich that contains cold cuts, lettuce, and so on
  • rubber-soled shoes worn in gym class, for athletic activities, etc.
Below are lists outlining regional vocabularies in the main dialect areas of the United States.

[edit] The Northeast

  • brook – creek. Mainly New England, now widespread but especially common in the Northeast.[2]
  • cellar – alternate term for basement.[7]
  • sneaker – although found throughout the U.S., appears to be concentrated in the Northeast. Elsewhere (except for parts of Florida) tennis shoe is more common.[8]
  • soda – a soft drink[9]

[edit] New England

See also: Boston accent
  • bulkhead – cellar hatchway[2]
  • Cabinet – (Rhode Island) – milk shake[2]
  • frappe (eastern Massachusetts) – milkshake[2]
  • grindersubmarine sandwich[2]
  • hosey – (esp. parts of Massachusetts & Maine) to stake a claim or choose sides, to claim ownership of something (sometimes, the front seat of a car)[2]
  • intervale – bottomland; mostly historical[2]
  • johnnycake (also Rhode Island jonnycake) – a type of cornmeal bread[2]
  • leaf peeper – a tourist who has come to see the area’s vibrant autumn foliage[2]
  • necessary – outhouse, privy[2]
  • packie – a liquor store (package store)[2]
  • quahog – pronounced “koe-hog,” it properly refers to a specific species of clam but is also applied to any clam[2]
  • rotarytraffic circle[2]
  • tonic (eastern Massachusetts) – soft drink[2]

[edit] Northern New England

  • ayuh – “yes” or affirmative[2]
  • dooryard – area around the main entry door of a house, specifically a farmhouse. Typically including the driveway and parking area proximal to the house[2]
  • Italian (sandwich) – (Maine) submarine sandwich[2]
  • logan (also pokelogan) – a shallow, swampy lake or pond (from Algonquian)[2]
  • muckle – to grasp, hold-fast, or tear into[2]

[edit] Mid-Atlantic

[edit] New York City Area (including adjacent New Jersey and Connecticut)

  • catty corner – on an angle to a corner[2]
  • dungarees (archaic) – jeans[2]
  • egg cream – a mixture of cold milk, chocolate syrup, and seltzer[2]
  • hero – submarine sandwich[2]
  • kill – a small river or strait, in the name of specific watercourses; e.g. Beaver Kill, Fresh Kills, Kill Van Kull, Arthur Kill (from Dutch)[2]
  • potsyhopscotch[2]
  • punchball – a baseball-like game suitable for smaller areas, in which a fist substitutes for the bat and a “spaldeen” is the ball[2]
  • scallion – spring onion[2]
  • stoop – a small porch or steps in front of a building, originally from Dutch[10]

[edit] Other Mid-Atlantic areas

[edit] The North

  • braht or bratbratwurst[2]
  • breezeway (widespread) – a hallway connecting two buildings[2]
  • bubbler (esp. Wisconsin and the Mississippi and Ohio river valleys) – a water fountain[2]
  • clout (originally Chicago, now widespread) – political influence[2]
  • davenport (widespread) – a sofa, or couch[2]
  • euchre (throughout the North) – card game similar to spades[2]
  • fridge (throughout North and West) – refrigerator[2]
  • hot dish (esp. Minnesota) – a simple entree cooked in a single dish, related to casserole[11]
  • paczki (in Polish settlement areas, esp. Michigan, Ohio and Wisconsin) – a jelly donut[2]
  • pop (widespread in North and West) – a soft drink, carbonated soda[2]
  • soda (parts of Wisconsin) – soft drink[9]
  • Yooper (Michigan) – people who reside in the Upper Peninsula of Michigan[12]

[edit] The Midland

  • barn-burner (now widespread) – an exciting, often high-scoring game, esp. a basketball game[2]
  • dinner (widespread) – the evening meal; the largest meal of the day, whether eaten at mid-day or in the evening[2]
  • hoosier (esp. Indiana) – someone from Indiana; (outside of Indiana, esp. in the St. Louis, Missouri area) a person from a rural area, comparable to redneck[3]
  • mango – green bell pepper, sometimes also various chili peppers[2]
  • outer road – a frontage road or other service road[2]
  • pop – a soft drink (except in a large area centered on St. Louis, Missouri, where soda predominates)[9]

[edit] The South

  • alligator pearavocado[2]
  • banquette (southern Louisiana) – sidewalk, foot-path[2]
  • billfold (widespread, but infrequent Northeast, Pacific Northwest) – a man’s wallet[2]
  • cap (also Midlands) – sir (prob. from “captain”)[2]
  • chill bumps (also Midlands) – goose bumps[2]
  • chunk – toss or throw an object[3]
  • coke – any brand of soft drink[9]
  • commode (also Midlands) – bathroom; restroom; particularly the toilet[2]
  • crocus sack (Atlantic), croker sack (Gulf) – burlap bag[2]
  • cut on/off – to turn on/off[2]
  • directly – in a minute; soon; momentarily[2]
  • dirty rice (esp. Louisiana) – Cajun rice dish consisting of rice, spices, and meat[2]
  • fais-dodo (southern Louisiana) – a party[2]
  • fix – to get ready, to be on the verge of doing; (widespread but esp. South) to prepare food[2]
  • house shoes – bedroom slippers[2]
  • lagniappe (Gulf, esp. Louisiana) – a little bit of something extra[2]
  • locker (esp. Louisiana) – closet[2]
  • make (age) (Gulf, esp. Louisiana) – have a birthday; “He’s making 16 tomorrow.”[2]
  • neutral ground (Louisiana, Mississippi) – median strip[2]
  • po’ boy (scattered, but esp. South) – a long sandwich, typically made with fried oysters, clams, or shrimp[2]
  • put up – put away, put back in its place[2]
  • yankee – northerner; also damn yankee, damned yankee[2]
  • yonder (esp. rural) – over there, or a long distance away; also over yonder[13]

[edit] The West

  • barrow pit (esp. Rocky Mountains) – a ditch to conduct water off a surface road[2]
  • davenport (widespread) – couch or sofa[2]
  • pop (widespread in West and North) – carbonated beverages; soda predominates in California, Arizona, southern Nevada[9]
  • snowmachine (Alaska) – a motor vehicle for travel over snow. Outside Alaska known as a snowmobile[14]

[edit] Pacific Northwest

  • chechaco – derogatory term for newcomers to the Northwest. (from Chinook Jargon)[14]
  • crummy – a vehicle used to transport forest workers[2]
  • gyppo – contract work (or worker). Corruption of “gypsy”[2]
  • potlatch – a social gathering; a Native American festival during which the chief gives away his possessions (from Chinook Jargon)[2]
  • Skid road or Skid row – a path made of logs or timbers along which logs are pulled; (widespread) a run-down, impoverished urban area[2][14]
  • skookum – good, strong, powerful, first rate. (from Chinook Jargon)[2]
  • snoose – chewing snuff or dipping tobacco, especially taken by loggers[14]
  • tyee – Chief, boss, a person of distinction. (from Chinook Jargon)[14]

[edit] See also

[edit] References

  1. ^ Examples in this section are from the Dictionary of American Regional English (2002), except where otherwise noted.
  2. ^ a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab ac ad ae af ag ah ai aj ak al am an ao ap aq ar as at au av aw ax ay az ba bb bc bd be bf bg bh bi bj bk bl bm bn bo bp bq br bs bt bu bv bw Cassidy, Frederic Gomes, and Joan Houston Hall (eds). (2002) Dictionary of American Regional English. Cambridge, MA: Harvard University Press.
  3. ^ a b c d Metcalf, Alan A. (2000) How we talk: American regional English today. New York: Houghton Mifflin Harcourt.
  4. ^ Allen, Harold Byron, and Gary N. Underwood (eds). (1971) Readings in American Dialectology. New York: Appleton-Century-Crofts.
  5. ^ Wood, Gordon Reid. (1971) Vocabulary change: a study of variation in regional words in eight of the Southern States. Carbondale, IL: Southern Illinois University Press.
  6. ^ Vaux, Bert, Scott A. Golder, Rebecca Starr, and Britt Bolen. (2000-2005) The Dialect Suvey. Survey and maps.
  7. ^ “Dialect Survey-Level of a building that is partly or entirely underground”. University of Wisconsin–Milwaukee. Retrieved 2008-06-17.
  8. ^ “Dialect Survey – General term for rubber-soled shoes worn for athletic activities, etc.”. University of Wisconsin–Milwaukee. Retrieved 2008-06-17.
  9. ^ a b c d e Campbell, Matthew T. (2003) Generic names for soft drinks by county. Map.
  10. ^ “Stoop | Define Stoop at Dictionary.com”. Dictionary.reference.com. Retrieved 2011-02-01.
  11. ^ Mohr, Howard. (1987) How to Talk Minnesotan: A Visitor’s Guide. New York: Penguin.
  12. ^ Binder, David. (14 September 1995). “Upper Peninsula Journal: Yes, They’re Yoopers, and Proud of it.” New York Times, section A, page 16.
  13. ^ Wolfram, Walt, and Natalie Schilling-Estes. (2006) American English: dialects and variation second edition. New York: Wiley-Blackwell.
  14. ^ a b c d e Oxford English Dictionary Second Edition. Oxford: Oxford University Press, 1989.

[edit] External links

[hide]
Dialects and accents of Modern English by continent

Europe


Other

North America


Other

Oceania

Other

South America

Africa

Asia
View page ratings
Rate this page
Trustworthy
Objective
Complete
Well-written
How Linguists Approach the Study of Language and Dialect
John R. Rickford
(ms. January 2002, for students in Ling 73, AAVE, Stanford)
            Since we will be drawing primarily on linguistic research to tell the story of African American Vernacular English [AAVE], we need to explain some of the premises under which linguists operate, the kinds of principles which are usually covered in the first chapter of introductory textbooks on linguistics.
The first such premise is that linguistics is a descriptive rather than a prescriptive discipline.  By this we mean that our objective is to describe the systematic nature of language as used by the members of particular speech communities rather than to pass (prescriptive) judgments about how well they speak or how they should or should not be using their language.  The study of people’s attitudes towards one variety or another is an interesting sub field of linguistics, one which can help us to understand the social distribution of dialects or the direction of language change, and one which can be helpful in formulating policy about which varieties to use in the schools and how.  But even here, the linguist is primarily describing the attitudes rather than prescribing what they should be.  [Will this stop us from suggesting that attitudes towards AAVE shouldn't be negative?]
A second, related premise is that every naturally used language variety is systematic, with regular rules and restrictions at the lexical, phonological and grammatical level.  Although non-linguists sometimes assume that some dialects–unusually non-standard ones –don’t have any rules, or that they are simply the result of their speakers’ laziness, carelessness, or cussedness, linguists usually feel quite differently, both on empirical grounds (dialects always turn out to have regular rules), and on theoretical grounds.  The theoretical reason is that the successful acquisition and use of a language variety in a community of speakers would be impossible if language were not systematic and rule-governed.  If every speaker could make up his or her own words and rules for pronunciation and grammar, communication between different speakers would be virtually impossible.
Note, too, that linguists use the term dialect as a neutral term to refer to the systematic usage of a group of speakers–those in a particular region or social class, for instance–and that the term has within linguistics none of the negative connotations which it sometimes has in everyday usage (for instance, meaning “nonstandard” or “substandard” speech, or the speech of people from other regions besides one’s own).  Everyone speaks a dialect–at least one.
The third premise of linguistics which we think it is important to emhasize is that in trying to understand and describe the system of a language, we give primary attention to speech rather than writing.  One obvious reason for this is that the written language omits valuable information about the pronunciation or sound system of a language.  But there are other reasons, including the fact that people all over the world learn to speak before they learn to read or write, and the fact that competence in the spoken variety of at least one language is universal to all normal human beings, but literacy is a more restricted skill (in fact. some languages do not even have writing systems).  Of course the written language is, to varying extents, related to the spoken language.  Comparing and contrasting the two is a fascinating enterprise, and some of the evidence about AAVE which we will consider in this book will be drawn from literature, as some of the excerpts considered above already demonstrate.  But because non-linguists often attach greater authority to the written rather than the spoken word (“if it’s in print, then it must be right”) it’s important to emphasize that linguists tend to make precisely the opposite assumption.
The fourth and final premise of linguistics is that although languages are always systematic, variation among their speakers is absolutely normal.  Although we sometimes think or act as if there were one entity called American or British English–and grammatical handbooks help to reinforce this fiction–we know from actual experience that the “language” varies from one region to another, from one social group to another, and even (when region and social group are held constant), from one occasion or topic to another.
The most significant variations or differences within languages occur at the level of the lexicon (vocabulary), phonology (pronunciation), grammar (morphology and syntax). and usage.  Moreover, they are not just qualitative, in the sense that dialect A uses one feature and dialect B another, but they may also be quantitative, in the sense that dialect A uses one feature more often than dialect B does. (This is particularly true of phonological and grammatical features which have social or stylistic significance.)  Finally, variation may be regional, social or stylistic in its origins, and the methods that linguists have used to study each type differ slightly.  We will now elaborate on these important concepts and provide examples.
Lexical variation
Differences in vocabulary are one aspect of dialect diversity which people notice readily and comment on quite frequently.  They are certainly common enough as markers of the differences between geographical areas or regions–for instance the fact that “a carbonated soft drink” might be called pop in the inland North and the West of the United States, soda in the Northeast, tonic in Eastern New England, and cold drink, drink or dope in various parts of the South (Carver 1987:268).  Or the fact that a person who was “tired” or exhausted” might describe themselves as being all in  if they were from the North or West, but wore out or give out if they were from the South (ibid.:273).  Accordingly, lexical differences play a significant role in regional dialectology (the study of regional dialects), and in popular treatments of American dialects like the documentary film American Tongues, lexical differences are given prime coverage.
Lexical differences are not as salient in distinguishing the speech of different social or socioeconomic classes, and they have accordingly played a much smaller role in social dialectology (the study of social dialects), which has concentrated instead on differences in phonology and grammar.  Nevertheless they are certainly an aspect of ethnic differences–for instance, knowledge of the term ashy  to describe the “whitish or grayish appearance of skin due to exposure to wind and cold” (Smitherman 1994:49) is widespread among African Americans but less so among European Americans (Labov et al 1968:???)–and several dictionaries of African American English have appeared over the past several years.  Lexical differences are also a factor in stylistic variation (for instance, whether one describe oneself as being exhausted or pooped), and in what are sometimes called the “genderlects” of men versus women (for instance, it has been claimed that women are more likely to describe an item as lovely or divine).
One area where social group differences are reflected strongly in the lexicon is in variation according to age group, particularly in the slang of teenagers and young adults.  Accurate definitions of slang are elusive, in part because some words fall more decisively into this category than others, but the term is commonly understood to include the informal in-group vocabulary of young people or non-mainstream groups, and to include items which are relatively short-lived (Wolfram 1991:46-50).  Slang is often particularly rich in evaluative terms; for instance Smitherman’s (1994:91-92) entry for def, a reduction of definitely  which means “great; superb; excellent” lists these older synonyms:  boss, mean, cool, hip, terrible, outa sight, monsta, dynamite, and these newer ones:  fresh, hype, jammin, slammin, kickin, bumpin, humpin, phat, pumpin, stoopid stupid, vicious, down, dope, on and raw.  Although most of these terms have originated and are best known within the African American community, the popularity of African American music and culture has also made many of them familiar to teenagers from other ethnic groups, so much so that these and other slang terms might, in some areas, be considered symbols of youth culture rather than Black culture.  However, African American teenagers often coin new in-group slang terms as fast as their former terms spread to other ethnic groups, and there remain significant differences between the slang of Whites and Blacks (T. Labov 1992).  At the same time, some items which originate as slang become part of the informal vocabulary of older age groups and eventually of the country as a whole, for instance buck “dollar”.
Phonological variation
Phonological variation refers to differences in pronunciation within and across dialects, for instance the fact that people from New York and New England might pronounce “greasy” with an s, while people from Virginia and points further South might pronounce it with a z.  Or the fact that working class people across the United States are more likely than are upper middle class speakers to pronounce the initial th of they and similar words with a d.
Phonological variants are fairly salient as markers of regional dialect.  For instance, the stereotypical Bostonian pronunciation of “Park your car in Harvard yard” as Pahk yo’ car in Hahvahd yahd includes not only the r-lessness of Pahk, yo‘, Hahvahd and yahd (the r in car is retained because the following word begins with a vowel)–a feature shared with many other American dialects, particularly in the South–but also the more distinctive use in these words of a long maximally low or open front vowel [a] where other dialects use a slightly fronter and less open vowel [Å] (Wells 1982:522).  In order to represent the pronunciations with some precision, linguists often use a phonetic alphabet in which each distinguishably different sound is uniquely represented by a different symbol, rather than the relatively unphonetic spelling system of English, in which one sound is often represented by different spellings (e.g. the sound “sh” represented by sh in sheet but by ti in nation) and different sounds by one spelling (e.g. s represents an “s” sound in bets but a “z” sound in beds).  Sounds and words represented in phonetic spelling are enclosed in square brackets; a key to the phonetic spellings used in this work is included at the beginning of this volume.
One relevant aspect of phonological variation worth noting is that it is often conditioned by the phonological environment–that is, by WHERE in a utterance (word-initially, word-finally, before r, and so on) the sound occurs.  We’ve already seen one example of this in the fact that post vocalic [r] is not lost in Boston when the next word begins with a vowel (this is sometimes referred to as “linking r”).  Another example which is relevant to this volume is the fact that the distinction between [È] and [I] which is evident in pig versus peg and other words is lost (or neutralized) in Southern speech before a following nasal consonant, as in pin and pen, both pronounced [pÈn].  As a result of this merger, speakers sometimes have to clarify which word is meant by asking for a “sticking [pÈn]” (pin) rather than a “writing [pÈn]” (pen).  This feature is also characteristic of AAVE across the United States.
The pin/pen example is just one example of a fairly common situation in which phonological mergers in one dialect make homonyms (two or more words with different meanings, pronounced alike) of words which are kept apart in other dialects.  Perhaps the best known example of this is the pronunciation of Mary, merry and marry as homonyms in the Midland (Southern Pennsylvania, Ohio, and so on) and many parts of the West (Reed 1977:31).  Consonant loss–a relatively common process in AAVE–is also a major source of mergers and homonyms (e.g. told, with loss of final d,  becoming homophonous with toll).
Phonological variation–particularly insofar as it involves consonants–is central to social variation and stylistic variation too, and we will provide relevant examples below.
Grammatical variation
What we have been referring to as grammatical variation really involves two sub-types:  morphology and syntax.  Morphology refers to the structure or forms of words, including the morphemes or minimal units of meaning which comprise words, for instance the morphemes {un}”not” and {happy} “happy” in unhappy , or the morphemes {cat}”cat” and {s} “plural” in cats.  Syntax refers to the structure of larger units like phrases and sentences, including rules for combining and relating words in sentences, for instance the rule that in English yes/no questions,  auxiliaries must occur at the beginning of sentences, before the subject noun phrase (e.g. Can John go? versus the statement John can go).
One can find examples of regional variation of both types.  For instance, the form (or morphology) of the past tense of catch, climb and draw  was sometimes catched, clum and drawed respectively in parts of the East but only caught, climbed and drew respectively in the Western US, at least according to a report more than forty years ago (Atwood 1953:???).  In the midwest of the US (including Wisconsin, Ohio and Iowa) and other regions (parts of Pennsylvania, New Jersey, West Virginia), one can use anymore with the meaning of “nowadays” in positive sentences like “He smokes a lot anymore,” but in the rest of the country, anymore can only be used with the meaning of “no longer” and only in negative sentences, as in “He doesn’t smoke a lot anymore” (Labov 1973).  Perhaps even more dramatic is the use of “So don’t I” in Boston and other parts of New England where other dialects would use “So do I”:
(13)     A:   Mary likes liver.
B:  So don’t I (Boston usage for “So do I.”).
Both of the latter examples might be classified as syntactic variation, because they involve relations between words within or across sentences.  The Boston example is in a sense morphosyntactic, since it involves the form of the auxiliary (don’t vs. do) following an adverb (so) which expresses agreement with the proposition of a preceding sentence.  Variation in the form of the past participle after have or had–”He had gone” versus “He had went“–is also morphosyntactic, involving variation in the form of the main verb (morphology) in combination with particular auxiliaries (syntax).
Grammatical variation is much more common as a marker of social dialects and formal/informal styles than it is of regional dialects, with non-standard or vernacular variants sometimes being strongly stigmatized for their associations with limited education or use by the lower working class, but simultaneously being strongly admired and adopted for their connotations of informality, masculinity or non-pretentiousness.  Whether positive or negative, grammatical variables tend to have strong social marking.  One example at the level of morphology is the absence of third person present tense -s, as in “She like Ø liver.”  (In this and other examples we will use the symbol Ø to mark the point at which an omitted feature might have occurred.)  This feature is common in working class AAVE in Detroit and elsewhere in the US, but it is also common in other working class English varieties, for instance among English speakers in Norwich, as shown in  figure 1 below.  A syntactic example is the use of multiple negation in AAVE and other vernacular English dialects, with negation marked both on the auxiliary verb and on the indefinite noun or adverb, as in “I didn’t see nobody” versus Standard English, which permits negative marking on only one constituent, as in “I didn’t see anybody” (negative verbal auxiliary) or  “I saw nobody” (negative indefinite noun).
[NOTE:  ALL MAPS AND FIGURES ARE AT END OF PAPER]
INSERT FIGURE 1:   Absence of third person present tense singular s (she walk Ø) by social class, among African American Speakers in Detroit and White speakers in Detroit (from Holmes 1992, p. 159, drawing on Wolfram 1969 and Trudgill 1974).
Most of the descriptive research which linguists have done on AAVE over the past thirty years has been focused on its grammar, particularly on its distinctive pre-verbal tense-aspect markers, like invariant habitual be  (He be workin “He is usually working”) and stressed BIN (She BIN had one “She’s had one for a long time”).  These may appear to be simple lexical items, but they fall under “grammar” rather than “lexicon” because they have grammatical rather than lexical meaning, serving to signal grammatical relationships (and participating in a system of tense-aspect oppositions) rather than possessing semantic content in and of themselves.  (Contrast bucket, walk, which refer to entities or events in the real world, outside of language, rather than expressing grammatical relationships).
Language use/Speech events and expressive language use
We have mentioned so far that dialects and styles can differ at the level of their words, sounds, and grammatical patterns.  These are the three components of language that have been investigated in dialectology and linguistics for more than a century and the ones that are usually covered in introductory books on these subjects.  A fourth level, one which has only begun to receive serious attention over the past thirty years, involves what we might characterize, with deliberate vagueness, as language use.  By this we mean, in the first instance, a community’s rules for constructing, participating in and (where relevant) evaluating verbal activities larger than the sentence, including narratives and telephone conversations and verbal routines like lecturing or telling jokes which are often described as “speech events.”  But we also include under this category the variegated aspects of language use which fall under the “ethnography of speaking,” including conventions for speaking loudly, softly, much, a little, or not at all, whether addressees are to remain silent or vocally interactive during a speaker’s turn, whether one is expected to broach or avoid certain topics and make extensive use of simile, metaphor and rhyme, and so on (Hymes 1973).  We also include rules for turn-taking and other aspects of what is normally included under Conversation Analysis (Sacks, Schegloff et al), as well as the rules for conversational implicature, presupposition, and speech acts (events like commands, requests, promises and threats which are usually accomplished through the use of words) which fall within narrower definitions of “pragmatics” (Levinson 1983).
Although different regions do have different conventions for language use, this is not something that has been systematically investigated by dialect geographers.  Most of what is known about variation in language use has come from studies of different social groups, including men versus women (for instance, in mixed sex conversation men interrupt women more often than women interrupt men, see Zimmerman and West  1975:115-116), and particularly, different national or ethnic groups.  For instance, based on Plato’s descriptions, Athenian talkers appear to have been verbose, Spartans laconic and Cretans pithy (Hymes 1973:44).  The conversational patterns of visiting Scandinavian neighbors have been described as involving long (apparently comfortable) silences, in contrast with the conversational patterns of Antiguans, in which people speak continuously and “contrapuntally,” with numerous interruptions and overlaps (Reisman 1974).  A pattern similar to that of the Antiguans has been reported, within the US, as being characteristic of informal Jewish interaction among close family or friends (Tannen 19??).              Distinctive African American speech events and patterns of language use have been investigated fairly extensively over the past quarter century and in this course we will consider the nature and significance of a variety of verbal activities common in the African American speech community, including preaching, call and response, sounding, signifying, marking, woofing, loud talking, toasts, the dozens, rapping, and so on.
2.  Regional Variation
We have already defined regional dialects as varieties of a language which are spoken in different geographical areas.  Here we will describe some of the methods which are used to collect and display regional dialect data, identify the major dialect regions in the United States, and summarize some of the reasons why regional dialect differences arise.
Methods.  Ever since its beginnings in the late nineteenth century, regional dialectology has depended on the dialect questionnaire as one of its main data sources. In 1876, George Wenker mailed a dialect questionnaire to thousands of schoolmasters in the North of Germany, depending on them to complete and return it on their own.  Although this method is still followed, most subsequent dialectologists preferred the method of Jules Gilliéron, who in 1896 sent a trained fieldworker (Edmond Edmont) into different parts of France to conduct dialect questionnaires in person.  Trained fieldworkers can get a more reliable record of pronunciation, and they can also pursue alternatives and report relevant observations about informants’ responses which can be highly instructive.
The regional dialect surveys which together make up the Linguistic Atlas of the Unites States and Canada–beginning with the Linguistic Atlas of New England (LANE, see Kurath et al 1939-43)–have all depended on fairly long questionnaires, administered in the field by trained fieldworkers.  Similarly, the Dictionary of American Regional English (DARE), under the editorship of Frederic G. Cassidy, drew on the usage of 2,777 informants from 1,002 communities across the United States.  These informants were interviewed between 1965 and 1970 by 72 fieldworkers, using a questionnaire with as many as 1,847 items, such as the following:
A1       What do you call the time in the early morning before the sun comes into sight?
A6       What time is this?  (Show picture of clock face at 10:45)
H60     The lumpy white cheese that is made from sour milk.
Y18     To leave in a hurry:  “Before they find this out, we’d better ________!”
Like many dialect questionnaires, these questions attempt to get at local word usage indirectly, without using the word in question or its equivalent in another dialect, to avoid influencing the informants’ response.  DARE fieldworkers also tape-recorded an average of half an hour’s speech from their informants (1843 recordings in all), and these were sampled to provide information on pronunciation differences across the US.  Fieldwork for the Linguistic Atlas of the Gulf States (LAGS), conducted between 1968 and 1983, was even more ambitious in this regard, yielding 5,300 hours of tape-recorded speech (Pederson 1993:31).  Both sets of recordings should constitute valuable archives for future dialect research, particularly as measures of how much change in “real” time has occurred in the interim.
Many dialect surveys in the US and Europe have depended for their informants on older people who were born and raised in the community and hadn’t moved around much.  This is a good strategy for helping to capture distinctive local traditions, but regional dialectology has also been criticized for its tendency to over-represent male respondents, under-represent modern usage, and avoid stratified random samples (see Pickford 1956, Chambers and Trudgill 1980:24-36).
Isoglosses and dialect areas.             One way of displaying the results of a regional dialect survey is to include the different variants in a table or list, with annotations about where each is most prevalent.  But a more graphic way of doing this is to chart the distribution of the variants on a dialect atlas or map, as Reed 1977:99 (drawing on data in Kurath 1949, figure 125) did for the North Eastern variants of “cottage cheese” in the US (see map 1).  The lines separating the areas in which each variant is used (Dutch cheese, pot cheese, and  smearcase) are called isoglosses.
INSERT MAP 1 HERE, from Reed 1977:99, based on Kurath 1949, fig. 125
A related way of displaying regional dialect data is to use a symbol for every location on a map in which a certain variant is attested, as in map 2 (from Cassidy 1985:883), which shows where in the US the noun curd, “freq. pl., also curd-cheese,” was offered in response to question H60, reprinted above. Note that the DARE maps of the United States differ from conventional maps because the amount of space which they allocate to each state is based on the size of its population rather than its land area (Carver 1985:xxiii).
INSERT MAP 2 HERE:  from DARE 1985, p. 883
When the isoglosses for different words, pronunciations or grammatical features bundle together, they are usually taken to define a dialect area.  In map 3, for instance (from Kurath 1949, figure 42) the isoglosses separate the Northern dialect area, in which pail, faucet, skunk and merry Christmas! are used, from the Midland and South dialect areas in which bucket, spicket, pole-cat and Christmas gift! are used respectively.
INSERT MAP 3 HERE:Kurath 1949, fig 42
Dialect areas of the US.   Map 4, from Carver (1987:248), provides a comprehensive depiction of dialect areas in the continental US.  Its regions are based on overlapping geographical layers in which particular sets of words (lexical isoglosses) occur.  The fundamental regional divide which Carver finds is between the North and the South.  The North is further divided into the Upper North, the Lower North, and the West, and the South is further divided into the Upper South and the Lower South, with other regional subdivisions as indicated.
INSERT MAP 4 HERE:  The Major Dialect Regions Summarized (from Carver 1987:248)
Four additional comments remain to be made in relation to Map 4. The first is that Carver’s Lower North and Upper South are more or less equivalent to Kurath’s Midland area, which does not, on Carver’s (1987:161) evidence, constitute a unified dialect area.  The second is that lexical subdivisions of the West are less clear-cut than those of the East in part because its settlement and development is more recent, and in part because dialect research in this part of the county has been less extensive than it has been in the East.  The third point to note is that the words associated with each of the regions in map 4 don’t occur with equal frequency throughout the region.  As shown by map 5 for the North layer, some areas (darker shadings) are denser than others, with DARE informants showing familiarity with a greater number of words than the DARE informants in other (lighter shaded) areas.  Finally, it is interesting to note that when Americans are asked to indicate on a map their subjective sense of major dialect areas of the US, their results correspond to the objective divisions of map 4 to a considerable extent, as shown by map 6 from Preston (1996:305, figure 5), a composite of the responses for 147 Michigan respondents.  Note that the areas on which these Michigan respondents show the greatest agreement are the South (delimited by 94% of respondents), and the North (delimited by 61% of respondents), corresponding more or less to the major North/South divide of map 4.
INSERT MAP 5 HERE: Relative densities of the North layer (from Carver 1987:57)
INSERT MAP 6 HERE:  Michigan respondents’ computer-generated mental map of US speech regions (from Preston 1996:305, figure 5)
Why dialect differences arise and persist.  Before we leave the subject of regional dialect differences, we might consider briefly how such differences arise and why they persist.  One factor is the influence of geographical barriers.  A river, a mountain range, or an expanse of barren land, can serve to keep two populations apart, creating or maintaining differences in usage between dialects on either side.  The Ohio river, for instance, helps to define the division between the dialect areas of the North and the South shown in map 4.   Other factors beside geography which help to create and maintain regional dialects include political boundaries, settlement patterns, migration and immigration routes, territorial conquest, and language contact.  In Texas, for instance, contact with Louisiana French in the East has led to loans like jambalaya “rice stew” and bayou “inlet,” while contact with Mexican Spanish along the South Western border has yielded loans like mesa “dry plateau,” and lariat “rope with a noose”  (Reed 1977:52).  One question for us in this volume will be whether the existence and persistence of a distinctive variety like AAVE can be attributed in part to factors similar to those which produce regional dialects, in particular to social barriers between African Americans and other ethnic groups (particularly Whites) and/or to settlement patterns as African Americans migrated North and West from the South.
Contrary to what many people seem to think, television has not had much influence in spreading dialect patterns or obliterating dialect differences, particularly in phonology and grammar, the domains of language which are less easily noticed or controlled than the lexicon is (see Trudgill 1983:61).  One reason for this is that television is a non-interactive medium; viewers don’t talk back to it and, and if they do, the television characters certainly do not respond to them in return.  It is the responses of the people we speak to in our everyday lives–indicating varying degrees of comprehension, non-comprehension, approval and non-approval of the way we speak–that cause us to modify our dialects, depending of course, on our attitudes towards those people and whether we care about their opinions  (see Giles 19??, Le Page and Tabouret-Keller 1985).
Social and Stylistic Variation
Social dialects are varieties distinguished according to the social groups who use them, for instance, upper middle class versus working class speakers (social class), men versus women (sex or gender), young people versus old (age), African Americans versus European Americans (ethnicity or race), people who are part of a particular network at school or in the neighborhood versus those who are not (network).   In theory, since individuals typically belong to several different groups simultaneously, their speech patterns might be taken to reflect the simultaneous intersection of their social categories and experiences, e.g. the speech of young upper middle class White female “jocks” from Chicago (see Eckert 1989).  In practice, however, social dialectologists or sociolinguists tend to consider the linguistic correlates of social categories one category at a time–for instance, the effects of social class membership on the use of third person singular present tense -s absence–and data on the simultaneous effects of social categories (e.g. class and sex) are presented less often (but see figure 3 below)  Interactions between social class and style are commonly noted, however, and stylistic variation will also be considered in this section.
The issue of social variation is critical to discussions of AAVE because the most vernacular features (e.g. “He Ø tall,”  “We be jumping,”  “Ain’ nobody done nothin”) are used most frequently by speakers of the working and lower class.   Geographical region does not appear to make a significant difference, except insofar as the lexicon, especially slang, is concerned, but social class is definitely relevant, and the relevance of age, sex and social network has also been raised in several studies.
Methods.  Although the systematic study of social dialects (about thirty years old) is a lot more recent than the systematic study of regional dialects (about one hundred and twenty years old), the methods of regional dialectology could not simply be extended to social dialectology.  For one thing, social dialect differences tend to be reflected more often in phonology and grammar than in the lexicon, and they are more often socially marked as prestigious or stigmatized than regional differences are.  If we were to attempt to get at social differences in language just by asking people which pronunciation or grammatical pattern they used (the equivalent of the regional dialectologist’s lexical questionnaire), we might find, in the first place, that people’s actual usage might lie below the level of conscious awareness.  Moreover, people might tend to under-report their actual use of socially stigmatized variants in everyday life, and over-report their use of socially prestigious variants, as a number of studies (Labov, Trudgill) have shown.   The direct elicitation of speakers’ intuitions can still be useful, and we will draw on data derived from this approach at several points in this volume, but what social dialectologists have usually relied on as their principal data source is samples of people speaking informally, analyzed to see which variants speakers use and how often.  The major means of achieving this goal has been to tape-record speakers in relatively informal interaction, either in conversation with their peers (close friends and family members) or in spontaneous interviews lasting an hour or more in which certain topics are included to produce more excited interaction and make the interviewee less conscious of his or her speech.  Two favorite topics or this type are descriptions of situations in which the speaker was in danger of being killed, and descriptions of games he/she played as a child, but any topic in which the speaker seems to get involved or excited may be pursued.
Social Class.  For social dialectology, one’s sample needs to include representatives of each of the social groups being investigated, and while it is relatively easy to differentiate men versus women, or teenagers versus middle-aged adults, distinguishing between different socioeconomic or social classes is a bit more difficult.  The most common way of doing this in sociolinguistics is to ask people about their occupations, their educational backgrounds, their incomes, their residence types (number of rooms and location) and/or their lifestyles, and then use one of the sociological status scales (for instance Warner 1960, Hollingshead 1958, Højrup 1983, Milroy and Milroy 1992) to assign them to one of four or five socioeconomic groups (e.g. Lower Working Class, Upper Working Class) depending on their answers.  Sometimes occupational prestige (as assessed by independent surveys) is given the greatest weight in such rankings, and income somewhat less, partly because income can be unreliably reported for various reasons, and because it doesn’t always correlate directly with social standing or status.   There is some debate in sociolinguistics about whether speakers’ evaluations of their own and other’s social class standing should be given greater weight than it is in most studies, and about whether conflict models of social class (for instance Marx, Dahrendorf) should be used more often (see Rickford 1986, Williams 1992), but to discuss this here would take us too far afield.
The differences between social dialects are usually, as we have noted before, quantitative rather than qualitative.  Accordingly, social stratification in language is usually represented by means of displays like figure 1, above, which shows the relative frequency with which one of the variants of a variable is used by the different social classes, as a proportion of all the cases in which it could have been used, following Labov’s (1966:49) principle of accountability.  In this figure, the variant displayed is absence of third person singular present tense -s (i.e., the percentage of the time speakers said forms like “He walkØ” rather than “He walks”), and the figure is an example of sharp stratification, with significant differences between the usage of the working class and middle class groups, both in Norwich and Detroit.
INSERT FIGURE 1:  Absence of third person present tense s, Norwich & Detroit (Holmes 1992:159)
Figure 2, by contrast, is an example of gradient or fine stratification, since the frequencies for the different social classes are much closer to each other, appearing as a continuum of fine shadings rather than a series of discrete and sharply separated breaks.  This figure is from the work of William Labov, the leading pioneer in the methodological and theoretical aspects of social dialect variation, particularly as it relates to ongoing changes in the language (see Labov 1972, 1994).  The variable it depicts is the pronunciation of postvocalic r in words like car or beard, and it’s a good one for illustrating a number of sociolinguistic generalizations and distinctions .  In the first place, note that social stratification and stylistic differentiation are shown simultaneously, and that both pattern quite regularly–each socioeconomic group increasing it’s relative frequency of r-pronunciation as the stylistic context becomes more formal, while, within each style, higher socioeconomic groups show more r-pronunciation than lower ones.  The one exception is the “crossover” pattern of the Lower Middle Class in word-list and minimal pair styles, where consciousness of the variable under investigation is greatest; in these contexts, the linguistically insecure Lower Middle Class speakers use even more of the prestige variant than speakers from the highest socioeconomic group, the Upper Middle Class.  This “crossover” pattern by an intermediate social group is often symptomatic of ongoing change, and, as comparisons with older records as well as contemporary age groups verify, the pronunciation of r-in New York City does in fact represent a change in progress, with the youngest members of the upper middle class showing the greatest use of the new r pronouncing norm.
INSERT FIGURE 2:  Variation in r-pronunciation in NYC (from Labov 1994:87)
By contrast, a number of other variables, such as the pronunciation of the suffix in walking and other gerunds as -in [In] instead of -ing [Ingˆ] are stable sociolinguistic variables, showing no significant differences across age groups, and no evidence of ongoing linguistic change (Labov 1972:238-240).  Both (-r) and (-ing) are sociolinguistic markers, meaning that they vary simultaneously by social group membership and style, in contrast with indicators, which are correlated with geographical region or social group membership only, and show little or no stylistic variation.  An example of an indicator is the variable (a:) in Norwich, England, involving the pronunciation of the vowel in cart, path and similar words.  In general, speakers of a language are more aware of markers than indicators; increasing or decreasing its use in different styles is in part a reflection of this awareness.  It is sometimes possible for markers to reach an even greater level of social awareness and commentary, and become a linguistic stereotype, popularly associated with a particular region (e.g. the Brooklynese pronunciation of “thirty third” as toity toid) or social group (e.g. the characterization of working class speakers as “always” saying dese, dem and dose, with initial d instead of th [∂]).  Linguistic stereotypes are often no more accurate than social stereotypes, representing behavior as categorical when it is in fact variable (as with dese, dem and dose), or as current when actual usage has changed (as with the toity toid stereotype–see Chambers and Trudgill 1980:88).
We might also note, on the basis of figures 1 and 2, that the relative status of a linguistic feature as prestigious or stigmatized is usually a direct reflection of the social status of the groups who use it most often.  Third person singular -s absence in Norwich and Detroit, clearly associated with working and lower class usage, is a stigmatized feature.  But r-pronunciation in New York, associated with middle class usage, is a prestige feature.  Note that the situation in relation to the prestigious “received pronunciation” (R.P.)  of England is quite the opposite, with r-lessness being the prestige norm.  This example is good for illustrating the fact that the relative prestige of a feature (usefully defined by Weinreich 1953:??? as “function in social advance”) is not simply a function of whether it corresponds to the standard spelling or whether it involves “deletions”; in England, deleted “r” is prestigious, but in New York it is stigmatized, based on the usage of the highest social classes in each community.  Of course the situation is often more complex.  Some AAVE variables are simultaneously stigmatized–so far as usage in the formal mainstream contexts of work and the classroom are concerned–and prestigious, so far as usage in informal contexts of solidarity and ethnicity or youth identity affirmation are concerned.  An alternative approach to this ambiguity is to see them as representing different kinds of “prestige,” the overt institutional norms of higher-status groups recognized by society at large and maintained by teachers, the media and others “agents of standardization” versus the covert, often counter-culture norms embraced by intermediate and lower status social groups with little or no institutional support (Wolfram 1991:98).
We have concentrated so far on social variation by social class, one of the most salient correlates of social variability in studies of AAVE and in sociolinguistics more generally.  Four other aspects which we will briefly consider in this introduction are variation according to ethnicity, age,  sex or gender, and social network.  We’ll also consider another approach to stylistic variation besides the one exemplified in figure 2.
Ethnicity.  A speaker’s ethnic or racial group may also have a significant effect on the language they use, but we will discuss this issue quite briefly here, since it is, in a sense the focus of this entire volume.  This is particularly so since the bulk of sociolinguistic research on language and ethnicity has in fact focused on the linguistic relationship between the English of African Americans and European Americans, as a glance at any of the introductory textbooks in sociolinguistics (e.g. Holmes 1992, Wardhaugh 1992, ) will reveal.  In these texts, discussions of language and ethnicity turn out to be primarily discussions of AAVE.
It is relevant to consider other kinds of ethnic influence, however, for one common source of distinctiveness in ethnic dialects is the influence of foreign languages spoken as a first language by an individual or by his or her parents and grandparents.  For instance, Maori speakers in New Zealand may use greetings like kia ora and other Maori words in their English, especially with other Maori speakers, and Jewish Americans may make greater use of ethnically marked terms like oy vay and shlemiel  (which come from Yiddish) than other ethnic groups (Holmes 1992:191-93).  Or, to give a phonological example, Mexican American speakers of English sometimes use a voiceless [s] rather than a voiced [z] (saying “soo” for “zoo”), and this may be attributed to the influence of Spanish, which does not have voiced [z] in word-initial or word-final position (Valdés 1988:130). The question then arises of whether the Vernacular English of African Americans might be attributed to the influence of African languages spoken by their forebears who came from Africa hundreds of years ago, or to the influence of creole languages which they and their early descendants might have acquired in the New World.   The answer is that some of the distinctiveness of AAVE might be attributed to passive inheritance from an ancestral language, but not all of it can.  The maintenance if not the creation of some of the linguistic differences between the speech of African Americans and other ethnic groups must be attributed to other factors, including segregation, migration within the US, and a desire to express a distinctive ethnic identity (Le Page and Tabouret Keller, 1985).
Age.    Age-related variation in language may reflect either age-grading or change in progress.  Age-grading involves features associated with specific age groups as a developmental or social stage, as in the two-word utterances of children around eighteen months of age (“Mommy sock,”  “Drink soup”–Moskowitz  1985:55), or the in-group slang of teenagers (rad “cool”, gnarly “gross/cool”–T. Labov 1992:350).  Normally, speakers give up the features associated with a particular stage as they grow older. In the case of change in progress, however, age differences reflect an actual change in community norms, as with the pronunciation of r in New York City, exemplified in figure 2 above.  The study of age differences is important for the study of language change (“change in apparent time“–Bailey et al 1991) but it can sometimes be difficult to tell whether one is dealing with stable age-grading or with change in progress (see Labov 1981, Rickford et al 1991:127-8), so one might seek out evidence of change in  real time (across samples from two or more points in time).  For instance, speakers of American English who are 19 years and younger tend to omit the verb (goes or be concerned) in as far as constructions (for instance, in “As far as the white servants Ø, it isn’t clear”) far more often than speakers aged 60 years or older do; this evidence of change in apparent-time is backed up by real time evidence that in the late nineteenth and in the first half of the twentieth century, the verb was almost never deleted in this construction (Rickford et al 1995).
One little studied aspect of age-related variation in language is the question of whether adolescence, which is such a significant physical and sociopsychological period in the transition from childhood to adulthood, is accompanied by equally significant linguistic developments.  One of the few studies with relevant data on this issue is Wolfram’s (1969) study of sociolinguistic variation in the AAVE of Detroit, which shows usage by children (ages 10-12), adolescents (ages 14-17) and adults for several variables.   Although the adolescents in Wolfram’s study appear to be intermediate between the oldest and youngest  groups, we know from other data that adolescents sometimes use AAVE as a symbolic group marker, more so than other age groups, and that this leads them to use their AAVE features more than any other age group.  The nature and sociolinguistic significance of the adolescent stage are currently being investigated by Penelope Eckert at Stanford..
Gender. The study of language and gender (“gender” often preferred to “sex” because it emphasizes the sociocultural rather than biological differences between men and women) has mushroomed over the past two decades, and it would be impossible to summarize here the main approaches to this subject or the most interesting findings.  However, one aspect of this research which is particularly relevant to this volume is the finding that women tend to use non-standard or vernacular variants less often than men.  For instance, as figure 3 shows, women in Norwich, England use non-standard [Èn] as the suffix in walking and similar gerunds less often than men from the same social class do (the effect is most marked in classes 2 and 3, the lower middle and upper working class); this has also been found to be true in the US (Detroit, New York and Philadelphia),  and in Canada and Australia as well (Labov 1990:211).  Studies of other variables–for instance, multiple negation, the absence of third-singular present tense -s, or the simplification of word-final consonant clusters (tol’, fas‘) show similar results.  Various reasons have been suggested for this common finding–perhaps women are more status-conscious than men, or perhaps they have a more significant role to play as upholders of society’s notions of “correctness,” or perhaps the men use the vernacular forms more often to express machismo, or perhaps there was an interviewer effect, with the prevalence of male interviewers leading to more comfortable interviews and more informal usage with men (Holmes 1992:171-181).   Whichever one or combination of these explanations turns out to be most significant, it is clear that gender is a significant aspect of dialect variation to which we must attend in considering AAVE.  This is all the more so since Wolfram’s (1969) study of Detroit has already shown greater use of AAVE features by males than by females.  Moreover, most of the literature on AAVE is based on studies of males, interviewed by other males (Edwards 1992, and Rickford and McNair Knox 1994 are recent exceptions), and there is a widespread assumption that AAVE is really the province of streetwise inner-city males.  We believe that this is a misconception, or at least that there are women whose vernacular usage equals and even surpasses that of men.  Certainly this appears to be the case with certain vernacular variables–like invariant habitual be–which represent change in progress.  This may be an instantiation of the other major generalization about language variation and gender–that women lead in linguistic change, regardless of whether the incoming variants are prestigious or not, and whether they are below the level of conscious awareness or not (Labov 1990:213-219).
INSERT FIGURE 3: -in by sex & class in Norwich (Holmes 1992:169, fr. Trudgill 1983)
Network.        Another aspect of social differentiation which can affect language use even when class, ethnicity, age and gender are held constant is social network, a measure of association patterns within a community.  For instance, Labov and Robins reported as early as 1969 that there was a sharp distinction between the linguistic behavior and reading scores of preadolescent and teenage African American boys in Harlem depending on whether they were members of neighborhood peer groups like the Jets and the Cobras or whether they were not.  Peer group members not only used higher frequencies of copula absence and other vernacular features than non-peer group members, but more of them were below grade level in reading, and they tended to be further behind–three or more years below grade level compared with one to two years for non-members (one third of whom were on or above grade level in reading).
More generally, Milroy (1980) has shown, with data from Belfast English, that networks which are dense (close-knit, with each member of the network knowing each other) and multiplex (with members knowing and interacting with each other in multiple capacities, e.g. as friends, coworkers, and family members) are powerful forces in the maintenance of local vernacular norms.  Edwards (1992) has shown the relevance of network analysis to the use of AAVE in Detroit.
Style.   Most of the subtypes of variation which we have considered so far involve variation according to USER–influenced by the region or social group(s) from which the speaker comes.  Stylistic variation, by contrast, involves variation according to USE (Halliday 1964), and it may be evident in the speech of a single individual or relatively homogeneous group, no matter how narrowly defined.
There are two principal approaches to the study of style in sociolinguistics. The first, associated with Labov (1966) and exemplified by figure 2, assumes that styles can be ranked on a continuum of attention paid to speech, from casual speech on one end, to word lists and minimal pairs on the other.  The primary means of eliciting samples of different styles in this approach is to vary the topics discussed (e.g. career plans versus childhood games) and the tasks which the interviewee is assigned (e.g. talking about childhood experiences versus reading a short passage).  An alternative approach, adopted by Labov et al (1968) but best represented in Bell (1984), assumes that styles essentially represent speakers’ responses to their audience.  The primary means of eliciting samples of different styles in this approach is to vary the interlocutor, for instance, by recording the same speaker with a different interviewer or in interaction with in-group members rather than outsiders.   Evidence of style shifting in AAVE has come mainly from this second approach (e.g. Labov et al 1968, Rickford and McNair Knox 1994)–including dramatic changes in AAVE use according to audience.

Soda, Pop, or Coke? America’s First Dictionary of Dialects


  • Language columnist, lexicographer, and humorist

  • March 6, 2009 • 8:00 am PST
  • 783

The Dictionary of American Regional English, a comprehensive lexicon of local language quirks, nears completion

If you’re living in a snowpocalyptic wasteland like the ice planet Hoth, Buffalo, New York, or much of the United States lately, you’ve probably shoveled some snow onto the berm.
Berm?
Oh, excuse me, depending on where you live, you may know that strip of grass between the sidewalk and street by another name, such as boulevard, devil strip, grass plot, neutral ground, parking strip, parkway, terrace, tree belt, or tree lawn.
The language of grass strips is just one of thousands of areas of American life documented in the Dictionary of American Regional English, one of the oldest and most ambitious projects in the history of American lexicography. Dedicated to capturing all terms that are not part of standard English nationwide, DARE dates from the 1960s and will finally be fully published in 2010 (though its eventual digitization promises to enrich and expand the dictionary indefinitely).
The founder and patron word saint of DARE is Fred Cassidy, an English professor at the University of Wisconsin who died in 2000. In the sixties, Cassidy was inspired by Britain’s English Dialect Dictionary and vowed to create one for the United States. This was a tall order. As even non-geography majors tend to notice, the United States is freaking huge, and its regions are larger than most countries.
To gather info, Cassidy assembled a term of 80 fieldworkers (mainly graduate students) to find informants who could answer questions on local language. Between 1965 and 1970, DARE informants from over a thousand communities-who included engineers, homemakers, storekeepers, journalists, museum staff, ranchers, postal workers, teachers, miners, truckers, barbers, librarians, loggers, students, and seamstresses-were asked the 1800-question survey, which covered a metric bazillion-load of hyper-specific subjects, such as:
“Very small insects, almost too small to see, that get under your skin and cause itching”
“What nicknames do people have around here for a small eating place where the food is not especially good?”
“When yellowish stuff comes out of a person’s ear, he has a ______”
“Words or expressions used here, where one person supposedly casts a spell over another:”
The results produced the most exhaustive look ever at regional language in America, which does more than define the terms collected: DARE is a historical dictionary, so it includes representative citations to show how the words are used. But the most unique element of DARE is the maps-proportional population maps that shrink low-pop areas and embiggen densely populated regions to give a sense of precisely where the terms are found, and how widespread they are. Some exist in only one city, while others span several regions, but as long as the words are not used from sea to shining sea, they belong in DARE.
Exhaustive documentation of regional language has rarely been a national priority, and funding problems have plagued DARE over the years. Chief Editor Joan Hall-who has worked for DARE since 1975-and her colleagues at the University at Wisconsin are giddy at the thought of finally reaching Z next year. (The previous volumes were published in 1985, 1991, 1996, and 2002, covering A-Sk). And yet, the long delays have had an upside, as the tremendous digital resources now available have enriched the later volumes. Hall singled out wharfing as an entry that’s fuller now than if it had been published decades ago: “For the entry for wharfing, a ramp that gives access to the mow of a barn…we started with three or four anecdotal quotes, all from New England, from 1999-2005. But Google Books took us back to 1823, 1852, 1881, 1916, and 1928, and Lexis-Nexis gave us 1860. All quotes were from New England, so they greatly strengthened the regionality and showed us the scope of the usage.”
For a word-lover like myself, DARE is a kind of unholy cross between crack, the Bacon Explosion, and a rainbow made of chocolate. If words float your boat to pleasant waters too, you might just drown in the wonders of DARE. Here are some words and expressions I found for the first time in this unique word-book; they are the mere tip of the tip of the dictionary-berg:
mubble-squibble
If you thought there was only one word for a noogie, here’s a synonym from North Carolina that will liven up your childhood memoirs.
monkey’s wedding
Like dog’s breakfast, this expression (found in Maine) describes a hot mess, a real cluster-something.
discomgollifusticated
This New Englandism makes the standard discombobulated seem succinct and restrained. Just a few words down is the even more extravagant discumgalligumfricated.
cockroach killers
Found in New Jersey, this is a term for shoes that are pointy enough to go medieval on our revolting friends.
to fight one’s hat
This southwestern expression means “to struggle uselessly,” which makes sense if you’ve ever tried to pick a fight with a lid, most of which are neither pugnacious nor easily offended.
death balls
You may know them as dust bunnies, but in the dust bunny community, this southwestern Missouri term is favored since it commands more respect. The example in DARE indicates that death balls reveal not only past squalor: they foretell future death. (Note to self: clean under couch).
son of a biscuit
A Wisconsin euphemism that’s polite enough for all ages.
But if you think DARE is only useful to the chronically word-loving, think again. A surprising assortment of professions have come calling at DARE’s door, including actors doing dialect research, test-makers trying to make exams comprehensible to non-standard English speakers, and doctors who know firsthand that dialect variation can have a real effect on treatment if patients use high blood, low blood, or the sugar instead of the standard diabetes.
DARE has even taken a bite out of crime, and I don’t mean a police officer used one of the enormous volumes to bludgeon a suspect, a la Vic Mackey. Forensic linguist Roger Shuy has used DARE a few times: as a tool in puzzling out the Unabomber’s background and, later, in a kidnapping case when the kidnapper used devil strip in the ransom note. Turns out that term (one of the synonyms for tree lawn) is used mainly around Akron, Ohio, which helped investigators narrow their search.
All too often patriotism is wrapped in flags and accompanied by bombs, as if troops and wars and governments were all there is to be proud about, nation-wise. But I’m proud to be part of a discomgollifusticated nation that gives mubble-squibbles liberally. DARE is a vivid, vibrant reminder of rich regional language that is ever-changing and not going away. If you care about language that is particularly, peculiarly, and distinctly American, then son of a biscuit, it is your patriotic duty to get thee to a library (or Amazon) and give DARE a chance.
Read more

Posted in
,

+ Comments Add yours ↓
Sign In or Join to discuss this article

Sponsored by University of PhoenixRelated posts


Various Lexical Resources

Lexical References

Local & Regional Lexicons & Glossaries (U.S.)

  • How to Talk San Franciscan — Find out about Bezerkely, Snob Hill, Coitus Tower, The Herpes Triangle, and more.
  • The Pittsburgh Dictionary — Learn to speak “Pittsburghese.” Learn to drop your D’s just like a real Pittsburgher: say cooun’t for couldn’t, wooun’t for wouldn’t, and din’t for didn’t.
  • CoalSpeak — A “list of words, phrases, locations, and colloquialisms commonly used in the Coal Region of Eastern Pennsylvania.”
  • American Slanguages: The Hick-to-Hip Translation Guide. Lists slang words and phrases for many cities, both American and international, including Atlanta, Boston, Chicago, Detroit, Hong Kong, L.A., New Orleans, San Francisco, and, of course, Seattle. What this page lacks in depth, it makes up for in its breadth.
  • The House of the Rising Sun Lexicon. A site dedicated to the patois of past and present residents of a shifting group-residence of young people and students (Sacramento State and CSU Chico, apparently), with the primary focus being on Sacramento and Northern California slang. An excellent compilation! — Note: As of late 2004, the parent site for the HRS Lexicon, DrinkMoreThinkLess.com, seems to have disappeared. One can only imagine that the perpetrators of the HRS Lexicon have finally grown up and would be embarrassed to have it read by employers, wives, or children, or perhaps might simply now want to have them. Purely in the interest of preservation of what is a very interesting time capsule, however, I here present links to all 12 pages of the HRS Lexicon as they still exist in the Internet Archive (you will need to click each link individually, as the internal JavaScript menu in the pages, for jumping to the separate pages, does not work: dict.html, dict2.html, dict3.html, dict4.html, dict5.html, dict6.html, dict7.html, dict8.html, dict9.html, dict10.html, dict11.html, dict12.html.

International Lexicons, Glossaries & Dictionaries

Ethnic Lexicons, Glossaries, and Dictionaries

Computer, Hacker & Web Lexicons, Glossaries & Dictionaries

  • The Microsoft Lexicon — A lexicon of terms originating from the Microsoft campus (a world unto itself). Find out what Blue Badger, BOOP, Dogfood, Facemail, Gronk, and Permatemp mean.
  • The Ultimate Silicon Valley Slang Page — Learn what Chat-Fly, Nerd Bird, and Loser Error mean.
  • A Hacker’s Glossary — A short list of hacker terms. Learn what deep magic, raster burn, and wetware mean.
  • The New Hacker’s Dictionary — The granddaddy of hacker slingo references. Learn what bag on the side, barfmail, cybercrud, feeping creaturism, field circus, fritterware, joe code, luser, paper-net, scratch monkey, security through obscurity, and war dialer mean.
  • Web-To-English Dictionary — the original Webspeak list posted at The Asylum, which is no more. Only available at one other place on the Web, that I know of (after an exhaustive search). Presented here in the interest of preservation.

Various Peculiar Lexicons and Glossaries

Other Resources

Contributions, suggestions, corrections, and amendations are welcome: send them to Steve Callihan at steve@callihan.com.
Please make sure that your return e-mail address is accurate and functioning. That way, you’ll save me from taking the time to answer your query, only to have my answer to you bounced back to me as undeliverable. I answer all e-mail queries, usually on the same day, so if you don’t receive a reply, please don’t blame me.
You are visitor number

since May 1, 1998.

A Seattle Lexicon
Copyright © 1998-2001 Steve Callihan


General Chowhounding Topics

save board
Discuss chow in general, including nationally available products, internet & mail-order, national cuisines and tips for chowhounding.
Start New Thread

Italian-American – Regional Lexicon

I’ll post this here because I think it’s a larger issue than any one of the Northeastern US boards, plus it’s of national interest. (Hey, I’m in Indiana…)
When I’ve been in Northeastern USA Italian delis and restaurants, I’ve noticed many ways in which Italian names for foods and dishes get changed, usually shortened.
I’d be interested in seeing how many such modifications there are, and also whether we should view the modifications as carryovers from Italian dialects or instead as accommodations to English. (By accommodation to English, I mean especially the very frequent dropping of gender/number vowels from the noun endings in standard Italian). My starter examples:
MOOTS-adell or MOOTS-arell (mozzarella)
pro-SHOOT (prosciutto)
manigott (baked manicotti)
And are there exceptions? I don’t think anyone says “pizz” for pizza. (But I have heard that people pronounce “apizza” as “a-BEETZ.” Does anyone say “spa-GETT” for spaghetti?
Permalink | Reply
By Bada Bing on Jan 25, 2012 11:12 AM

44 Replies so Far

  1. I have actually heard someone from Northern Jersey say spa-GETT. I’m not sure if he was joking, but judging by his affection for gabagoo and mootz, I wouldn’t be surprised if he actually has spaghett’ night at home. Curiously, though, he likes to finish dinner with an “expresso.”
    Permalink | Reply
    By JungMann on Jan 25, 2012 11:21 AM
    1. Cool. But what’s gabagoo?
      I should add “pasta fa-ZOOL” to my original list.
      Permalink | Reply
      By Bada Bing on Jan 25, 2012 12:01 PM
      1. Gabagoo = capicolla ham
        Permalink | Reply
        By JungMann on Jan 25, 2012 12:27 PM
      2. There are *two* words that sound like “gabagoo”.
        Only one of the words is a pronunciation of the Italian lunchmeat, capicola … the other is a word that basically means “stuff to eat” … often desirable type food, sandwiches …. basically, good stuff to chew.
        Pronouncing the “c” sound as a “g” sound and chopping the end of the word is a very Sicilian way to speak.
        So this results in a pronunciation for the lunchmeat ham capicola as “gabagol”. (pronounced “gahba-gole”)
        The other term meaning stuff to eat, is “gabbagool”.
        In the original Godfather movie you can hear both terms actually being used in the same sentence.
        The character “Paulie” is working at the Corleone wedding and a button from a few feet away says: “Hey Paulie … I got two gabbagool … gabagol and a prosciutto!”; and throws Paulie two sandwiches, one capicola and the other prosciutto.
        Permalink | Reply
        By Sonny_Funzio on Jan 29, 2012 02:51 PM
  2. Never heard “pizz” or “spa-gett” but pro-shoot, manigott and moots-arell are definitely heard in the outer boroughs. A couple more:
    ravs – ravioli
    ice-cee – Italian ice
    ga-knowl – cannoli
    Permalink | Reply
    By EM23 on Jan 25, 2012 12:34 PM
  3. Traditionally, it’s the Southern Italian dialect (?) to drop the last vowel in many words and names. As an aside, many pizza places here in CT adverttise and sometimes use in their name, A-peez, instead of “pizza.” Anyone who has heard Giada Delaurentis say “spaghetti” can’t help but notice the stress she places on the last syllable…almost OVER stressing it.
    Yes..gabagoo is cappicola, cavadeel = cavatelli, pasta fazoo – pasta e fagilola.
    Permalink | Reply
    By njmarshall55 on Jan 25, 2012 12:35 PM
  4. This comment is only tangentially related. Over the summer I was in Istria–part of Croatia.
    The word for their locally made, air dried ham was Przut. Which spelled phonetically can be more-or-less approximated by: pir-ZHOOT.
    So. Totally different language, but it sounds like the italian american and/or sicilian pronunciation of prosciutto. Sometimes very old words transcend dialects, and even languages.
    Permalink | Reply
    By egit on Jan 25, 2012 02:58 PM
  5. When I took Italian in Rhode Island, there was a guy in my class who was so proud of the Italian he spoke and thought he’d do great in Italy. When he started dropping the ends of words, the teacher said he needed to learn how to speak “real” Italian or people would think he was stupid. But I heard all those words said by most of the Italian-Americans I knew who often did not otherwise speak any Italian…..they also knew how to swear and gesture in “Italian.”
    Permalink | Reply
    By escondido123 on Jan 25, 2012 03:55 PM
    1. As a linguistic matter, I would find it odd if a whole region of (southern) Italy just stopped even bothering with the gender and singular-plural markings of noun endings. It’s always seemed to me more likely to reflect the fact that English doesn’t use noun endings for gender and therefore also for singular/plural distinctions. Once you start sprinkling Italian words into English, people who actually know Italian would understand that there is no function in English for the varied endings. That’s why English speakers almost invariably say lasagna (Italian singular) when they almost invariably mean lasagne (Italian plural).
      But I truly don’t know. Maybe Sicilian and Campanian dialects really don’t observe gender/plurals marking. That teacher who said your classmate might sound stupid could mean simply that he thinks Campanians sound stupid!
      Permalink | Reply
      By Bada Bing on Jan 25, 2012 04:04 PM
      1. I agree. I think it is an Italian-American thing. Maybe the reason people think it is what is spoken in the south of Italy is because so many Italian-Americans came here from the south…because they were so damn poor.
        Permalink | Reply
        By escondido123 on Jan 25, 2012 05:31 PM
      2. Gender/plurals are used throughout Southern Italy, with varying emphases. In Calabria, they are clearly voiced, if not in a standard way. The street dialects of Naples, which have so widely informed Italian American food talk, regularly cut off or dramatically silence vowel endings, so that “Napoli” is “Napul(e)”. “Guaglione” becomes “guaglion’” and the like. Nepaolitans have traditonally shaped Italian American food culture in many ways–linguistically, too. The rest of Campania varies somewhat from this metropolitan standard.
        Permalink | Reply
        By bob96 on Jan 27, 2012 12:09 AM
    2. That’s funny. Your teacher thought such pronunciation makes a person sound stupid.
      Well, the Calabrese side of my family might say that of my Sicilian side too.
      This sort of thing made our holidays quite, ah, festive.
      Mashing words together, specific pronunciation of certain letters, and chopping the ends off words is a very Sicilian way to speak.
      One, if not somewhat “famous”, then useful example is (again) from the Godfather movie when Luca Brasi is in Tattaglia’s bar and a guy across the bar says to him in English “I am Bruno Tattaglia” … and Luca says back, “io te conosco” … but he pronounces it in Sicilian so he says something that sounds like “eedee gohnosh”. Yup, mashed and chopped.
      Differences in language and in lots of other things too. Food for instance. You’d be surprised how much can be said at a family get together at the holidays about whether it is appropriate to put sugar in spaghetti sauce, when half your family is Calabrese and half is Sicilian.
      You know that when it gets loud and lapses into Italian, that is when Nonna (grandma) is going to get the 3-foot wooden laundry spoon to restore order.
      (Anyone else remember the laundry spoon?)
      Different dialects and customs came together to become what Americans think of as “Italian”.
      Permalink | Reply
      By Sonny_Funzio on Jan 29, 2012 03:36 PM
      1. Sonny, the laganatura was the tool that kept order in our house.
        When mom would say (in her dialect), ‘Mo piglia lu laganatura’ … that sent us all scampering.
        Permalink | Reply
        By Cheese Boy on Jan 29, 2012 10:21 PM
  6. I grew up in northern NJ and heard it both ways. Just last night my mom sent me this link about dropping vowels in parts of NYC and NJ.
    http://chowhound.chow.com/topics/418951
    Permalink | Reply
    By calliope_nh on Jan 25, 2012 04:10 PM
    1. Perfect background thread!
      Permalink | Reply
      By Bada Bing on Jan 26, 2012 04:33 AM
  7. cov-a-deal = cavateli
    ri-gawt = ricotta
    sa-drules = cetriolo
    non-food
    sfa-cheem = fare (to do, or when I was a kid, “little doer” really meant to mean little fucker)
    kaga-zowt = codardo? (again as a kid, it was implied “shitty pants)
    And it’s not a poor thing, or an invented Italian American thing, it is absolutely a regional dialect phenom. It’s not heard alot in Italy anymore because of improved education, etc. Much like American affectations, such as mew-vee=movie, feew-ed=food, or southernisms, like ya’ll, or fixinto or windas (windows) or awn (on).
    Permalink | Reply
    By BiscuitBoy on Jan 26, 2012 07:09 AM
  8. ^^And it’s not a poor thing, or an invented Italian American thing, it is absolutely a regional dialect phenom.^^ Exactly.
    There is something to getting “sh” in there, too–e.g., there’s a bakery in Hartford, Mozzicato De Pasquale. My mom always says De Pash-quall.
    shvoyadell = sfogliatelle
    pizza freet = pizza fritte (fried dough)
    CENT’ANNI! Notice even when Sinatra used to toast it, it was more like CHEN DAHN!
    ETA: How did I forget pizza fritte?! And for non-food: baKOWzoo = bathroom (backhouse/outhouse).
    Permalink | Reply
    By kattyeyes on Jan 26, 2012 07:43 AM
  9. We were recently in Firenze and heard someone order a “cappooch” at a bar (coffee shop). He was served a cappuccino. Guess it happens in Italy as well.
    But as BiscuitBoy said in his post, we shorten English words and skew their pronunciations as well. It’s just part of language.
    I have to admit that since travelling in Italy a few times now, it is difficult for me to say certain food words the way I grew up saying them as an Italian American. I have to pronounce them more the way the Italians in Italy do or it sounds weird to me. Which makes me sound a bit affected when going to my local salumeria and pastry shops here where I live or with my family. It’s a very interesting topic.
    Permalink | Reply
    By ttoommyy on Jan 26, 2012 07:45 AM
  10. The vast majority of Italians who emigrated to the northeastern US were from southern Italy, most commonly from Sicily, Calabria and Campania. While it’s not accurate to speak of southern Italian as a single language, the various southern dialects do share some peculiarities of pronunciation. Three features, among others, that can be found in these dialects:
    1. The vowel “o” in standard Italian often becomes “u” (pronounced “oo”) in the southern dialects.
    2. Unvoiced consonants in the standard language can be voiced in the southern dialects. For example, t becomes d, k becomes g, and p becomes b.
    3. There is a tendency not to pronounce the vowel in unstressed syllables fully, but rather to pronounce it indistinctly as a schwa. From there, continuing evolution of the language in America has resulted in dropping the vowel entirely when it is at the end of a word.
    Pronouncing capicolla as “gabuhgool” illustrates all of these traits: replacement of unvoiced c by voiced g, replacement of o by u (oo), replacement of the middle unstressed i by a schwa, and dropping of the final unstressed vowel entirely.
    Permalink | Reply
    By cheesemaestro on Jan 26, 2012 07:57 AM
    1. Great distillation of the emerging patterns!
      Permalink | Reply
      By Bada Bing on Jan 26, 2012 12:59 PM
    2. I don’t believe there is a “K” in Italian. Where you referring to the pronunciation of the hard “c”. such as in capicolla?
      Permalink | Reply
      By RGC1982 on Jan 26, 2012 05:35 PM
      1. Right. There is no K in Italian, but there is a k sound, which can also be referred to as a hard c.
        Permalink | Reply
        By cheesemaestro on Jan 26, 2012 06:11 PM
    3. Thanks for making clear the points I was attemptjng to illustrate above. Food vocabularles and pronunciation carries the longest across generations, and with it more history and tradition than many realize. There are, indeed, reasons for “mozzarell’” that are not related to Vinnie on the corner.
      Permalink | Reply
      By bob96 on Jan 27, 2012 12:12 AM
  11. Having lived many years in NJ, I have heard all sorts of abominable pronunciations of words (both in and out of the Italian-American food context). Some of the bastardizations, I submit, are from two forms of “Americanizations” as well – employing familiar sounds and truncating. For example, one deli near me has counter staff that all tend to correct me when I order prosciutto (pro-shoot) or sopressata (super-sot). Then, there is the tendency for English words to be “clipped” over time – think “lab,” “fax,” or “gas.”
    Permalink | Reply
    By MGZ on Jan 26, 2012 08:01 AM
    1. re: MGZ
      What is “abominable” about these pronunciations? Italy was, and remains, a country of dialects, many so different one from another than mutual comprehension is difficult, if not impossible. What we know of as standard Italian is a form of the Tuscan dialect that became the lingua franca after Italy was unified. The language of the media and government across Italy is standard Italian, which is taught universally in the schools. When Italians from different regions speak with each other, they will also use standard Italian, but when people speak with others from the local area, it is usually in their own dialect. The dialects are not inferior bastardizations of the standard language. They are living representations of the language in their own right, and they have evolved over time in different directions. That evolution has continued among Italian-Americans whose families have lived abroad for a hundred years or more. Think about how American English has changed from British English in pronunciation, vocabulary, etc. Should Americans consider that their form of English is a bastardization and an abomination compared to what is spoken in the UK?
      Permalink | Reply
      By cheesemaestro on Jan 26, 2012 08:44 AM
      1. “Think about how American English has changed from British English in pronunciation, vocabulary, etc.”
        That was basically my point. The words are pronounced in neither their American English form, nor any clearly recognizable Italian form. Instead, they are articulated by English only speakers with a momentary, exaggerated accent employing “sounds” from their everyday English usage. To me, it’s funny because it’s phony and yet subscribed to as though it lends authenticity.
        My apologies if I was a bit hyperbolic in my word choice.
        Permalink | Reply
        By MGZ on Jan 26, 2012 09:13 AM
      2. A similar pattern of language variation applies in Germany. The “standard” dialect (used by newscasters and college teachers and so forth) actually is locatable to an area in the north-center, roughly around Göttingen. When I learned pretty good standard German, boy was I surprised to find that I couldn’t understand the Bayrisch dialect from the South nor the Allemansche (from the Southwest). Indeed, when I was there in the late 1980s, most Germans could discern nuances of dialect from just 10 or 20 miles down the road.
        Permalink | Reply
        By Bada Bing on Jan 26, 2012 01:05 PM
      3. I am not so convinced that this is a regional issue. Many years ago, when I had the opportunity to study Italian in school, my mother and her sisters almost didn’t recognize the Italian I was learning, yet Italian was their first language. They were native speakers, and they had to learn English when they got to school in New York. I am sure that the New York Diocese taught quite a few native Italian speakers up to and during the 1930′s. IMO, this is a class issue, perhaps more than a regional differentiation. They referred to the proper Italian I was learning, found in books and newspapers, as “Alt-Italiano”, or “High Italian” i.e., the language of the upper class and the educated. My family were farmers and laborers, so they were clearly not of the upper class. Now, when you consider that most immigrants from about the turn of the 20th century through pre-WWII were running from poverty and lack of opportunity in Italy, and were usually uneducated, and then consider the tendency to “Americanize” words, you can easily see how many words are mispronounced so commonly. It is the same way in other languages too, including English.
        Permalink | Reply
        By RGC1982 on Jan 26, 2012 05:44 PM
  12. I think there is a huge long thread elsewhere on CH about this thing.
    Permalink | Reply
    By rockandroller1 on Jan 26, 2012 08:07 AM
    1. Is it possibly the one calliope_nh suggested above?
      http://chowhound.chow.com/topics/418951
      Permalink | Reply
      By Bada Bing on Jan 26, 2012 01:01 PM
  13. can’t forget:
    shka-tawl – escarole
    and the ever popular, but misunderstood:
    fon-guwl – fare culo, literally, to do in the ass
    Permalink | Reply
    By BiscuitBoy on Jan 26, 2012 11:17 AM
    1. vaffanculo!
      or
      vattelo a pigliare in culo!
      LOL
      Permalink | Reply
      By Novelli on Jan 26, 2012 01:28 PM
      1. Or the more polite “Vaffanapoli”.
        Permalink | Reply
        By bob96 on Jan 27, 2012 12:15 AM
        1. re: bob96
          Which is “go to Naples.” A slur if ever there is one from anyone north of Naples (or so those northerners believe).
          Permalink | Reply
          By ttoommyy on Jan 27, 2012 05:52 AM
  14. Hazelton, Pa has a thick dough, tomato topped bread called “pitz”, the kind of speciality locals take on the plane back to their new homes out west.
    Where the hell did southern NJ get the panzerotti???? I only know of it there.
    Permalink | Reply
    By Passadumkeg on Jan 26, 2012 11:17 AM
  15. Residing in the boroughs of NYC and constantly being surrounded by Italian neighbors, I think I can say that I have heard it all. Dialects from the beloved regions of Sicily, Calabria, Naples, and Bari were those most popular and those most heard around me. Nothing surprises me in pronunciation any more, but I do often chuckle at the “similarities” in pronunciation we all share. I used to get uncomfortable having to decipher what was just spoken to me in conversation, but now I find it just adds to the fun of the moment. As a fellow Italian, I take great pride to say that it’s great to hear Italians speak their native language here in the United States. *** I just wish there was more of it — dialect or otherwise. ***
    Wanted to add that there is an interesting wiki entry explaining the origin of the Buca di Beppo name. It fits right in with the format of this thread. ; > ) I can envision a Sicilian guy sayin’ this right now. LOL.
    Permalink | Reply
    By Cheese Boy on Jan 26, 2012 01:37 PM
  16. Both of my grandparents came from the same little town in Italy (in the providence of Benevento, today there are about 1200 people living there)
    My grandfather was never educated and learned to read and write here in the US (he was 15 when he got here)
    Their dialect was similar to what has been turned into what I call “Italian-American Speak” for many different words. Our last name, even having a vowel at the end of it, the pronunciation of the vowel has gone from an unstressed “eh” at the end of the name to being totally unpronounced by anyone in our family
    “Sah la mun eh” has turned into “SAYL uh MOAN”
    Many “c’s” were pronounced with a “g” such as gabagooleh (for capicolla) with the “eh” being nearly silent
    Another word that my grandparents used to say was “buh-zil-igo” stress on the “buh” (or at least that’s how it sounded to me) yeah, that’s basil = basilico
    I remember being in elementary school at lunch, one of the kids asked what I was eating. (very large green squash, cooked with tomatoes, onions, garlic and basil) I told the kid “gah-goots” because that’s how as a kid, I processed the Italian version of what my grandparents called cucuzza. (He tattled and told the lunch monitor I said I was eating guts for lunch)
    Kids eating peanut butter and jelly or bologna sandwiches thought I ate “weird” stuff, so telling someone you’re eating “gah-goots” probably doesn’t help
    I digress… while I don’t really LIKE the Americanized version of “poor people” Italian, I can understand where it comes from. Besides, who wants to walk around sounding like Giada de Laurentis all time with her over-pronunciation! (spah-GET-tee)
    Permalink | Reply
    By cgarner on Jan 27, 2012 06:01 AM
    1. There was a tremendous arabic influence on southern italian pronunciation and you hear similar sounds in portugal and southern spain. Sicily in particular was a crossroads for every kind of language and it’s Italian reflects its remarkable place in history as an early “melting pot.”
      Permalink | Reply
      By teezeetoo on Jan 27, 2012 07:09 AM
      1. My mom’s family is from Siracusa Sicily and their last name is an Arabic word for “arrogant” (Salafia)
        Siracusa was the gateway for all kinds of cultures coming into Italy. There’s Greek, Spanish, Egyptian, Germanic and Norman presences all over the old city. (it’s really quite beautiful there)
        Her family, in contrast to my Dad’s, was well-to-do and educated. They prided themselves on speaking “Proper” Italian. Though when my Mom went to Rome with friends, her “proper Italian” was called into question, when a jewelry store owner asked her if she was Sicilian. Apparently the dialect was still evident to a native Roman.
        Permalink | Reply
        By cgarner on Jan 27, 2012 07:23 AM
    2. “kids eating peanut butter and jelly or bologna sandwiches thought I ate “weird” stuff, so telling someone you’re eating “gah-goots” probably doesn’t help”
      LOL I am literally at work laughing out loud over this. I bet most of us first or second generation Italian Americans can tell a story along these lines. Thanks for the laugh and memories.
      Permalink | Reply
      By ttoommyy on Jan 27, 2012 07:20 AM
      1. My Mom can! ttoommy, she’s first generation and they moved from Philadelphia into the middle of farm lands of Bucks County.
        Her grandparents lived with her and her Grandmother would pack her lunch. Imagine being 12 years old and your lunch is a foccacia with anchovies and garlic and rosemary on it
        sure it was delicous, but it smelled like, well anchovies and garlic
        Permalink | Reply
        By cgarner on Jan 27, 2012 09:30 AM
        1. Had a few of these experiences growing up, but there was one sandwich I did not like at all and I hid it under my chest of drawers for almost a week so I wouldn’t have to eat it. It was La Poveretta on two pieces of Wonder Bread. My mom made it even worse because she made hers with red wine vinegar. Awful thing for a young kid to eat. Here’s a recipe: http://www.cooks.com/rec/view/0,1850,142164-236202,00.html
          Also, be reminded that we Italians are not alone in this circle of ridicule for what we eat. Does anyone recall the reference to Moose Caca ? Here it is at 3:03 or so.
          Look –> http://www.youtube.com/watch?v=vS1aFV…
          Permalink | Reply
          By Cheese Boy on Jan 27, 2012 12:07 PM
    3. But in a way, both pronunciations share the same trait – that attempt at temporary accent. I mean, who wouldn’t be amused by a fourth or fifth generation American of Welsh ancestry adopting an accent to articulate the words “bangers” or “pudding?”
      Permalink | Reply
      By MGZ on Jan 27, 2012 07:28 AM
  17. another item of the Italian-American lexicon …
    With some old folks … when it comes to spaghetti … sauce is “gravy” and the pasta noodle, almost regardless of type is referred to as “macaroni”.
    Permalink | Reply
    By Sonny_Funzio on Jan 29, 2012 03:58 PM
« Back to the General Chowhounding Topics Board
American English Regional Speech and Dialect Samples
(See also Dialects in American English and A Dialect Map of American English)

Regional Variations on Arthur the Rat (standard speech-testing passage)



Speech Examples from the New England Region


The Mid-Atlantic Region

The State of Pennsylvania

The Midwest


Examples from the Southern Region


The Mountain States

The American Southwest

The Pacific Northwest, Alaska and Hawaii

Spoken Language Resources from Academic Web Projects

Oral History, Folklore and Other Live Speech Samples

Slang-Related Web Links (non-region-specific)

General Web Resources on Spoken American English




TopUS-1 References IndexUS-1 Class ScheduleUS-1 Home
Last Updated and All Links Checked on 23 October 2010

Frog Hair to Woolies: Dust Bunnies by 173 Other Names

Regional Dictionary Travels America; New England’s Willywags

By RYAN SAGER

What do you call those soft rolls of dust that collect on the floor under your bed? Many people know them as dust bunnies. But in parts of the Northeast, you’d call them dust kitties; in the South, house moss; in Pennsylvania, you might call them woolies.
There are, in fact, at least 174 names by which Americans call these bits of fluff, including bunny tails, frog hair, cussywop, woofinpoofs and—perhaps most evocatively—ghost manure.

Beyond Woofinpoofs and Willywags

Try your hand at identifying some of the terms found in the Dictionary of American Regional English.
That we can identify these words today is largely a testament to the vision of one man: Frederic Cassidy, a professor of English at the University of Wisconsin in Madison who conceived the Dictionary of American Regional English (known as DARE) in a 1962 speech to the American Dialect Society.
Mr. Cassidy died in 2000, at the age of 92, having made it to “O” in his quest to catalog American English in all its rip-staving (that is Ozarkian for rip-roaring) regional diversity. His tombstone bears a simple inscription: “On to Z!”
“That was his rallying cry for about the last decade of his life,” said Joan Houston Hall, 65, who joined DARE in 1975 and took over as its chief editor after Mr. Cassidy’s death. In March, Harvard University Press will publish the Dictionary’s Volume V, finishing off the alphabet with slab through zydeco, nearly half a century after the first fieldworkers fanned out in “Word Wagons” to 1,002 communities across America, administering a 1,600-item questionnaire to sometimes-suspicious, often-perplexed locals.
Getty Images
The fruits of their labors have been a feast for the lexicographically inclined ever since. What does a patient in the South mean when he complains of dew poison? What does a waitress in California mean when she offers you coffee and snails? Where would you go if a New Englander directed you to the willywags?
(Answers: The patient has a rash on his feet or legs. The waitress is offering you cinnamon rolls with your cup of joe. The New Englander means what others might call the boonies.)
As the repository of answers to such questions (the dictionary contains nearly 60,000 entries and is the only project of its type that is national in scope), the folks at DARE have long acted as a clearinghouse for all sorts of odd requests, by everyone from doctors to dialogue coaches to presidents.
Ms. Hall remembers a call she took in the early 1990s from a lawyer whose client had called a former girlfriend a mud flap. Could the phrase be used as a term of endearment?
Associated PressJoan Houston Hall is chief editor of the Dictionary of American Regional English.
“I could neither confirm nor deny,” said Ms. Hall, who searched DARE’s archives but found nothing. “It was only years later, driving down the highway behind a big truck, that I realized he may have been referring to those curvaceous silhouettes you see,” Ms. Hall said. “So, I suppose that could be complimentary.”
In 1992, a member of President George H.W. Bush’s staff called on Mr. Cassidy when the president baffled reporters by calling an argument over who had run the first negative ad of the campaign a case of “who shot John.” Mr. Cassidy found that the term originated with a children’s game, an Iowa variant is “who shot the bear,” and in southern Appalachia, who-shot-John is slang for corn whiskey, primarily moonshine.
The next year, reporters rang DARE when President Bill Clinton said a critic didn’t know him “from Adam’s off-ox.” The phrase turned out to be common west of the Appalachians, meaning, “he doesn’t know anything about me.”
DARE has even been used to solve crimes. Roger Shuy, a retired forensic linguist, recounted the case of a child abduction in which the kidnapper left a note demanding ransom of $10,000, directing: “Put it in the green trash kan on the devil strip” at the corner of two streets.
The kidnapper tried to disguise his education with “kan” (elsewhere spelling “precious” correctly), but devil strip is a term for the strip of grass between the sidewalk and the roadway, one used solely in a small area around Akron. When law enforcement’s suspect list included just one educated man from Akron, the police got a confession.
Some linguists worry that television and the Internet will wash away America’s diverse regional vocabulary. The Subway sandwich chain, for instance, is eroding regionalisms like grinder (New England), hero (New York City), hoagie (Pennsylvania and New Jersey), zep (southeastern Pennsylvania) and spucky (Boston).
But new regionalisms are being minted. Relatively new (that is, in the last 40 years), the term skeevy has arisen primarily in Connecticut, New York and New Jersey to describe something gross or dirty. Out of Northern California, there has been hella, used as an intensifier, as in “that’s hella cool.”
University of Wisconsin-MadisonThe project was started by Frederic Cassidy, shown in 1949.
Indeed, while some might find tweet-speak hella skeevy, it looks like the future of discovering regionalisms is online.
A paper from Carnegie Mellon University in 2010 looked at regionalisms on Twitter, using geo-tagged posts. The authors found that while Northern Californians were hella tired, New Yorkers were deadass tired. And while sumthin’ means something in most cities, it is suttin’ in New York City.
Erin McKean, founder of online dictionary Wordnik, and a member of the DARE advisory board, said that Internet subcultures will increasingly be sources of new words. She points to a book, “Slayer Slang,” which cataloged the jargon of online fans of the “Buffy the Vampire Slayer” movie and TV show (e.g., slayage and Buffyverse).
“These words give us a sense of kinship and belonging,” she said. “It doesn’t matter if we live online all the time.”
While DARE isn’t ready to add the Buffyverse to its roster of regions, it is launching a digital version in 2013. Beyond that, Ms. Hall would like to field an updated language survey, which would be partly conducted online.
Ms. Hall said that she now has a new motto, paying homage to Mr. Cassidy (by way of Dr. Seuss): “On Beyond Zebra!” In the spirit of Volume V, Mr. Cassidy would surely be suffancified.
Want to join in the discussion?
We just need some additional information before we can set up your Journal Community profile. Please send your username and full name to: journalcommunity@wsj.com and a customer service representative will complete the process for you.
  • Clear
  • Post

…and I am Sid Harth@topcogitoergosum.com

0 comments:

Post a Comment