Qualitative Acoustic Analysis in the Study of Motor Speech

Qualitative acoustic analysis in the study of motor speech disorders Julie M. Liss ... articulatory deficit in apraxia of speech (cf. Kent and Rosen-...

5 downloads 477 Views 387KB Size
Qualitative acoustic analysis in the study of motor speechdisorders Julie M. Liss

Department ofCommunication Disorders, 115Shevlin Hall, University ofMinnesota, Minneapolis, Minnesota

55455

Gary Weismer Department of Communicative Disorders andtheWaisman Center,University of Wisconsin, Madison, Wisconsin 53 705

(Received9 March 1992;accepted forpublication 29July1992)

Traditionalmeasurements performed on theacoustic signals of normalspeech arefrequently usedto quantifytheacoustic characteristics of disordered speech aswell.Thisletter demonstrates howimportantaspects of speechproductiondeficitsin motorspeechdisorders maybeoverlooked if stringentquantification procedures areemployed,especially in the stage of exploratory dataanalysis. It issuggested thatqualitativeprocedures, whereinphenomena areinferredfromvisualexamination of certainacoustic displays, areusefulto supplement traditionalmeasurements, andmoreover, thattheybeusedto pointto thetypesof measurements thatshouldbemadein thefiner-grained stages of quantitative analysis. PACS

numbers: 43.70.Dn

INTRODUCTION

I. METHODS AND PROCEDURES

We have asserted elsewhere that traditional

acoustic

measures of temporalandspectralcharacteristics of normal

speechmay not necessarily revealthe inherently"important" aspectsof disorderedspeechproduction(Weismer andLiss,1991;seealsoWestbury,1991). By "important," we meanthosecharacteristics of the acousticsignalthat are likely to reflectaspectsof disorderedsensorimotor control and/or perceptualphenomenathat will play a prominent role in a theory of motor speechdisorders.We have also arguedthat because the traditionalparametricapproaches necessarily excludeidiosyncratic and aberrantspeechproductionfrom the analyses, andin fact maybecompromised by thevariability,muchpotentiallyrevealinginformationis sacrificed(Sussmanet al., 1988). The purposeof this letter is to describehow the inclusionof a qualitativelevelof acousticanalysiscan elucidate potentially"important"phenomenain the disorderedproductionof vocalicsegments,and how this can lead to theoretical notionsthat yield testablehypothesesin both the quantitativeandqualitativedomains.By "qualitative"level of analysis, wemeana visualinspection andinterpretation of acousticdata, especiallyinvolvingrepetitionsof utterances by individual speakers.The examplesdescribedin this letter

wereobtainedin the contextof a largerinvestigation of the

acoustic effects of contrastive stress • amongnormalgeriatric men,andsubjects exhibitingapraxiaof speechandataxic dysarthria (Liss and Weismer, submitted). Whereas the currentexamplesinvolvequalitativeanalysesof formant tra-

jectoriesassociatedwith vocalic nuclei, the analysisapproachmightbeappliedto anysetof acousticor kinematic data (see Cooke and Brown, 1986, for a discussionof a similar method in researchon limb motor control, and Corcoset

al., 1990,for examplesof suchanalyses). 2984

J. Acoust.Soc.Am.92 (5), November1992

For thelargerstudy,subjects includedfour eachof control (C), apraxic(A), andataxicdysarthric(D) speakers. Detaileddescriptions of thesesubjects canbefoundin other publishedworks (e.g., McNeil et al., 1990). For the purposesof demonstration, we will examinedatafrom onenormal male subject(age 63), and two male speakerswith apraxiaof speech(age54 and 62) whowerefreeof significant concomitantdysarthriaor aphasia.

Subjectsproducedphrasesand sentencesfollowing tape-recorded stimuli.Theseproductionswererecordedon cassette tapeusinghigh fidelityequipment.The two utterancesanalyzedin thisinvestigation, "buy Bobbya poppy," and "builda bigbuilding,"wereproducedfivetimeseachin twoconditions(for a totalof 40 testutterances persubject). Thesewere randomlydistributedamongother sentences and phrases.Two speakingconditionswere utilizedin the larger study: neutral and contrasticestress.Each sentence wasproducedin a neutralcondition,for which no specific directionsaboutstressplacementweregiven,andin a condition in whicheachof the contentwordswascontrastively stressed.

The quantitativeacousticanalysisof the largerinvestigation consistedof measurementof segmentand utterance durations, and measurement of formant transition characteristics.Here we will discussonly formant transitionchar-

acteristicsfor/aI/("buy") and/ll/("build"). Thesesegments were selected because they are associatedwith relativelylargeandcomplexchangesin articulatoryconfiguration that are reflectedmostnotablyin the trajectoryof the secondformantfrequency(F2). We usethe term "trajectory" to referto the time-frequency pathof the centerof the formant band across the entire duration of the vocalic nu-

cleus.Transition duration (TD, in ms) and transition extent 0001-4966/92/112984-04500.80

@ 1992 AcousticalSocietyof America

2984

(TE, in Hz) of the mostrapidly changingportionsof the dition. Mean slopevalues(from the quantitativeanalysis) trajectories weremeasuredusinga Kay DSP 5500worksta- and a transcriptionof eachof the syllablescontainingthe tionusinga wide-band( 300Hz) spectrographic display(0plottedformanttrajectoriesare alsoincludedin the figures. 4 kHz scaleexpansion)and the associatedwaveforms(see The transcriptionsare includedonly to describethe general WeismerandLiss,1991). Slopevalueswerecalculatedfrom segmental characteristics of the syllables,andnot asindices thesemeasures(TE/TD). Slopevalueswerenot calculated of the normalityor aberrancyof the productions. for productions that did not containthe expectedtrajectory The trajectoriesshownin thispaper (Figs. 1 and 2) are shape[e.g.,Fig. 1(b), production#2 doesnotcontaina 20usedto illustratehow the qualitativeprocedurecanbe used Hz changein any 20-mssegment--ouroperationaldefini- to identifyand cataloguephenomenaat the individualsubtion of "flat" (seeWeismer et al., 1988) ]. Inter- and intraject levelof analysis.Our pointis that thiskind of qualitative judgereliabilitywasacceptable for all quantitativemeasures examinationcanexposephenomenathat mustbe explained (Liss and Weismet, submitted). in theoreticalaccounts,and in fact takesadvantageof intraIn thequalitativephaseof the analysis,multipletrajec- and intersubjectvariabilityasobjectsof theoreticalinterest, tory tracingsweresuperimposed and visuallyexaminedto asopposedto a viewwhereinthat variabilityisan obstacleto successful statistical treatment of the data. identify and describepatternsand phenomena.This technique was originally designedto accommodateformants that were excludedfrom quantitativeassessment because they could not be measuredaccordingto our operational II. RESULTS AND DISCUSSION definitions.Suchformantstypicallycorresponded with aberrantproductions of thevowelthatdid not yieldtheexpect- A. Temporal translocation and gesture scaling ed trajectoriesfor the segmentsof interest.In the figures Figure1(a) displaysfiveF2 trajectories for thesegment shownhere,therearefivetrajectoriesplottedper panel,cor/I1/(from "build") producedby a controlsubject,andFig. respondingto fiverepetitionsof a particularutterance/con- 1(b) showsF2 trajectoriesfor the samesegmentproduced

CONTROL D BIG

APRAXIC BUILD 1800

SLOPE (N=S) X= 9.62

•1800

SO=

R

SLOPE (N=5)

Hz/ms

X= 5.48 Hz/ms SD= 1.58

1,48

TRANSCRIPTION

•z 1300

•1300

123-

/II/ /II/ /11/

4-

/II/

•-

/œ1/

o LU L•

800

800

160

260

lbO

TIME (ms)

(a)

APRAXlC BIG

T

1800

2bO

TIME (ms)

(a)

APRAXIC BIG

R

1800

SLOPE (N=4)

SLOPE (N=5)

X= 5.23 Hz/ms SD= ,81 T

X= 4.22 SD=

TRANSCRIPTION

1-

/II/

3•_ - /I:1/ /I:l/

3-

lI:l/

,,z,1300

Hz/rns

1.74

TRANSCRIPTION

I11/

Zua1300

IT:l/

--1 uJ

uJ

r•

fr

12_

12_

800

800

0 (b)

1dO

200

FIG, 1. (a) Five F2 trajectoriesfor thesyllablenucleus/11/produced in the "BIG" stressconditionby a neurologicallynormalsubject;(b) fiveformant trajectories for thesamesyllableandstressconditionproducedby a subject with apraxiaof speech. 2985

o

TIME (ms)

J. Acoust. Sec. Am., Vol. 92, No. 5, November 1992

(b)

16o 2be TIME (ms)

FIG. 2. (a) FiveF2 trajectories for thesyllablenucleus/]1/produced in the "BUILD" stressconditionby a subjectwith apraxiaof speech;(b) five formanttrajectories produced by thesamesubjectin the"BIG" stress condition.

J.M. Liss and G. Weisruer: Letters to the Editor

2985

by an apraxicspeaker.All of thesetrajectorieswere taken from utterancesin which the word "big" wascontrastively

is probablyrelatedto a gesturescalingproblemthat is expressed in theformof variabilityof thegesture magnitude. A stressed. directquantitativepredictionof thisviewisthat hyperscaled Usingthesetwo setsof trajectories, it ispossible to demgestures(thosewith largeinitial swingsin F2) shouldbe onstratehowtraditionalmeasures of formantslopeandseg- associated with later-occurring onsetsof themajorarticulament durationcan under-represent importantdifferences torygesture. Examination of thefivetrajectories in Fig.2(b) betweenthesesetsof formants.Considerthatthemeanslope generally supports thisidea.Hereagain,thedevelopment of value of the steepestportionsof the trajectoriesfrom the suchpredictions isbasedlargelyon visualinspection of the normal speaker[Fig. l(a)] is 9.62 Hz/ms (s.d. = 1.48). superimposed trajectories, andwouldnotemergefromstrict Comparedto the meanslope(5.23, s.d.= 0.81) of four of adherence to traditionalquantification. the trajectoriesproducedby the apraxic speakerin Fig. B. Articulatory decomposltion/segmentalization 1(b), we can concludethat the trajectoriesof the control speakerare steeper.Further comparisonrevealsthat the The precedingexamplesillustrate that observations transitiondurationsof the apraxicsubjectarealsogenerally from qualitativeanalysis(temporaltranslocation)canlead greaterthan thoseof the normal speaker.Thus, from this to testablehypothesesabout the underlying mechanism traditionalquantitativeapproach,wecaninferthat thema(gesturescaling).Taking this onestepfurther,the qualita-

jor articulatory gesture 2 for/[1/was produced at a slower tive analysiscanalsopointto hypotheses that beardirectly rate and over a greaterperiodof time, as comparedto the gesturefor the normalspeaker.

on the development of theoriesof motorspeechdisorders. The scalingproblemdescribedaboveand its possible This conclusion is consistent with other studies of the relationshipto temporaltranslocationof transitionsmay articulatory deficitin apraxiaofspeech (cf. KentandRosen- alsoberelatedto, or bea byproductof, articulatorydecombek, 1983),andonecouldbeinclinedto stophere.However, position(or, segmentalization) of speech. Thisphenomenon the conclusion of slowness and the possible lengthening of isthoughtto becharacterized by a reductionin thedegreeof the major articulatorygesturedoesnot captureall that is overlapbetweensuccessive articulatorygestures(Weismer unusualaboutthe trajectoryrepetitions.Despitethe general and Liss,1991), resultingin speechacousticwaveformsthat similarities amongthemostrapidlychanging portionsof the reflect reduced coarticulationor copreduction (i.e., evitrajectories,thereare substantialdifferences amongthe dudenceof motor controldeficits).Articulatorydecomposirationsof thesegments thatprecedethem.In Fig. 1(a), the tion hasbeenobservedpreviouslyfor both apraxicand dysegmentof the trajectoryprecedingthe downwardsloping sarthricspeakers(Kent and Rosenbek,1983;Weismerand portionisrelatively briefandtopologically uniform 3 across Liss, 1991;Weismeret al., 1992), and is likely to be largely four of thefivetokens.Thisis not truefor the repetitions of responsible for the perceptionof "scanningspeech"(rethe apraxicspeaker[Fig. 1(b) ] wherethe trajectoriesapducedsegmentdurationcontrasts)in thesedisorders(e.g., pear to be "translocated"acrossthe time axis.Specifically, Zieglerand von Cramon, 1986). Comparethe productions thesetrajectoriesgenerallyend in the expecteddownward of an apraxicspeakershownin Fig. 2(a) and (b) to thoseof slope,but the durationsof the flat portionsprecedingthe thecontrolspeakerin Fig. 1(a). Four of thefiveF2 valuesat downwardsegments rangefrom approximately25 to nearly theonsetof thenormaltrajectories, or at thepointwherethe 200 ms.Theseplateauswouldbe consistent with periodsof consonantconstrictionisjust releasedinto the vocalicsegrelativearticulatoryimmobility,and must be explainedin ment, rangebetween 1600-1700 Hz. This is a reasonable any comprehensive theoryof motor speechdisorders.The "target" valuefor/I/produced by a geriatricmale in this point hereis that without visualexaminationof thesesuper- phoneticcontext.Four trajectoriesshowa brief steadystate imposedformanttrajectories, onewouldnot thinkapriori to (about 30 ms) aroundthesefrequencies, followedby the measurethe time precedingthe downwardslopingportion characteristicfallingtransition.In contrast,sevenof the ten of the trajectories.Thus,the qualitativestepof visualexami/ll/trajectories producedby apraxicspeaker[ Fig. 2 (a) and nationof the disorderedtrajectoriesrevealsa phenomenon (b) ] have startingfrequenciesthat are substantiallylower that lendsitselfeasilyto quantification;that is not apparent than expectedfor /[/, due to the failure to position the from thetracingsof a normalspeaker;andthat maybeasso- tonguein a high-frontpositionduring the periodof vocal ciatedwith underlyingissuesof motor control. tract closure for the word-initial/b/. At the release of the Anotherexamplecan be foundin Fig. 2(a). Note that /b/, then,the tonguemustmoveto the requiredhigh-front

four of the trajectorieshave essentiallythe samestarting fre-

quency(around1300Hz). Nevertheless, the onsetsof the downwardslopingtransitionin thissetof trajectories--the "temporal translocation"of the transitions--are highly variable,apparentlyas a consequence of variability in gesture scalingfollowingreleaseof the/b/. Thesegesturescaling difficulties,as evidencedhere by large variationsin the magnitudesof the initial risingfrequencyswingsalongthe F2 trajectories,shouldbe associatedwith variationsin the onsettime of the major articulatorygesture(i.e., the transition). In otherwords,temporaltranslocation of transitions 2986

J. Acoust. Sec. Am., Vol. 92, No. 5, November 1992

position,a gesturereflectedin severalof thesetrajectoriesas

the initial risingportionof the F 2 trajectory.Thesevarious attemptsresultin differentscalingsof the gesture,aswell as the phenomenonidentifiedaboveas "temporal translocation" of transitions.

C. Theoretical

considerations

In our experience, the kindsof phenomena described in thispaperarecommonin thespeechof individualswith motor speech disorders. We haveshownhowa qualitativeanalJ.M. Liss and G. Weisruer: Letters to the Editor

2986

Contrastive stressis a commontherapytechnique wherebya speakeris ysisprocedure (i.e.,visualexamination ofsuperimposed trataughtto emphasize the contentor importantwordsin an utterance to jectories)canleadto theoretical perspectives on thenature enhanceintelligibilityandmessage transfer. of the speechproductiondeficitin motorspeechdisorders 2By "majorarticulatorygesture"wemeanthepartof thevocalicnucleus andyieldsometestable(and quantifiable)ideas.Specifical- thatdemandsthemostextensive(andperhapsthemostrapid) changein vocaltract configuration to producethe desiredsoundsequence. The ly, amongthearticulatoryaberrancies identifiedhere (temacoustic correlateof thisistheoperationally definedtransitionportionof poraltranslocation of transitions, inappropriate scalingof the formant trajectory.

gestures, articulatorydecomposition/segmentalization), we

postulate thattheformertwoareprobablybyproducts of the latter. One possibilityis that segmentalization of articulatorygestures inducescompensatory responses onthepartof the speakerthat lead to variationsin gesturescaling,and thustemporaltranslocation of transitions. This is an attractive theoretical notion for at least three

3"Topologically uniform"impliesa consistent trajectory shapeacross repetitions.The implicationis not that the specificfrequency-time coordinatesare identicalor nearly so acrossrepetitions,but that the different trajectoriescouldbe translatedup or down the frequencyscale(or, in somecases,the durationscale)with the resultingsuperimposition showing a minimumvariabilityfor a particularportionof the trajectory.In normalspeakers, the amountof frequencyor time scaletranslationrequiredto demonstrate topological uniformityistypicallyveryminor.Trajectoriesthat arenottopologically uniform,by thisdefinition,wouldnot showa reductionof thetrajectoryvariabilitywhenthisscaletranslation is performed.

reasons. First,it attemptsto unifytheexplanatory apparatus of variousspatio-temporal articulatorydifficultiesin some motorspeechdisordersby meansof a general,well-attested Cooke,J. D., and Brown, S. H. (1986). "Scienceand statisticsin motor phenomenon (i.e., segmentalization). Second,it fitsin with physiology," J. Motor Behav.17, 489492. certain contemporaryaccountsof successive articulatory Cotcos,D. M., Agarwal,G. C., Flaherty,B. P.,andGottlieb,G. L. (1990). overlap---namely, thenotionof gesture slidingandblending "Organizing principles forsingle-joint movements. IV. Implications for isometriccontractions," J. Neurophysiol. 64, 1033-1042. described by Saltzmanand Munhall (1989)--that might Kent, R. D., andRosenbek, J. C. (1983). "Acousticpatternsof apraxiaof permittheoreticalanalysisof motorspeechdisorderswithin speech," J. SpeechHear.Res.26, 231-249. the frameworkof a theoryof normalspeechproduction. Liss,J. M., andWeismer,G. (submitted)."Acousticcharacteristics ofconThird,asnotedabove,it suggests certainempiricalteststhat trastivestressproduction in controlgeriatric,apraxic,andataxicdysarthricspeakers," submitted to Clin.Linguist.Phon. are amenableto quantitativeanalysis. McNeil, M. R., Liss,J. M., Tseng,C-H., andKent,R. D. (1990). "Effects The full potencyof qualitativeanalyseslikely will be ofspeech rateontheabsolute andrelativetimingofapraxicandconducrealizedwhensuchattemptsare guided--butnot limited-tionaphasic sentence production," BrainLang.38, 135-158. bytheoreticalconsiderations. In effortsto identifythepoten- Saltzman,E. L., and Munhall, K. O. (1989). "A dynamicalapproachto gestural patterning in speech production," Ecol.Psychol. 1, 333-382. tially "important"acoustic measures in motorspeech disorSussman, H. M., Marquardt,T. P., MacNeilage,P. F., andHutchinson, J. ders,we regardthe qualitativeprocedureasindispensable. A. (1988). "Anticipatory coarticulation in aphasia: SomemethodologACKNOWLEDGMENTS

icalconsiderations," BrainLanguage 35, 369-379. Weismer,G., Kent,R. D., Hodge,M., andMartin,R. (1988). "Theacoustic signature forintelligibilitytestwords,"J. Acoust.Soc.Am. 84, 12811291.

This researchwassupportedby a Universityof Minnesota Graduate SchoolGrant-in-Aid of Research,Artistry, andScholarship, andNIH Grant No. NS 18797.We extend our gratitudeto Karen Forrestfor her thoughtfulcomments on an earlierversionof thispaper.We alsowishto acknowledgethelaboratoryassistance providedby Kristin Little and StephanieHanson. Addressreprint requeststo Julie Liss, Ph.D., Departmentof Communication Disorders,115Shevlin Hall, Universityof Minnesota,Minneapolis,MN 55455.

2987

J. Acoust. Soc. Am., Vol. 92, No. 5, November 1992

Weismer,G., andLiss,J. M. (1991). "Acoustic/perceptual taxonomies of disordered speech," in Dysarthria anddpraxiaofSpeech: Perspectives on Management, editedby C. Moore,K. Yorkston,and D. Beukelman (Brookes,Baltimore), pp. 245-270. Weisruer,G., Martin, R., Kent, R. D., and Kent, J. F. (1992). "Formant

trajectories in menwith amyotrophic lateralsclerosis," J. Acoust.Soc. Am. 91, 1085-1098.

Westbury, J.R. ( 1991). "Ontheanalysis ofspeech movements," J.Acoust. Soc. Am. 89, 1870(A).

Ziegler,W., andvonCramon,D. (1986). "Disturbed coarticulation in apraxiaof speech; acoustic evidence," BrainLang.29, 34-47.

J.M. Liss and G. Weisruer: l_etters to the Editor

2987