导图社区 IPO 语调感知研究
这是一篇关于IPO 语调感知研究的思维导图,IPO语调感知研究是一个具有挑战性和潜力的领域,它将有助于我们更深入地理解IPO过程中的信息处理和决策机制,为企业和投资者提供更有效的策略建议和指导。
编辑于2024-03-22 13:26:53A perceptual study of intonation: an experimental-phonetic approach to speech melody 语调感知研究
“Introduction”
“1.1 A dilemma”
人们需要一种元语言来谈论现象。这种元语言的词汇表应该由适当的描述性单元组成,人们可以用这些单元来指代不同观察级别的实体和结构。
“one needs a metalanguage in which one can talk about the phenomenon. The vocabulary of such a metalanguage should consist of suitable descriptive units with which one can refer to entities and structures at various levels of observation.” (p. 3)
语音旋律是一个不断变化的属性,它包含与完整话语一样长的语言单位,并且是否可以将其分割成更小的单位,以及如何分割成更小的单位,这一点还很不清楚。
“Speech melody is a continuously varying attribute that encompasses linguistic units as long as complete utterances, and it is far from evident whether, and if so how, it can be segmented into smaller units.” (p. 3)
印象主义的听觉描述仍然难以解释,并且可能不能代表其他听众的看法。
“impressionistic auditory descriptions remain difficult to interpret and may not be representative of other listeners' perceptions.” (p. 4)
可以记录的微小的身体变化几乎不可能用感知和交流的术语来解释。研究人员再次面临寻找合适的描述单位的困难。
“the minute physical variations that could be recorded were almost impossible to interpret in perceptual and communicative terms. Once more the researcher was confronted with the difficulty of finding suitable descriptive units.” (p. 4)
总而言之,语调学生面临着一个两难境地:要么他选择冒着忽视语音重要特征的风险的语言学方法,要么他选择工具语音角度,从而增加了错过交际相关要素的机会。
“In summary, the student of intonation faces a dilemma: either he chooses the linguistic approach at the risk of overlooking phonetically important features, or he opts for an instrumental-phonetic angle, thus increasing the chance of missing the communicatively relevant essentials.” (p. 4)
“1.2 A way out”
对众多语言和语音问题的一个非常有希望的解决方案可能在于开发“听者模型”。这样的模型最终应该回答包含在更普遍的问题下的许多问题:听者对言语的音调有何看法
“a very promising solution to the multitude of linguistic and phonetic problems may reside in the development of a 'model of the listener'. Such a model should eventually answer the many questions subsumed under the more general one: what does the listener make of pitch in speech” (p. 4)
要回答这个问题意味着我们要揭示听者区分哪些旋律单元,如何将它们构建为音高轮廓的整体感知,如何将感知的轮廓与更抽象的旋律实体(语调模式)联系起来,如何整合旋律和音调。发现听者对语音旋律变化施加的结构类型相当于揭示其语调能力的主要方面。
“To answer this question implies that one brings to light which melodic units the listener distinguishes, how he structures them to the overall percept of a pitch contour, how he relates perceived contours to more abstract melodic entities (intonation patterns), how he integrates melodic and textual information into on linguistic message, etc. To discover the kind of structure a listener imposes on the melodic variations in speech amounts to revealing major aspects of his intonational competence.” (p. 4)
语言使用者的语调能力不仅包括旋律形式的知识,还包括旋律功能的知识。然而,对语调形式属性的评估在逻辑上优先于对其语言和表达用途的研究。最终我们想要掌握语调的交际价值,但我们最关心的是为语音的旋律特性和语言的语调特征开发一个描述框架。
“A language user's intonational competence not only comprises knowledge about melodic form, but also about melodic function. However, the assessment of the formal properties of intonation takes logical precedence over the study of its linguistic and expressive use. Eventually we want to come to grips with the communicative value of intonation, but our immediate concern is to develop a descriptive framework for the melodic properties of speech and for the intonational features of language.” (p. 5)
心理声学阈值和交流相关性构成了界定我们感知探索范围的下限和上限。
“Psychoacoustic thresholds and communicative relevance constitute the lower and upper boundaries that delimit the province of our perceptual quest.” (p. 5)
因此,我们最关心的问题不是“我们感知到什么旋律?”,而是“声学信号的哪些属性与我们对语音旋律的感知相关?”随后,“哪些生理机制控制这些感知相关的声学特征?”
“Thus our central concern is not so much the question 'What melody do we perceive?', but rather 'Which properties of the acoustic signal are relevant for our perception of speech melody?' and, subsequently, 'Which physiological mechanisms control these perceptually relevant acoustic features?'” (p. 5)
感知方法本身是主观的。为了对语音旋律的感知做出有保证的概括,人们需要确定特定的旋律印象是否可以在同一听众以及其他听众中再现
“a perceptual approach is subjective per se. In order to formulate warranted generalizations about the perception of speech melody, one needs to ascertain whether a particular melodic impression can be reproduced in the same listener as well as in other listeners” (p. 6)
我们的方法可以最好地描述为实验语音研究,研究听者的语调知识如何影响他对口语的感知和解释。
“our approach can best be characterized as an experimental-phonetic study of how the listener's intonational knowledge is brought to bear on his perception and interpretation of spoken language.” (p. 6)
“1.3 Overview of the contents”
“Institute for Perception Research (IPO)” 感知研究所 (IPO)
事实上,我们对一般的“最先进的技术”没什么可说的,而是将自己限制在对所谓“荷兰学派”语调的深入研究上。
“In fact, we will have little to say about the 'state of the art' in general, but will rather limit ourselves to an in-depth treatment of what has become known as The Dutch School' of intonation.” (p. 6)
“chapter 2:我们提供了有关决定声带振动速率的生理机制的简洁信息。我们还讨论了如何将语调嵌入到语音信号中,以及如何以或多或少自动的方式从信号中提取相关信息。最后我们关注音高的感知。
we provide succinct information on the physiological mechanisms that determine the rate of vocal-fold vibration. We also discuss how intonation is embedded in the speech signal, and how the relevant information can be extracted from the signal in more or less automatic ways. Finally we devote attention to the perception of pitch.” (p. 6)
“chapter 3:sketches the general framework of our perceptually oriented analysis of intonation. It introduces our basic assumption and the descriptive units with which we operate at different levels of analysis.” 勾勒出我们以感知为导向的语调分析的总体框架。它介绍了我们的基本假设和我们在不同分析级别上操作的描述单元。
“First we explain how the artificial manipulation of fundamental frequency (Fo) is at the core of a series of procedures that yield descriptive units of increasing complexity.”(Hart 等, 1990, p. 7) 首先,我们解释基频 (F0) 的人工操纵如何成为一系列程序的核心,这些程序产生越来越复杂的描述单元。 “we elucidate how the atomistic pitch movements can be used in the construction of a global pitch contour, through the intermediate unit of the pitch movement configuration. Again we explain how these sequential constraints of the pitch movements can be discovered, how they can be formalized in a grammar of intonation, and how the validity of the predictions made by this grammar can be assessed.”(Hart 等, 1990, p. 7) 我们阐明了如何通过音高运动配置的中间单元将原子音高运动用于构建全局音高轮廓。我们再次解释如何发现音高运动的这些顺序约束,如何将它们形式化为语调语法,以及如何评估该语法做出的预测的有效性。 “Finally, we discuss how perceptually distinct pitch contours may be related to more abstract categories, the intonation patterns”(Hart 等, 1990, p. 7) 最后,我们讨论感知上不同的音高轮廓如何与更抽象的类别、语调模式相关
“chapter 4: formulate the acquired insights in the form of ten propositions.” 我们以十个命题的形式阐述所获得的见解。
“The first six propositions pertain to the phonetic part of our theory. They primarily deal with our melodic model of intonation and with the relationship between the perceptual and the acoustic or physiological manifestations of intonation.”(Hart 等, 1990, p. 7) 前六个命题与我们理论的语音部分有关。它们主要处理我们的语调旋律模型以及语调的感知和声学或生理表现之间的关系。 “The next three propositions concern our views on certain functional aspects of pitch in speech, viz. the role of intonation in sentence accentuation and syntactic-boundary marking, and its contribution to the overall meaning of an utterance.”(Hart 等, 1990, p. 8) 接下来的三个命题涉及我们对语音音调的某些功能方面的看法,即。语调在句子重音和句法边界标记中的作用,及其对话语整体意义的贡献。 “The last proposition regards a psycholinguistic issue, viz. the amount of preplanning that is required for a speaker to successfully integrate melodic and functional requirements in the control of his pitch.”(Hart 等, 1990, p. 8) 最后一个命题涉及心理语言学问题,即,演讲者为了成功地将旋律和功能要求整合到音高控制中所需的预先计划量。
“chapter 5:declination, i.e. the actual or virtual tendency for a pitch contour to gradually drift downward in the course of an utterance” 下倾,即音高轮廓在说话过程中逐渐向下漂移的实际或虚拟趋势
“The chapter is subdivided in three phonetic sections (acoustics, production and perception), followed by a fourth in which some functional aspects of the phenomenon are at issue.”(Hart 等, 1990, p. 8) 本章分为三个语音部分(声学、产出和感知),然后是第四个语音部分,其中讨论了该现象的一些功能方面。 “The acoustics section discusses the difference between topline and baseline declination and illustrates how our stylization technique can be used to reliably measure the declination rate. Predictions about the variable rate of declination are stated in a formula. The first section ends with a discussion of declination resetting”(Hart 等, 1990, p. 8) 声学部分讨论了顶线和基线下倾之间的差异,并说明了如何使用我们的程式化技术来可靠地测量下倾率。关于可变偏角率的预测用公式表示。第一部分以下倾重置的讨论结束。 “g. The second section explores the possible physiological causes of declination and examines to what extent it may be actively controlled by the speaker.”(Hart 等, 1990, p. 8) 第二部分探讨了下倾可能的生理原因,并检查了说话者在多大程度上可以主动控制它。 “The perceptual side, dealt with in the third section, raises the issue of the psychological reality of declination”(Hart 等, 1990, p. 8) 第三部分讨论的感知方面提出了下倾的心理现实问题。 “In the final section of this chapter we address the possible influence of declination on the overall interpretation of an utterance.”(Hart 等, 1990, p. 8) 在本章的最后一节,我们讨论了下倾对话语整体解释可能产生的影响。
“chapter 6:attempt to further reduce the phonetic structure of Dutch pitch contours to its essential properties” 我们尝试进一步将荷兰语音高轮廓的语音结构简化为其基本属性
“we introduce and define a number of descriptive units and categories, and show how they apply to abstract intonational structures.”(Hart 等, 1990, p. 8) 我们引入并定义了许多描述性单位和类别,并展示了它们如何应用于抽象语调结构。 “Then we present a set of derivation rules that convert underlying intonation patterns into more elaborate, concrete melodic entities”(Hart 等, 1990, p. 8) 然后,我们提出一组推导规则,将潜在的语调模式转换为更复杂、更具体的旋律实体。 “Finally we show how melodic structures and textual elements can be mapped onto each other.”(Hart 等, 1990, p. 8) 最后,我们展示了旋律结构和文本元素如何相互映射。
“Phonetic aspects of intonation”
“2.0 Introduction”
“Intonation, as we have defined it, is the ensemble of pitch variations in the course of an utterance” 正如我们所定义的,语调是说话过程中音高变化的整体
“if intonation is approached from a phonetic angle, its form of appearance can be described in perceptual, acoustic and physiological terms.” 如果从语音角度来看待语调,那么它的表现形式可以用知觉、听觉和生理学术语来描述。
“2.1 The physiology of intonation”语调的生理学
“2.2 Acoustic manifestations of vocal-cord vibration and their measurement” 声带振动的声学表现及其测量
“what extent the measurements are interpretable in terms of how the auditory system processes them” 测量结果在多大程度上可以根据听觉系统如何处理它们来解释
“Measurement F0 by ear”
“Measurement by eye”
“Choice of units: Hz versus semitones” 单位选择:Hz 与半音
考虑到音高的感知,我们对频率距离比对绝对频率本身更感兴趣,并且我们希望独立于偶然性频率来表达这些距离的大小。这样就可以比较不同说话者、不同音域的 F0曲线。
“with a view to the perception of pitch, we are more interested in frequency distances than in the absolute frequencies themselves, and that we want to express the magnitudes of these distances independently of the incidental frequency. This makes it possible to compare F0 curves from different speakers, with different ranges of voice.”
concluding remarks
局部扰动是由声源和声道之间不同的声学耦合引起的。只要这些机制没有被完全理解,F0 曲线的解释仍然是一项艰巨的任务。
“the local perturbations are caused by the varying acoustic coupling between the voice source and the vocal tract. As long as these mechanisms are not fully understood, the interpretation of F0 curves remains a difficult task.”
“how F0 is converted by the auditory system into a melodic continuum” 听觉系统如何将F0转换为旋律连续体
“2.3 Perception”
起
“'psychophonetic' experiments too are not fully representative of the way in which pitch in real speech is perceived.” “心理语音”实验也不能完全代表真实语音中音高的感知方式。
承
音调是如何以及何时被感知的:主要区域中的谐波实际上总是以足够的能量存在。尽管语音信号不完全是周期性的,但无需认为语音信号中的音调感知与严格周期性信号中的音调感知有本质上的不同。
“How and when is pitch perceive”“Harmonics in the dominance region are practically always present with sufficient energy. Although the speech signal is not exactly periodic, there is no need to consider the perception of pitch in a speech signal to be essentially different from the perception of pitch in a strictly periodic signal.”
转
“the differential threshold of pitch (2.3.2.1), the differential threshold of pitch distance (2.3.2.2), the absolute threshold of pitch change (2.3.2.3), and the differential threshold of pitch change (2.3.2.4).” 音高差分阈值(2.3.2.1)、音高距离差分阈值(2.3.2.2)、音高变化绝对阈值(2.3.2.3)、音高变化差分阈值(2.3.2.4)。
合
“except with respect to the differential threshold of pitch, no substantial discrepancies exist between the results from psychoacoustic and 'psychophonetic' experiments.” 除了音调的不同阈值之外,心理声学和“心理语音”实验的结果之间不存在实质性差异。
“2.4 Integration”
理想情况下,尽可能完整的语音描述应结合本章给出的三个方面。
“Ideally, a phonetic description that would be as complete as possible, would combine in itself the three aspects given in this chapter. In analogy to the study of segmental phenomena, one could begin with production, subsequently give an acoustic description of the signal, and next try to establish the relationship between these two. Afterwards, one could concentrate on perception, and examine which acoustic properties contribute to what extent and in what way. If both the relationship between production and the acoustic characteristics and that between the latter and perception are sufficiently understood, then, in addition the bridge from production to perception can be built.”
与分段现象的研究类似,人们可以从产出开始,随后给出信号的声学描述,然后尝试建立两者之间的关系。
之后,人们可以专注于感知,并检查哪些声学特性对声音的影响程度和方式如何。
如果充分理解产出与声学特性之间的关系以及声学特性与感知之间的关系,那么就可以建立从产出到感知的桥梁。
然而,通过肌电图和声门下压力和肺容积的测量来研究语调的产生似乎非常复杂。除了所有这些测量固有的技术问题之外,还有一个更根本的问题,即,所涉及的生理机制与所产生的东西的关系远不如舌尖、下颌和软腭等透明。
“However, the study of the production of intonation, by means of electromyography and measurements of subglottal pressure and lung volume, has appeared to be very complicated. Apart from technical problems inherent in all these measurements, there is a more fundamental problem, viz. that the physiological mechanisms involved have far less transparent relationships with what is produced than, e.g., the tongue tip, the jaws and the velum have”
“Part of this intransparency is caused by the lack of adequate units for a phonetic description.” 这种不透明的部分原因是缺乏足够的语音描述单位。
为了解决上述关系的具体问题,我们需要一个选择标准,以便决定在大量可能的生理记录中寻找什么。
“To address the specific question of the relationships mentioned above, one needs a selection criterion in order to decide what to look for in the abundance of possible physiological recordings. Such a criterion can only be based on hypotheses about the correspondence between the muscular activities on the one hand, and as many discrete events in the acoustic signal on the other. Thus, thefirstproblem is to find a way to describe the F0 curve in terms of discrete events.”
这样的标准一方面只能基于关于肌肉活动之间的对应关系的假设
另一方面基于关于声学信号中的许多离散事件之间的对应关系的假设
因此,第一个问题是找到一种用离散事件来描述F0曲线的方法。
出于这些原因,我们选择从 F0 的测量开始,然后考虑 F0 变化的哪些方面对语音旋律的感知做出重大贡献的问题。
“For these reasons, we opted to start with the measurement of Fo, and next to consider the question as to which aspects of Fo variation give substantial contributions to the perception of speech melody.
直到后来,当我们对 F0 变化对于感知的相对重要性有了足够的了解时,我们才发现进行生理测量并尝试寻找肌肉和呼吸活动是有意义的
It was only in a later stage, when we had gained enough insight into the relative importance of the Fo variation for perception, that we found it meaningful to take up physiological measurements, and to try to look for muscular and respiratory activities”
“The IPO approach”
“3.0 Introduction”
起
“ it has been our deliberate choice to first concentrate on the study of the acoustic phenomena, and to try to develop a method of data reduction on the basis of perceptual tolerances.” 我们刻意选择首先专注于声学现象的研究,并尝试开发一种基于感知容差的数据约简方法。
承
“Data reduction in itself would already make the data more manageable, but if it is done on the basis of perceptual tolerances, it might provide us with perceptual units in a melodic description.” 数据缩减本身已经使数据更易于管理,但如果它是在感知容差的基础上完成的,它可能会为我们提供旋律描述中的感知单元。
“This, in turn, might enable us to interpret these data so as to reflect how the human listener processes them. Ultimately, this may help to offer a solution to the problem of finding suitable descriptive units.” 反过来,这可能使我们能够解释这些数据,以反映人类听众如何处理它们。最终,这可能有助于为寻找合适的描述单元的问题提供解决方案。
转
“Since it seems impossible, in a phonetic approach, to establish a direct link between recordings of F0 and the abstract mental categories of the basic intonation patterns” 在语音方法中似乎不可能在F0的记录和基本语调模式的抽象心理类别之间建立直接联系
合
“necessary to make a detour. In each of the steps of this perceptionguided detour, subjective similarity is at stake, from auditory identity in the first, to a more abstract kind of resemblance in the last step” 需要绕道而行。在这种感知引导绕道的每一步中,主观相似性都利害攸关,从第一步的听觉同一性到最后一步的更抽象的相似性
“3.1 Pitch movements”
“3.2 Pitch contours”
“3.3 Intonation patterns”
“3.4 A schematic survey”
“F0 curve and close-copy stylization; perceptual equality” F0 曲线和近距离复制风格化;感知等价

“With the criterion of perceptual equality, close copies are made, which contain all and only the perceptually relevant pitch movements.” 根据感知平等的标准,制作F0 curve and close-copy stylization,其中包含所有且仅包含感知相关的音高运动。
“With the criterion of perceptual equivalence, the close copies are transformed into standardized stylizations. Once their acceptability has been established, the standardized perceptually relevant pitch movements can now be considered as the minimal descriptive units” 以知觉对等为标准,将近似的复制品转化为标准化的风格化。一旦确定了它们的可接受性,标准化的感知相关的音高运动现在可以被视为最小的描述单元
“3.5 A general characterization of the IPO approach”IPO 方法的一般特征
这是一种实验语音、自下而上的方法,但它不仅仅是一种仪器、声学分析,然后进行统计处理,检查某些假设差异的潜在意义。
“It is an experimental-phonetic, bottom-up approach, but it is more than a mere instrumental, acoustic analysis followed by statistic processing that examines the potential significance ofcertain postulated differences. Although our starting point certainly consists of measuring F0, the centre of gravity is in perception. The motivation of this choice is not only to achieve data reduction in an effective way, but, moreover, to examine how the information in the speech melody is processed by the listener. Since we have assumed that the perceptually relevant information resides in those F0 changes that are brought about voluntarily by the speaker, we must try tofindexperimental support for this assumption by studying production aspects by means of physiological measurements. In this way, the usual requirement of phonetic research is met: viz. that it should deploy tripartite activities, in the fields of production, of acoustic manifestation and of perception, although not necessarily in that order, nor with equal weights.”
虽然我们的出发点当然是测量 F0,但重心在于感知。
这种选择的动机不仅是为了以有效的方式实现数据缩减,而且还为了检查听者如何处理语音旋律中的信息。
由于我们假设感知相关信息存在于说话者自愿带来的 F0 变化中,因此我们必须通过生理测量研究生产方面来尝试为这一假设提供实验支持。
这样就满足了语音研究的通常要求:即。它应该在生产、声学表现和感知领域开展三方活动,尽管不一定按这个顺序,也不一定具有同等的权重。
如果我们借用音段音标中广义与狭义的二分法,我们可以说IPO方法是狭义的;但也不能太狭隘,因为它致力于通过应用感知标准来减少数据。它不像 Tune 的方法那么广泛。事实上,它确实旨在尝试在具体和原子特征(concrete and atomistic features)与更抽象和全局结构之间建立联系,例如通过检查哪些分析细节负责语调模式的身份。
“If we borrow the dichotomy of broad versus narrow from segmentalphonetic transcription, we can say that the IPO approach is narrow-oriented; but not too narrow, because it strives at data reduction by means of the application of perceptual criteria. It is not as broad as the Tune' approach. In fact, it does aim at trying to establish a link between concrete and atomistic features and more abstract and global structures, for example by examining what analytic detail is responsible for the identity of an intonation pattern.”
此外,该方法的特点是坚信语调的等级组织。正如将在第 4 章命题 7 中解释的那样,我们并不坚持这样的观点,即通常可以仅基于局部性质的考虑来生成整个音高轮廓。将会清楚的是,语调模式决定了各种音高运动之间的局部选择以及它们可能出现的顺序。
“The approach is, furthermore, characterized by a firm belief in a hierarchical organization of intonation. As will be explained in chapter 4, proposition 7, we do not adhere to the view that it is generally possible to generate an entire pitch contour solely on the basis of considerations of a local nature. It will be made clear that it is the intonation pattern which dictates the local choice among the various pitch movements, and the order in which they may appear.”
因此,语调模式似乎是一种理论构造,仅仅是为了实现最高级别的等级组织而引入的,但这样的策略将与 IPO 方法相悖。
“The intonation pattern thus may seem to be a theoretical construct, introduced merely to implement the highest level of the hierarchical organization, but such a strategy would be contrary to the IPO approach. Neither is it our intention to lay bare the subdivision of speech melody into a limited number of intonation groups on the basis of any kind of linguistic categorization, such as statement vs question, or of paralinguistic, for instance attitudinal, features. Rather, our experimentation is directed towards getting insight into the listener's internal representation of the intonational system of his native language. If in the experimental outcome the listeners give evidence of sufficient agreement on the categorization of melodic shapes, we shall have to incorporate it in our theoretical account. But in the absence of such evidence, we do not feel entitled to introduce a categorization on a priori grounds
我们也无意根据任何类型的语言分类(例如陈述与问题)或副语言特征(例如态度特征)将语音旋律细分为有限数量的语调组。
相反,我们的实验旨在深入了解听众对其母语语调系统的内部表征。如果在实验结果中听众对旋律形状的分类有足够的一致性,我们就必须将其纳入我们的理论解释中。但在缺乏此类证据的情况下,我们无权基于先验理由进行分类
“A theory of intonation”
“Proposition 1: In the phonetic perception of pitch in speech the listener is sensitive to a highly restricted class of F0 changes only: viz. those that have been intentionally produced by the speaker”命题 1:在语音音高的语音感知中,听者仅对高度受限的F0 变化类别敏感:即,那些是说话者有意制造的
“the extensive stylization to which the natural course of F0 can be submitted without any perceptual consequences cannot be explained on purely auditory grounds” F0 的自然过程可以在没有任何知觉后果的情况下进行广泛的风格化,这不能纯粹以听觉为基础来解释
“the spectral complexity of the speech signal makes its auditory analysis more difficult than that of the sinusoidal stimuli used in most psychophysical investigations” 语音信号的频谱复杂性使其听觉分析比大多数心理物理学研究中使用的正弦刺激更困难
“from which one gathers an impression of how surprisingly large a difference in size or slope has to be for two F0 changes to be perceived as unequal.” 人们从中得到的印象是,两个 F0 变化的大小或斜率差异必须有多么大,才会被认为是不相等的。
“in order to explain why speechpitch perception is so selectively sensitive, we will still need an additional principle that is not auditory but phonetic in nature. That principle is stated as our basic assumption: viz. that only those F0 changes are relevant for perception that have been intentionally produced by the speaker as physical properties that are cues to the intonation pattern that he wants to produce.” 为了解释为什么语音音高感知如此有选择性地敏感,我们仍然需要一个额外的原则,它本质上不是听觉而是语音。该原则被表述为我们的基本假设:即,只有那些 F0 的变化与说话者有意产生的感知相关,作为物理属性,这些物理属性是他想要产生的语调模式的线索。
“perception is constrained by his knowledge of two sorts of production facts: (a) what is physiologically possible; and (b) what is allowed by the language-specific rules for intonation.” 感知受到他对两种生产事实的了解的限制:(a)生理上可能的是什么; (b) 特定于语言的语调规则所允许的内容。
“Proposition 1.1: Those F0 variations that are relevant for the perception of speech pitch can be approximated as strictly linear changes in the (log) F0 versus time domain” 命题 1.1:那些与语音音调感知相关的 F0 变化可以近似为(log)F0 与时域的严格线性变化
“Fo rise or fall occurs, the natural course of this variable can be stylized as a straight, slightly tilted line, the so-called 'declination line'” 如果发生上升或下降,该变量的自然过程可以被程式化为一条稍微倾斜的直线,即所谓的“下倾线” “grossly approximated by straight lines that depart from or return to the tilted baseline. This concatenation of straight lines thus effectively reduces the continuously changing Fo to a sequence of discrete movements.” 粗略地近似为偏离或返回倾斜基线的直线。因此,这种直线串联有效地将连续变化的 F0 简化为一系列离散运动。 “linear approximations, when made audible in resynthesis, are perceptually indistinguishable from their natural counterparts” 当在重新合成中听到线性近似时,在感知上与它们的自然对应物没有区别 “Straight lines may not be the only possible elements for the stylization of an Fo curve (see, e.g., Fujisaki and Sudo, 1971).” 直线可能不是 F0曲线风格化的唯一可能元素(参见 Fujisaki 和 Sudo,1971)。
“Proposition 1.2: The linear changes that approximate the natural course of F0 have to be 'glides' rather than 'jumps'” 命题1.2:近似F0自然过程的线性变化必须是“滑行”而不是“跳跃”
“all perceptually relevant Fo changes require a certain amount of time, so as to give the impression of a pitch movement, not of a jump in pitc” 所有与感知相关的 Fo 变化都需要一定的时间,以便给人一种音高运动的印象,而不是音高跳跃的印象 “in production too, Fo changes require a certain amount of time - even if the subject is instructed to produce a sudden jump in pitch” 在制作中也是如此,Fo 的变化需要一定的时间 - 即使受试者被指示产生音高的突然跳跃 “Fo changes in natural speech are effected more slowly than the maximum speed allowed by the laryngeal control” 自然语音中 Fo 的变化受到的影响比喉部控制允许的最大速度更慢
“Proposition 2: At the first level of description, the smallest unit of perceptual analysis is the pitch movement” 命题2:在第一级描述中,感知分析的最小单位是音高运动
“unavoidable to analyse it in fairly 'global' terms, viz. with descriptive units that are co-extensive with a clause or sentence” 不可避免地要用相当“全球”的术语来分析它,即。具有与子句或句子共同扩展的描述性单元
“Palmer advocated an alternative approach in which the Tune is decomposed into its constituent elements, the pitch movements. The analytic detail that is introduced by the choice of a small-size descriptive unit may entail the risk that one can no longer see the wood for the trees” 帕尔默主张另一种方法,将曲调分解为其组成元素,即音高运动。通过选择小尺寸描述单元引入的分析细节可能会带来只见树木、不见森林的风险
“how the 'atomistic' and the 'global' levels of analysis can be integrated and reconciled.” 如何整合和协调“原子”和“全局”层面的分析。
“As such, they elegantly delimit discrete perceptual events. At the same time they capture the essence of the physical phenomenon that gives rise to these perceptual impressions. Therefore, the pitch movement is at the same time a perceptual and an acoustic unit” 因此,它们优雅地界定了离散的感知事件。同时,它们捕捉了产生这些感知印象的物理现象的本质。因此,音高运动同时是一个感知单位和一个听觉单位
“Proposition 2.1: Pitch movements can be decomposed into perceptual features along the following dimensions: their direction; their timing with regard to syllable boundaries; their rate of change; and their size” 命题2.1:俯仰运动可以沿着以下维度分解为感知特征:方向;音节边界的时间安排;他们的变化率;和它们的大小
“the standardization procedure, on the other hand, reveals into how many categories each dimension has to be divided and what the standard specifications for each category are. Presumably, the dimensions themselves are universal, whereas the categorization is language-specific” 另一方面,标准化程序揭示了每个维度需要划分多少个类别以及每个类别的标准规格是什么。据推测,维度本身是通用的,而分类是特定于语言的 “correctness of a transcription can be verified by making it audible through resynthesis, using the standard specifications inherent in the transcription symbols.” 转录的正确性可以通过使用转录符号固有的标准规范通过重新合成使其可听来验证。
“Proposition 2.2: The maximum number of categories along each melodic dimension is limited by universal constraints” 命题 2.2:每个旋律维度的最大类别数受到通用约束的限制
“Collier (1983) presents an overview of some of the perceptual and physiological constraints that define the class of possible tone and intonation systems” Collier (1983) 概述了一些感知和生理限制,这些限制定义了可能的音调和语调系统的类别 “As far as the variable speed of pitch movements is concerned, it appears from data by, for example, Sundberg (1979) that rises that cover an interval of four to twelve semitones require a minimum duration of 85 to 100 ms, which amounts to a physiologically constrained maximum rate of change of 120 semitones per second” 就音高运动的可变速度而言,从例如 Sundberg (1979) 的数据来看,覆盖 4 到 12 个半音间隔的上升需要 85 到 100 ms 的最短持续时间,这相当于生理限制最大变化率为每秒 120 个半音 “can one discriminate - within the bounds of a single syllable - rises or falls that differ in slope” 人们能否在单个音节的范围内区分斜率不同的上升或下降 “A comparable question can be asked concerning the discriminability of pitch movements that differ in size,” 可以提出一个类似的问题,涉及大小不同的俯仰运动的可辨别性, “Hart (1981) has shown that in the perception of running speech a pitch range of one octave can be quantized into no more than three or four distinguishable intervals” Hart (1981) 表明,在跑步语音的感知中,一个八度音阶的音高范围可以量化为不超过三个或四个可区分的间隔 “This perceptual constraint suggests that, in tone languages, the number of contrastive pitch levels per octave cannot exceed four” 这种感知限制表明,在声调语言中,每个八度音阶的对比音高级别的数量不能超过四个 “very few tone languages exploit this maximum number of contrasts among so-called 'register tones'” 很少有声调语言能够利用所谓的“音域声调”(高音中音低音)之间的最大数量的对比 “Many more have tonal systems in which a contrast between two or three 'register' tones is supplemented with a contrast between dynamic 'contour' tones.” 更多的人拥有音调系统,其中两个或三个“音域”音调之间的对比通过动态“轮廓”音调(升调降调)之间的对比来补充。 “a perceptual limit to the number of pitch movements that can be discriminated on the basis of their position in the syllable. In an average syllable of 200 ms duration, no more than three distinct positions can be kept apart perceptually.” 对音高运动数量的感知限制,可以根据音高运动在音节中的位置来区分。在平均 200 毫秒持续时间的音节中,感知上不能保持的不同位置不超过三个。
“Proposition 3: There are no pitch levels'”命题 3:不存在音高等级
“rises and falls are the smallest units of melodic description. They are also the basic units, in the sense that we assume them to be the true targets of intonation production.” 上升和下降是旋律描述的最小单位。从我们假设它们是语调产生的真正目标的意义上来说,它们也是基本单位。
“By positing pitch movements as the basic melodic targets we reject the alternative view that the speaker primarily intends to hit a particular pitch level and that the resulting movements are only the physiologically unavoidable transitions between any two basic levels.” 通过将音高运动定位为基本旋律目标,我们拒绝了另一种观点,即说话者主要打算达到特定的音高水平,并且由此产生的运动只是任何两个基本水平之间生理上不可避免的过渡。
“the excursion of a pitch movement does not enter into the definition of its basic melodic characteristics, except if it is conspicuously smaller than standard” 音高运动的偏移不属于其基本旋律特征的定义,除非它明显小于标准
“with prominence-lending pitch movements the excursion size is determined by quite a different prosodic variable, viz. degree of accentuation” 对于突出的音高运动,偏移大小由完全不同的韵律变量决定,即。强调程度“differences in the average range of the pitch movements from one utterance to the next correlate with changing paralinguistic factors, such as the emotional state of the speaker” 从一个话语到下一个话语的音调移动平均范围的差异与副语言因素的变化相关,例如说话者的情绪状态
“the primacy of pitch levels can be refuted on the following grounds. If the levels themselves were the primary targets of intonation production, one would expect the transitions between them (i.e. the pitch movements) to have invariant melodic properties” 音高的首要地位可以通过以下理由来反驳。如果级别本身是语调产生的主要目标,那么人们会期望它们之间的过渡(即音高运动)具有不变的旋律属性
“We have seen that this is not the case: pitch movements differ among each other in many ways and these differences contribute to their melodic identity” 我们已经看到情况并非如此:音高运动在许多方面彼此不同,这些差异导致了它们的旋律特征“In other words, the pitch movements have properties that are not merely accidental and predictable consequences of their putative function of bridging two pitch levels” 换句话说,音高运动具有的属性不仅仅是其桥接两个音高水平的假定功能的偶然和可预测的结果
“if one wants to predict correctly the relevant melodic properties of the pitch movements, while defending the primacy of the pitch levels, then these target levels will have to be specified in greater detail.” 如果想要正确预测音高运动的相关旋律特性,同时捍卫音高水平的首要地位,那么必须更详细地指定这些目标水平。
“But to enrich the 'levels' concept with a temporal dimension deprives it of its simplicity” 但是用时间维度来丰富“层次”概念就失去了它的简单性 “Apart from a specification for direction (rise or fall), they also require a specification for range” 除了方向(上升或下降)的规范外,它们还需要范围的规范 “The concept of pitch level may play a role in the typology of pitch movements, but it cannot be substituted for them as a descriptive unit.” 音高水平的概念可能在音高运动的类型学中发挥作用,但它不能作为描述性单位来替代它们。 “it appears that the 'levels' approach cannot cope with relevant timing-distinctions in speech pitch. Ladd (1983a) has attempted to provide a solution while arguing that the time dimension can be incorporated in the 'levels' approach, not as a basic feature, but as a 'modification'.” 看来“级别”方法无法应对语音音调中的相关时间差异。 Ladd (1983a) 试图提供一种解决方案,同时认为时间维度可以纳入“水平”方法中,不是作为基本特征,而是作为“修改”。 “A further problem with Ladd's analysis is, that [ ± delayed peak] only allows for a two-category distinction, whereas Dutch differentiates three positional categories.” Ladd 分析的另一个问题是,[±延迟峰值] 只允许进行两类区分,而 Dutch 则区分三种位置类别。
“we believe that the use of'levels' in a phonetic analysis of intonation is an oversimplification. And, even though it may be a commendable attempt at phonological data reduction, its application on the phonetic level runs counter to the phonetic facts of pitch-change production and perception.” 我们认为,在语调语音分析中使用“级别”过于简单化了。而且,尽管这可能是语音数据缩减方面值得称赞的尝试,但它在语音层面的应用与音高变化产生和感知的语音事实背道而驰。
“Proposition 4: The combinatory possibilities of the pitch movements are highly constrained and can be expressed F0rmally in a 'grammar' of intonation” 命题4:音高运动的组合可能性受到高度限制,并且可以用语调“语法”正式表达
“not sufficient to establish the inventory of its perceptually relevant pitch movements” 不足以建立其感知相关的俯仰运动的清单
“the pitch movements reveals that some of them combine with some others into higher-order structures and that, in turn, these structures can only be concatenated in compliance with specific rules of sequence” 音高运动表明,其中一些结构与其他一些结构结合成更高阶的结构,而这些结构又只能按照特定的顺序规则进行连接
“specify the combinatory possibilities of the pitch movements, a 'grammar' can be designed that generates all and only the permissible sequences of pitch movements in a given domain” 指定音高运动的组合可能性,可以设计一个“语法”来生成给定域中所有且仅允许的音高运动序列
“grammar of Dutch intonation” 荷兰语语调语法
“Proposition 4.1: At the second level of description, pitch movements combine into 'configurations'” 命题 4.1:在第二级描述中,音高运动组合成“配置” “Proposition 4.2: Pitch-movement configurations belong to one of the following paradigmatic classes: 'Prefix', 'Root' or 'Suffix'” 命题 4.2:音高运动配置属于以下范式类别之一:“前缀”、“根”或“后缀” “Proposition 4.3: At the third level of description, pitch-movement configurations combine into contours in accordance with syntagmatic constraints” 命题4.3:在第三级描述中,俯仰运动配置根据组合约束组合成轮廓
“Proposition 5: The unlimited number of different pitch contours are manifestations of a finite number of basic intonation patterns” 命题5:无限数量的不同音高轮廓是有限数量的基本音调模式的表现
“diversity stems from two sources: on the one hand, the actual sequence of pitch movements may be different from one contour to the other; on the other hand, the same sequence may be distributed differently over the utterance. The latter variation is caused mainly by differences in the location of the sentence accents.” 多样性源于两个来源:一方面,一个轮廓与另一个轮廓的实际俯仰运动顺序可能不同;另一方面,相同的序列可能在话语中以不同的方式分布。后者的变化主要是由句子重音位置的差异引起的。
“he intuitive resemblance must have its origin in a more abstract organizational principle, viz. the existence of underlying melodic categories which we call 'intonation patterns'.” 直观的相似性必定起源于一个更抽象的组织原则,即。我们称之为“语调模式”的潜在旋律类别的存在。
“each intonation pattern generates a number of 'variants' and all of these form a set of similar contours by virtue of their common origin.” 每种语调模式都会产生许多“变体”,所有这些变体都因其共同的起源而形成一组相似的轮廓。
“the listeners could consistently group the variants of the same pattern and discriminate the variants of different patterns.” 听众可以一致地将同一模式的变体分组并区分不同模式的变体。
“Their pairing of resembling contours is in accordance with our predicted grouping of pitch contours that derive from the same intonation pattern.” 它们的相似轮廓配对符合我们预测的源自相同语调模式的音高轮廓分组。
“In order to make a distinction between 'pattern' and 'contour' we now introduce the following notational convention” 为了区分“图案”和“轮廓”,我们现在引入以下符号约定
“the patterns are transcribed with the pitch-movement symbols that are their obligatory ingredients and these symbols are given between slashes” 这些模式是用音高运动符号转录的,这些符号是它们的必备成分,并且这些符号在斜杠之间给出 “the contours are transcribed with all the pitch movements they are composed of and these symbols are given between square brackets.” 轮廓记录了它们所组成的所有音高运动,这些符号在方括号之间给出。
“basic Tunes or patterns constitute a psychological reality” 基本的曲调或模式构成了心理现实
“Proposition 6: All the relevant features of pitch movements are controlled by the activity of the laryngeal muscles” 命题6:所有音调运动的相关特征均由喉部肌肉的活动控制
“the difficulty is compounded by the fact that one cannot always differentiate in the EMG recording of a single muscle those laryngeal gestures that were intended to produce the successive ingredients of the pitch contour and those that are not related to intonation production. The interpretation has to be mediated by insight into the perceptual relevance of the F0 changes that have been produced.” (p. 89) 更困难的是,人们不能总是在单个肌肉的肌电图记录中区分那些旨在产生音调轮廓的连续成分的喉部姿势和那些与语调产生无关的喉部姿势。解释必须通过对已产生的 F0 变化的感知相关性的洞察来调解。
“the fundamental frequency of vocal-fold vibration is, in principle, dependent upon two factors: the tension of the vocal folds themselves and the amount of subglottal pressure. There is an increasing body of production data to show that the control of F0 is sited in the larynx and that Ps has a negligible effect on the production of F0 rises and falls” (p. 89) 原则上,声带振动的基频取决于两个因素:声带本身的张力和声门下压力的大小。越来越多的生产数据表明,F0 的控制位于喉部,Ps 对 F0 生产上升和下降的影响可以忽略不计
“all pitch movements have their origin in the activity of the laryngeal muscles, predominantly in that of the cricothyroid (CT) muscles” (p. 89) 所有音高运动都起源于喉部肌肉的活动,主要是环甲肌 (CT) 的活动
“However, as will be explained in chapter 5, the gradual decrease of Ps may be held responsible to a large extent for the phenomenon of declination.” (p. 95) 然而,正如第 5 章将要解释的那样,Ps 的逐渐减小可能在很大程度上导致下倾现象。
“Proposition 7: Intonation takes precedence over accentuation in determining the shape of a pitch contour” 命题 7:在确定音高轮廓的形状时,语调优先于重音
“'prosody' as a cover term for at least three non-segmental phenomena: pitch variation, prominence and prosodic duration” “韵律”作为至少三种非分段现象的涵盖术语:音高变化、突出度和韵律持续时间
“Indeed, our dealing with prosodic matters has confronted us with the interaction of intonation and accentuation, especially at the phonetic level of description” 事实上,我们在处理韵律问题时面临着语调和重音的相互作用,尤其是在描述的语音层面上。
“it is still open to two alternative interpretations, to be paraphrased as two competing hypotheses.” 它仍然可以接受两种不同的解释,可以解释为两种相互竞争的假设。
“Hypothesis 1: Those pitch movements that occur on prominent syllables are entirely and exclusively caused by the accentuation demands of the utterance; the remaining pitch movements (and declination) result from the requirements of intonation proper.” 假设1:突出音节上发生的音高运动完全是由话语的重音要求引起的;其余的音高运动(和偏角)来自于适当的语调要求。
“This hypothesis states that a pitch contour is a sequence of pitch movements that are caused either by the accentual or by the intonational demands of the utterance, but never by both requirements simultaneously.” 该假设指出,音高轮廓是由语音的重音或语调要求引起的一系列音高运动,但绝不会同时由这两种要求引起。 “In other words, the overall pitch contour is a linear addition of accentual and intonational features.” 换句话说,整体音高轮廓是重音和语调特征的线性相加。 “we have argued against this functional separation of pitch phenomena. If rises and falls can provoke the impression of prominence, either singly or in combination, it should be immaterial which type of pitch change or combination of changes is selected to produce a pitch accent. In actual fact, this freedom of choice does not exist.” 我们反对这种音高现象的功能分离。如果上升和下降可以单独或组合地引起突出的印象,那么选择哪种类型的音高变化或变化组合来产生音高重音应该是无关紧要的。事实上,这种选择的自由并不存在。 “contours that violate this restriction do, in fact, produce the required pitch-accent impression, but they sound ill-formed. So, we suggest that the choice of the kind of accent-lending pitch movements is subordinate to the kind of intonation pattern that is to be implemented..” 事实上,违反此限制的轮廓确实会产生所需的音调印象,但听起来格式不正确。因此,我们建议重音借用音高运动类型的选择从属于要实施的语调模式类型。
“Hypothesis 2: (a) The melodical and sequential properties of all the pitch movements in a contour are solely determined by the intonation pattern that has been selected. (b) Among the pitch movements of any contour there may be one or more with such phonetic properties as are necessary to induce the perception of a pitch accent. (c) The location of the accent-lending pitch movement(s) is determined by the position of the words that have to be accented.” 假设 2:(a) 轮廓中所有音高运动的旋律和顺序特性仅由已选择的语调模式决定。 (b) 在任何轮廓的音高运动中,可能有一个或多个具有诱发音高重音感知所必需的语音特性。 (c) 重音借出音调移动的位置由必须重音的单词的位置决定。
“In summary, one cannot predict for every single accent which type of pitch movement should be used to implement it, unless one derives this choice from the prior decision as to which intonation pattern is to be implemented.” 总之,人们无法预测对于每一种口音应该使用哪种类型的音高运动来实现它,除非人们从关于要实现哪种语调模式的先前决定中得出这一选择。
“Proposition 8: The correspondence between intonation and syntax is neither obligatory nor unique”命题8:语调和句法之间的对应关系既不是强制性的,也不是唯一的
“with regard to such a demonstrable relation between syntax and intonation, we want to emphasize two points: first, the observable correspondence is not obligatory; and second, there are no melodic features that are uniquely and exclusively used for the purpose of marking aspects of syntactic structure” (p. 100) 对于句法和语调之间这种明显的关系,我们要强调两点:第一,可观察到的对应关系不是必然的;其次,没有旋律特征独特且专门用于标记句法结构方面的目的
“It must be kept in mind that we limit ourselves to purely melodic features, and refrain from taking the functional contribution of other prosodic factors into account (in particular pause and preboundary lengthening).” (p. 101) 必须记住,我们将自己限制在纯粹的旋律特征上,并且避免考虑其他韵律因素的功能贡献(特别是停顿和前界延长)。
“Optional correspondence:If an utterance consists of two or more clauses, the syntactic boundary can be marked by intonational means. However, the speaker always has the option not to do so.” (p. 101) 如果一个话语由两个或多个从句组成,则可以通过语调手段来标记句法边界。然而,发言者始终可以选择不这样做。
“Non-unique correspondence:Evidently, then, the correspondence between syntax and intonation is not deterministic: clause boundaries need not be marked melodically, and if they are, they are accompanied by pitch configurations that also occur in other syntactic environments” (p. 102) 显然,语法和语调之间的对应关系不是确定性的:子句边界不需要以旋律方式标记,如果是的话,它们会伴随着在其他句法环境中也出现的音高配置
“Bolinger's (1957/8: 36) statement: 'Intonation operates in its own sphere, and the uses that grammar makes of it are catch-as-catch-can': the two do not necessarily go separate ways, but they do not cling together all the time, either.” (p. 106) 博林格 (1957/8: 36) 的说法:“语调在它自己的范围内运作,语法对它的使用是无计划的”:两者不一定分道扬镳,但它们并不一直粘连在一起。
“The limited correspondence between intonation and syntactic structure makes it possible that, on a rare occasion, speech melody may constrain the syntactic hypothesis-formation in the presence of surface-structure ambiguity.” (p. 106) 语调和句法结构之间的有限对应性使得在极少数情况下,语音旋律可能会在表面结构模糊性存在的情况下限制句法假设的形成。
“some ambiguities may be resolved by prosodic means other than intonational. The number and location of sentence accents, the temporal structure of the words, the presence of pauses, etc., may be other (and better) cues for the syntactic organization of an utterance.” (p. 108) 有些歧义可以通过语调以外的韵律方式来解决。句子重音的数量和位置、单词的时间结构、停顿的存在等,可能是话语句法组织的其他(也是更好的)线索。
“the presence of a temporal marker induces the perception of a prosodic boundary in the great majority of the cases, whether or not it is accompanied by a pitch marker. Pitch markers alone are far less effective cues than temporal markers alone: only in about one-third of the cases does a pitch marker by itself lead to the perception of a prosodic boundary. In cases of conflict, the temporal marker always overrules the effect of the pitch marker.” (p. 108) 在大多数情况下,时间标记的存在会引起韵律边界的感知,无论它是否伴有音高标记。单独的音高标记远不如单独的时间标记有效:只有大约三分之一的情况下,音高标记本身会导致韵律边界的感知。在发生冲突的情况下,时间标记始终会否决音高标记的效果。
“Proposition 9: Intonation features have no intrinsic meaning” 命题9:语调特征没有内在意义
“intonation may affect the semantic interpretation of an utterance” (p. 110) 语调可能会影响话语的语义解释
“To the extent that intonation can differentiate between alternative syntactic interpretations of the same surface structure, it may be said to serve a semantic purpose, if only indirectly: it influences the syntactic analysis, which in turn affects the semantic interpretation.”(Hart 等, 1990, p. 110) 就语调可以区分同一表面结构的不同句法解释而言,它可以说服务于语义目的,即使只是间接的:它影响句法分析,进而影响语义解释。 “however, the semantic function of intonation is looked for in the paralinguistic domain of 'attitude'. The so-called 'tone of voice' is considered a means of vocal expression by which the speaker can add certain shades of meaning to an utterance: surprise, boredom, disbelief, anger, etc.”(Hart 等, 1990, p. 110) 然而,语调的语义功能是在“态度”的副语言领域中寻找的。所谓的“语气”被认为是一种声音表达方式,说话者可以通过这种方式为话语添加一定的含义:惊讶、无聊、怀疑、愤怒等。
“we might envisage that attitudinal considerations dictate the choice of a particular intonation pattern.” (p. 111) 我们可能会设想,态度方面的考虑决定了特定语调模式的选择。
“But for all we know, the number of basic intonation patterns per language is very limited, whereas the variety of shades of meaning and attitudes is nearly infinite” (p. 111) 但据我们所知,每种语言的基本语调模式的数量非常有限,而含义和态度的深浅变化却几乎是无限的
“We speculate that these intonation patterns choices are influenced by the attitudinal meaning that a speaker wants to add to the literal meaning of his utterances. But the actual encoding of this attitudinal meaning into an individual pitch contour is evidently governed by so many pragmatic and situational factors that we are still looking for a manageable experimental paradigm in which to tackle this complicated issue.” (p. 114) 我们推测这些语调模式的选择受到说话者想要添加到其话语的字面含义中的态度含义的影响。但是,将这种态度意义实际编码为单个音高轮廓显然受到许多实用和情境因素的控制,因此我们仍在寻找一种可管理的实验范式来解决这个复杂的问题。
“Proposition 10: The successful programming of a pitch contour requires only a limited look-head' strategy that integrates intonational, accentual and surface-syntactic inF0rmation” 命题 10:音高轮廓的成功编程只需要一个有限的“头脑”策略,该策略整合了语调、重音和表面句法信息
“The interaction of intonation with accentuation and syntax, described in propositions 8 and 9, implies that, in the process of programming a pitch contour, the speaker has to consult at least three sources of knowledge. The question then is: at which point(s) in time does which sort of knowledge become important?” (p. 114) 命题 8 和 9 中描述的语调与重音和句法的相互作用意味着,在设计音高轮廓的过程中,说话者必须查阅至少三个知识源。那么问题是:哪种知识在哪个(哪些)时间点变得重要?
“a possible programming model one in which the relevant decisions are taken sequentially, as one goes along through the utterance that is being developed” (p. 114) 一种可能的编程模型,其中相关决策是按照正在开发的话语进行的顺序做出的
“Tone-Sequence (TS) model. It describes the generation of a pitch contour as the realization of a sequence of pitch movements whose order of appearance is determined by the sequential constraints of the intonation grammar, and whose actual distribution over the utterance is a function of local syntactic and/or accentual features that are attached to the syllable. The locality principle implies that no 'look ahead' is necessary.” (p. 115) 音序 (TS) 模型。它将音高轮廓的生成描述为一系列音高运动的实现,其出现顺序由语调语法的顺序约束确定,并且其在话语上的实际分布是局部句法和/或重音特征的函数附加在音节上的。局部性原则意味着不需要“前瞻性”。
“Indirect evidence for the 'look-ahead' component in the speaker's strategy comes from observable intonation errors. Such errors often occur in reading aloud, when the speaker literally fails to look ahead far enough.” (p. 116) 说话者策略中“前瞻”成分的间接证据来自于可观察到的语调错误。这种错误经常发生在朗读过程中,因为说话者确实没有向前看得足够远。
矛盾
“the programming of a pitch contour requires the anticipation of at least two sorts of non-intonational information, viz. the presence of a next accent, specified as [+ last], and the occurrence of a clause-internal phrase boundary.” (p. 118) 音高轮廓的编程需要预期至少两种非语调信息,即,下一个重音的存在,指定为 [+ last],以及子句内部短语边界的出现。
“The important conclusion is that these two types of interaction cannot be accounted for on a syllable-by-syllable basis, as the Tone-Sequence model” (p. 118) 重要的结论是,这两种类型的交互不能像音序模型那样逐个音节地解释
总结
“Our theory is based on a bottom-up approach that is concerned with the concrete manifestation of intonation in the F0rm of speech melod” (p. 119) 我们的理论基于自下而上的方法,涉及语调在言语旋律中的具体表现
“Because we advocate the hierarchical supremacy of the intonation patterns, our views belong in the class of 'Contour-Interaction' models, not in that of 'Tone-Sequence' theories, to use Ladd's (1983a) terminology. ” (p. 120) 因为我们主张语调模式的等级至上,所以我们的观点属于“轮廓交互”模型的类别,而不是使用 Ladd(1983a)术语的“音调序列”理论。
Contour interaction means that the listener's perception of speech melody is constrained by top-down inF0rmation: he makes use of knowledge concerning the structural properties of contours, expressed in the intonation grammar. On the speaker's side, contour interaction implies that the successful production of a pitch contour requires a limited 'look-ahead' strategy that allows the integration of melodic and textual inF0rmation. 轮廓交互意味着听者对语音旋律的感知受到自上而下的信息的约束:他利用有关轮廓结构特性的知识,以语调语法表达。在说话者方面,轮廓交互意味着音高轮廓的成功产生需要有限的“前瞻”策略,该策略允许整合旋律和文本信息。
“Declination”
“5.0 Introduction”
“make a distinction between local and global attributes in intonational matters” 区分语调问题中的局部属性和全局属性
“Local attributes comprise characteristics of Fo relating to only a few syllables, whereas global attributes extend over longer stretches of speech, such as an entire clause or utterance” 局部属性包括仅与几个音节相关的 Fo 特征,而全局属性则扩展到较长的语音片段,例如整个子句或话语 “Admittedly, the gradual movements may extend over quite a number of syllables, but they do not cover an entire clause or utterance.” 诚然,渐进的运动可能会延伸到相当多的音节,但它们并不涵盖整个从句或话语。
“most important global attribute is the observed tendency of F0 to decrease slowly from beginning to end of an utterance” 最重要的全局属性是观察到 F0 从话语开始到结束缓慢下降的趋势
“relate the local rises and falls to some kind of reference line, for which we first chose a horizontal line through the overall average of F0 in the utterance” 将局部的上升和下降与某种参考线联系起来,为此我们首先选择一条通过话语中 F0 的整体平均值的水平线
“also a perceptually relevant attributeit: simply sounded more natural whenever declination was added.” 下倾也与感知相关:只要添加了下倾,听起来就更自然了。
“the basic assumption, formulated in chapter 3, that pitch movements considered relevant by the listener should be related to certain activities on the part of the speaker, characterized as discrete commands to the vocal cords, and should be recoverable as so many discrete events in the F0 recording.” 第 3 章提出的基本假设是,听者认为相关的音高运动应该与说话者的某些活动相关,其特征是对声带的离散命令,并且应该像 F0 中的许多离散事件一样可复现。
“but in stretches with mere declination, such activity is almost always entirely absent.” 但在仅有下倾的地区,这种活动几乎总是完全不存在。
“The basic assumption was put forward in an attempt to bring about a distinction between programmed, voluntary F0 changes, on the one hand, and physiologically determined, involuntary fluctuations, on the other.” 提出基本假设的目的是试图区分一方面是程序化的、自愿的 F0 变化,另一方面是生理决定的、非自愿的波动。
“since there is generally no pitch-related laryngeal activity in stretches with mere declination” 因为在仅有下倾的延伸段中通常不存在与音调相关的喉部活动
“The corresponding activity may be considered, for instance, to take place in the respiratory system.” 例如,可以认为相应的活动发生在呼吸系统中。
“Another consequence of the perceptual relevance of declination is that it might be linguistically relevant” 下倾感知相关性的另一个结果是它可能在语言上相关
“we will deal with acoustic and perceptual aspects of declination, and with a technique to establish its form of appearance in a reliable way” 解决下倾的声学和感知方面的问题,并采用一种技术以可靠的方式确定其外观形式
“a model will be proposed in which a control mechanism, meant to compensate for the loss of air during speech, and hence of air pressure, gives rise to declination as an inevitable by-product.” 提出一个模型,其中的控制机制旨在补偿语音期间的空气损失,从而补偿气压的损失,因而不可避免地产生下倾。
“communicative aspects” 交际方面
“perceived peak height as a function of position in the utterance” 感知峰值高度作为话语中位置的函数 “an expectation as to utterance duration on the basis of declination-related cues” 基于下倾相关线索对话语持续时间的期望 “declination resets and syntactic structure” 下倾重置和句法结构 “communicative relevance of resets” 重置的交际相关性
“5.1 Acoustic and perceptual aspects”
“the acoustic manifestation of declination is not always without problems” 磁偏角的声学表现并不总是没有问题
“there are quite a number of local perturbations, to the effect that the global trend is hardly visible, or at least that it is difficult to decide how to draw the (lower) declination line (the baseline, as it is called by Maeda, 1976).” 有相当多的局部扰动,导致全球趋势几乎不可见,或者至少很难决定如何绘制(较低的)赤纬线(前田所说的基线,1976 )。 “As an alternative, one could try to draw the so-called topline, connecting the peaks” 作为一种替代方案,人们可以尝试绘制所谓的顶线,连接山峰 “This is, of course, impossible if there is only one peak, but there is a more general drawback. Peaks are usually associated with accented syllables” 当然,如果只有一个峰值,这是不可能的,但有一个更普遍的缺点。峰值通常与重音音节相关 “it may be risky to connect successive peaks or valleys if one is not sure that these are the results of Fo changes that belong to the same category.” 如果不确定这些是属于同一类别的 Fo 变化的结果,那么连接连续的峰或谷可能是有风险的。 “several reasons to consider this theoretical issue a more serious threat to the adequacy of the topline than to that of the baseline” 有几个理由认为这个理论问题对顶线的充分性的威胁比对基线的威胁更严重“One is that while raising Fo requires muscular contraction, which can be applied in various degrees, its lowering is mostly the product of muscular relaxation, thus making the valleys phonetically more equivalent than the peaks. Another is that in utterances with few pitch accents the baseline will show up without any interpretative inference being necessary” 一是虽然提高 Fo 需要肌肉收缩(可以不同程度地应用),但其降低主要是肌肉放松的产物,因此使谷在语音上比峰更等效。另一个是,在几乎没有音调重音的话语中,基线将出现,而无需任何解释性推理
“In order to circumvent the difficulties with fitting the baseline by eye as mentioned above, it is often helpful to first make a close-copy stylization, and then to try to find out if the lower stretches can be replaced by pieces of one straight line, without serious perceptual consequences.” 为了避免上述通过肉眼拟合基线的困难,通常首先进行近距离复制风格化,然后尝试找出下部延伸是否可以用一条直线替换,这通常是有帮助的,没有严重的感知后果。
“The intermediate step of making a close copy together with a perceptual check may sometimes reveal that the baseline is interrupted by a 'declination reset': in many longer utterances, a rapid jump upwards of the baseline is observed at one or more places” 制作紧密副本和感知检查的中间步骤有时可能会显示基线被“偏角重置”中断:在许多较长的话语中,在一个或多个地方观察到基线快速向上跳跃
“In reporting on our own occupations with declination in Dutch and British English, we will maintain a dichotomy between the slope of declination, and of the baseline resets” 在用荷兰语和英式英语报告我们自己的下倾现象时,我们将在下倾斜率和基线重置之间保持二分法.
“5.1.1 Slope”
如何认识下倾?
“literally present as a tilted line whenever there are no local events” 每当没有本地事件时,字面上就会显示为一条倾斜线 “present in a more implicit way, e.g. as the consequence of Fo falls being larger than rises” 以更隐含的方式呈现,例如由于 Fo 的下降幅度大于上升幅度
“one with genuine declination, and rises and falls of equal size; the other one with falls larger than rises and no further declination, but such that the utterance-initial and -final frequencies were the same as in the former versions. When presented with these contours, listeners largely agreed that the second versions were entirely unnatural.” 一个具有真正的赤纬,并且上升和下降的幅度相等;另一种版本的下降幅度大于上升幅度,并且没有进一步的下降,但话语初始频率和最终频率与之前的版本相同。当看到这些轮廓时,听众基本上同意第二个版本完全不自然。

“Having established the need of the physical presence of a slowly falling course of F0 during stretches without local changes, we might ask if it is possible to make generalizing statements about the slope of the tilted baseline, on the basis of observed systematic properties. In other words, just as with the other perceptually relevant pitch movements, we are in need of some kind of standardization.” 在确定了在没有局部变化的拉伸过程中 F0 缓慢下降过程的物理存在的必要性之后,我们可能会问是否可以根据观察到的系统特性对倾斜基线的斜率做出概括性的陈述。换句话说,就像其他感知相关的音高运动一样,我们需要某种标准化。
“We feel no reason to reject the straight-line approximation, mainly because the concaveness just mentioned is, in view of the relatively low sensitivity of the human auditory system for differences in slope” 我们觉得没有理由拒绝直线近似,主要是因为刚才提到的凹度是考虑到人类听觉系统对斜率差异的敏感度相对较低 “The second question has to do with the topline. It has already been mentioned that it is generally more difficult to draw one line through the peaks than through the valleys. Nevertheless, it is not very difficult to see that, in general, the topline declines more steeply than the baseline, also in logconverted recordings. In our practice of making standardized stylizations, we make the topline (and the middle line in British English) parallel to the baseline, thus violating the acoustic reality. However, although highly trained listeners are capable of hearing the difference between parallel and converging toplines, contours with parallel toplines are not judged less natural than those with convergence.” 第二个问题与顶线有关。前面已经提到过,画一条穿过山峰的线通常比画一条穿过山谷的线更困难。然而,不难看出,一般来说,顶线比基线下降得更急剧,在对数转换的记录中也是如此。在我们进行标准化风格化的实践中,我们使顶线(以及英式英语中的中线)与基线平行,从而违反了声学现实。然而,尽管训练有素的听众能够听出平行背线和会聚背线之间的差异,但平行背线的轮廓并不被认为比会聚背线的轮廓更不自然。
“5.1.2 Baseline resets” 5.1.2 基线重置
关注起始频率终止频率和斜率的对比
 关注起始频率终止频率和斜率的对比 “The slopes of the declination-line fragments did not obey the formulae for uninterrupted declination lines, as was mentioned earlier. For this material, they appeared to be systematically less steep than would be calculated on the basis of the individual durations. On average, the difference turned out to be about 1 ST/s (with a standard deviation of 0.7 ST/s).” 如前所述,赤纬线片段的斜率不符合不间断赤纬线的公式。对于这种材料,它们的陡度似乎系统地低于根据各个持续时间计算的陡度。平均而言,差异约为 1 ST/s(标准差为 0.7 ST/s)。 “The acoustic analysis further showed that for three of the four speakers (forty-five sentences altogether), the end frequencies of the successive partial declination lines had a small, gradual decline; the resulting difference between the end frequencies of thefirstand the last partial declination line was 1 ST on average” 声学分析进一步表明,对于四个说话者中的三个(总共四十五个句子),连续的部分赤纬线的结束频率有一个小的、逐渐的下降;第一条和最后一条部分赤纬线的最终频率之间的最终频率差平均为 1 ST
“5.2 Production”
“declination slope” 下倾斜率
“examine whether there is a physiological mechanism that can be held responsible for declination in general. As will be shown in 5.2.1.1, a serious candidate for the explanation of declination is the slow decrease of the subglottal pressure (Ps), although several additional effects are conceivable” 检查是否存在某种生理机制可以解释一般的磁偏角。正如 5.2.1.1 所示,磁偏角的一个重要解释是声门下压力 (Ps) 的缓慢下降,尽管可以想象到一些额外的影响
“the activity of the external intercostal muscles such that the rather quick decrease of the volume of the thoracic cavity in normal breathing is slowed down (Ladefoged, 1967: 14). However, the expenditure of air causes the subglottal pressure to decrease, to the effect that the pressure drop over the glottis will soon fall below a minimum value needed for phonation, if no countermeasure is taken.” 外部肋间肌的活动,使得正常呼吸时胸腔体积相当快的减小速度减慢(Ladefoged,1967:14)。然而,空气的消耗导致声门下压力降低,结果是,如果不采取对策,声门上的压降将很快降至发声所需的最小值以下。 “This countermeasure consists in reducing the volume of the thoracic cavity by gradually relaxing the external intercostal muscles (and later, if necessary, activating the internal intercostal and other muscles).” 该对策包括通过逐渐放松外肋间肌(随后,如有必要,激活内肋间肌和其他肌肉)来减小胸腔的体积。 “From the measurements by Ohala (1970), Collier (1975a) and Atkinson (1978) it becomes apparent, however, that the effect of this countermeasure is such that there remains a slow decrease of Ps, and this may well cause Fo to fall gradually.” 然而,从 Ohala (1970)、Collier (1975a) 和 Atkinson (1978) 的测量中可以明显看出,这种对策的效果是 Ps 仍然缓慢下降,这很可能导致 Fo 逐渐下降。 “Such cases indicate that sometimes the decrease of Ps is not sufficient to explain the observed declination slope. Apparently, an additional pitch-lowering mechanism may be involved.” 这种情况表明,有时 Ps 的减小不足以解释观测到的赤纬斜率。显然,可能涉及额外的俯仰降低机构。 “a gradual relaxation of CT-activity could be held responsible for the steeper initial part of the declination line.” CT 活动的逐渐松弛可能是赤纬线初始部分较陡的原因。 “no reason to assume that in spontaneous speech the mechanism responsible for declination would be radically different from that in read-out speech.” 没有理由假设在自发语音中导致偏角的机制与读出语音中的机制完全不同。 “The outcome seems to suggest that speakers apply a certain amount of preplanning: if the duration of the utterance they are about to produce is known in advance, they can choose a start frequency and a slope suitable to finish at their individual end frequency.” 结果似乎表明说话者进行了一定程度的预先规划:如果他们将要发出的话语的持续时间提前已知,他们就可以选择一个起始频率和一个适合在其各自的结束频率处结束的斜率。
“answer to the question whether the mechanism responsible for the decreasing Ps is under voluntary control of the speaker, or must be considered an automatism” 回答以下问题:导致 Ps 降低的机制是否受到说话者的自愿控制,或者必须被视为自动机制
“What we have found above, for a more dynamic, speech-like situation (reiterant speech), constitutes an even more delicate balance: a compensation which is very close to, but consistently below 100 per cent. It is hardly conceivable that the speaker is able to perform this voluntarily. Rather, we must presume that the muscular activity involved is subject to an automatic control system” 我们在上面发现,对于一种更加动态、类似演讲的情况(重申演讲),构成了一种更加微妙的平衡:一种非常接近但始终低于 100% 的补偿。很难想象说话者能够自愿地做到这一点。相反,我们必须假设所涉及的肌肉活动受到自动控制系统的控制 “declination is largely an automatic by-product of properties of the respiratory system, since it is very unlikely that the tracheal-pull mechanism is under voluntary control of the speaker, either.” 磁偏角很大程度上是呼吸系统特性的自动副产品,因为气管拉动机制也不太可能受到说话者的自愿控制。 “it is not necessary to assume that the speaker controls the declining pitch syllable by syllable. On the contrary, he only needs to give a suitable value to one parameter for each fragment of speech which he plans to produce on one uninterrupted baseline. And in as far as the speaker has knowledge in advance about the duration of that fragment, he must be considered to be a match for this task.” 不必假设说话者逐个音节控制降调音高音节。相反,他只需要为他计划在一个不间断的基线上产生的每个语音片段的一个参数赋予一个合适的值。只要说话者事先知道该片段的持续时间,他就必须被认为适合这项任务。
“declination resets” 下倾重置
“Ps alone could not be held responsible for the observed amount of resetting, but that some laryngeal activity is involved as well.” Ps 本身不能对观察到的重置量负责,但也涉及一些喉部活动。“Ps primarily provides the driving force for phonation, and cannot explain the clause-initial Fo values in any detail.” Ps 主要提供发声的驱动力,无法详细解释分句词首 Fo 值。 “On the other hand, the difference in Fo for each pair of clause onsets correlated significantly with the difference in CT activity in the corresponding first syllables. This strongly supports the hypothesis that the onset frequency of the declination line in general, and of the newly started declination line after a reset in particular, is dependent on the level of CT activity.” 另一方面,每对从句起始的 Fo 差异与相应第一个音节中 CT 活动的差异显着相关。这有力地支持了以下假设:一般而言,赤纬线的起始频率,特别是重置后新开始的赤纬线的起始频率,取决于 CT 活动的水平。
“5.3 Communicative aspects”
slope
“the question how peak height is experienced by the listener.” 听众如何体验峰值高度的问题。 “Given the dependence of declination slope on utterance duration, it is to be expected that the slope may serve to enable the listener to predict, before the utterance has actually ended, for how long it is still going to be continued” 考虑到赤纬斜率对话语持续时间的依赖性,可以预期该斜率可以使听者能够在话语实际结束之前预测它还将继续持续多长时间
slope - peak height
slope - peak height “The general hypothesis was that 'listeners perceptually compensate for the declination effect when evaluating the relative height of successive peaks in an utterance' (p. 39).” 🔤一般假设是“听众在评估话语中连续峰值的相对高度时,会在感知上补偿磁偏角效应”(第 39 页)。🔤 “If we give the stimuli a standard slope of declination (one that is appropriate to the length of the utterance) then the second peak will be considered to be as high as the first one when it is objectively lower, and to be higher than the first peak when it is objectively equal to it.” 🔤如果我们给刺激一个标准的偏角斜率(适合于话语长度的斜率),那么当第二个峰值客观上较低时,将被认为与第一个峰值一样高,并且高于第一个峰值当客观上等于它时达到峰值。🔤 “When the stimuli are given a declination slope that is steeper than standard, the compensation will be proportionally greater; in other words, the overestimation of the relative height of the second peak will be stronger.” 🔤当刺激的赤纬斜率比标准更陡时,补偿将成比例地更大;换句话说,对第二个峰值相对高度的高估会更强。🔤 “When the two peaks are superimposed on a monotonous (not declining) baseline, the overestimation of the height of the second peak will be reduced to zero” 🔤当两个峰值叠加在单调(不下降)基线上时,对第二个峰值高度的高估将减少为零🔤 “When the stimuli consist of two syllables, each with a Fo peak, but separated by silence, likewise no compensation will take place.” 🔤当刺激由两个音节组成,每个音节都有一个 Fo 峰值,但被静音分开时,同样不会发生补偿。🔤 “The conditions were named N (normal baseline), S (steep baseline), M (monotonous baseline) and IS (isolated peaks).” “the conclusion must be that listeners are sensitive to perceived amount, but besides are influenced by expected amount of declination.” 🔤结论一定是听众对感知量很敏感,但除此之外还受到预期偏差量的影响。🔤
Slope - a predictor of total duration?
Slope - a predictor of total duration? “If a speaker intends to say a long sentence, but for some reason stops at a point at which the word content could suggest that the sentence is finished, can a listener then nevertheless decide, on the basis of phenomena connected with declination alone, that the sentence is not yet finished?” 如果说话者打算说一个长句子,但由于某种原因停在单词内容可能表明该句子已完成的点上,那么听者是否可以根据仅与偏角相关的现象来决定该句子句子还没说完? “three different cues may be present: viz. a higher start frequency, a less steep slope, and a higher end frequency than if the actually produced short utterance were said intentionally.” 可能存在三种不同的线索:即。与有意说出实际产生的短话语相比,更高的起始频率、更小的斜率和更高的结束频率。 “Two such bias effects were: if no pitch accent occurred (the presence or absence of pitch accents was one of the experimental variables), there was a bias towards 'not finished'; the longest sentence tended to be judged 'finished' more often than chance.” 两个这样的偏差效应是:如果没有出现音高重音(音高重音的存在或不存在是实验变量之一),则存在“未完成”的偏差;最长的句子往往被判断为“已完成”,而不是偶然。
reset
reset - Relation with syntactic structure
“The general tendency in the corpus is that the part of the sentence after the reset does not link up with the constituent immediately preceding it, but with an earlier constituent” 语料库中的总体趋势是重置后的句子部分不与紧邻其之前的成分联系起来,而是与更早的成分联系起来
reset - Influence on interpretation
reset - Influence on interpretation “one or more resets are steeper than the slope in an uninterrupted baseline. This implies that a reset can be introduced in an uninterrupted baseline by rotating clockwise the baseline parts on either side of the point where the reset is wanted, the left side round its starting point, the right side round its endpoint. Likewise, a reset can be removed by rotating the parts in the other direction until they have become collinear.” 一次或多次重置比不间断基线中的斜率更陡。这意味着可以通过顺时针旋转需要重置的点两侧的基线部分(左侧围绕其起点,右侧围绕其终点)在不间断基线中引入重置。同样,可以通过沿另一个方向旋转部件直至它们共线来消除重置。 “resets were introduced or removed to see if, in the first instance, listeners could hear the difference. The results were rather disappointing, probably due to the long durations and complicated structures of the sentences.” 引入或删除重置是为了看看听众是否能在第一时间听到差异。结果相当令人失望,可能是由于句子的持续时间长且结构复杂。 “a set of twelve much simpler ambiguous sentences was constructed.” 构建了一组十二个简单得多的歧义句子。 “The outcome of these experiments could lead to the generalizing view that it is local events only that have a communicative function, since, ofcourse, resets are local events.” 这些实验的结果可能会导致普遍的观点,即只有局部事件才具有交流功能,因为当然,重置是局部事件。 “an alternative way of reasoning is also possible. The fundamental frequency of a speaker is bound to a lowest limit. It is by virtue of this limit that the global phenomenon of declination must inevitably every now and then be interrupted by a reset. In other words, if the declination slope were nil, resetting would not be necessary. In that sense, not only to declination resets, but also to declination, a communicative function can be attributed.” 另一种推理方式也是可能的。扬声器的基频有一个最低限度。正是由于这种限制,全球赤纬现象必然时不时地被重置所打断。换句话说,如果赤纬斜率为零,则无需重置。从这个意义上说,不仅可以归因于赤纬重置,而且可以归因于赤纬,一种交际功能。
“5.4 Concluding remarks”结束语
“By means of the application of the close-copy technique, it has become possible to determine the slope of the baseline more reliably” 通过应用close-copy技术,可以更可靠地确定基线的斜率
“since it can be done not only on the basis of mere visual criteria, but also with the aid of the required perceptual equality between close-copy and resynthesized original” 因为它不仅可以基于单纯的视觉标准来完成,而且还可以借助精密复制和重新合成的原始内容之间所需的感知平等来完成
下倾与重置的产生
下倾与重置的产生 “As for the production of declination, evidence already existed in favour of a relationship with the decreasing subglottal pressure” 至于下倾的产生,已有证据表明与声门下压力降低有关 “For resets, on the other hand, there is substantial evidence that these are voluntarily controlled by the speaker, witness the laryngeal activity involved in their production.” 另一方面,对于重置,有大量证据表明这些是由说话者自愿控制的,见证了其产生过程中涉及的喉部活动。
长短话语中的下倾与重置
长短话语中的下倾与重置 “For relatively short utterances, the speaker may in most cases well know in advance their duration.” 对于相对较短的话语,说话者在大多数情况下可能预先清楚地知道其持续时间。 “Longer utterances are more difficult to preprogram as a whole, but, since the probability of the occurrence of resets increases with utterance length, knowing in advance their entire duration becomes a far less important issue.” 较长的话语整体上预编程更加困难,但是,由于重置发生的概率随着话语长度的增加而增加,因此提前知道其整个持续时间变得不再那么重要。
“signalling function of declination appears to be restricted, since its possible acoustic attributes, in as far as they are systematically present, are only just above the threshold of perception in most cases.” 下倾的指示功能似乎受到限制,因为就其系统地存在而言,其可能的声学属性在大多数情况下仅略高于感知阈值。
一些追问
“the question of whether the mental projection of the baseline can be interfered with when different declination slopes are applied to the stimulus material.” 当对刺激材料应用不同的下倾斜率时,基线的心理投射是否会受到干扰的问题。
“examine whether listeners' interpretations can be influenced by manipulating the location of resetting”检查听众的解释是否会受到操纵重置位置的影响
“Linguistic generalizations”
“6.0 Introduction”
“Our study of Dutch intonation has resulted in a melodic description of (nearly) all possible pitch contours of that language” 我们对荷兰语语调的研究得出了该语言(几乎)所有可能的音高轮廓的旋律描述
“Our study of Dutch intonation has resulted in a melodic description of (nearly) all possible pitch contours of that language” 🔤我们对荷兰语语调的研究得出了该语言(几乎)所有可能的音高轮廓的旋律描述🔤 “Contours are global melodic entities that tend to coincide with clauses or complete utterances. They can be broken down into the structural units that we labelled Prefix, Root and Suffix configurations, each of which consists of one or more discrete pitch movements.” 🔤轮廓是全局旋律实体,往往与从句或完整的话语一致。它们可以被分解为我们标记为前缀、根和后缀配置的结构单元,每个结构单元都由一个或多个离散的音高运动组成。🔤 “The atomistic pitch movements are the elementary descriptive units of our melodic model. We have supplemented their perceptual characterization with an acoustic definition that can be used to control the Fo parameter of synthetically produced speech” 🔤原子音高运动是我们旋律模型的基本描述单元。我们用声学定义补充了它们的感知特征,可用于控制合成语音的 Fo 参数🔤 “We have also established a link between perceptually relevant pitch changes and the voluntary manoeuvres that a speaker executes in order to produce variations in the course of Fo” 🔤我们还在感知相关的音高变化和说话者为了在 Fo 的过程中产生变化而执行的自愿动作之间建立了联系。🔤 “insights as to how the infinite variety of pitch contours relates to a restricted set of more abstract melodic categories or intonation patterns.” 🔤关于无限变化的音高轮廓如何与一组有限的更抽象的旋律类别或语调模式相关的见解。🔤
“different contours belong to the same intonation pattern if they share a common Root configuration” 如果不同的轮廓共享相同的根配置,则它们属于相同的语调模式
“different contours belong to the same intonation pattern if they share a common Root configuration” 🔤如果不同的轮廓共享相同的根配置,则它们属于相同的语调模式🔤 “'grammar of intonation', which is primarily a generative device that produces well-formed strings of pitch movements without much internal structure” 🔤“语调语法”,主要是一种生成装置,可以产生格式良好的音高运动串,而无需太多内部结构🔤 “It may be attempted to express melodic regularities in terms of basic forms, which then enter a set of rules that derive all possible alternative representations” 🔤可以尝试用基本形式来表达旋律规律,然后输入一组规则,派生出所有可能的替代表示🔤
“efF0rts to make the structure of Dutch intonation” 努力构建荷兰语语调结构
“efforts to make the structure of Dutch intonation” 🔤努力构建荷兰语语调结构🔤 “number of descriptive units and categories and will show how they apply to abstract intonational structures” 🔤描述性单元和类别的数量,并将展示它们如何应用于抽象语调结构🔤 “a set of derivation rules that convert underlying intonation patterns into more elaborate and concrete melodic entities” 🔤一组推导规则,将潜在的语调模式转换为更复杂和具体的旋律实体🔤 “how melodic structures and textual (sentential and segmental) elements can be mapped onto each other” 🔤旋律结构和文本(句子和片段)元素如何相互映射🔤 “examine whether our own generalizations can profitably be pushed one step further by introducing even higher levels of abstraction.” 🔤检查我们自己的概括是否可以通过引入更高层次的抽象而有利地进一步推进。🔤
“6.2 Ladd's analysis of the 'hat pattern’”Ladd 对“帽子模式”的分析
“a 'top-down' approach, going from underlying forms to systematicphonetic representations, should never lose sight of the phonetic reality at the bottom end of the derivational process. Therefore, in working out an alternative analysis of our own, we have proceeded in a 'bottom-up' fashion, and have avoided positing underlying representations that depart from surface phonetic forms.” (p. 167) 从基本形式到系统语音表示的“自上而下”方法永远不应忽视推导过程底部的语音现实。因此,在制定我们自己的替代分析时,我们以“自下而上”的方式进行,并避免提出偏离表面语音形式的底层表示。
“6.3 General discussion”
melodical features of Dutch
melodical features of Dutch “'basic contour' is defined by the grammar: it is any sequence of pitch movements that can be generated by its first rule, (Rl).” 🔤“基本轮廓”由语法定义:它是可以由其第一规则(R1)生成的任何音高运动序列。🔤 “Basic contours minimally consist of an obligatory Root, which can be preceded by a Prefix and - in certain cases - be followed by a Suffix. Their maximum expansion is obtained by multiplying the number of Prefixes.” 🔤基本轮廓至少由一个必需的根组成,其前面可以有一个前缀,并且在某些情况下,后面可以有一个后缀。它们的最大扩展是通过乘以前缀的数量来获得的。🔤
“Ladd's attempt to carry the generalizations one step further, to what may be called a 'phonological' level of intonation analysis” 拉德试图将概括更进一步,达到所谓的语调分析的“语音”水平
“Ladd's attempt to carry the generalizations one step further, to what may be called a 'phonological' level of intonation analysis” 🔤拉德试图将概括更进一步,达到所谓的语调分析的“语音”水平🔤 “One major difference between his phonological and our phonetic description resides in the choice of descriptive units. Ladd and other phonologists (e.g. Gussenhoven, Liberman, Pierrehumbert) replace pitch movements by stationary pitch targets: H (high) and L (low)” 🔤他的语音描述与我们的语音描述之间的一个主要区别在于描述单位的选择。 Ladd 和其他音韵学家(例如 Gussenhoven、Liberman、Pierrehumbert)用固定的音高目标取代音高运动:H(高)和 L(低)🔤 “two basic units, especially the H target, have inherent properties that can be modified by rule. Thus, for instance, the pitch peak inherent in H can become [ + delayed], or the high pitch level of H can be 'downstepped', which results in a small local fall” 🔤两个基本单位,尤其是H目标,具有可以通过规则修改的固有属性。因此,例如,H 中固有的音高峰值可能会变得[+延迟],或者 H 的高音高水平可能会“下降”,从而导致局部小幅下降🔤 “the inventory ofmelodically important units is reduced to just two pitch targets, which are scaled in frequency (position in the speaker's pitch range) and in time (position in the syllable)” 🔤旋律上重要的单元的库存减少到只有两个音高目标,它们按频率(说话者音调范围中的位置)和时间(音节中的位置)进行缩放🔤 “criticism against Ladd's approach to Dutch intonation is that the reduction of perceptually distinct pitch movements to just two abstract pitch targets undermines the identity of the intonation patterns. The latter differ categorically from each other because their Roots consist of essentially different pitch movements.” 🔤对拉德处理荷兰语语调的方法的批评是,将感知上不同的音高运动减少到仅两个抽象的音高目标破坏了语调模式的同一性。后者彼此截然不同,因为它们的根音由本质上不同的音高运动组成。🔤
“An important measure for the success of the phonological enterprise will be whether the proposed solutions are psycholinguistically and phonetically plausible.” 衡量语音事业成功与否的一个重要标准是所提出的解决方案在心理语言学和语音学上是否合理。
“Applications”
“7.1 Existing applications”
“7.1.1 Acquisition of intonation” 7.1.1 语调的习得
“explicit descriptions have become available, it has become possible to choose a different approach: viz. one in which the student is told explicitly what the rules are, in order to have them internalized. Such a strategy could be called a cognitive approach.” 明确的描述已经可用,因此可以选择不同的方法:即。学生被明确告知规则是什么,以便将其内化。这种策略可以称为认知方法。
“The acquisition of the intonation of a foreign language was until recently dependent on 'listen-and-repeat' drills. The disappointing result, as it was often felt, was that students talented in imitation, who would have been able to learn it anyhow, even without these drills, could profit from them, but the others would not.”直到最近,外语语调的习得还依赖于“听并重复”的练习。人们常常感到令人失望的结果是,那些有模仿天赋的学生,即使没有这些练习,无论如何也能学会模仿,可以从中受益,但其他人却不能。 “The results of our investigation do not merely consist of a description of the physical events: it has also been our aim to give an account of the perceptual structure in Fo curves. Consequently, we thought it possible to provide the student with explicitly described perceptual targets. We have therefore tried to devise a teaching method that would make the student conscious of the same features of intonation as those that are relevant to the native ear.” 我们的研究结果不仅仅包括对物理事件的描述:我们的目标也是解释 Fo 曲线中的感知结构。因此,我们认为可以为学生提供明确描述的感知目标。因此,我们试图设计一种教学方法,使学生意识到与母耳相关的相同语调特征。 “A possible objection to such a method could be, that, in general, too much concentration on isolated events in a training programme might prevent the student from learning to produce the aimed-at activity in an integrated way” 对这种方法的一个可能的反对意见可能是,一般来说,过多关注培训计划中的孤立事件可能会阻止学生学习以综合方式进行目标活动 “However, our investigations on intonation have revealed that the perceptually relevant pitch movements do constitute recombinable elements, to the effect that, within the sequential constraints of a pattern, fully acceptable pitch contours are obtained.” 然而,我们对语调的研究表明,感知相关的音高运动确实构成了可重组元素,从而在模式的顺序约束内获得完全可接受的音高轮廓。 “what we expected to be essential for the student: to be able to hear these elements so analytically that their perceptual targets could be established.” 我们期望对学生来说至关重要的是:能够分析地听到这些元素,从而可以确定他们的感知目标。
“We expect that a confrontation with stylized pitch contours in a visual-feedback condition will turn out to give larger effects, since in that case, the student will immediately see the perceptually relevant movements as discrete events.” 我们期望在视觉反馈条件下与程式化的音调轮廓的对抗将产生更大的效果,因为在这种情况下,学生将立即将感知相关的运动视为离散事件。
“an instruction of a quarter of an hour in which subjects were made aware of the phenomenon of intonation by means of a number of auditorily and visually presented demonstrations” 一刻钟的教学,通过一系列听觉和视觉演示让受试者意识到语调现象【实验证明有一定效果】
“7.1.2 An intonable electrolarynx (for Dutch” 7.1.2 无法发音的电喉
“7.1.3 An aid for the vocally handicapped” 7.1.3 声乐障碍者的援助
“7.1.4 Application in linguistic and phonetic research” 7.1.4 在语言学和语音研究中的应用
“Our contribution to the study of pitch in speech constitutes a necessary, but only modest,firststep towards tackling more central issues of intonation” 我们对语音音高研究的贡献是解决更核心的语调问题的必要但有限的第一步
“Our analysis has only brought to light which variations of Fo can occur in speech; it has only to a limited extent answered the question under which conditions which pitch movements can take place, or the question under which conditions which basic patterns are used.” 我们的分析仅揭示了 F0 的哪些变体可以在言语中出现;它仅在有限程度上回答了在什么条件下可以发生音高运动的问题,或者在什么条件下使用基本模式的问题。 “we know now how accentuation and the marking of prosodic boundaries are connected to particular melodic phenomena, we do not know which words should or may be accented in a given context, or whether or not a given syntactic boundary should or may be marked intonationally.” 我们现在知道重音和韵律边界的标记如何与特定的旋律现象联系起来,我们不知道在给定的上下文中哪些单词应该或可以重读,或者给定的句法边界是否应该或可以用语调标记。 “Yet, it is our conviction that the more central issues can hardly be approached fruitfully without the tools lower-level analysis has provided: it is useless to try to examine why or under what conditions potentially communicatively relevant events occur in intonation as long as the means to describe these events adequately are lacking. Quite a few authors have shown that they share this opinion by using our results in their own investigations.” 然而,我们坚信,如果没有低级分析提供的工具,就很难有效地解决更核心的问题:只要手段有效,试图研究为什么或在什么条件下会发生潜在的交流相关事件的语调是没有用的。缺乏充分描述这些事件的能力。相当多的作者通过在他们自己的调查中使用我们的结果表明他们同意这一观点。
“We believe that the substantial improvement of the possibilities of giving an adequate auditory transcription of speech pitch is largely due to the fact that a set of suitable descriptive units have become available.” 我们相信,提供足够的语音音调听觉转录的可能性的显着提高很大程度上是由于一组合适的描述单元已经可用。
“the transcriber should first make himself familiar with the inventory of the movements, and with the combination restrictions as recorded in the grammar. The familiarity should be so intense that the transcriber has learnt to recognize the various movements. In this, help can be offered in specific training, which has now become possible, thanks to the explicitness of the description” 抄写员首先应熟悉动作清单以及语法中记录的组合限制。熟悉程度应该如此强烈,以至于抄写员已经学会识别各种动作。在这方面,可以在特定培训中提供帮助,由于描述的明确性,现在这已成为可能
“7.2 Potential applications”
“7.2.1 Second-language acquisition by Dutch students” 7.2.1 荷兰学生的第二语言习得
“We believe that on the basis of the IPO approach, with its emphasis on the perceptually relevant pitch movements, it should be possible to design intonation courses for foreign languages in which very explicit instructions are given about where to produce which kinds of pitch movement” 我们认为,在 IPO 方法的基础上,强调感知相关的音高运动,应该可以设计语调课程,其中给出关于在哪里产生哪种音高运动的非常明确的指示。 “of each of the target languages, on the other, can be compared in a feasible and reliable way. This may make it easier to take into account mother-tongue interferences that can be expected to occur.” 另一方面,可以以可行且可靠的方式对每种目标语言进行比较。这可能会让我们更容易考虑到可能发生的母语干扰。
“7.2.2 Automatic stylization for use in feedback systems” 7.2.2 用于反馈系统的自动风格化
“we made the supposition that the use of stylized contours in the visual presentation of a teacher's model intonation and in the visual feedback of the student's imitations would give better results than can be obtained with only slightly smoothed Fo curves.” 我们假设,在教师模型语调的视觉呈现和学生模仿的视觉反馈中使用程式化轮廓将比仅使用稍微平滑的 Fo 曲线获得更好的结果。 “although correction of the intonation alone does not notably improve the intelligibility of the speech of the deaf, there is a tremendous positive interaction between segmental and suprasegmental correction, if applied together.”虽然单独的语调校正并不能显着提高聋人言语的清晰度,但如果一起应用,分段校正和超分段校正之间存在巨大的积极相互作用。
“7.2.3 Synthesis by rule” 7.2.3 按规则合成
“a number of computer programs are available for making artificial pitch contours. These can be divided into two categories: viz. programs for expert users and programs for those people who, for some reason or other, want spoken output with intonation of their choice, but who are not phoneticians themselves” 有许多计算机程序可用于制作人工音高轮廓。这些可以分为两类:即。为专家用户提供的程序,以及为那些出于某种原因希望以自己选择的语调进行口语输出但本身不是语音学家的人提供的程序
“The former programs offer facilities ranging from full freedom in every respect, as is necessary for making close-copy stylizations, to freedom only with respect to the combinatory possibilities of the movements, which themselves have standard specifications.” 前者提供了各种便利,从各个方面的完全自由(这是制作精密复制风格所必需的)到仅在运动组合可能性方面的自由(运动本身具有标准规格)。
“In the latter programs, the intonation grammar (for either Dutch or British English) is built in, to the effect that all variants of all basic patterns can be made in fully standardized form, with the automatic exclusion of ungrammaticalities” 在后者中,内置了语调语法(荷兰语或英国英语),以便所有基本模式的所有变体都可以完全标准化的形式制作,并自动排除不语法的情况
“investigators wanting to do psycholinguistic experiments on sentence comprehension, and fearing that (improper) intonation might constitute a disturbing factor, often avoided using stimuli with very large variations of Fo. That is, in principle at least, no longer necessary when use is made of these programs” 想要对句子理解进行心理语言学实验的研究人员,担心(不正确的)语调可能构成干扰因素,通常避免使用 Fo 变化很大的刺激。也就是说,至少在原则上,当使用这些程序时不再需要 “However, necessary input requirements for the latter programs are the choice of a basic pattern, the locations of vowel onsets of syllables that will be accented, of boundaries one wishes to mark, and of the end of voicing of syllables that are to receive a continuation rise or a final rise. Additionally, there are a great number of options for the generation of any of the variants of the basic patterns, as far as these are grammatically correct in the chosen melodic context.” 然而,后一程序的必要输入要求是选择基本模式、将重音的音节的元音起始位置、希望标记的边界以及要接收延续的音节的发声结尾上升或最终上升。此外,还有大量选项可用于生成基本模式的任何变体,只要这些选项在所选旋律上下文中语法正确即可。 “Although the grammatical constraints are monitored by the program itself, non-expert researchers who want intoned spoken output still have to take care of quite a few detailed input requirements beyond their primary interest. This brings us to the point that, ultimately, one wants to have at one's disposal a system in which intonation can be generated entirely by rule.” 尽管语法限制是由程序本身监控的,但想要语音输出的非专家研究人员仍然必须处理超出其主要兴趣的相当多的详细输入要求。这使我们得出这样的结论:最终,人们希望拥有一种可以完全按照规则生成语调的系统。 “in the foreseeable future, computer systems and other apparatus will communicate with their users partly in spoken language, rather than exclusively in written form on a screen.” 在可预见的未来,计算机系统和其他设备将部分以口语方式与其用户进行交流,而不仅仅是在屏幕上以书面形式进行交流。
“It also has a heuristic value, since it shows the limits of one's knowledge, in the sense that incomplete knowledge will betray itself sooner or later in the generation of unacceptable output. We will elaborate on this by means of a few examples about boundary marking.” 它还具有启发价值,因为它显示了一个人知识的局限性,即不完整的知识迟早会在产生不可接受的输出时暴露自己。这体现在有关边界标记和重音上。
“Apart from the problem of determining which boundaries should be marked, may be marked or should not be marked, much the same problem exists for accentuation. But, of late, this problem has been studied intensively, also in the Netherlands, and the results justify the expectation that in future text-to-speech systems the number of errors (incorrect accentuation and also incorrect de-accentuation) can be kept limited, and that such errors will not always be very disturbing to the listener.” 除了确定哪些边界应该被标记、可以被标记或不应该被标记的问题之外,对于强调也存在同样的问题。但是,最近,荷兰也对这个问题进行了深入研究,结果证实了以下预期:未来的文本到语音系统可以限制错误的数量(不正确的重音和不正确的去重音),并且这样的错误不会总是对听众造成很大的干扰。 “develop office computer-systems with spoken input and output, using the German language. The assignment of pitch accents is partly based on a syntactic analysis of the message the system is going to produce as output. In order to enable the system to formulate a correct answer to the question spoken by the user, this input message is submitted to a semantic analysis. This offers the possibility of basing the pitch-accent assignment also on the content of the input message” 使用德语开发具有语音输入和输出的办公计算机系统。音调重音的分配部分基于对系统将作为输出生成的消息的句法分析。为了使系统能够对用户提出的问题制定正确的答案,该输入消息被提交进行语义分析。这提供了基于输入消息的内容进行音高重音分配的可能性。
“7.2.4 Application in automatic speech recognition” 7.2.4 在自动语音识别中的应用
“prosodic analysis can be helpful in, roughly, three different respects:” 韵律分析大致可以在三个不同方面有所帮助
“1. The detection of stressed syllables may serve to provide 'islands of phonetic reliability' (p. 170) or 'anchors for reliable phonetic analysis' (p. 201): their segmental phonetic analysis can be trusted more than that of other syllables.” 1. 重读音节的检测可能有助于提供“语音可靠性岛”(第 170 页)或“可靠语音分析的锚点”(第 201 页):它们的分段语音分析比其他音节的分段语音分析更值得信赖。 “2. As far as the speech-recognition system makes an appeal to syntactic information, 'prosodic analysis offers an independent way of acoustically detecting some aspects of syntactic structure'” 2. 就语音识别系统对句法信息的诉求而言,“韵律分析提供了一种独立的方法来从声学角度检测句法结构的某些方面” “3. Prosodic features constitute cues to voicing, to locations of syllable nuclei (vowels), to occurrences of glottal stops, etc.” 3. 韵律特征构成发声、音节核(元音)位置、声门塞音出现等的线索。
“three important requirements should be met in order to make a prosodic analysis applicable” 为了使韵律分析适用,应满足三个重要要求
“the technical facilities should be sufficiently refined to detect the sometimes very subtle effects of prosody on the speech signal” 技术设施应足够完善,以检测韵律对语音信号有时非常微妙的影响 “the analysis system must be instructed what to look for; in order to be able to recognize a specific contribution of any of the prosodic parameters, the underlying prosodic system of the language at issue should be sufficiently known.” 必须指示分析系统要寻找什么;为了能够识别任何韵律参数的具体贡献,应该充分了解所讨论语言的基本韵律系统。 “it should be possible to evaluate the extent to which the prosodic analyser is successful as such, and the extent to which it contributes to the automatic speech-recognition process.” 应该可以评估韵律分析器本身的成功程度,以及它对自动语音识别过程的贡献程度。
“7.3 Conclusions”
“there is a tremendous amount of work still to be done.” 还有大量工作要做。
“F0r instance, the writing of a course for Dutch students of British English has not yet been started, although that was one of the main objectives from the very outset of our study of British English intonation” 例如,为荷兰学生学习英式英语的课程的编写尚未开始,尽管这是我们研究英式英语语调一开始的主要目标之一
“an experiment will have to be done to provide support for our conviction that an explicit, cognitive course of intonation will appear to be more helpful than the traditional listen-andrepeat method.” 必须进行一项实验来支持我们的信念,即明确的、认知的语调过程似乎比传统的聆听和重复方法更有帮助。
“in order to be applicable in speech training for deaf children, it should be implemented in hardware, to make real-time operation possible. ” 为了适用于聋哑儿童的言语训练,需要在硬件上实现,使实时操作成为可能。
Next, a training programme will have to be designed that will cope with many uncertainties, among others those resulting from the fact that congenitally deaf people do not know what pitch is.接下来,必须设计一个训练计划来应对许多不确定性,其中包括由于先天性聋人不知道音调是什么而导致的不确定性。
“although the primary aim of the analysis was not to collect precepts for the control of F0 in synthetic speech, yet ourfindingscan successfully be used in the generation of artificial speech.” 尽管分析的主要目的不是收集合成语音中控制 F0 的规则,但我们的发现可以成功地用于人工语音的生成。
“However, when it comes to the synthesis of entire texts, the repetition of standard contours, derived from only one basic pattern, may give rise to a lower acceptability than is measured in an experiment with isolated sentences, with a systematic alternation of basic patterns.” 然而,当涉及到整个文本的合成时,仅源自一种基本模式的标准轮廓的重复可能会导致比在基本模式的系统交替的孤立句子的实验中测得的可接受性更低。 “Therefore, efforts are still being invested in attempts to develop rules for variation of excursion, and for a feasible alternation of boundary markers” 因此,仍在努力尝试制定偏移变化的规则以及边界标记的可行更换 “Moreover, experiments are being carried out to examine the conditions in which microintonation phenomena (including vowel-intrinsic pitch) may contribute to liveliness and naturalness in artificial intonation.”此外,正在进行实验以检查微语调现象(包括元音固有音调)可能有助于人工语调的生动性和自然性的条件。
“Conclusion”
概要
“we would like to assess the main points that we have tried to make in establishing the feasibility of approaching pitch in speech from a mainly perceptual point of view.” 我们想评估从感知角度在语音音调处理可行性方面尝试提出的要点。
“This focussing on perception was initially inspired by the dissatisfaction we felt with the state of the art at the start of our programme.” 这种对感知的关注最初是由我们在项目开始时对现有技术的不满所激发的。
“These are to be found in the inadequacy of the methods used within a linguistic framework, based on the primacy of semantic functioning, the principle of distinctivity” 这些问题是由于在语言框架内使用的方法不足而造成的,这些方法基于语义功能的首要性、独特性原则
“top down versus bottom up, leading to our endeavour to reconcile them in our own attempt at modelling the listeners' way of processing intonational cues in speech” 自上而下与自下而上的比较,导致我们在尝试模拟听众处理语音中语调线索的方式时努力协调它们
“8.1 Top down”自上而下
起:很多现象纯粹的语言学框架下并不能解释
起:很多现象纯粹的语言学框架下并不能解释 “The broad field of prosodic cues allows a large amount of freedom to language users to alter or modify their intended vocal expressions in any number of dimensions, including mood, awareness of purpose to elicit reactions from a partner in a dialogue, to reach and convince an audience, to emphasize and contrast any chosen part of a message, none of which clearly can be accounted for within a purely linguistic framework. Such a narrow linguistic approach would seem to run the risk of seriously underdifferentiating important intonational cues.” 韵律线索的广泛领域为语言使用者提供了很大的自由,可以在任何维度上改变或修改他们想要的声音表达,包括情绪、目的意识,以引起对话伙伴的反应,达到并说服对方受众,强调和对比信息的任何选定部分,这些部分都不能在纯粹的语言框架内清楚地解释。这种狭隘的语言方法似乎存在严重低估重要语调线索的风险。
承:区别原则无法区分韵律层面上所涉及的微妙意义
承:区别原则无法区分韵律层面上所涉及的微妙意义 “the principle of distinctivity, which proved to be such a helpful tool in clearly discriminating discrete differences in meaning on the phonological level, cannot easily be upheld: here the existence of a lexicon provides a reliable criterion to decide whether or not observed segmental differences give rise to differences in meaning of words; such a yardstick is not available in intonational matters. 区别原则被证明是一个非常有用的工具,可以在语音层面上清楚地区分意义的离散差异,但它并不容易得到维护:在这里,词典的存在提供了一个可靠的标准来决定观察到的片段差异是否会产生词语含义的差异;这种标准不适用于语调问题。 Indeed, the principle of distinctivity is incapable of differentiating the subtle shades of meaning involved on a prosodic level. 事实上,区别原则无法区分韵律层面上所涉及的微妙意义。 Moreover, in order to account for attitudinal meaning, one needs a gradual scale instead of all-or-none decisions.” 此外,为了解释态度意义,我们需要一个渐进的尺度,而不是全有或全无的决定。
转:从自上而下转为自下而上
“We therefore decided, for the time being, to lay aside this top-down approach, and to start from the other end, working bottom up. This option in reality involved shelving the principle of distinctivity until later and concentrating on trying to discover structural properties of the phenomenal attributes ofintonation in terms ofmelodic characteristics, to be established in perceptual experiments.” 因此,我们决定暂时搁置这种自上而下的方法,而从另一端开始,自下而上地工作。实际上,这种选择涉及区别原则搁置到以后,并集中精力通过感知实验尝试发现与建立在旋律特征方面的语调现象属性的结构特性。
合:自下而上得确定描述语调的单位
合:自下而上得确定描述语调的单位 “The reason for staying clear of the allegedly straight and narrow path of linguistic distinctivity is not so much the intricacy inherent in meaning as such, formidable as it is, in particular with respect to intonation, but rather the absence of a reliable tool with which to reveal the actual pitch phenomena that are supposed to represent the various shades ofmeaning. 我们之所以要避开所谓的语言区别性的狭隘之路,与其说是因为意义本身的复杂性,尤其是语调本身的复杂性,不如说是因为缺乏一种可靠的工具来揭示实际的音高现象,而这些现象本应代表不同的意义。 Therefore, the major reason for adopting our own way of dealing with intonation was that, at the start of our efforts, the basic groundwork, on which a reliable description of the characteristics of intonation would have to depend, was lacking” 因此,采用我们自己的语调处理方式的主要原因是,在我们开始努力的时候,缺乏对语调特征的可靠描述所依赖的基本基础。
“8.2 Bottom up”自下而上
“attempting to set up the necessary framework for acquiring reliable tools in describing intonational features” 尝试建立必要的框架以获得描述语调特征的可靠工具
“we were aware of the opposite danger to underdifferentiation: viz. that of overdifferentiating pitch cues that may be shown to obtain in the speech signal under laboratory conditions in a psychoacoustic setting” 我们意识到分化不足的相反危险:即。在心理声学环境下的实验室条件下,可以在语音信号中显示过度分化的音调线索 “need to try to reduce the available acoustic data within a manageable framework” 需要尝试在可管理的框架内减少可用的声学数据 “as compared with the segmental field, where there is a consensus about the descriptive entities, nothing so ready to hand is found in intonation” 与对描述实体有共识的分段领域相比,语调中找不到如此现成的东西
“establish, by experimental means, a model of how listeners deal with pitch cues in speech” 通过实验手段建立听众如何处理语音中的音调线索的模型
“we termed our basic assumption. This implied that only those F0 changes would be regarded as possible candidates for a descriptive model of pitch for which a link could be established with commands to the vocal-cord mechanism, which as such are under the speaker's control.” 我们的基本假设:只有那些 F0 变化才会被视为音调描述性模型的可能候选者,可以通过声带机制的命令建立链接,而声带机制本身是在说话者的控制之下的。 “The notion of this link was subsequently backed by studying, of necessity, the production aspect of intonation, including that of declination.” 这种联系的概念随后得到了对语调产生方面(包括下倾)的必要研究的支持。
模型建立步骤
“setting the task of mapping a model of the listener's behaviour with regard to pitch in speech” 设定任务,绘制听者在讲话中音调方面的行为模型 “what is relevant for the listener has resulted from the outcome of listening experiments with the help of manipulated pitch contours” 与听众相关的内容是在操纵音调轮廓的帮助下进行听力实验的结果得出的 “able to sustain our conviction that whatever information is ultimately carried by pitch is not confined to linguistic distinctivity” 实验维持了我们的信念,即音高最终携带的任何信息并不局限于语言的区别性 “Through this bottom-up approach, we arrived at the perceptually relevant pitch movements on which to base the descriptive device we were looking for.” 通过这种自下而上的方法,我们得到了感知上相关的音高运动,以此为我们正在寻找的描述性手段的基础。
“segmental phonetics:distinguish in phonetic transcription between 'broad' and 'narrow'” 音段语音学里面区分“广义”和“狭义”的转写音标
“broad approach are insufficient to indicate specific renderings for which a narrow transcription would have to be made available” 宽泛的方法不足以表明特定的效果,为此必须提供狭义的转录
“a broad framework would allow one to distinguish such categories as exclamation, question, surprise, etc.” 一个广泛的框架将允许人们区分感叹、问题、惊讶等类别。 “A narrow approach, in its attempt to account for more detailed information, runs the risk of bringing to light small variations whose communicative significance remains unclear” 狭隘的方法试图解释更详细的信息,但存在暴露微小变化的风险,而这些微小变化的交流意义仍不清楚 “In our experimental approach towards intonational phenomena, we found that there is a fairly large amount of freedom to opt for variants belonging to one and the same basic intonation pattern.” 在我们对语调现象的实验方法中,我们发现存在相当大的自由度来选择属于一种且相同的基本语调模式的变体。 “Since many of these variants have been shown to be perceptually distinct, they cannot be left out of the account.” 由于许多这些变体已被证明在感知上是不同的,因此不能将它们排除在外
“There are two limiting conditions that have to be regarded.” 有两个限制条件必须考虑。
“On the one hand, there are the psychoacoustic thresholds and tolerances that have to be accounted for in any perceptually based study of pitch.” 一方面,在任何基于感知的音调研究中都必须考虑到心理声学阈值和容差。
“On the other hand, there are the clear cases of linguistic relevance that ipso facto deserve to be included.” 另一方面,有一些明显的语言相关性的案例,当然应该包括在内。
“8.3 Reconciling top down and bottom up”协调自上而下和自下而上
“It is in the intermediatefieldbetween psychoacoustics and linguistics that we believe the mapping of intonational features can best be undertaken” 我们相信,在心理声学和语言学之间的中间领域,可以最好地进行语调特征的映射
“The main problem to be solved was to make available a reliable descriptive tool that was so far missing” 要解决的主要问题是提供迄今为止缺少的可靠的描述工具 “The other issue to be squarely faced was that of the relationship between phonetic and functional aspects.” 另一个需要直面的问题是语音和功能之间的关系。 “Our solution to the dilemma was to make perceptual tolerances the decisive factor in setting up criteria for how to cut up the pitch continuum into discrete pitch movements. This could only be achieved by means of the basic assumption, the link with commands to the vocal apparatus.” 我们对这个困境的解决方案是,在设定如何将音高连续体分割成离散音高运动的标准时,使感知公差成为决定性因素。这只能通过基本假设来实现,即与发声器官的命令的联系。 “It enabled us to come up eventually with a tool kit of pitch movements, being the discrete descriptive units, and a firm link with listeners' judgments about the acceptability of synthesized contours.” 它使我们最终能够拿出一套音高运动的工具包,作为离散的描述单元,并与听众对合成轮廓可接受性的判断建立牢固的联系。 “sticking to our tools of the experimental phonetician, but with a clear eye for the need to regard functional aspects, we have come up with a grammar of intonation, a model of possible and perceptually tested intonation patterns that are the core of the native speakers' intonational competence.” 坚持我们的实验语音学家的工具,但清醒地认识到功能方面的需要,我们提出了语调语法,一个可能的和经过感知测试的语调模式的模型,这是母语者的核心语调能力。 “It allows them to make contact with the verbal content by means of well-formed intonational features afforded by the intonation grammar and impinging, as far as pitch accents are involved, on segments selected by both pragmatic and syntactic requirements.” 它使他们能够通过语调语法提供的格式良好的语调特征与语言内容进行联系,并根据语用和句法要求选择的片段(就涉及音高重音而言)。
“There are a number of issues that have been decided on the way” 途中有很多问题已经决定
“It is the pitch movements, rather than levels, which provide the cues to listeners for whatever segment in an utterance is supposed to be made salient by carrying a pitch accen” 正是音高运动,而不是音高,为听众提供了线索,让他们知道话语中的任何片段都应该通过带有音高重音来突出显示。 “the proper way of dealing with the acoustic correlate of pitch, Fo, should be rendered in terms of semitones rather than Hertz, due to the mechanism of the auditory apparatus” 由于听觉器官的机制,处理音调的声学相关性 F0 的正确方法应该以半音而不是赫兹来呈现 “The phenomenological datum that pitch is heard as a continuum in spite of the presence of voiceless and therefore pitchless interludes is taken care of by means of the notion of pitch contours. The sequential regularities involved in such contours are generated by the intonation grammar.” 现象学数据表明,尽管存在清音且因此无音高的间奏,但音高仍被视为连续体,这一现象学数据是通过音高轮廓的概念来处理的。这种轮廓中涉及的顺序规律是由语调语法生成的。 “The concentration on pitch as one of the prosodic features at the exclusion of others has been a working hypothesis that has stood the test very well. 将音高集中作为韵律特征之一而排除其他特征一直是一个有效的假设,并且经受住了很好的检验。 This is by no means intended to imply that a proper study of the contribution of temporal features, including pauses, can be overlooked. On the contrary, our efforts in working towards a fully automated text-to-speech conversion have taught us that a lot can and should be gained by just such a study. However, the absence of a reliable frame for rendering pitch has forced us to take this restricted approach. 这绝不意味着可以忽视对时间特征(包括停顿)的贡献的适当研究。相反,我们在实现完全自动化的文本到语音转换方面所做的努力告诉我们,这样的研究可以而且应该收获很多。然而,由于缺乏可靠的复现音高的框架,迫使我们采取这种受限的方法。
“even more stringently, for what we do not yet know about the interface between linguistic parameters and phonetic output.” 更严格地说,我们还不知道语言参数和语音输出之间的接口。
“that the relations between linguistic and phonetic aspects, or in broader terms between language and speech, constitute an exciting and perennial problem.” 语言和语音之间的关系,或者更广泛地说,语言和言语之间的关系,构成了一个令人兴奋且长期存在的问题。
“8.4 Integrating the linguistic code and the speech signal”集成语言代码和语音信号
“the problem of how to match considerations from the formal and functional points of view” 如何从形式和功能角度考虑的匹配问题
“the communicative function is of paramount importance in tackling the task ofputting structure on the recalcitrant phenomena of pitch cues in speech” 交际功能对于解决语音中音高线索顽抗现象的结构任务至关重要
“general position”
“speech as such is the most natural way in which linguistically encoded messages can be framed.” 语音本身是构建语言编码消息的最自然方式。
“As in every coding system, the resulting messages are prone to errors coming from three possible sources: the encoder, the decoder, and what can be indicated as noise.” 与每个编码系统一样,生成的消息很容易出现来自三个可能来源的错误:编码器、解码器以及可以指示为噪声的内容。 “This noise can be either external, due to unfavourable communication conditions, for example noisy surroundings, or internal, as for example caused by inherent ambiguity of the linguistically encoded message. The way to overcome these disturbing influences is constituted by the introduction of redundant information, so that parts of the message that are obscured can still be decoded successfully by making use of the redundancies supplied in it.” 这种噪声可以是外部的,由于不利的通信条件,例如嘈杂的环境,也可以是内部的,例如由语言编码消息的固有模糊性引起的。克服这些干扰影响的方法是引入冗余信息,使得消息中被模糊的部分仍然可以通过利用其中提供的冗余来成功解码。 “Natural speech is an extremely efficient code in this respect, since it is usually abundant in just this matter of supplying extra cues to secure communication.” 在这方面,自然语音是一种极其有效的代码,因为它通常在提供额外提示以确保通信安全方面非常丰富。 “Prosodic features as such are eminently capable of taking care of this aspect. After all, most language communication can also be brought about in written form in which prosodic cues are not involved.” 韵律特征本身就能够很好地解决这个问题。毕竟,大多数语言交流也可以以不涉及韵律提示的书面形式进行。 “Intonation, as one of the constituents of speech prosody, can be removed from a speech message without seriously damaging the linguistic content of the message. Such is, for example, the case in speech which is artificially supplied with a monotone. Nevertheless, since natural speech is accompanied by pitch modifications, it stands to reason to study this phenomenon in an effort to impose a structure on the seemingly capricious pitch excursions of the vocal mechanism.” 语调作为语音韵律的组成部分之一,可以从语音消息中删除,而不会严重损害消息的语言内容。例如,人为提供单调的语音就是这种情况。然而,由于自然语音伴随着音高变化,因此有理由研究这种现象,以努力对发声机制看似反复无常的音高偏移施加一种结构。 “What the student of intonation is faced with, therefore, is the task of grasping the impact of how this extra information contained in the surface output of natural speech is brought about” 因此,语调学生面临的任务是掌握自然语音表面输出中包含的额外信息是如何产生影响的。 “It is our conviction that while intonation is always embedded in a linguistically couched frame, clauses, sentences and the like, it follows its own rules, which differ from one language to another.” 我们坚信,虽然语调总是嵌入在语言表达的框架、从句、句子等中,但它遵循自己的规则,这些规则因一种语言而异。 “Nevertheless, these rules will always be determined to a certain extent by the same abstract properties inherent in all intonation systems. Hence it falls legitimately within the scope of linguistics.” 然而,这些规则在一定程度上总是由所有语调系统固有的相同抽象属性决定。因此,它完全属于语言学的范围 “On the other hand, since intonation is an observable property of speech, it stands to reason that its study falls within the field of phonetics.” 另一方面,由于语调是言语的可观察属性,因此它的研究理所当然地属于语音学领域。
“intonation highlights the background information that is provided by the linguistically encoded message” 语调突出了语言编码消息提供的背景信息
“Through the wear and tear of speech communication over the ages, intonation is eminently suited to fulfil this additional role of providing extra cues to safeguard efficient transmission of linguistic information contained in the verbal message” 经过多年来语音交流的磨损,语调非常适合履行这种额外的作用,即提供额外的提示,以保障口头信息中包含的语言信息的有效传输 “On the one hand, it enables listeners in noisy surroundings, where more speakers are involved, to focus on the message carried by one speaking source.一方面,它使听众在嘈杂的环境中(有更多发言者参与)能够专注于一个讲话源所传达的信息。 On the other hand, it will help in assessing the frame of mind in which speakers address them” 另一方面,它将有助于评估演讲者演讲时的心态
“a descriptive model in which the regularities that can be observed in studying what listeners make of intonational cues can be accounted for.” 一种描述性模型,其中可以解释在研究听众对语调线索的看法时观察到的规律。
主要思路: “acknowledge the various layers that can be observed in the multitude of possible intonational functions that are concomitant with the linguistic message” 承认在与语言信息相伴的多种可能的语调功能中可以观察到的各个层次 “focus on the most rewarding aspect” 专注于最有价值的方面 “The resulting intonation grammar occupies the borderline between speech and language, since the linguistic background and the intonational capacity of foregrounding should be seen to match as closely as possible. ” 由此产生的语调语法成为语音和语言之间的边界,因为语言背景和前景的语调能力应该被视为尽可能紧密地匹配。 However, since there is a great amount of freedom for language users to choose patterns of their own for pragmatic reasons, no obvious one-to-one relationships can be expected between linguistic background and phonetic foreground.然而,由于语言使用者出于实用原因有很大的自由选择自己的模式,因此语言背景和语音前景之间不会出现明显的一对一关系。 “The study of this pragmatic aspect and the way it impinges on the actual choice of patterns and configurations being made is an interesting research objective for the near future. Such research will therefore have to concentrate on high-quality synthesis in fully automated text-to-speech systems.” 对这一实用方面及其对所做出的模式和配置的实际选择的影响方式的研究是不久的将来的一个有趣的研究目标。因此,此类研究必须集中于全自动文本转语音系统中的高质量合成。
“in spite of generally large tolerances, the selective acuity of the perceptual mechanism forces us to supply information over and above that of assigning proper placement of pitch accents.” 尽管公差通常很大,但感知机制的选择性敏锐度迫使我们提供超出指定音调重音正确位置的信息。
“In particular, additional information is required regarding direction, slope, and temporal alignment of pitch movements in quantitative terms. With this object in mind, we are once again back on phonetic ground, from which we started our research effort in the first place.” 特别是,需要有关定量方面的音高运动的方向、斜率和时间对准的附加信息。带着这个目标,我们再次回到了语音基础上,我们首先从语音基础上开始了我们的研究工作。