https://ojs3.mtak.hu/index.php/besztud/issue/feedBeszédtudomány - Speech Science2024-11-11T10:20:50+00:00Gráczi Tekla Etelka & Mády Katalinbesztud@nytud.huOpen Journal Systems<p>A Beszédtudomány - Speech Science nevű folyóirat a Beszédkutatás című folyóirat utódja.</p> <p>A Beszédkutatás 2019-ben megjelent utolsó, 27. száma és a megelőző online kiadványok itt elérhetők: <a href="https://ojs3.mtak.hu/index.php/beszkut/">https://ojs3.mtak.hu/index.php/beszkut/</a></p>https://ojs3.mtak.hu/index.php/besztud/article/view/11978Az 'iá' és az 'ijá' vokalikus hangsorok megvalósulása magyar álszavakban a nyelvállás akusztikai vetületének szempontjából2024-11-11T10:20:50+00:00Kornélia Juhászjuhasz.kornelia8@gmail.comAndrea Demedeme.andrea@btk.elte.hu<pre>In this acoustic analysis we compare the realization of iá /ia:/ and ijá /ija:/ in Hungarian pseudowords. We expect that the orthographical representation induces contrast between these forms in the phonetic realisation, more particularly, between the [j] that is not present in the orthographical representation of the pseudoword (e.g., in iá /ia:/) and the [ j] that is present in orthography (e.g., in ijá /ija:/). We suggest that the investigation of these realisations may serve as a basis for future analyses where i) epenthetic [j] appearing in hiatus, and ii) [j] present in the assumed phonological representation of a word are compared, since j is never marked in hiatus by orthography. We propose that through orthographic facilitation, the setting of the present study forces speakers to maximally exaggerate any possible phonetic contrasts between marked and non-marked [j]-realisations (in otherwise identical phonetic contexts), and this is analogous to the phonemic and non-phonemic j. Therefore, the present study can clarify if any difference may be expected in the comparison of phonemic and non-phonemic [j]. We analyse the acoustic traits of tongue height differences between the two [j] realizations in /ia:/ and /ija:/ sequences. The phonemic /j/ is claimed to be an approximant, and a liquid, and thus is characterized by more constricted vocal tract than, e.g., the high vowel /i/. The epenthetic [j] in hiatus resolution is, however, considered to be a glide which – from a phonetic viewpoint – is the result of the acoustic transition between the articulatory/acoustic targets of /i/ and /a:/. On this basis, we expect that the epenthetic [j] in the sequence iá /ia:/ is articulated with a less constricted (more vowel like) vocal tract than that observable in the realisation of the phonemic /j/ in the sequence ijá /ija:/. To test this, we analyse the acoustic traits of tongue height differences between the two [j] realizations in iá /ia:/ and ijá /ija:/ sequences, that is, we measure and analyse F1. We expect that /j/ in ijá /ija:/ features a narrower constriction in the oral cavity reflected in lower F1, than iá /ia:/.We recorded [j] realizations in /ia:/, and /ija:/ shaped vocalic sequences in nonsense words in two sibilant contexts produced in isolation by 14 Hungarian female speakers. We extracted F1 frequencies automatically at every 5th ms throughout the whole quasi-periodic signal phase in Praat. The resulting F1 curves were submitted to GAMMs, where we analysed the effect of the normalized timepoint predictor on the dependent variable, F1, and added vocalic sequence as a parametric term to each model, as well, as random smooth by each trajectory. Our results showed that regardless of the sibilant context, there was a significant difference between iá /ia:/ and ijá /ija:/ in the transitional phase connecting the two targets (/i/ and /a:/), since /ija:/ showed lower F1 than /ia:/, which reflects a narrower constriction in the oral cavity in ijá /ija:/. Therefore, we concluded that speakers may differentiate [j] variants that are marked or not marked in orthography, and it is possible that they apply this differentiation when producing phonemic and epenthetic [j] that surfaces in the case of hiatus resolution. </pre>2024-08-07T00:00:00+00:00Copyright (c) https://ojs3.mtak.hu/index.php/besztud/article/view/11975Az anyanyelvre gyakorolt célnyelvi hatás gyengülésének kérdése a növekvő számú anyanyelvi ingerek hatására2024-11-11T10:20:49+00:00Kornélia Juhászjuhasz.kornelia8@gmail.com<p>This acoustic analysis is focused on how an atonal L1 and a tonal L2 interacts in the case of Hungarian learners of Mandarin Chinese. In particular, this experiment intends to shed light on whether L2 Chinese tonal patterns’ effect weakens on L1 Hungarian intonation contours throughout the experiment, as the production of L1 utterances increases. It was hypothesized that in the beginning of the Hungarian L1 recordings, language learners’ production is primarily shaped by the L2-dominant bilingual mode, thus L1 Hungarian intonation patterns approximate the L2 Chinese tonal curves in their shape. However throughout the recording as language learners produce more L1 utterance, their production is hypothesized to approach gradually the standard native L1 patterns due to the weakening of the L2 tonal effect. Since we expected that L2 tonal effects are also dependent on language learners’ L2 experience, therefore we analysed two speaker groups with different levels of L2 experience. The effect of the L2 tones were analysed by the f0 curve and the duration of the vocalic section of the monosyllabic utterances recorded in four different L1 tune: declarative, imparative and two interrogative intonation patterns. Statistical analysis was submitted to GAMMs, where the f0 change was analysed along the vocalic section’s normalized duration, as well as throughout the recording by aligning the utterances along their ordinal number. Our results did not confirm the gradual weakening of L2 effect on L1 intonation patterns, rather suggest that the sudden change between L1 and L2 induces a more dynamic excursion towards the L1 language mode, which is followed by a return to the L2-dominated language mode approximating L2 tonal patterns. Regarding these results questions arise whether longer recordings with more utterances would show different outcome about the weakening of the L2 effect on L1 intonation patterns. The results of the experiment also contribute to the deeper understanding of which acoustic features Hungarian native speakers enhance along repetitions of the same L1 sentence type in monosyllabic utterances.</p>2024-11-07T14:56:22+00:00Copyright (c) 2024 HUN-REN Nyelvtudományi Kutatóközponthttps://ojs3.mtak.hu/index.php/besztud/article/view/11324Egy beszédkutatási kísérlet a 'hát' diskurzusjelölő típusainak feltárására2024-11-11T10:20:49+00:00Ákos Gocsálgocsal@gmail.comAnna Szetelianna.szeteli@gmail.comGábor Szenteszenteg@gmail.comGábor Albertialberti.gabor@pte.hu<p>Although many believe that the frequently occurring Hungarian discourse marker hát (‘well/so’) is a superfluous filler, previous research has pointed out its multifunctional nature. It harmonizes the communicating minds, preparing the listener for receiving new information. Hát therefore works like a semaphore and has a crucial role in human communication. This paper examines whether there are differences in duration parameters between háts expressing different meanings. In this study, 53 speakers (28 females, 25 males, all university students, speakers of standard Hungarian) read a text, imitating spontaneous speech. In the utterances, hát appeared in ten different functions (h1 straightforward, h2 uncertain, h3 uneasy, h4 teasing, h5 resigning, h6 introductory, h7 summative, h8 evaluative, h9 sentence-final inferring, hf sentence-final confirming). The duration of háts and of the following pauses, where hát was not in a final position, were measured. The duration of háts with summative, straightforward, introductory, or evaluative functions was shorter, while those expressing negative attitudes or uncertainty were longer. Teasing represented a separate category, as well as the two sentence-final types. Differences in the duration of the pauses following the háts were also found. Pauses following the uneasy háts were significantly longer than those following the straightforward and summative ones. The results confirm the significance of prosody in different uses of hát. Possible uses of the results in speech technology and language teaching are raised, and it is also highlighted that hát is a natural element of the Hungarian language.</p>2024-11-07T15:09:24+00:00Copyright (c) 2024 HUN-REN Nyelvtudományi Kutatóközponthttps://ojs3.mtak.hu/index.php/besztud/article/view/11863Distinguishing between dysarthria types based on acoustic parameters2024-11-11T10:20:49+00:00Bernadett Damdam.bernadett@stud.u-szeged.huLívia Ivaskóivasko@hung.u-szeged.hu<p>Dysarthria is a motor speech disorder resulting from neurological impairments. Because of the variability of impairments and deviant speech characteristics, it is useful to categorize it into types. The current study gives an overview of the main types of dysarthria, describing the different underlying causes, some deviant speech characteristics arising from those impairments, as well as the corresponding acoustic parameters, and some possible methods to measure the most relevant acoustic features. Six main groups of acoustic parameters were identified that could help distinguish between the types of dysarthria. Since the properties of the acoustic signal are connected to the manner of articulation, which is dependent on the neuromuscular system, the precise description of acoustic features of dysarthric speech could provide valuable information that could aid localization and differential diagnosis.</p>2024-11-07T00:00:00+00:00Copyright (c) https://ojs3.mtak.hu/index.php/besztud/article/view/12076Tanulásban akadályozott (enyhe értelmi fogyatékos) fiatalok alaphangjellemzői a spontán beszédben2024-11-11T10:20:49+00:00Julianna Jankovicsjankovicsjuli@gmail.com<p>Disorder of intellectual development (intellectual disability) is a collective term that is defined by three factors: reduced intelligence, deficits in adaptive skills, and the appearance of symptoms before the age of 18. Individuals with intellectual disability often experience impairments in general cognitive functions such as thinking and spatial orientation, which significantly impact their language production and perception. This study examines the prosodic structure in spontaneous speech of young adults with mild intellectual disabilities. The main hypotheses are: (1) in all types of spontaneous speech, fundamental frequency is higher in people with mild intellectual disabilities; (2) there are differences between the two genders in prosodic characteristics, the average fundamental frequency of women is higher, and their vocal range and interval are wider compared to men; (3) in four types of spontaneous speech, there is a difference in prosodic characteristics.</p> <p>The study involved 16 participants with mild intellectual disabilities (8 women and 8 men), with an average age of 19.5 years, and 16 mentally healthy control subjects (8 women and 8 men) of similar ages. The classification of mild intellectual disabilities was determined based on the BNO code and IQ values obtained from expert committee documents.</p> <p>Four types of audio recordings were created for the study, including a two-part interview, picture description, and narrative recall. The recordings were annotated using Praat software, and scripts were utilized during the analysis to ensure accuracy. The scripts facilitated the determination of average fundamental frequency (f<sub>0</sub>), f<sub>0</sub>-minimum, and f<sub>0</sub>-maximum values for each speech segment. Additionally, the vocal range and interval were calculated for each speech type and segment, representing the distance between the highest and lowest fundamental frequency values.</p> <p>According to the results, the average fundamental frequency was higher in the speech of people with mild intellectual disabilities in four types of recordings, and in terms of gender, the average f<sub>0</sub> was higher for women, as expected. Furthermore, there was a difference between the prosodic characteristics of each speech type.</p>2024-11-07T15:26:17+00:00Copyright (c) 2024 HUN-REN Nyelvtudományi Kutatóközponthttps://ojs3.mtak.hu/index.php/besztud/article/view/11963Towards decoding brain activity during passive listening of speech2024-11-11T10:20:49+00:00Milán András Fodormilanfodor@edu.bme.huTamás Gábor Csapócsapot@tmit.bme.huFrigyes Viktor Arthurarthur@tmit.bme.hu<p>The aim of the study is to investigate the complex mechanisms of speech perception and ultimately decode the electrical changes in the brain accruing while listening to speech. We attempt to decode heard speech from intracranial electroencephalographic (iEEG) data using deep learning methods. The goal is to aid the advancement of brain-computer interface (BCI) technology for speech synthesis, and, hopefully, to provide an additional perspective on the cognitive processes of speech perception.</p> <p>This approach diverges from the conventional focus on speech production and instead chooses to investigate neural representations of perceived speech. This angle opened up a complex perspective, potentially allowing us to study more sophisticated neural patterns. Leveraging the power of deep learning models, the research aimed to establish a connection between these intricate neural activities and the corresponding speech sounds.</p> <p>Despite the approach not having achieved a breakthrough yet, the research sheds light on the potential of decoding neural activity during speech perception. Our current efforts can serve as a foundation, and we are optimistic about the potential of expanding and improving upon this work to move closer towards more advanced BCIs, better understanding of processes underlying perceived speech and its relation to spoken speech.</p>2024-11-07T15:34:27+00:00Copyright (c) https://ojs3.mtak.hu/index.php/besztud/article/view/11977Revised annotation conventions in Hungarian speech corpora2024-11-11T10:20:47+00:00Katalin Mádymady.katalin@nytud.huTekla Etelka Gráczigraczi.tekla.etelka@nytud.huAnna Kohárikohari.anna@nytud.huPéter Mihajlikmihajlik@tmit.bme.hu<p>This technical report presents the revised annotation conventions for a large and two smaller Hungarian speech corpora, the BEA Spoken Language Database, the Akaka Maptask Corpus and the Budapest Games Corpus. Annotations relying on standard Hungarian orthography rather than actual and partly reduced phonetic realisations make it possible to run both linguistic and phonetic queries on a large amount of data. Since the vast majority of the recordings contains (semi-)spontaneous speech, non-lexical phenomena such as hesitations (filled pauses) and non-verbal events such as laughter are labelled. The frequency of the occurrences of these phenomena are demonstrated on the subset Release~1 of the BEA database on speech samples of 115 speakers. Unsurprisingly, laughter and conversational grunt were more frequent in spontaneous speech when expressed in relative numbers. Hesitations occurred more often in semi-spontaneous speech than in read and spontaneous speech showing that the task demanded a higher cognitive effort from speakers. The majority of questions was found in spontaneous speech, since the reading tasks did not include interrogatives.</p>2024-11-07T15:46:15+00:00Copyright (c)