Background: Transcutaneous vagus nerve stimulation (tVNS) has been reported on its effect to reorganize the sensory neural network in the central nervous system, similar as the observations in the traditional invasive vagus nerve stimulation. Very few studies have examined the impact of tVNS on the plasticity of the motor neural network. This exploratory study was aimed to investigate the effect of tVNS on the motor neural network manifested in laryngeal muscle flexibility and activities during word-level pitch production.
Methods: Two volunteers participated in a single-subject design experiment. They were instructed to produce 128 Chinese lexical tones in 16 sessions. tVNS was applied in the second 8 sessions when a stabilized baseline was achieved. Speech samples were collected from CLS 4500b and measured in PRAAT.
Results: Among the four types of Chinese lexical tones measured with fundamental frequency in Hz, three have been observed more accurately produced after tVNS application, showing the improved regulation of targeted intrinsic laryngeal muscles. One lexical tone was not improved as its regulation depends on the superior laryngeal muscles that were not targeted.
Conclusions: tVNS is a promising non-invasive technique which could modify the brain motor neural network, and it has the potential in facilitating the treatment of motor speech disorders clinically.
Transcutaneous vagus nerve stimulation, VNS, Lexical tone, Motor speech, Vagus nerve
Transcutaneous vagus nerve stimulation (tVNS) is a non-invasive technique that applies electrical currents through surface electrodes at selected locations to reorganize the brain neural network based on neuroplasticity [1-3]. This method has provided a means of overcoming the limitations posed by traditionally invasive vagus nerve stimulation, as it has replaced the need for surgically implanted stimulation devices . tVNS has been widely accepted as a means of treating a variety of health conditions such as pain , epilepsy , anxiety disorder , migraine , autism spectrum disorders [9,10], and associative memory .
tVNS is developed from the invasive procedure of vagus nerve stimulation (VNS), which is an FDA-approved therapy for treating individuals with depression and epilepsy . Studies have demonstrated that the electrical current stimulation of the vagus nerve could lead to cortical map plasticity. Engineer and colleagues [13-17] reported that repeatedly pairing tones with brief pulses of vagus nerve stimulation can reverse the pathological neural activity in the brain sensory system. Porter and colleagues  applied repeatedly pairing vagus nerve stimulation with a specific movement and observed an increased representation of that movement in primary motor cortex. Lundy, et al.  tested laryngeal and vocal function in five individuals with left-side vagal nerve stimulator implant for intractable seizures, and they found that the left vocal fold demonstrated consistent abduction and adduction at 20 and 40 Hz respectively. Additionally, they observed that at 59 Hz and 83 Hz, four individuals showed continued immobility of the left vocal fold and torsion of the larynx to the left side, with left arytenoid cartilage and aryepiglottic fold consistently torquing the hemi-larynx to the ipsilateral side. The results of above-mentioned studies suggest that VNS paired with specific movement could be a new approach for treating disorders related to abnormal movement representations.
Vagus nerve stimulation has been demonstrated to be able to improve sound signal perception in both animals and humans. For instance, Engineer, et al.  examined the reversing brain changes through VNS with tone pairing in rats and found that the perceptual impairment in rats was eliminated. When delivering an electrical stimulation of 15 pulses at a frequency of 30 Hz to the left cervical vagus nerve in rats 300 times in a 2.5-hour period per day for 20 days, an effect of reversing brain changes was observed, which eliminated the noise induced tinnitus. The authors proposed that vagus nerve stimulation restored neural activity and resulted in physiological and behavioral changes. To further examine the effect of VNS on speech sound perception, Engineer, et al.  investigated whether VNS could reverse pathological primary auditory cortex plasticity in rats with noise-induced hearing disorder. Their results indicated that when pairing VNS with speech sounds, enhanced auditory cortex response was presented. Although it did not strengthen the response to novel speech sounds, the VNS and speech sound pairing provided a means to enhance speech sound processing in the central auditory system. The authors predicted that the use of VNS during speech therapy could improve outcomes of individuals with receptive language deficits.
However, traditional invasive VNS presents limitations in relation to cost effectiveness, risk presented for the patient, and the requirement of a medical provider to use the device [20-22]. In comparison, transcutaneous vagus nerve stimulation (tVNS) could lead to the effects that mimic the traditional invasive VNS. Sharon, et al.  investigated the short-term effects of tVNS in healthy people and found that tVNS led to significant changes in pupil dilation and EEG markers of arousal. The results supported the hypothesis that tVNS could elevate the noradrenaline and other arousal-promoting neuromodulatory signaling, which were observed in the invasive VNS. In a case reported by Yuan, et al. , a 28-year-old female patient with severe dysphagia was treated with tVNS, 50 ms duration bursts of biphasic rectangular-wave pulse, followed by 2 seconds of rest with a continuation of 20 minutes, twice a day, 5 days a week for 6 weeks. Their tVNS targeted the vagus nerve division of the pharyngeal branch and superior laryngeal branch. As a result, the researchers observed the increased pharyngeal peristalsis due to the improved function of constrictor muscles by tVNS. Frangos and colleagues  applied non-invasive electrical stimulation to the cymba conchae, a region of the external ear exclusively innervated by the auricular branch of the vagus nerve, in 12 healthy adults. The stimulation produced significant activation of the central vagal projections to the nucleus tractus solitarii in the medulla. Their results implied that tVNS could generate consistent projections in the central nervous system as in the traditional invasive VNS. Mansuri, et al.  reported the adjuvant effects of tVNS in treating muscle tension dysphonia. In their study, Mansuri and colleagues compared the treatment results of voice therapy with and without tVNS among 20 women with muscle tension dysphonia. The group with tVNS demonstrated significantly positive outcomes in auditory perceptual assessment, acoustic voice analysis, the Vocal Tract Discomfort, and the assessment of musculoskeletal pain . In a speech category learning study, Llanos, et al.  applied tVNS to 36 native English speakers to examine whether their learning performance was enhanced in the perception of four Mandarin Chinese lexical tones. The between-group analysis showed that the effects of tVNS emerged rapidly and was significantly different for the easy-to-learn group of lexical tones, defined as the tone height category.
Although the effects of tVNS on the sensory neural network in the central nervous system have been reported, very few studies have examined the impact of tVNS on the plasticity of the motor neural network. To note some, Cook and colleagues  developed a motor-activated auricular vagus nerve stimulation system as a potential neurorehabilitation tool. They believed that pairing vagus nerve stimulation with motor training accelerates cortical reorganization, and this synchronous pairing may enhance the motor recovery. The experiment results demonstrated that when applying tVNS of 3-second trains of 25 Hz at a 500 μs pulse width and 0.1 mA over the buccinator muscle, three neonates presented improved the oromotor function. Nevertheless, limited evidence in literature has been reported for the effects of tVNS on motor speech, particularly the relevant disorders such as apraxia of speech and dysarthria.
The purpose of the current study is to investigate whether tVNS may improve intrinsic laryngeal muscles flexibility and their motor activities, particularly during motor speech movement. Mandarin Chinese lexical tones were adopted as their production requires laryngeal muscles to be flexible to adjust pitch at word level in a fast rate, accurate direction, and proper range of movement . There are four lexical tones in Mandarin Chinese to help determine the meaning of the word . Based on tone value from 1 to 5 proposed by Chao , the four lexical tones are high level (e.g., ma55, "mother"), high rise (e.g., ma35, "fiber"), low tone (e.g., ma214, "horse"), and high falling (e.g., ma51, "blame or curse"). The core vowel /a/ in the above examples bears the lexical tones. For convenience, these tones are named tone 1 (T1), tone 2 (T2), tone 3 (T3), and tone 4 (T4) respectively. According to Llanos, et al. , tone categories were presented in varying degrees of difficulty. They found that Mandarin Chinese is more perceptually distinguishable due to its variation in pitch height (T1 and T3) and pitch direction (T2 and T4). English speakers are sensitive to the relative differences in pitch height. Thus, T1 and T3 are perceptually easier than T2 and T4. It is legitimate to assume that when applying the tVNS and targeting the recurrent laryngeal nerve, a branch of the vagus nerve innervating the intrinsic laryngeal muscles including the thyroarytenoid, lateral cricoarytenoid, posterior cricoarytenoid, and interarytenoid muscles, these muscles will become more flexible, and as a result, their motor activities will be enhanced. To exam this assumption, two specific research questions were asked: 1) Will tVNS improve the imitation of Mandarin Chinese lexical tones among native English speakers? 2) If there is an improvement, whether pitch height (T1 and T3) or pitch direction (T2 and T4) is better produced, as measured by fundamental frequency in Hz?
The current study adopted a single-subject, AB experiment design, with A phase as the baseline condition without tVNS application and B phase as the treatment condition with tVNS application. This study has been approved by the Institutional Review Board at Fort Hays State University. Each volunteer has reviewed, agreed on and signed informed consent at the beginning of the experiment.
Two adult English native speakers volunteered to participate in the study, one male and one female, with an average age of 21. They did not present any known neurological conditions. They had never been exposed to any level of Mandarin Chinese. Additionally, the two participants did not have a cardiac pacemaker, implanted defibrillator, or any other metallic or electronic implanted devices. They reported to be healthy during the time of the study.
In this study, 77 Chinese characters were first selected from the Peanut passage , a fifth grade reading passage officially used in elementary schools in China. Phonologically, each Chinese character can be represented by a Pinyin with a specific lexical tone. Pinyin is an alphabetic system for the phonological representation of Chinese characters. Another 51 Pinyin with lexical tone were added as the stimuli for the purpose of phoneme and lexical distribution. Consequently, a total of 128 Mandarin Chinese lexical tones were determined. These stimuli were distributed evenly to 16 sets based on pitch height (T1 and T3) and pitch direction (T2 and T4). Each stimulus was presented only once in the experiment.
iReliev® Model ET-5050, is a therapeutic wearable system approved by FDA, which has been used for active treatment of the muscular system over the past 30 years. This system is a safe, non-invasive, and drug free method of muscle conditioning and muscle strengthening. According to Yap, et al. (2020) , there was no optimal stimulation frequency and intensity agreed upon from the previous research, with frequency ranging from 10 Hz to 50 Hz, and intensity from 0.8 mA to 12 mA. In the current study, iReliev® was set at the TENS level three (P3), which presents a continuous pulsating rate of 60 Hz and a pulse width of 260 μs, 6 mA (20.8 microcoulombs per pulse) maximum. This setting provided a comfortable pulsing sensation according to the iReliev® manual and was determined based on participant's sensation self-report.
Participants joined 16 sessions of Mandarin Chinese lexical tone imitation, 20 minutes each session. Each session includes pre-imitation, imitation training, and post-imitation of target Mandarin Chinese lexical tones. Pre-imitation and post-imitation were audio recorded with the KayPentax Computerized Speech Laboratory (CSL Model 4500b). The first 8 sessions were the baseline condition without application of tVNS, with 4 sessions targeting pitch direction (T2 and T4) and 4 sessions targeting pitch height (T1 and T3). The second 8 sessions were the tVNS treatment condition, with 4 sessions targeting pitch direction (T2 and T4) and 4 sessions targeting pitch height (T1 and T3).
Participants were provided with a brief introduction to the knowledge of Mandarin Chinese lexical tones at the beginning of the baseline sessions, with a demonstration of Chinese Pinyin "ma" in four lexical tones. After participants have developed the basic knowledge about the lexical tones, they began to imitate the target lexical tones. In each session, 8 different stimuli were presented first to be imitated by the participants while being recorded as the pre-imitation. During the next 20 minutes, participants imitated each of the 8 stimuli 20 times in 3 sets following modeling. At the end of each session, participants imitated the 8 stimuli again for the post-imitation recording.
In the treatment sessions, participants were applied tVNS in addition to the same imitation task as in baseline condition. The target skin locations for simulation were determined as the middle of the tracheoesophageal groove on each side of the neck to set up the iReliev® system. According to Zemlin , both left and right RLNs are vertical to the larynx. The right RLN loops behind the right common carotid and subclavian arteries at their junction and goes into the larynx, while the left RLN loops under and behind the aortic arch and in a groove located between the trachea and esophagus, and goes into the larynx. In the current study, two electrodes on the adhesive pads were placed horizontally on each side of the neck, targeting perpendicularly to the RLNs, for 20 minutes in each conditioned session.
Participants' imitation of lexical tones was recorded via the KayPentax Computer Speech Laboratory (CSL Model 4500b) and segmented by a free software PRAAT  for acoustic analysis. Each lexical tone determined by its fundamental frequency (f0) was measured in PRAAT. Both sound waveform and spectrum were used to determine the onset and offset of a specific lexical tone. More specifically, the four lexical tones measured as f0 in Hz were as follows:
To measure acoustically each lexical tone, core vowel was determined first based on the formants in spectrum window in PRAAT. Then, the first three cycles of this vowel were determined as the onset and the last three cycles as the offset. Tone 1 (T1), Tone 2 (T2), and Tone 4 (T4) were operationally determined as the difference between the mean f0 in Hz of the first three cycles of the onset and the last three cycles of the offset. Tone 3 (T3) was operationally determined as the difference between the mean f0 in Hz of the falling section and the rising section. The falling section is measured as the difference between the mean f0 in Hz of the first three cycles of the onset and the lowest f0 in Hz either automatically generated by PRAAT or manual calculation based on one cycle from the lowest pitch contour. The rising section is measured as the difference between the lowest f0 determined as abovementioned and the mean f0 of the last three cycles of the offset.
Measurement reliability was examined by comparing the f0 measurement between three independent analysts. Intraclass correlation coefficient (ICC) estimates and their 95% confident intervals (CI) were calculated using IBM SPSS statistics (Version 28) based on a mean-rating (k = 3), absolute agreement, two-way mixed-effects model. A moderate reliability was found , with the average measures ICC .677 and the 95% CI [0.625, 0.723].
Visual inspection aids of level, variability, and trend were generated with R (Version 4.2.2; R Core Team, 2022)  following the guidelines for single-subject experimental design data analysis by Manolov, et al. . Dual-criteria method  and split-middle method of trend estimation by Wolery and Harris  were referred. In addition, percentage of data points exceeding the median (PEM) was generated by R, with 50% and above demonstrating an effect of the tVNS application .
Table 1 shows the mean f0 for each of the four lexical tones by two participants in both baseline and treatment conditions. When examining mean f0 of each lexical tone after receiving tVNS, T1 presented a decreasing pattern in both participants, while T2 presented an increasing pattern in both participants. T3 by both participants dropped dramatically, as a result of the decreasing falling section and increasing rising section.
Table 1: Statistics of Lexical Tones Measured with f0 in Hz by Both Participants: Mean (SD). View Table 1
Both participants showed a notable increasing T4 after receiving tVNS compared to the baseline Table 1.
The graphed data of lexical tones imitation were visually inspected for the level, variability, trend, and data points exceeding the median (PEM) determined by the fundamental frequency in Hz within and across the conditions. For each lexical tone by Participant 1 as shown in Figure 1, there were level changes and trend changes between the baseline (A phase) and treatment (B phase). More variability was observed during the B phase, particularly for T1, T2, and T4. Percentage of data points exceeding the median (PEM) generated by R was 100% for T1, 100% for T2, 100% for T3, and 75% for T4.
Figure 1: The Level, Variability, and Trend of Lexical Tones by Participant 1 before and after tVNS Application.
Note: A phase represents the baseline while B phase represents the treatment condition. The left diagrams demonstrate the level change between A and B phase for each lexical tone. The middle left diagrams demonstrate the variability between A and B phase for each lexical tone. The middle right diagrams demonstrate the trend change between A and B phase for each lexical tone. The right diagrams illustrate the data points exceeding the split-middle trend. View Figure 1
Figure 2 demonstrated the level, variability, trend, and data points exceeding the median (PEM) for each lexical tone by Participant 2 before and after tVNS application. The level and trend changes were observed between the baseline (A phase) and treatment (B phase). Variability was reduced from A phase to B phase, particularly for T1, T2, and T4. In Participant 2, percentage of data points exceeding the median (PEM) generated by R  was 100% for T1, 75% for T2, 100% for T3, and % for T4.
Figure 2: The Level, Variability, and Trend of Lexical Tones by Participant 2 Before and After tVNS Application.
Note: A phase represents the baseline while B phase represents the treatment condition. The left diagrams demonstrate the level change between A and B phase for each lexical tone. The middle left diagrams demonstrate the variability between A and B phase for each lexical tone. The middle right diagrams demonstrate the trend change between A and B phase for each lexical tone. The right diagrams illustrate the data points exceeding the split-middle trend. View Figure 2
The aim of the current study was to investigate the effect of transcutaneous vagus nerve stimulation on the plasticity of motor neural network manifested in the intrinsic laryngeal muscle flexibility and enhanced motor activities. Mandarin Chinese lexical tones were adopted because their production required fine motor movements of the laryngeal muscles to adjust pitch at lexical level in a delicate way of fast rate, accurate direction, and proper movement range . By comparing the imitation of Mandarin Chinese lexical tones from two English native speakers with and without tVNS application, the results demonstrated a more accurate imitation after tVNS application in T1, T3 and T4, except for T2. When examining pitch height (T1 and T3) and pitch direction (T2 and T4), pitch height was better imitated than pitch direction, similar to its perception .
Mandarin Chinese lexical T1 is a high-leveled pitch, and operationally measured as the difference between the mean f0 of the core vowel onset and offset. After receiving tVNS, a more leveled pitch imitation of T1 was observed in both participants manifested in reduced difference between the onset and offset mean f0, from 7.75 Hz to 3.14 Hz in Participant 1 and from 16.49 Hz to 14. 89 Hz in Participant 2 as shown in Table 1. This indicated a better T1 imitation with a high and leveled pitch. Improvement after tVNS was also observed in T3. T3 is a pitch contour in Chinese lexical tone that first falls briefly then rises to a higher pitch. Operationally, T3 was measured as the mean f0 difference between the falling section and the rising section. The falling section is longer and lower in pitch contour compared to the rising section. Therefore, a decreasing mean f0 from the baseline to tVNS treatment represented a more accurate T3 imitation, which was demonstrated in both participants. In addition, T4 imitation was improved during the treatment phase in both participants. T4 is a high falling pitch, measured operationally as the difference between the mean f0 of the onset and offset. Both participants increased this difference after tVNS application, from 31.15 Hz to 44.03 Hz in Participant 1 and from 48.02 Hz to 61.14 Hz in Participant 2. However, T2 was not observed to be improved in both participants. T2 in Mandarin Chinese is a high rising pitch, and operationally measured as the difference between the mean f0 of the onset and offset. Increased T2 mean f0 by the two participants indicated an opposite pattern, away from the ideal decreasing T2 pattern it should.
Fundamental frequency measured in lexical tones reflects the biomechanical characteristics of human vocal folds and is determined by the laryngeal structure and related muscle forces . Physiologically, the activities of cricothyroid and thyroarytenoid muscles are the primary mechanism in changing f0, with superior laryngeal branch of the vagus nerve innervating the cricothyroid muscles and the recurrent laryngeal branch innervating the thyroarytenoid muscles .
According to Zemlin , the modification of thyroarytenoid muscles changes stiffness and the effective vibrating mass of vocal folds. More importantly, when the cricothyroid muscles contract, the thyroid cartilage tilts downward and forward, and the cricoid cartilage may tilt upward as well. As vocal folds are attached to the interior middle portion of the thyroid cartilage and the arytenoid cartilages, the contraction of cricothyroid muscles results in a lengthening of the vocal folds, therefore increasing the f0. In addition, the intrinsic muscles of the larynx, including posterior cricoarytenoid, lateral cricoarytenoid, and the arytenoid muscles, innervated by the recurrent laryngeal branch of the vagus nerve, are important in assisting in the shape of the glottis and the vocal folds to vary the degree of the pitch produced.
Transcutaneous vagus nerve stimulation applied in the current study seemed to impact the motor control of the laryngeal muscles, particularly the fine movement of the intrinsic muscles in two English native speakers, so that they can better imitate Chinese lexical tones. After tVNS, the two participants were able to maintain a leveled pitch in T1, as a result of sustaining the shape of their vocal folds for the length of the target core vowel. While imitating T3 after tVNS, participants could quickly and flexibly adjust the falling then rising pitch on the core vowel to achieve the T3 pitch contour. This fine adjustment of the pitch could be achieved by decreasing then increasing the medial compression through the actions of the lateral cricoarytenoid muscles . For T4, it is important to initiate the phonation by adducting the vocal folds, then to relax vocal folds immediately to decrease the medial compression so that fundamental frequency drops. The application of tVNS led the participants to improve the control of intrinsic laryngeal muscles to produce more accurate T4.
For T2 production, a major increased pitch is needed through the primary mechanism of cricothyroid and thyroarytenoid muscles [31,40]. Yet, the cricothyroid muscles are innervated by the superior laryngeal branch of the vagus nerve that was not targeted and stimulated via tVNS in the current study. Although the activation of lateral cricoarytenoid muscles innervated by the recurrent laryngeal nerve can adjust the medial compression to increase the fundamental frequency, it is considered as only the "secondary or ancillary mechanism of fundamental frequency control" .
The validity of the current study results was limited by the single-subject experiment design. With only two participants and limited sessions for each lexical tone, the data analysis may not achieve stronger evidence. Future studies could adopt more rigorous experiment designs that produce stronger generalizable evidence, to further examine the effect of transcutaneous vagus nerve stimulation on motor neural network. Another limitation in this study was the selection of electronic stimulator. iReliev® Model ET-5050 presents limitations in adjusting its frequency range and intensity levels, which prevents from an ideal configuration setting in the experiment. More delicate and adjustable electronic stimulators could be explored and trialed in future studies.
This study has provided preliminary findings of the effect of tVNS on motor speech production, to add more evidence for the clinical application of tVNS. Transcutaneous vagus nerve stimulation is a promising non-invasive technique in modifying brain neural networks. The potential of tVNS can be explored in the future in facilitating the treatment of motor speech disorders such as hypokinetic dysarthria associated with Parkinson's disease.
Transcutaneous vagus nerve stimulation has been supported as a technique in modifying brain motor neural network by the current study findings. After the tVNS application targeting the recurrent laryngeal nerve, the intrinsic laryngeal muscles innervated by the corresponding neural motor network become more flexible in regulating pitch production. It could be a step stone for future clinical trials of tVNS application.
All authors claim that there are no conflicts of interest.
The data that supports the findings of this study are available in the supplementary material of this article.
All the participants had signed informed consent approved by the IRB at Fort Hays State University.