Introduction: Speech brain-computer interfaces (BCIs) have the potential to restore rapid communication to people with paralysis by neurally decoding attempted speaking into text. However, early demonstrations have not yet achieved accuracies high enough for communication of any sequence of words from a large vocabulary. Here, we demonstrate the first microelectrode-based BCI that can decode attempted speaking movements into text using single neuron activity.
Methods: Four 8x8 silicon microelectrode arrays were implanted in the left ventral precentral gyrus and inferior frontal gyrus in one participant with anarthria due to bulbar ALS, as part of the BrainGate2 pilot clinical trial. The neural activity (multiunit threshold crossings and spike band power) was temporally binned and smoothed on each electrode. Then, a recurrent neural network (RNN) converted a time series of this neural activity into a probability for each phoneme. The RNN is a 5-layer GRU with 512 units per layer and was trained using Tensorflow. Finally, the phoneme probabilities were combined with a large-vocabulary language model (a custom, 130,000-word trigram model implemented in Kaldi) to decode the most likely sentence.
Results: Our study participant achieved a 23.8% word error rate on a 130,000 word vocabulary, and a 9.1% word error rate on a 50 word vocabulary (2.7 times fewer errors than the prior state of the art speech BCI). Additionally, our BCI decoded speech at 62 words per minute, which is 3.4 times faster than the prior record for any kind of BCI and begins to approach the speed of natural conversation (160 words per minute). We observed spatially intermixed tuning (eliminating the need to record from wide regions of cortex), and a detailed articulatory representation of phonemes, preserved after years of anarthria.
Conclusion : These results show a feasible path forward for using microelectrode-based speech BCIs to restore rapid communication to people who can no longer speak.