diff --git a/README.md b/README.md
index af7e788..fad938f 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,4 @@
-
-2024-12-19 |
-Event-based backpropagation on the neuromorphic platform SpiNNaker2 |
-Béna Gabriel, Wunderlich Timo, Akl Mahmoud, Vogginger Bernhard, Mayr Christian, Andres Gonzales Hector |
-Link |
-Neuromorphic computing aims to replicate the brain's capabilities for energy efficient and parallel information processing, promising a solution to the increasing demand for faster and more efficient computational systems. Efficient training of neural networks on neuromorphic hardware requires the development of training algorithms that retain the sparsity of spike-based communication during training. Here, we report on the first implementation of event-based backpropagation on the SpiNNaker2 neuromorphic hardware platform. We use EventProp, an algorithm for event-based backpropagation in spiking neural networks (SNNs), to compute exact gradients using sparse communication of error signals between neurons. Our implementation computes multi-layer networks of leaky integrate-and-fire neurons using discretized versions of the differential equations and their adjoints, and uses event packets to transmit spikes and error signals between network layers. We demonstrate a proof-of-concept of batch-parallelized, on-chip training of SNNs using the Yin Yang dataset, and provide an off-chip implementation for efficient prototyping, hyper-parameter search, and hybrid training methods. |
+2024-12-20 |
+How connection probability shapes fluctuations of neural population dynamics |
+Nils E. Greven, Jonas Ranft, Tilo Schwalger |
+Link |
+Mean-field models of neuronal populations in the brain have proven extremely useful to understand network dynamics and response to stimuli, but these models generally lack a faithful description of the fluctuations in the biologically relevant case of finite network size and connection probabilities $p<1$ (non-full connectivity). To gain insight into the different fluctuation mechanisms underlying the neural variability of populations of spiking neurons, we derive and analyze a stochastic mean-field model for finite-size networks of Poisson neurons with random, non-full connectivity, external noise and disordered mean inputs. We treat the quenched disorder of the connectivity by an annealed approximation that enables a reduction to a low-dimensional closed system of coupled Langevin equations for the mean and variance of the neuronal membrane potentials as well as a variable capturing finite-size fluctuations arising specifically in the case $p<1$. Comparing to microscopic simulations, we find that the mesoscopic model describes the fluctuations and nonlinearities well and outperforms previous mesoscopic models that neglected the recurrent noise effect caused by the non-full connectivity. This effect can be analytically understood by a softening of the effective nonlinearity and the multiplicative character of finite-size spiking noise. The mesoscopic theory shows that quenched disorder can stabilize the asynchronous state, and it correctly predicts large quantitiative and non-trivial qualitative effects of connection probability on the variance of the population firing rate and its dependence on stimulus strength. Our theory thus elucidates how disordered connectivity shapes nonlinear dynamics and fluctuations of neural populations at the mesoscopic scale and showcases a useful mean-field method to treat non-full connectivity in finite-size, spiking neural networks. |
-2024-12-19 |
-Accessing the topological properties of human brain functional sub-circuits in Echo State Networks |
-Bach Nguyen, Tianlong Chen, Shu Yang, Bojian Hou, Li Shen, Duy Duong-Tran |
-Link |
-Recent years have witnessed an emerging trend in neuromorphic computing that centers around the use of brain connectomics as a blueprint for artificial neural networks. Connectomics-based neuromorphic computing has primarily focused on embedding human brain large-scale structural connectomes (SCs), as estimated from diffusion Magnetic Resonance Imaging (dMRI) modality, to echo-state networks (ESNs). A critical step in ESN embedding requires pre-determined read-in and read-out layers constructed by the induced subgraphs of the embedded reservoir. As \textit{a priori} set of functional sub-circuits are derived from functional MRI (fMRI) modality, it is unknown, till this point, whether the embedding of fMRI-induced sub-circuits/networks onto SCs is well justified from the neuro-physiological perspective and ESN performance across a variety of tasks. This paper proposes a pipeline to implement and evaluate ESNs with various embedded topologies and processing/memorization tasks. To this end, we showed that different performance optimums highly depend on the neuro-physiological characteristics of these pre-determined fMRI-induced sub-circuits. In general, fMRI-induced sub-circuit-embedded ESN outperforms simple bipartite and various null models with feed-forward properties commonly seen in MLP for different tasks and reservoir criticality conditions. We provided a thorough analysis of the topological properties of pre-determined fMRI-induced sub-circuits and highlighted their graph-theoretical properties that play significant roles in determining ESN performance. |
+2024-12-20 |
+Probing lactate exchange and compartmentation in Gray Matter via time-dependent diffusion-weighted MRS |
+Eloise Mougel, Marco Palombo, Julien Valette |
+Link |
+Lactate is crucial in the brain, as it is involved in neuronal activity and memory formation. This is thought to be related to the astrocyte-neuron lactate shuttle hypothesis (ANLS), which has been debated for 30 years, not least because it is difficult to measure lactate compartmentation and exchange in vivo. In particular, while ANLS requires transfer of lactate via the extracellular space, intracellular/extracellular exchange rate remains unknown. In this work, we propose to use time-dependent diffusion MRS to assess lactate exchange and lactate compartmentation in mouse brain gray matter in vivo. First, by comparing lactate time-dependent diffusivity and kurtosis with those of water and purely intracellular metabolites (which exhibit "fast" and "no" exchange signatures, respectively), we estimate that lactate exchange is slow (i.e. on the order of hundreds of milliseconds). Then, using biophysical models neglecting exchange, we estimate extracellular, neuronal and astrocytic lactate fractions compatible with the ANLS hypothesis. |
-2024-12-19 |
-AI-Powered Intracranial Hemorrhage Detection: A Co-Scale Convolutional Attention Model with Uncertainty-Based Fuzzy Integral Operator and Feature Screening |
-Mehdi Hosseini Chagahi, Md. Jalil Piran, Niloufar Delfan, Behzad Moshiri, Jaber Hatam Parikhan |
-Link |
-Intracranial hemorrhage (ICH) refers to the leakage or accumulation of blood within the skull, which occurs due to the rupture of blood vessels in or around the brain. If this condition is not diagnosed in a timely manner and appropriately treated, it can lead to serious complications such as decreased consciousness, permanent neurological disabilities, or even death.The primary aim of this study is to detect the occurrence or non-occurrence of ICH, followed by determining the type of subdural hemorrhage (SDH). These tasks are framed as two separate binary classification problems. By adding two layers to the co-scale convolutional attention (CCA) classifier architecture, we introduce a novel approach for ICH detection. In the first layer, after extracting features from different slices of computed tomography (CT) scan images, we combine these features and select the 50 components that capture the highest variance in the data, considering them as informative features. We then assess the discriminative power of these features using the bootstrap forest algorithm, discarding those that lack sufficient discriminative ability between different classes. This algorithm explicitly determines the contribution of each feature to the final prediction, assisting us in developing an explainable AI model. The features feed into a boosting neural network as a latent feature space. In the second layer, we introduce a novel uncertainty-based fuzzy integral operator to fuse information from different CT scan slices. This operator, by accounting for the dependencies between consecutive slices, significantly improves detection accuracy. |
+2024-12-20 |
+A Digital Phantom for 3D MR Spectroscopy Data Simulation |
+D. M. J. van de Sande, A. T. Gudmundson, S. Murali-Manohar, C. W. Davies-Jenkins, D. Simicic, G. Simegn, İ. Özdemir, S. Amirrajab, J. P. Merkofer, H. J. Zöllner, G. Oeltzschner, R. A. E. Edden |
+Link |
+Simulated data is increasingly valued by researchers for validating MRS and MRSI processing and analysis algorithms. However, there is no consensus on the optimal approaches for simulation models and parameters. This study introduces a novel 3D MRS digital brain phantom framework, providing a comprehensive and modular foundation for MRS and MRSI data simulation. We generate a digital brain phantom by combining anatomical and tissue label information with metabolite data from the literature. This phantom contains all necessary information for simulating spectral data. We integrate the phantom with a signal-based model to demonstrate its functionality and usability in generating various spectral datasets. Outputs are saved in the NIfTI-MRS format, enabling their use in downstream applications. We successfully implemented and tested the 3D MRS digital brain phantom framework using two different anatomical models at two resolutions. The resulting metabolite maps and spectral datasets demonstrate realistic data quality, flexibility based on user inputs, and reasonable computational efficiency. This innovative 3D digital brain phantom framework provides a clear and structured approach to simulating MRS and MRSI data. Its modular design establishes a strong, adaptable foundation for future advancements in MRS and MRSI simulation, allowing researchers to extend and refine the model to meet the field's evolving needs. |
-2024-12-19 |
-Dual Photonics Probing of Nano- to Submicron-Scale Structural Alterations in Human Brain Tissues or Cells and Chromatin or DNA with the Progression of Alzheimers Disease |
-Fatemah Alharthi, Ishmael Apachigawo, Dhruvil Solanki, Sazzad Khan, Himanshi Singh, Mohammad Moshahid Khan, Prabhakar Pradhan |
-Link |
-Understanding alterations in structural disorders in tissue or cells or building blocks, such as DNA or chromatin in the human brain, at the nano to submicron level provides us with efficient biomarkers for Alzheimers detection. Here, we report a dual photonics technique to detect nano- to submicron-scale alterations in brain tissues or cells and DNA or chromatin due to the early to late progression of Alzheimers disease in humans. Using a recently developed mesoscopic light transport technique, fine-focused nano-sensitive partial wave spectroscopy (PWS), we measure the degree of structural disorder in tissues. Furthermore, the chemical-specific inverse participation ratio technique (IPR) was used to measure the DNA or chromatin structural alterations. The results of the PWS and IPR experiments showed a significant increase in the degree of structural disorder at the nano to submicron scale at different stages of AD relative to their controls for both the tissue or cell and DNA cellular levels. The increase in the structural disorder in cells or tissues and DNA or chromatin in the nuclei can be attributed to higher mass density fluctuations in the tissue and DNA or chromatin damage in the nuclei caused by the rearrangements of macromolecules due to the deposition of the amyloid beta protein and damage in DNA or chromatin with the progress of AD. |
+2024-12-20 |
+MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems |
+Elifnur Sunger, Yunus Bicer, Deniz Erdogmus, Tales Imbiriba |
+Link |
+Brain-Computer Interfaces (BCIs) help people with severe speech and motor disabilities communicate and interact with their environment using neural activity. This work focuses on the Rapid Serial Visual Presentation (RSVP) paradigm of BCIs using noninvasive electroencephalography (EEG). The RSVP typing task is a recursive task with multiple sequences, where users see only a subset of symbols in each sequence. Extensive research has been conducted to improve classification in the RSVP typing task, achieving fast classification. However, these methods struggle to achieve high accuracy and do not consider the typing mechanism in the learning procedure. They apply binary target and non-target classification without including recursive training. To improve performance in the classification of symbols while controlling the classification speed, we incorporate the typing setup into training by proposing a Partially Observable Markov Decision Process (POMDP) approach. To the best of our knowledge, this is the first work to formulate the RSVP typing task as a POMDP for recursive classification. Experiments show that the proposed approach, MarkovType, results in a more accurate typing system compared to competitors. Additionally, our experiments demonstrate that while there is a trade-off between accuracy and speed, MarkovType achieves the optimal balance between these factors compared to other methods. |
-2024-12-19 |
-TuneS: Patient-specific model-based optimization of contact configuration in deep brain stimulation |
-Anna Franziska Frigge, Lina Uggla, Elena Jiltsova, Markus Fahlström, Dag Nyholm, Alexander Medvedev |
-Link |
-Objective: The objective of this study is to develop and evaluate a systematic approach to optimize Deep Brain Stimulation (DBS) parameters, addressing the challenge of identifying patient-specific settings and optimal stimulation targets for various neurological and mental disorders. Methods: TuneS, a novel pipeline to predict clinically optimal DBS contact configurations based on predefined targets and constraints, is introduced. The method relies upon patient-specific models of stimulation spread and extends optimization beyond traditional neural structures to include automated, model-based targeting of streamlines. Results: Initial findings demonstrate that STN motor streamlines consistently receive a significant portion of the allocated stimulation volume, suggesting that a consistent portion of the stimulation should ideally focus on the STN motor streamlines. At the example of a small cohort of Parkinson's disease patients, the value of model-based contact predictions for assessing stimulation targets while observing constraints is demonstrated. Conclusion: TuneS shows promise as a research tool, enabling systematic assessment of DBS target effectiveness and facilitating constraint-aware optimization of stimulation parameters. Significance: The presented pipeline offers a pathway to improve patient-specific DBS therapies and contributes to the broader understanding of effective DBS targeting strategies. |
+2024-12-20 |
+Precision ICU Resource Planning: A Multimodal Model for Brain Surgery Outcomes |
+Maximilian Fischer, Florian M. Hauptmann, Robin Peretzke, Paul Naser, Peter Neher, Jan-Oliver Neumann, Klaus Maier-Hein |
+Link |
+Although advances in brain surgery techniques have led to fewer postoperative complications requiring Intensive Care Unit (ICU) monitoring, the routine transfer of patients to the ICU remains the clinical standard, despite its high cost. Predictive Gradient Boosted Trees based on clinical data have attempted to optimize ICU admission by identifying key risk factors pre-operatively; however, these approaches overlook valuable imaging data that could enhance prediction accuracy. In this work, we show that multimodal approaches that combine clinical data with imaging data outperform the current clinical data only baseline from 0.29 [F1] to 0.30 [F1], when only pre-operative clinical data is used and from 0.37 [F1] to 0.41 [F1], for pre- and post-operative data. This study demonstrates that effective ICU admission prediction benefits from multimodal data fusion, especially in contexts of severe class imbalance. |
-2024-12-19 |
-CAE-T: A Channelwise AutoEncoder with Transformer for EEG Abnormality Detection |
-Youshen Zhao, Keiji Iramina |
-Link |
-Electroencephalogram (EEG) signals are critical for detecting abnormal brain activity, but their high dimensionality and complexity pose significant challenges for effective analysis. In this paper, we propose CAE-T, a novel framework that combines a channelwise CNN-based autoencoder with a single-head transformer classifier for efficient EEG abnormality detection. The channelwise autoencoder compresses raw EEG signals while preserving channel independence, reducing computational costs and retaining biologically meaningful features. The compressed representations are then fed into the transformer-based classifier, which efficiently models long-term dependencies to distinguish between normal and abnormal signals. Evaluated on the TUH Abnormal EEG Corpus, the proposed model achieves 85.0% accuracy, 76.2% sensitivity, and 91.2% specificity at the per-case level, outperforming baseline models such as EEGNet, Deep4Conv, and FusionCNN. Furthermore, CAE-T requires only 202M FLOPs and 2.9M parameters, making it significantly more efficient than transformer-based alternatives. The framework retains interpretability through its channelwise design, demonstrating great potential for future applications in neuroscience research and clinical practice. The source code is available at https://github.com/YossiZhao/CAE-T. |
+2024-12-20 |
+The Role of Recurrency in Image Segmentation for Noisy and Limited Sample Settings |
+David Calhas, João Marques, Arlindo L. Oliveira |
+Link |
+The biological brain has inspired multiple advances in machine learning. However, most state-of-the-art models in computer vision do not operate like the human brain, simply because they are not capable of changing or improving their decisions/outputs based on a deeper analysis. The brain is recurrent, while these models are not. It is therefore relevant to explore what would be the impact of adding recurrent mechanisms to existing state-of-the-art architectures and to answer the question of whether recurrency can improve existing architectures. To this end, we build on a feed-forward segmentation model and explore multiple types of recurrency for image segmentation. We explore self-organizing, relational, and memory retrieval types of recurrency that minimize a specific energy function. In our experiments, we tested these models on artificial and medical imaging data, while analyzing the impact of high levels of noise and few-shot learning settings. Our results do not validate our initial hypothesis that recurrent models should perform better in these settings, suggesting that these recurrent architectures, by themselves, are not sufficient to surpass state-of-the-art feed-forward versions and that additional work needs to be done on the topic. |
-2024-12-19 |
-Diverging network architecture of the $\textit{C. elegans}$ connectome and signaling network |
-Sophie Dvali, Caio Seguin, Richard Betzel, Andrew M. Leifer |
-Link |
-The connectome describes the complete set of synaptic contacts through which neurons communicate. While the architecture of the $\textit{C. elegans}$ connectome has been extensively characterized, much less is known about the organization of causal signaling networks arising from functional interactions between neurons. Understanding how effective communication pathways relate to or diverge from the underlying structure is a central question in neuroscience. Here, we analyze the modular architecture of the $\textit{C. elegans}$ signal propagation network, measured via calcium imaging and optogenetics, and compare it to the underlying anatomical wiring measured by electron microscopy. Compared to the connectome, we find that signaling modules are not aligned with the modular boundaries of the anatomical network, highlighting an instance where function deviates from structure. An exception to this is the pharynx which is delineated into a separate community in both anatomy and signaling. We analyze the cellular compositions of the signaling architecture and find that its modules are enriched for specific cell types and functions, suggesting that the network modules are neurobiologically relevant. Lastly, we identify a "rich club" of hub neurons in the signaling network. The membership of the signaling rich club differs from the rich club detected in the anatomical network, challenging the view that structural hubs occupy positions of influence in functional (signaling) networks. Our results provide new insight into the interplay between brain structure, in the form of a complete synaptic-level connectome, and brain function, in the form of a system-wide causal signal propagation atlas. |
+2024-12-20 |
+Does the brain behave like a (complex) network? I. Dynamics |
+D. Papo, J. M. Buldú |
+Link |
+Graph theory is now becoming a standard tool in system-level neuroscience. However, endowing observed brain anatomy and dynamics with a complex network structure does not entail that the brain actually works as a network. Asking whether the brain behaves as a network means asking whether network properties count. From the viewpoint of neurophysiology and, possibly, of brain physics, the most substantial issues a network structure may be instrumental in addressing relate to the influence of network properties on brain dynamics and to whether these properties ultimately explain some aspects of brain function. Here, we address the dynamical implications of complex network, examining which aspects and scales of brain activity may be understood to genuinely behave as a network. To do so, we first define the meaning of networkness, and analyse some of its implications. We then examine ways in which brain anatomy and dynamics can be endowed with a network structure and discuss possible ways in which network structure may be shown to represent a genuine organisational principle of brain activity, rather than just a convenient description of its anatomy and dynamics. |
-2024-12-18 |
-Urban Scaling is hardwired in the human brain |
-Airton Deppman |
-Link |
-The emerging field of the Science of Cities has unveiled previously undiscovered facets of urban life. Contrary to the expectation of chaotic behaviour influenced solely by cultural and geographic factors, cities globally exhibit universal power-law trends in urban scaling. Leveraging recent advances in mathematical descriptions of urban dynamics, this study investigates the interplay among fundamental allometry, fractal dimension, and the number of close contacts within cities. Through a dynamic exploration of these factors, a causal relationship is established, shedding light on the intricate dynamics that shape urban environments. Remarkably, our analysis suggests that the Big Five personality traits play a pivotal role in determining the fractal aspects of urban life. This research contributes to the understanding of urban scaling and proposes a novel connection between human personality traits and the structural patterns observed in cities, opening avenues for further interdisciplinary exploration. |
+2024-12-20 |
+Darkit: A User-Friendly Software Toolkit for Spiking Large Language Model |
+Xin Du, Shifan Ye, Qian Zheng, Yangfan Hu, Rui Yan, Shunyu Qi, Shuyang Chen, Huajin Tang, Gang Pan, Shuiguang Deng |
+Link |
+Large language models (LLMs) have been widely applied in various practical applications, typically comprising billions of parameters, with inference processes requiring substantial energy and computational resources. In contrast, the human brain, employing bio-plausible spiking mechanisms, can accomplish the same tasks while significantly reducing energy consumption, even with a similar number of parameters. Based on this, several pioneering researchers have proposed and implemented various large language models that leverage spiking neural networks. They have demonstrated the feasibility of these models, validated their performance, and open-sourced their frameworks and partial source code. To accelerate the adoption of brain-inspired large language models and facilitate secondary development for researchers, we are releasing a software toolkit named DarwinKit (Darkit). The toolkit is designed specifically for learners, researchers, and developers working on spiking large models, offering a suite of highly user-friendly features that greatly simplify the learning, deployment, and development processes. |
-2024-12-18 |
-Starting a Synthetic Biological Intelligence Lab from Scratch |
-Md Sayed Tanveer, Dhruvik Patel, Hunter E. Schweiger, Kwaku Dad Abu-Bonsrah, Brad Watmuff, Azin Azadi, Sergey Pryshchep, Karthikeyan Narayanan, Christopher Puleo, Kannathal Natarajan, Mohammed A. Mostajo-Radji, Brett J. Kagan, Ge Wang |
-Link |
-With the recent advancements in artificial intelligence, researchers and industries are deploying gigantic models trained on billions of samples. While training these models consumes a huge amount of energy, human brains produce similar outputs (along with other capabilities) with massively lower data and energy requirements. For this reason, more researchers are increasingly considering alternatives. One of these alternatives is known as synthetic biological intelligence, which involves training \textit{in vitro} neurons for goal-directed tasks. This multidisciplinary field requires knowledge of tissue engineering, bio-materials, digital signal processing, computer programming, neuroscience, and even artificial intelligence. The multidisciplinary requirements make starting synthetic biological intelligence research highly non-trivial and time-consuming. Generally, most labs either specialize in the biological aspects or the computational ones. Here, we propose how a lab focusing on computational aspects, including machine learning and device interfacing, can start working on synthetic biological intelligence, including organoid intelligence. We will also discuss computational aspects, which can be helpful for labs that focus on biological research. To facilitate synthetic biological intelligence research, we will describe such a general process step by step, including risks and precautions that could lead to substantial delay or additional cost. |
+2024-12-20 |
+Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings |
+Taketo Akama, Zhuohao Zhang, Pengcheng Li, Kotaro Hongo, Hiroaki Kitano, Shun Minamikawa, Natalia Polouliakh |
+Link |
+Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations when subjected to identical auditory sensory inputs. In these studies, the ability to predict cortical representations is probed by regressing from ANN representations to cortical representations. Building upon this concept, our approach reverses the direction of prediction: we utilize ANN representations as a supervisory signal to train recognition models using noisy brain recordings obtained through non-invasive measurements. Specifically, we focus on constructing a recognition model for music identification, where electroencephalography (EEG) brain recordings collected during music listening serve as input. By training an EEG recognition model to predict ANN representations-representations associated with music identification-we observed a substantial improvement in classification accuracy. This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli. It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition. Furthermore, it provides new insights into the relationship between auditory brain activity and ANN representations. |
-2024-12-18 |
-Parameter-efficient Fine-tuning for improved Convolutional Baseline for Brain Tumor Segmentation in Sub-Saharan Africa Adult Glioma Dataset |
-Bijay Adhikari, Pratibha Kulung, Jakesh Bohaju, Laxmi Kanta Poudel, Confidence Raymond, Dong Zhang, Udunna C Anazodo, Bishesh Khanal, Mahesh Shakya |
-Link |
-Automating brain tumor segmentation using deep learning methods is an ongoing challenge in medical imaging. Multiple lingering issues exist including domain-shift and applications in low-resource settings which brings a unique set of challenges including scarcity of data. As a step towards solving these specific problems, we propose Convolutional adapter-inspired Parameter-efficient Fine-tuning (PEFT) of MedNeXt architecture. To validate our idea, we show our method performs comparable to full fine-tuning with the added benefit of reduced training compute using BraTS-2021 as pre-training dataset and BraTS-Africa as the fine-tuning dataset. BraTS-Africa consists of a small dataset (60 train / 35 validation) from the Sub-Saharan African population with marked shift in the MRI quality compared to BraTS-2021 (1251 train samples). We first show that models trained on BraTS-2021 dataset do not generalize well to BraTS-Africa as shown by 20% reduction in mean dice on BraTS-Africa validation samples. Then, we show that PEFT can leverage both the BraTS-2021 and BraTS-Africa dataset to obtain mean dice of 0.8 compared to 0.72 when trained only on BraTS-Africa. Finally, We show that PEFT (0.80 mean dice) results in comparable performance to full fine-tuning (0.77 mean dice) which may show PEFT to be better on average but the boxplots show that full finetuning results is much lesser variance in performance. Nevertheless, on disaggregation of the dice metrics, we find that the model has tendency to oversegment as shown by high specificity (0.99) compared to relatively low sensitivity(0.75). The source code is available at https://github.com/CAMERA-MRI/SPARK2024/tree/main/PEFT_MedNeXt |
+2024-12-19 |
+Event-based backpropagation on the neuromorphic platform SpiNNaker2 |
+Béna Gabriel, Wunderlich Timo, Akl Mahmoud, Vogginger Bernhard, Mayr Christian, Andres Gonzales Hector |
+Link |
+Neuromorphic computing aims to replicate the brain's capabilities for energy efficient and parallel information processing, promising a solution to the increasing demand for faster and more efficient computational systems. Efficient training of neural networks on neuromorphic hardware requires the development of training algorithms that retain the sparsity of spike-based communication during training. Here, we report on the first implementation of event-based backpropagation on the SpiNNaker2 neuromorphic hardware platform. We use EventProp, an algorithm for event-based backpropagation in spiking neural networks (SNNs), to compute exact gradients using sparse communication of error signals between neurons. Our implementation computes multi-layer networks of leaky integrate-and-fire neurons using discretized versions of the differential equations and their adjoints, and uses event packets to transmit spikes and error signals between network layers. We demonstrate a proof-of-concept of batch-parallelized, on-chip training of SNNs using the Yin Yang dataset, and provide an off-chip implementation for efficient prototyping, hyper-parameter search, and hybrid training methods. |
@@ -100,6 +100,41 @@
+2024-12-20 |
+Mamba-based Deep Learning Approaches for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography |
+Andrew H. Zhang, Alex He-Mo, Richard Fei Yin, Chunlin Li, Yuzhi Tang, Dharmendra Gurve, Nasim Montazeri Ghahjaverestan, Maged Goubran, Bo Wang, Andrew S. P. Lim |
+Link |
+Study Objectives: We investigate using Mamba-based deep learning approaches for sleep staging on signals from ANNE One (Sibel Health, Evanston, IL), a minimally intrusive dual-sensor wireless wearable system measuring chest electrocardiography (ECG), triaxial accelerometry, and temperature, as well as finger photoplethysmography (PPG) and temperature. Methods: We obtained wearable sensor recordings from 360 adults undergoing concurrent clinical polysomnography (PSG) at a tertiary care sleep lab. PSG recordings were scored according to AASM criteria. PSG and wearable sensor data were automatically aligned using their ECG channels with manual confirmation by visual inspection. We trained Mamba-based models with both convolutional-recurrent neural network (CRNN) and the recurrent neural network (RNN) architectures on these recordings. Ensembling of model variants with similar architectures was performed. Results: Our best approach, after ensembling, attains a 3-class (wake, NREM, REM) balanced accuracy of 83.50%, F1 score of 84.16%, Cohen's $\kappa$ of 72.68%, and a MCC score of 72.84%; a 4-class (wake, N1/N2, N3, REM) balanced accuracy of 74.64%, F1 score of 74.56%, Cohen's $\kappa$ of 61.63%, and MCC score of 62.04%; a 5-class (wake, N1, N2, N3, REM) balanced accuracy of 64.30%, F1 score of 66.97%, Cohen's $\kappa$ of 53.23%, MCC score of 54.38%. Conclusions: Deep learning models can infer major sleep stages from a wearable system without electroencephalography (EEG) and can be successfully applied to data from adults attending a tertiary care sleep clinic. |
+
+
+2024-12-20 |
+MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems |
+Elifnur Sunger, Yunus Bicer, Deniz Erdogmus, Tales Imbiriba |
+Link |
+Brain-Computer Interfaces (BCIs) help people with severe speech and motor disabilities communicate and interact with their environment using neural activity. This work focuses on the Rapid Serial Visual Presentation (RSVP) paradigm of BCIs using noninvasive electroencephalography (EEG). The RSVP typing task is a recursive task with multiple sequences, where users see only a subset of symbols in each sequence. Extensive research has been conducted to improve classification in the RSVP typing task, achieving fast classification. However, these methods struggle to achieve high accuracy and do not consider the typing mechanism in the learning procedure. They apply binary target and non-target classification without including recursive training. To improve performance in the classification of symbols while controlling the classification speed, we incorporate the typing setup into training by proposing a Partially Observable Markov Decision Process (POMDP) approach. To the best of our knowledge, this is the first work to formulate the RSVP typing task as a POMDP for recursive classification. Experiments show that the proposed approach, MarkovType, results in a more accurate typing system compared to competitors. Additionally, our experiments demonstrate that while there is a trade-off between accuracy and speed, MarkovType achieves the optimal balance between these factors compared to other methods. |
+
+
+2024-12-20 |
+SODor: Long-Term EEG Partitioning for Seizure Onset Detection |
+Zheng Chen, Yasuko Matsubara, Yasushi Sakurai, Jimeng Sun |
+Link |
+Deep learning models have recently shown great success in classifying epileptic patients using EEG recordings. Unfortunately, classification-based methods lack a sound mechanism to detect the onset of seizure events. In this work, we propose a two-stage framework, \method, that explicitly models seizure onset through a novel task formulation of subsequence clustering. Given an EEG sequence, the framework first learns a set of second-level embeddings with label supervision. It then employs model-based clustering to explicitly capture long-term temporal dependencies in EEG sequences and identify meaningful subsequences. Epochs within a subsequence share a common cluster assignment (normal or seizure), with cluster or state transitions representing successful onset detections. Extensive experiments on three datasets demonstrate that our method can correct misclassifications, achieving 5%-11% classification improvements over other baselines and accurately detecting seizure onsets. |
+
+
+2024-12-20 |
+Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings |
+Taketo Akama, Zhuohao Zhang, Pengcheng Li, Kotaro Hongo, Hiroaki Kitano, Shun Minamikawa, Natalia Polouliakh |
+Link |
+Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations when subjected to identical auditory sensory inputs. In these studies, the ability to predict cortical representations is probed by regressing from ANN representations to cortical representations. Building upon this concept, our approach reverses the direction of prediction: we utilize ANN representations as a supervisory signal to train recognition models using noisy brain recordings obtained through non-invasive measurements. Specifically, we focus on constructing a recognition model for music identification, where electroencephalography (EEG) brain recordings collected during music listening serve as input. By training an EEG recognition model to predict ANN representations-representations associated with music identification-we observed a substantial improvement in classification accuracy. This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli. It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition. Furthermore, it provides new insights into the relationship between auditory brain activity and ANN representations. |
+
+
+2024-12-19 |
+LG-Sleep: Local and Global Temporal Dependencies for Mice Sleep Scoring |
+Shadi Sartipi, Mie Andersen, Natalie Hauglund, Celia Kjaerby, Verena Untiet, Maiken Nedergaard, Mujdat Cetin |
+Link |
+Efficiently identifying sleep stages is crucial for unraveling the intricacies of sleep in both preclinical and clinical research. The labor-intensive nature of manual sleep scoring, demanding substantial expertise, has prompted a surge of interest in automated alternatives. Sleep studies in mice play a significant role in understanding sleep patterns and disorders and underscore the need for robust scoring methodologies. In response, this study introduces LG-Sleep, a novel subject-independent deep neural network architecture designed for mice sleep scoring through electroencephalogram (EEG) signals. LG-Sleep extracts local and global temporal transitions within EEG signals to categorize sleep data into three stages: wake, rapid eye movement (REM) sleep, and non-rapid eye movement (NREM) sleep. The model leverages local and global temporal information by employing time-distributed convolutional neural networks to discern local temporal transitions in EEG data. Subsequently, features derived from the convolutional filters traverse long short-term memory blocks, capturing global transitions over extended periods. Crucially, the model is optimized in an autoencoder-decoder fashion, facilitating generalization across distinct subjects and adapting to limited training samples. Experimental findings demonstrate superior performance of LG-Sleep compared to conventional deep neural networks. Moreover, the model exhibits good performance across different sleep stages even when tasked with scoring based on limited training samples. |
+
+
2024-12-19 |
CAE-T: A Channelwise AutoEncoder with Transformer for EEG Abnormality Detection |
Youshen Zhao, Keiji Iramina |
@@ -134,41 +169,6 @@
Link |
A brain-computer interface (BCI) establishes a direct communication pathway between the human brain and a computer. It has been widely used in medical diagnosis, rehabilitation, education, entertainment, etc. Most research so far focuses on making BCIs more accurate and reliable, but much less attention has been paid to their privacy. Developing a commercial BCI system usually requires close collaborations among multiple organizations, e.g., hospitals, universities, and/or companies. Input data in BCIs, e.g., electroencephalogram (EEG), contain rich privacy information, and the developed machine learning model is usually proprietary. Data and model transmission among different parties may incur significant privacy threats, and hence privacy protection in BCIs must be considered. Unfortunately, there does not exist any contemporary and comprehensive review on privacy-preserving BCIs. This paper fills this gap, by describing potential privacy threats and protection strategies in BCIs. It also points out several challenges and future research directions in developing privacy-preserving BCIs. |
-
-2024-12-16 |
-Accurate, Robust and Privacy-Preserving Brain-Computer Interface Decoding |
-Xiaoqing Chen, Tianwang Jia, Dongrui Wu |
-Link |
-An electroencephalogram (EEG) based brain-computer interface (BCI) enables direct communication between the brain and external devices. However, EEG-based BCIs face at least three major challenges in real-world applications: data scarcity and individual differences, adversarial vulnerability, and data privacy. While previous studies have addressed one or two of these issues, simultaneous accommodation of all three challenges remains challenging and unexplored. This paper fills this gap, by proposing an Augmented Robustness Ensemble (ARE) algorithm and integrating it into three privacy protection scenarios (centralized source-free transfer, federated source-free transfer, and source data perturbation), achieving simultaneously accurate decoding, adversarial robustness, and privacy protection of EEG-based BCIs. Experiments on three public EEG datasets demonstrated that our proposed approach outperformed over 10 classic and state-of-the-art approaches in both accuracy and robustness in all three privacy-preserving scenarios, even outperforming state-of-the-art transfer learning approaches that do not consider privacy protection at all. This is the first time that three major challenges in EEG-based BCIs can be addressed simultaneously, significantly improving the practicalness of EEG decoding in real-world BCIs. |
-
-
-2024-12-13 |
-KenCoh: A Ranked-Based Canonical Coherence |
-Mara Sherlin D. Talento, Sarbojit Roy, Hernando C. Ombao |
-Link |
-In this paper, we consider the problem of characterizing a robust global dependence between two brain regions where each region may contain several voxels or channels. This work is driven by experiments to investigate the dependence between two cortical regions and to identify differences in brain networks between brain states, e.g., alert and drowsy states. The most common approach to explore dependence between two groups of variables (or signals) is via canonical correlation analysis (CCA). However, it is limited to only capturing linear associations and is sensitive to outlier observations. These limitations are crucial because brain network connectivity is likely to be more complex than linear and that brain signals may exhibit heavy-tailed properties. To overcome these limitations, we develop a robust method, Kendall canonical coherence (KenCoh), for learning monotonic connectivity structure among neuronal signals filtered at given frequency bands. Furthermore, we propose the KenCoh-based permutation test to investigate the differences in brain network connectivity between two different states. Our simulation study demonstrates that KenCoh is competitive to the traditional variance-covariance estimator and outperforms the later when the underlying distributions are heavy-tailed. We apply our method to EEG recordings from a virtual-reality driving experiment. Our proposed method led to further insights on the differences of frontal-parietal cross-dependence network when the subject is alert and when the subject is drowsy and that left-parietal channel drives this dependence at the beta-band. |
-
-
-2024-12-13 |
-CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information |
-Kaifan Zhang, Lihuo He, Xin Jiang, Wen Lu, Di Wang, Xinbo Gao |
-Link |
-Electroencephalogram (EEG) signals have attracted significant attention from researchers due to their non-invasive nature and high temporal sensitivity in decoding visual stimuli. However, most recent studies have focused solely on the relationship between EEG and image data pairs, neglecting the valuable ``beyond-image-modality" information embedded in EEG signals. This results in the loss of critical multimodal information in EEG. To address this limitation, we propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals. Specifically, CognitionCapturer trains Modality Expert Encoders for each modality to extract cross-modal information from the EEG modality. Then, it introduces a diffusion prior to map the EEG embedding space to the CLIP embedding space, followed by using a pretrained generative model, the proposed framework can reconstruct visual stimuli with high semantic and structural fidelity. Notably, the framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities. Through extensive experiments, we demonstrate that CognitionCapturer outperforms state-of-the-art methods both qualitatively and quantitatively. Code: https://github.com/XiaoZhangYES/CognitionCapturer. |
-
-
-2024-12-13 |
-Active Poisoning: Efficient Backdoor Attacks on Transfer Learning-Based Brain-Computer Interfaces |
-X. Jiang, L. Meng, S. Li, D. Wu |
-Link |
-Transfer learning (TL) has been widely used in electroencephalogram (EEG)-based brain-computer interfaces (BCIs) for reducing calibration efforts. However, backdoor attacks could be introduced through TL. In such attacks, an attacker embeds a backdoor with a specific pattern into the machine learning model. As a result, the model will misclassify a test sample with the backdoor trigger into a prespecified class while still maintaining good performance on benign samples. Accordingly, this study explores backdoor attacks in the TL of EEG-based BCIs, where source-domain data are poisoned by a backdoor trigger and then used in TL. We propose several active poisoning approaches to select source-domain samples, which are most effective in embedding the backdoor pattern, to improve the attack success rate and efficiency. Experiments on four EEG datasets and three deep learning models demonstrate the effectiveness of the approaches. To our knowledge, this is the first study about backdoor attacks on TL models in EEG-based BCIs. It exposes a serious security risk in BCIs, which should be immediately addressed. |
-
-
-2024-12-13 |
-User Identity Protection in EEG-based Brain-Computer Interfaces |
-L. Meng, X. Jiang, J. Huang, W. Li, H. Luo, D. Wu |
-Link |
-A brain-computer interface (BCI) establishes a direct communication pathway between the brain and an external device. Electroencephalogram (EEG) is the most popular input signal in BCIs, due to its convenience and low cost. Most research on EEG-based BCIs focuses on the accurate decoding of EEG signals; however, EEG signals also contain rich private information, e.g., user identity, emotion, and so on, which should be protected. This paper first exposes a serious privacy problem in EEG-based BCIs, i.e., the user identity in EEG data can be easily learned so that different sessions of EEG data from the same user can be associated together to more reliably mine private information. To address this issue, we further propose two approaches to convert the original EEG data into identity-unlearnable EEG data, i.e., removing the user identity information while maintaining the good performance on the primary BCI task. Experiments on seven EEG datasets from five different BCI paradigms showed that on average the generated identity-unlearnable EEG data can reduce the user identification accuracy from 70.01\% to at most 21.36\%, greatly facilitating user privacy protection in EEG-based BCIs. |
-
@@ -186,6 +186,20 @@
+2024-12-20 |
+MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems |
+Elifnur Sunger, Yunus Bicer, Deniz Erdogmus, Tales Imbiriba |
+Link |
+Brain-Computer Interfaces (BCIs) help people with severe speech and motor disabilities communicate and interact with their environment using neural activity. This work focuses on the Rapid Serial Visual Presentation (RSVP) paradigm of BCIs using noninvasive electroencephalography (EEG). The RSVP typing task is a recursive task with multiple sequences, where users see only a subset of symbols in each sequence. Extensive research has been conducted to improve classification in the RSVP typing task, achieving fast classification. However, these methods struggle to achieve high accuracy and do not consider the typing mechanism in the learning procedure. They apply binary target and non-target classification without including recursive training. To improve performance in the classification of symbols while controlling the classification speed, we incorporate the typing setup into training by proposing a Partially Observable Markov Decision Process (POMDP) approach. To the best of our knowledge, this is the first work to formulate the RSVP typing task as a POMDP for recursive classification. Experiments show that the proposed approach, MarkovType, results in a more accurate typing system compared to competitors. Additionally, our experiments demonstrate that while there is a trade-off between accuracy and speed, MarkovType achieves the optimal balance between these factors compared to other methods. |
+
+
+2024-12-20 |
+Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings |
+Taketo Akama, Zhuohao Zhang, Pengcheng Li, Kotaro Hongo, Hiroaki Kitano, Shun Minamikawa, Natalia Polouliakh |
+Link |
+Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations when subjected to identical auditory sensory inputs. In these studies, the ability to predict cortical representations is probed by regressing from ANN representations to cortical representations. Building upon this concept, our approach reverses the direction of prediction: we utilize ANN representations as a supervisory signal to train recognition models using noisy brain recordings obtained through non-invasive measurements. Specifically, we focus on constructing a recognition model for music identification, where electroencephalography (EEG) brain recordings collected during music listening serve as input. By training an EEG recognition model to predict ANN representations-representations associated with music identification-we observed a substantial improvement in classification accuracy. This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli. It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition. Furthermore, it provides new insights into the relationship between auditory brain activity and ANN representations. |
+
+
2024-12-17 |
Predicting Workload in Virtual Flight Simulations using EEG Features (Including Post-hoc Analysis in Appendix) |
Bas Verkennis, Evy van Weelden, Francesca L. Marogna, Maryam Alimardani, Travis J. Wiltshire, Max M. Louwerse |
@@ -241,20 +255,6 @@
Link |
Motor imagery (MI) based brain-computer interfaces (BCIs) enable the direct control of external devices through the imagined movements of various body parts. Unlike previous systems that used fixed-length EEG trials for MI decoding, asynchronous BCIs aim to detect the user's MI without explicit triggers. They are challenging to implement, because the algorithm needs to first distinguish between resting-states and MI trials, and then classify the MI trials into the correct task, all without any triggers. This paper proposes a sliding window prescreening and classification (SWPC) approach for MI-based asynchronous BCIs, which consists of two modules: a prescreening module to screen MI trials out of the resting-state, and a classification module for MI classification. Both modules are trained with supervised learning followed by self-supervised learning, which refines the feature extractors. Within-subject and cross-subject asynchronous MI classifications on four different EEG datasets validated the effectiveness of SWPC, i.e., it always achieved the highest average classification accuracy, and outperformed the best state-of-the-art baseline on each dataset by about 2%. |
-
-2024-12-12 |
-Motor Imagery Teleoperation of a Mobile Robot Using a Low-Cost Brain-Computer Interface for Multi-Day Validation |
-Yujin An, Daniel Mitchell, John Lathrop, David Flynn, Soon-Jo Chung |
-Link |
-Brain-computer interfaces (BCI) have the potential to provide transformative control in prosthetics, assistive technologies (wheelchairs), robotics, and human-computer interfaces. While Motor Imagery (MI) offers an intuitive approach to BCI control, its practical implementation is often limited by the requirement for expensive devices, extensive training data, and complex algorithms, leading to user fatigue and reduced accessibility. In this paper, we demonstrate that effective MI-BCI control of a mobile robot in real-world settings can be achieved using a fine-tuned Deep Neural Network (DNN) with a sliding window, eliminating the need for complex feature extractions for real-time robot control. The fine-tuning process optimizes the convolutional and attention layers of the DNN to adapt to each user's daily MI data streams, reducing training data by 70% and minimizing user fatigue from extended data collection. Using a low-cost (~$3k), 16-channel, non-invasive, open-source electroencephalogram (EEG) device, four users teleoperated a quadruped robot over three days. The system achieved 78% accuracy on a single-day validation dataset and maintained a 75% validation accuracy over three days without extensive retraining from day-to-day. For real-world robot command classification, we achieved an average of 62% accuracy. By providing empirical evidence that MI-BCI systems can maintain performance over multiple days with reduced training data to DNN and a low-cost EEG device, our work enhances the practicality and accessibility of BCI technology. This advancement makes BCI applications more feasible for real-world scenarios, particularly in controlling robotic systems. |
-
-
-2024-12-10 |
-Towards Predictive Communication with Brain-Computer Interfaces integrating Large Language Models |
-Andrea Caria |
-Link |
-This perspective article aims at providing an outline of the state of the art and future developments towards the integration of cutting-edge predictive language models with BCI. A synthetic overview of early and more recent linguistic models, from natural language processing (NLP) models to recent LLM, that to a varying extent improved predictive writing systems, is first provided. Second, a summary of previous BCI implementations integrating language models is presented. The few preliminary studies investigating the possible combination of LLM with BCI spellers to efficiently support fast communication and control are then described. Finally, current challenges and limitations towards the full integration of LLM with BCI systems are discussed. Recent investigations suggest that the combination of LLM with BCI might drastically improve human-computer interaction in patients with motor or language disorders as well as in healthy individuals. In particular, the pretrained autoregressive transformer models, such as GPT, that capitalize from parallelization, learning through pre-training and fine-tuning, promise a substantial improvement of BCI for communication with respect to previous systems incorporating simpler language models. Indeed, among various models, the GPT-2 was shown to represent an excellent candidate for its integration into BCI although testing was only perfomed on simulated conversations and not on real BCI scenarios. Prospectively, the full integration of LLM with advanced BCI systems might lead to a big leap forward towards fast, efficient and user-adaptive neurotechnology. |
-
@@ -530,74 +530,74 @@
-2024-12-19 |
-Modeling and Measurement of Lead Tip Heating in Implanted Wires with Loops |
-Lydia J Bardwell Speltz, Seung-Kyun Lee, Yunhong Shu, Matt A Bernstein |
-Link |
-Purpose: To theoretically and experimentally study implant lead tip heating caused by radiofrequency (RF) power deposition in different wire configurations that contain loop(s). Methods: Maximum temperature rise caused by RF heating was measured at 1.5T on 20 insulated, capped wires with various loop and straight segment configurations. The experimental results were compared with predictions from the previously reported simple exponential and the adapted transmission line models, as well as with a long-wavelength approximation. Results: Both models effectively predicted the trends in lead tip temperature rise for all the wire configurations, with the adapted transmission line model showing superior accuracy. For superior/inferior (S/I)-oriented wires, increasing the number of loops decreased the overall heating. However, when wires were oriented right/left (R/L) where the x-component of the electric field is negligible, additional loops increased the overall heating. Conclusion: The simple exponential and the adapted transmission line models previously developed for, and tested on, straight wires require no additional terms or further modification to account for RF heating in a variety of loop configurations. These results extend the usefulness of the models to manage implanted device lead tip heating and provide theoretical insight regarding the role of loops and electrical lengths in managing RF safety of implanted devices. |
+2024-12-20 |
+EF-Net: A Deep Learning Approach Combining Word Embeddings and Feature Fusion for Patient Disposition Analysis |
+Nafisa Binte Feroz, Chandrima Sarker, Tanzima Ahsan, K M Arefeen Sultan, Raqeebir Rab |
+Link |
+One of the most urgent problems is the overcrowding in emergency departments (EDs), caused by an aging population and rising healthcare costs. Patient dispositions have become more complex as a result of the strain on hospital infrastructure and the scarcity of medical resources. Individuals with more dangerous health issues should be prioritized in the emergency room. Thus, our research aims to develop a prediction model for patient disposition using EF-Net. This model will incorporate categorical features into the neural network layer and add numerical features with the embedded categorical features. We combine the EF-Net and XGBoost models to attain higher accuracy in our results. The result is generated using the soft voting technique. In EF-Net, we attained an accuracy of 95.33%, whereas in the Ensemble Model, we achieved an accuracy of 96%. The experiment's analysis shows that EF-Net surpasses existing works in accuracy, AUROC, and F1-Score on the MIMIC-IV-ED dataset, demonstrating its potential as a scalable solution for patient disposition assessment. Our code is available at https://github.com/nafisa67/thesis |
-2024-12-19 |
-MultiverSeg: Scalable Interactive Segmentation of Biomedical Imaging Datasets with In-Context Guidance |
-Hallee E. Wong, Jose Javier Gonzalez Ortiz, John Guttag, Adrian V. Dalca |
-Link |
-Medical researchers and clinicians often need to perform novel segmentation tasks on a set of related images. Existing methods for segmenting a new dataset are either interactive, requiring substantial human effort for each image, or require an existing set of manually labeled images. We introduce a system, MultiverSeg, that enables practitioners to rapidly segment an entire new dataset without requiring access to any existing labeled data from that task or domain. Along with the image to segment, the model takes user interactions such as clicks, bounding boxes or scribbles as input, and predicts a segmentation. As the user segments more images, those images and segmentations become additional inputs to the model, providing context. As the context set of labeled images grows, the number of interactions required to segment each new image decreases. We demonstrate that MultiverSeg enables users to interactively segment new datasets efficiently, by amortizing the number of interactions per image to achieve an accurate segmentation. Compared to using a state-of-the-art interactive segmentation method, using MultiverSeg reduced the total number of scribble steps by 53% and clicks by 36% to achieve 90% Dice on sets of images from unseen tasks. We release code and model weights at https://multiverseg.csail.mit.edu |
+2024-12-20 |
+Convolutional Deep Operator Networks for Learning Nonlinear Focused Ultrasound Wave Propagation in Heterogeneous Spinal Cord Anatomy |
+Avisha Kumar, Xuzhe Zhi, Zan Ahmad, Minglang Yin, Amir Manbachi |
+Link |
+Focused ultrasound (FUS) therapy is a promising tool for optimally targeted treatment of spinal cord injuries (SCI), offering submillimeter precision to enhance blood flow at injury sites while minimizing impact on surrounding tissues. However, its efficacy is highly sensitive to the placement of the ultrasound source, as the spinal cord's complex geometry and acoustic heterogeneity distort and attenuate the FUS signal. Current approaches rely on computer simulations to solve the governing wave propagation equations and compute patient-specific pressure maps using ultrasound images of the spinal cord anatomy. While accurate, these high-fidelity simulations are computationally intensive, taking up to hours to complete parameter sweeps, which is impractical for real-time surgical decision-making. To address this bottleneck, we propose a convolutional deep operator network (DeepONet) to rapidly predict FUS pressure fields in patient spinal cords. Unlike conventional neural networks, DeepONets are well equipped to approximate the solution operator of the parametric partial differential equations (PDEs) that govern the behavior of FUS waves with varying initial and boundary conditions (i.e., new transducer locations or spinal cord geometries) without requiring extensive simulations. Trained on simulated pressure maps across diverse patient anatomies, this surrogate model achieves real-time predictions with only a 2% loss on the test set, significantly accelerating the modeling of nonlinear physical systems in heterogeneous domains. By facilitating rapid parameter sweeps in surgical settings, this work provides a crucial step toward precise and individualized solutions in neurosurgical treatments. |
-2024-12-19 |
-Joint estimation of activity, attenuation and motion in respiratory-self-gated time-of-flight PET |
-Masoud Elhamiasl, Frederic Jolivet, Ahmadreza Rezaei, Michael Fieseler, Klaus Schäfers, Johan Nuyts, Georg Schramm, Fernando Boada |
-Link |
-Whole-body PET imaging is often hindered by respiratory motion during acquisition, causing significant degradation in the quality of reconstructed activity images. An additional challenge in PET/CT imaging arises from the respiratory phase mismatch between CT-based attenuation correction and PET acquisition, leading to attenuation artifacts. To address these issues, we propose two new, purely data-driven methods for the joint estimation of activity, attenuation, and motion in respiratory self-gated TOF PET. These methods enable the reconstruction of a single activity image free from motion and attenuation artifacts. The proposed methods were evaluated using data from the anthropomorphic Wilhelm phantom acquired on a Siemens mCT PET/CT system, as well as 3 clinical FDG PET/CT datasets acquired on a GE DMI PET/CT system. Image quality was assessed visually to identify motion and attenuation artifacts. Lesion uptake values were quantitatively compared across reconstructions without motion modeling, with motion modeling but static attenuation correction, and with our proposed methods. For the Wilhelm phantom, the proposed methods delivered image quality closely matching the reference reconstruction from a static acquisition. The lesion-to-background contrast for a liver dome lesion improved from 2.0 (no motion correction) to 5.2 (proposed methods), matching the contrast from the static acquisition (5.2). In contrast, motion modeling with static attenuation correction yielded a lower contrast of 3.5. In patient datasets, the proposed methods successfully reduced motion artifacts in lung and liver lesions and mitigated attenuation artifacts, demonstrating superior lesion to background separation. Our proposed methods enable the reconstruction of a single, high-quality activity image that is motion-corrected and free from attenuation artifacts, without the need for external hardware. |
+2024-12-20 |
+Towards Interpretable Radiology Report Generation via Concept Bottlenecks using a Multi-Agentic RAG |
+Hasan Md Tusfiqur Alam, Devansh Srivastav, Md Abdul Kadir, Daniel Sonntag |
+Link |
+Deep learning has advanced medical image classification, but interpretability challenges hinder its clinical adoption. This study enhances interpretability in Chest X-ray (CXR) classification by using concept bottleneck models (CBMs) and a multi-agent Retrieval-Augmented Generation (RAG) system for report generation. By modeling relationships between visual features and clinical concepts, we create interpretable concept vectors that guide a multi-agent RAG system to generate radiology reports, enhancing clinical relevance, explainability, and transparency. Evaluation of the generated reports using an LLM-as-a-judge confirmed the interpretability and clinical utility of our model's outputs. On the COVID-QU dataset, our model achieved 81% classification accuracy and demonstrated robust report generation performance, with five key metrics ranging between 84% and 90%. This interpretable multi-agent framework bridges the gap between high-performance AI and the explainability required for reliable AI-driven CXR analysis in clinical settings. |
-2024-12-19 |
-MUSTER: Longitudinal Deformable Registration by Composition of Consecutive Deformations |
-Edvard O. S. Grødem, Donatas Sederevičius, Esten H. Leonardsen, Bradley J. MacIntosh, Atle Bjørnerud, Till Schellhorn, Øystein Sørensen, Inge Amlien, Pablo F. Garrido, Anders M. Fjell |
-Link |
-Longitudinal imaging allows for the study of structural changes over time. One approach to detecting such changes is by non-linear image registration. This study introduces Multi-Session Temporal Registration (MUSTER), a novel method that facilitates longitudinal analysis of changes in extended series of medical images. MUSTER improves upon conventional pairwise registration by incorporating more than two imaging sessions to recover longitudinal deformations. Longitudinal analysis at a voxel-level is challenging due to effects of a changing image contrast as well as instrumental and environmental sources of bias between sessions. We show that local normalized cross-correlation as an image similarity metric leads to biased results and propose a robust alternative. We test the performance of MUSTER on a synthetic multi-site, multi-session neuroimaging dataset and show that, in various scenarios, using MUSTER significantly enhances the estimated deformations relative to pairwise registration. Additionally, we apply MUSTER on a sample of older adults from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. The results show that MUSTER can effectively identify patterns of neuro-degeneration from T1-weighted images and that these changes correlate with changes in cognition, matching the performance of state of the art segmentation methods. By leveraging GPU acceleration, MUSTER efficiently handles large datasets, making it feasible also in situations with limited computational resources. |
+2024-12-20 |
+Efficient MedSAMs: Segment Anything in Medical Images on Laptop |
+Jun Ma, Feifei Li, Sumin Kim, Reza Asakereh, Bao-Hiep Le, Dang-Khoa Nguyen-Vu, Alexander Pfefferle, Muxin Wei, Ruochen Gao, Donghang Lyu, Songxiao Yang, Lennart Purucker, Zdravko Marinov, Marius Staring, Haisheng Lu, Thuy Thanh Dao, Xincheng Ye, Zhi Li, Gianluca Brugnara, Philipp Vollmuth, Martha Foltyn-Dumitru, Jaeyoung Cho, Mustafa Ahmed Mahmutoglu, Martin Bendszus, Irada Pflüger, Aditya Rastogi, Dong Ni, Xin Yang, Guang-Quan Zhou, Kaini Wang, Nicholas Heller, Nikolaos Papanikolopoulos, Christopher Weight, Yubing Tong, Jayaram K Udupa, Cahill J. Patrick, Yaqi Wang, Yifan Zhang, Francisco Contijoch, Elliot McVeigh, Xin Ye, Shucheng He, Robert Haase, Thomas Pinetz, Alexander Radbruch, Inga Krause, Erich Kobler, Jian He, Yucheng Tang, Haichun Yang, Yuankai Huo, Gongning Luo, Kaisar Kushibar, Jandos Amankulov, Dias Toleshbayev, Amangeldi Mukhamejan, Jan Egger, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Shohei Fujita, Tomohiro Kikuchi, Benedikt Wiestler, Jan S. Kirschke, Ezequiel de la Rosa, Federico Bolelli, Luca Lumetti, Costantino Grana, Kunpeng Xie, Guomin Wu, Behrus Puladi, Carlos Martín-Isla, Karim Lekadir, Victor M. Campello, Wei Shao, Wayne Brisbane, Hongxu Jiang, Hao Wei, Wu Yuan, Shuangle Li, Yuyin Zhou, Bo Wang |
+Link |
+Promptable segmentation foundation models have emerged as a transformative approach to addressing the diverse needs in medical images, but most existing models require expensive computing, posing a big barrier to their adoption in clinical practice. In this work, we organized the first international competition dedicated to promptable medical image segmentation, featuring a large-scale dataset spanning nine common imaging modalities from over 20 different institutions. The top teams developed lightweight segmentation foundation models and implemented an efficient inference pipeline that substantially reduced computational requirements while maintaining state-of-the-art segmentation accuracy. Moreover, the post-challenge phase advanced the algorithms through the design of performance booster and reproducibility tasks, resulting in improved algorithms and validated reproducibility of the winning solution. Furthermore, the best-performing algorithms have been incorporated into the open-source software with a user-friendly interface to facilitate clinical adoption. The data and code are publicly available to foster the further development of medical image segmentation foundation models and pave the way for impactful real-world applications. |
-2024-12-19 |
-Dual Photonics Probing of Nano- to Submicron-Scale Structural Alterations in Human Brain Tissues or Cells and Chromatin or DNA with the Progression of Alzheimers Disease |
-Fatemah Alharthi, Ishmael Apachigawo, Dhruvil Solanki, Sazzad Khan, Himanshi Singh, Mohammad Moshahid Khan, Prabhakar Pradhan |
-Link |
-Understanding alterations in structural disorders in tissue or cells or building blocks, such as DNA or chromatin in the human brain, at the nano to submicron level provides us with efficient biomarkers for Alzheimers detection. Here, we report a dual photonics technique to detect nano- to submicron-scale alterations in brain tissues or cells and DNA or chromatin due to the early to late progression of Alzheimers disease in humans. Using a recently developed mesoscopic light transport technique, fine-focused nano-sensitive partial wave spectroscopy (PWS), we measure the degree of structural disorder in tissues. Furthermore, the chemical-specific inverse participation ratio technique (IPR) was used to measure the DNA or chromatin structural alterations. The results of the PWS and IPR experiments showed a significant increase in the degree of structural disorder at the nano to submicron scale at different stages of AD relative to their controls for both the tissue or cell and DNA cellular levels. The increase in the structural disorder in cells or tissues and DNA or chromatin in the nuclei can be attributed to higher mass density fluctuations in the tissue and DNA or chromatin damage in the nuclei caused by the rearrangements of macromolecules due to the deposition of the amyloid beta protein and damage in DNA or chromatin with the progress of AD. |
+2024-12-20 |
+Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game |
+Sebastian Niehaus, Ingo Roeder, Nico Scherf |
+Link |
+Decentralised learning enables the training of deep learning algorithms without centralising data sets, resulting in benefits such as improved data privacy, operational efficiency and the fostering of data ownership policies. However, significant data imbalances pose a challenge in this framework. Participants with smaller datasets in distributed learning environments often achieve poorer results than participants with larger datasets. Data imbalances are particularly pronounced in medical fields and are caused by different patient populations, technological inequalities and divergent data collection practices. In this paper, we consider distributed learning as an Stackelberg evolutionary game. We present two algorithms for setting the weights of each node's contribution to the global model in each training round: the Deterministic Stackelberg Weighting Model (DSWM) and the Adaptive Stackelberg Weighting Model (ASWM). We use three medical datasets to highlight the impact of dynamic weighting on underrepresented nodes in distributed learning. Our results show that the ASWM significantly favours underrepresented nodes by improving their performance by 2.713% in AUC. Meanwhile, nodes with larger datasets experience only a modest average performance decrease of 0.441%. |
-2024-12-19 |
-Pitfalls of topology-aware image segmentation |
-Alexander H. Berger, Laurin Lux, Alexander Weers, Martin Menten, Daniel Rueckert, Johannes C. Paetzold |
-Link |
-Topological correctness, i.e., the preservation of structural integrity and specific characteristics of shape, is a fundamental requirement for medical imaging tasks, such as neuron or vessel segmentation. Despite the recent surge in topology-aware methods addressing this challenge, their real-world applicability is hindered by flawed benchmarking practices. In this paper, we identify critical pitfalls in model evaluation that include inadequate connectivity choices, overlooked topological artifacts in ground truth annotations, and inappropriate use of evaluation metrics. Through detailed empirical analysis, we uncover these issues' profound impact on the evaluation and ranking of segmentation methods. Drawing from our findings, we propose a set of actionable recommendations to establish fair and robust evaluation standards for topology-aware medical image segmentation methods. |
+2024-12-20 |
+Probing lactate exchange and compartmentation in Gray Matter via time-dependent diffusion-weighted MRS |
+Eloise Mougel, Marco Palombo, Julien Valette |
+Link |
+Lactate is crucial in the brain, as it is involved in neuronal activity and memory formation. This is thought to be related to the astrocyte-neuron lactate shuttle hypothesis (ANLS), which has been debated for 30 years, not least because it is difficult to measure lactate compartmentation and exchange in vivo. In particular, while ANLS requires transfer of lactate via the extracellular space, intracellular/extracellular exchange rate remains unknown. In this work, we propose to use time-dependent diffusion MRS to assess lactate exchange and lactate compartmentation in mouse brain gray matter in vivo. First, by comparing lactate time-dependent diffusivity and kurtosis with those of water and purely intracellular metabolites (which exhibit "fast" and "no" exchange signatures, respectively), we estimate that lactate exchange is slow (i.e. on the order of hundreds of milliseconds). Then, using biophysical models neglecting exchange, we estimate extracellular, neuronal and astrocytic lactate fractions compatible with the ANLS hypothesis. |
-2024-12-19 |
-Accelerated Patient-Specific Calibration via Differentiable Hemodynamics Simulations |
-Diego Renner, Georgios Kissas |
-Link |
-One of the goals of personalized medicine is to tailor diagnostics to individual patients. Diagnostics are performed in practice by measuring quantities, called biomarkers, that indicate the existence and progress of a disease. In common cardiovascular diseases, such as hypertension, biomarkers that are closely related to the clinical representation of a patient can be predicted using computational models. Personalizing computational models translates to considering patient-specific flow conditions, for example, the compliance of blood vessels that cannot be a priori known and quantities such as the patient geometry that can be measured using imaging. Therefore, a patient is identified by a set of measurable and nonmeasurable parameters needed to well-define a computational model; else, the computational model is not personalized, meaning it is prone to large prediction errors. Therefore, to personalize a computational model, sufficient information needs to be extracted from the data. The current methods by which this is done are either inefficient, due to relying on slow-converging optimization methods, or hard to interpret, due to using `black box` deep-learning algorithms. We propose a personalized diagnostic procedure based on a differentiable 0D-1D Navier-Stokes reduced order model solver and fast parameter inference methods that take advantage of gradients through the solver. By providing a faster method for performing parameter inference and sensitivity analysis through differentiability while maintaining the interpretability of well-understood mathematical models and numerical methods, the best of both worlds is combined. The performance of the proposed solver is validated against a well-established process on different geometries, and different parameter inference processes are successfully performed. |
+2024-12-20 |
+Self-Supervised Radiograph Anatomical Region Classification -- How Clean Is Your Real-World Data? |
+Simon Langer, Jessica Ritter, Rickmer Braren, Daniel Rueckert, Paul Hager |
+Link |
+Modern deep learning-based clinical imaging workflows rely on accurate labels of the examined anatomical region. Knowing the anatomical region is required to select applicable downstream models and to effectively generate cohorts of high quality data for future medical and machine learning research efforts. However, this information may not be available in externally sourced data or generally contain data entry errors. To address this problem, we show the effectiveness of self-supervised methods such as SimCLR and BYOL as well as supervised contrastive deep learning methods in assigning one of 14 anatomical region classes in our in-house dataset of 48,434 skeletal radiographs. We achieve a strong linear evaluation accuracy of 96.6% with a single model and 97.7% using an ensemble approach. Furthermore, only a few labeled instances (1% of the training set) suffice to achieve an accuracy of 92.2%, enabling usage in low-label and thus low-resource scenarios. Our model can be used to correct data entry mistakes: a follow-up analysis of the test set errors of our best-performing single model by an expert radiologist identified 35% incorrect labels and 11% out-of-domain images. When accounted for, the radiograph anatomical region labelling performance increased -- without and with an ensemble, respectively -- to a theoretical accuracy of 98.0% and 98.8%. |
-2024-12-19 |
-{S$^3$-Mamba}: Small-Size-Sensitive Mamba for Lesion Segmentation |
-Gui Wang, Yuexiang Li, Wenting Chen, Meidan Ding, Wooi Ping Cheah, Rong Qu, Jianfeng Ren, Linlin Shen |
-Link |
-Small lesions play a critical role in early disease diagnosis and intervention of severe infections. Popular models often face challenges in segmenting small lesions, as it occupies only a minor portion of an image, while down\_sampling operations may inevitably lose focus on local features of small lesions. To tackle the challenges, we propose a {\bf S}mall-{\bf S}ize-{\bf S}ensitive {\bf Mamba} ({\bf S$^3$-Mamba}), which promotes the sensitivity to small lesions across three dimensions: channel, spatial, and training strategy. Specifically, an Enhanced Visual State Space block is designed to focus on small lesions through multiple residual connections to preserve local features, and selectively amplify important details while suppressing irrelevant ones through channel-wise attention. A Tensor-based Cross-feature Multi-scale Attention is designed to integrate input image features and intermediate-layer features with edge features and exploit the attentive support of features across multiple scales, thereby retaining spatial details of small lesions at various granularities. Finally, we introduce a novel regularized curriculum learning to automatically assess lesion size and sample difficulty, and gradually focus from easy samples to hard ones like small lesions. Extensive experiments on three medical image segmentation datasets show the superiority of our S$^3$-Mamba, especially in segmenting small lesions. Our code is available at https://github.com/ErinWang2023/S3-Mamba. |
+2024-12-20 |
+From General to Specific: Tailoring Large Language Models for Personalized Healthcare |
+Ruize Shi, Hong Huang, Wei Zhou, Kehan Yin, Kai Zhao, Yun Zhao |
+Link |
+The rapid development of large language models (LLMs) has transformed many industries, including healthcare. However, previous medical LLMs have largely focused on leveraging general medical knowledge to provide responses, without accounting for patient variability and lacking true personalization at the individual level. To address this, we propose a novel method called personalized medical language model (PMLM), which explores and optimizes personalized LLMs through recommendation systems and reinforcement learning (RL). Specifically, by utilizing self-informed and peer-informed personalization, PMLM captures changes in behaviors and preferences to design initial personalized prompts tailored to individual needs. We further refine these initial personalized prompts through RL, ultimately enhancing the precision of LLM guidance. Notably, the personalized prompt are hard prompt, which grants PMLM high adaptability and reusability, allowing it to directly leverage high-quality proprietary LLMs. We evaluate PMLM using real-world obstetrics and gynecology data, and the experimental results demonstrate that PMLM achieves personalized responses, and it provides more refined and individualized services, offering a potential way for personalized medical LLMs. |
-2024-12-19 |
-DAMPER: A Dual-Stage Medical Report Generation Framework with Coarse-Grained MeSH Alignment and Fine-Grained Hypergraph Matching |
-Xiaofei Huang, Wenting Chen, Jie Liu, Qisheng Lu, Xiaoling Luo, Linlin Shen |
-Link |
-Medical report generation is crucial for clinical diagnosis and patient management, summarizing diagnoses and recommendations based on medical imaging. However, existing work often overlook the clinical pipeline involved in report writing, where physicians typically conduct an initial quick review followed by a detailed examination. Moreover, current alignment methods may lead to misaligned relationships. To address these issues, we propose DAMPER, a dual-stage framework for medical report generation that mimics the clinical pipeline of report writing in two stages. In the first stage, a MeSH-Guided Coarse-Grained Alignment (MCG) stage that aligns chest X-ray (CXR) image features with medical subject headings (MeSH) features to generate a rough keyphrase representation of the overall impression. In the second stage, a Hypergraph-Enhanced Fine-Grained Alignment (HFG) stage that constructs hypergraphs for image patches and report annotations, modeling high-order relationships within each modality and performing hypergraph matching to capture semantic correlations between image regions and textual phrases. Finally,the coarse-grained visual features, generated MeSH representations, and visual hypergraph features are fed into a report decoder to produce the final medical report. Extensive experiments on public datasets demonstrate the effectiveness of DAMPER in generating comprehensive and accurate medical reports, outperforming state-of-the-art methods across various evaluation metrics. |
+2024-12-20 |
+MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Classification and Detection |
+Andrea Moglia, Elia Clement Nastasio, Luca Mainardi, Pietro Cerveri |
+Link |
+Problem: Pancreas radiological imaging is challenging due to the small size, blurred boundaries, and variability of shape and position of the organ among patients. Goal: In this work we present MiniGPT-Pancreas, a Multimodal Large Language Model (MLLM), as an interactive chatbot to support clinicians in pancreas cancer diagnosis by integrating visual and textual information. Methods: MiniGPT-v2, a general-purpose MLLM, was fine-tuned in a cascaded way for pancreas detection, tumor classification, and tumor detection with multimodal prompts combining questions and computed tomography scans from the National Institute of Health (NIH), and Medical Segmentation Decathlon (MSD) datasets. The AbdomenCT-1k dataset was used to detect the liver, spleen, kidney, and pancreas. Results: MiniGPT-Pancreas achieved an Intersection over Union (IoU) of 0.595 and 0.550 for the detection of pancreas on NIH and MSD datasets, respectively. For the pancreas cancer classification task on the MSD dataset, accuracy, precision, and recall were 0.876, 0.874, and 0.878, respectively. When evaluating MiniGPT-Pancreas on the AbdomenCT-1k dataset for multi-organ detection, the IoU was 0.8399 for the liver, 0.722 for the kidney, 0.705 for the spleen, and 0.497 for the pancreas. For the pancreas tumor detection task, the IoU score was 0.168 on the MSD dataset. Conclusions: MiniGPT-Pancreas represents a promising solution to support clinicians in the classification of pancreas images with pancreas tumors. Future research is needed to improve the score on the detection task, especially for pancreas tumors. |
-2024-12-19 |
-VISA: Retrieval Augmented Generation with Visual Source Attribution |
-Xueguang Ma, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Wenhu Chen, Jimmy Lin |
-Link |
-Generation with source attribution is important for enhancing the verifiability of retrieval-augmented generation (RAG) systems. However, existing approaches in RAG primarily link generated content to document-level references, making it challenging for users to locate evidence among multiple content-rich retrieved documents. To address this challenge, we propose Retrieval-Augmented Generation with Visual Source Attribution (VISA), a novel approach that combines answer generation with visual source attribution. Leveraging large vision-language models (VLMs), VISA identifies the evidence and highlights the exact regions that support the generated answers with bounding boxes in the retrieved document screenshots. To evaluate its effectiveness, we curated two datasets: Wiki-VISA, based on crawled Wikipedia webpage screenshots, and Paper-VISA, derived from PubLayNet and tailored to the medical domain. Experimental results demonstrate the effectiveness of VISA for visual source attribution on documents' original look, as well as highlighting the challenges for improvement. Code, data, and model checkpoints will be released. |
+2024-12-20 |
+Development of a Large-scale Dataset of Chest Computed Tomography Reports in Japanese and a High-performance Finding Classification Model |
+Yosuke Yamagishi, Yuta Nakamura, Tomohiro Kikuchi, Yuki Sonoda, Hiroshi Hirakawa, Shintaro Kano, Satoshi Nakamura, Shouhei Hanaoka, Takeharu Yoshikawa, Osamu Abe |
+Link |
+Background: Recent advances in large language models highlight the need for high-quality multilingual medical datasets. While Japan leads globally in CT scanner deployment and utilization, the lack of large-scale Japanese radiology datasets has hindered the development of specialized language models for medical imaging analysis. Objective: To develop a comprehensive Japanese CT report dataset through machine translation and establish a specialized language model for structured finding classification. Additionally, to create a rigorously validated evaluation dataset through expert radiologist review. Methods: We translated the CT-RATE dataset (24,283 CT reports from 21,304 patients) into Japanese using GPT-4o mini. The training dataset consisted of 22,778 machine-translated reports, while the validation dataset included 150 radiologist-revised reports. We developed CT-BERT-JPN based on "tohoku-nlp/bert-base-japanese-v3" architecture for extracting 18 structured findings from Japanese radiology reports. Results: Translation metrics showed strong performance with BLEU scores of 0.731 and 0.690, and ROUGE scores ranging from 0.770 to 0.876 for Findings and from 0.748 to 0.857 for Impression sections. CT-BERT-JPN demonstrated superior performance compared to GPT-4o in 11 out of 18 conditions, including lymphadenopathy (+14.2%), interlobular septal thickening (+10.9%), and atelectasis (+7.4%). The model maintained F1 scores exceeding 0.95 in 14 out of 18 conditions and achieved perfect scores in four conditions. Conclusions: Our study establishes a robust Japanese CT report dataset and demonstrates the effectiveness of a specialized language model for structured finding classification. The hybrid approach of machine translation and expert validation enables the creation of large-scale medical datasets while maintaining high quality. |
diff --git a/data_store/papers_2024-12-24.json b/data_store/papers_2024-12-24.json
new file mode 100644
index 0000000..a432a4f
--- /dev/null
+++ b/data_store/papers_2024-12-24.json
@@ -0,0 +1,526 @@
+{
+ "Brain": {
+ "2412.16111v1": {
+ "title": "How connection probability shapes fluctuations of neural population dynamics",
+ "url": "http://arxiv.org/abs/2412.16111v1",
+ "authors": "Nils E. Greven, Jonas Ranft, Tilo Schwalger",
+ "update_time": "2024-12-20",
+ "abstract": "Mean-field models of neuronal populations in the brain have proven extremely useful to understand network dynamics and response to stimuli, but these models generally lack a faithful description of the fluctuations in the biologically relevant case of finite network size and connection probabilities $p<1$ (non-full connectivity). To gain insight into the different fluctuation mechanisms underlying the neural variability of populations of spiking neurons, we derive and analyze a stochastic mean-field model for finite-size networks of Poisson neurons with random, non-full connectivity, external noise and disordered mean inputs. We treat the quenched disorder of the connectivity by an annealed approximation that enables a reduction to a low-dimensional closed system of coupled Langevin equations for the mean and variance of the neuronal membrane potentials as well as a variable capturing finite-size fluctuations arising specifically in the case $p<1$. Comparing to microscopic simulations, we find that the mesoscopic model describes the fluctuations and nonlinearities well and outperforms previous mesoscopic models that neglected the recurrent noise effect caused by the non-full connectivity. This effect can be analytically understood by a softening of the effective nonlinearity and the multiplicative character of finite-size spiking noise. The mesoscopic theory shows that quenched disorder can stabilize the asynchronous state, and it correctly predicts large quantitiative and non-trivial qualitative effects of connection probability on the variance of the population firing rate and its dependence on stimulus strength. Our theory thus elucidates how disordered connectivity shapes nonlinear dynamics and fluctuations of neural populations at the mesoscopic scale and showcases a useful mean-field method to treat non-full connectivity in finite-size, spiking neural networks."
+ },
+ "2412.16014v1": {
+ "title": "Probing lactate exchange and compartmentation in Gray Matter via time-dependent diffusion-weighted MRS",
+ "url": "http://arxiv.org/abs/2412.16014v1",
+ "authors": "Eloise Mougel, Marco Palombo, Julien Valette",
+ "update_time": "2024-12-20",
+ "abstract": "Lactate is crucial in the brain, as it is involved in neuronal activity and memory formation. This is thought to be related to the astrocyte-neuron lactate shuttle hypothesis (ANLS), which has been debated for 30 years, not least because it is difficult to measure lactate compartmentation and exchange in vivo. In particular, while ANLS requires transfer of lactate via the extracellular space, intracellular/extracellular exchange rate remains unknown. In this work, we propose to use time-dependent diffusion MRS to assess lactate exchange and lactate compartmentation in mouse brain gray matter in vivo. First, by comparing lactate time-dependent diffusivity and kurtosis with those of water and purely intracellular metabolites (which exhibit \"fast\" and \"no\" exchange signatures, respectively), we estimate that lactate exchange is slow (i.e. on the order of hundreds of milliseconds). Then, using biophysical models neglecting exchange, we estimate extracellular, neuronal and astrocytic lactate fractions compatible with the ANLS hypothesis."
+ },
+ "2412.15869v1": {
+ "title": "A Digital Phantom for 3D MR Spectroscopy Data Simulation",
+ "url": "http://arxiv.org/abs/2412.15869v1",
+ "authors": "D. M. J. van de Sande, A. T. Gudmundson, S. Murali-Manohar, C. W. Davies-Jenkins, D. Simicic, G. Simegn, \u0130. \u00d6zdemir, S. Amirrajab, J. P. Merkofer, H. J. Z\u00f6llner, G. Oeltzschner, R. A. E. Edden",
+ "update_time": "2024-12-20",
+ "abstract": "Simulated data is increasingly valued by researchers for validating MRS and MRSI processing and analysis algorithms. However, there is no consensus on the optimal approaches for simulation models and parameters. This study introduces a novel 3D MRS digital brain phantom framework, providing a comprehensive and modular foundation for MRS and MRSI data simulation. We generate a digital brain phantom by combining anatomical and tissue label information with metabolite data from the literature. This phantom contains all necessary information for simulating spectral data. We integrate the phantom with a signal-based model to demonstrate its functionality and usability in generating various spectral datasets. Outputs are saved in the NIfTI-MRS format, enabling their use in downstream applications. We successfully implemented and tested the 3D MRS digital brain phantom framework using two different anatomical models at two resolutions. The resulting metabolite maps and spectral datasets demonstrate realistic data quality, flexibility based on user inputs, and reasonable computational efficiency. This innovative 3D digital brain phantom framework provides a clear and structured approach to simulating MRS and MRSI data. Its modular design establishes a strong, adaptable foundation for future advancements in MRS and MRSI simulation, allowing researchers to extend and refine the model to meet the field's evolving needs."
+ },
+ "2412.15862v1": {
+ "title": "MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems",
+ "url": "http://arxiv.org/abs/2412.15862v1",
+ "authors": "Elifnur Sunger, Yunus Bicer, Deniz Erdogmus, Tales Imbiriba",
+ "update_time": "2024-12-20",
+ "abstract": "Brain-Computer Interfaces (BCIs) help people with severe speech and motor disabilities communicate and interact with their environment using neural activity. This work focuses on the Rapid Serial Visual Presentation (RSVP) paradigm of BCIs using noninvasive electroencephalography (EEG). The RSVP typing task is a recursive task with multiple sequences, where users see only a subset of symbols in each sequence. Extensive research has been conducted to improve classification in the RSVP typing task, achieving fast classification. However, these methods struggle to achieve high accuracy and do not consider the typing mechanism in the learning procedure. They apply binary target and non-target classification without including recursive training. To improve performance in the classification of symbols while controlling the classification speed, we incorporate the typing setup into training by proposing a Partially Observable Markov Decision Process (POMDP) approach. To the best of our knowledge, this is the first work to formulate the RSVP typing task as a POMDP for recursive classification. Experiments show that the proposed approach, MarkovType, results in a more accurate typing system compared to competitors. Additionally, our experiments demonstrate that while there is a trade-off between accuracy and speed, MarkovType achieves the optimal balance between these factors compared to other methods.",
+ "code_url": "https://github.com/neu-spiral/markovtype"
+ },
+ "2412.15818v1": {
+ "title": "Precision ICU Resource Planning: A Multimodal Model for Brain Surgery Outcomes",
+ "url": "http://arxiv.org/abs/2412.15818v1",
+ "authors": "Maximilian Fischer, Florian M. Hauptmann, Robin Peretzke, Paul Naser, Peter Neher, Jan-Oliver Neumann, Klaus Maier-Hein",
+ "update_time": "2024-12-20",
+ "abstract": "Although advances in brain surgery techniques have led to fewer postoperative complications requiring Intensive Care Unit (ICU) monitoring, the routine transfer of patients to the ICU remains the clinical standard, despite its high cost. Predictive Gradient Boosted Trees based on clinical data have attempted to optimize ICU admission by identifying key risk factors pre-operatively; however, these approaches overlook valuable imaging data that could enhance prediction accuracy. In this work, we show that multimodal approaches that combine clinical data with imaging data outperform the current clinical data only baseline from 0.29 [F1] to 0.30 [F1], when only pre-operative clinical data is used and from 0.37 [F1] to 0.41 [F1], for pre- and post-operative data. This study demonstrates that effective ICU admission prediction benefits from multimodal data fusion, especially in contexts of severe class imbalance."
+ },
+ "2412.15734v1": {
+ "title": "The Role of Recurrency in Image Segmentation for Noisy and Limited Sample Settings",
+ "url": "http://arxiv.org/abs/2412.15734v1",
+ "authors": "David Calhas, Jo\u00e3o Marques, Arlindo L. Oliveira",
+ "update_time": "2024-12-20",
+ "abstract": "The biological brain has inspired multiple advances in machine learning. However, most state-of-the-art models in computer vision do not operate like the human brain, simply because they are not capable of changing or improving their decisions/outputs based on a deeper analysis. The brain is recurrent, while these models are not. It is therefore relevant to explore what would be the impact of adding recurrent mechanisms to existing state-of-the-art architectures and to answer the question of whether recurrency can improve existing architectures. To this end, we build on a feed-forward segmentation model and explore multiple types of recurrency for image segmentation. We explore self-organizing, relational, and memory retrieval types of recurrency that minimize a specific energy function. In our experiments, we tested these models on artificial and medical imaging data, while analyzing the impact of high levels of noise and few-shot learning settings. Our results do not validate our initial hypothesis that recurrent models should perform better in these settings, suggesting that these recurrent architectures, by themselves, are not sufficient to surpass state-of-the-art feed-forward versions and that additional work needs to be done on the topic."
+ },
+ "2412.15711v1": {
+ "title": "Does the brain behave like a (complex) network? I. Dynamics",
+ "url": "http://arxiv.org/abs/2412.15711v1",
+ "authors": "D. Papo, J. M. Buld\u00fa",
+ "update_time": "2024-12-20",
+ "abstract": "Graph theory is now becoming a standard tool in system-level neuroscience. However, endowing observed brain anatomy and dynamics with a complex network structure does not entail that the brain actually works as a network. Asking whether the brain behaves as a network means asking whether network properties count. From the viewpoint of neurophysiology and, possibly, of brain physics, the most substantial issues a network structure may be instrumental in addressing relate to the influence of network properties on brain dynamics and to whether these properties ultimately explain some aspects of brain function. Here, we address the dynamical implications of complex network, examining which aspects and scales of brain activity may be understood to genuinely behave as a network. To do so, we first define the meaning of networkness, and analyse some of its implications. We then examine ways in which brain anatomy and dynamics can be endowed with a network structure and discuss possible ways in which network structure may be shown to represent a genuine organisational principle of brain activity, rather than just a convenient description of its anatomy and dynamics."
+ },
+ "2412.15634v1": {
+ "title": "Darkit: A User-Friendly Software Toolkit for Spiking Large Language Model",
+ "url": "http://arxiv.org/abs/2412.15634v1",
+ "authors": "Xin Du, Shifan Ye, Qian Zheng, Yangfan Hu, Rui Yan, Shunyu Qi, Shuyang Chen, Huajin Tang, Gang Pan, Shuiguang Deng",
+ "update_time": "2024-12-20",
+ "abstract": "Large language models (LLMs) have been widely applied in various practical applications, typically comprising billions of parameters, with inference processes requiring substantial energy and computational resources. In contrast, the human brain, employing bio-plausible spiking mechanisms, can accomplish the same tasks while significantly reducing energy consumption, even with a similar number of parameters. Based on this, several pioneering researchers have proposed and implemented various large language models that leverage spiking neural networks. They have demonstrated the feasibility of these models, validated their performance, and open-sourced their frameworks and partial source code. To accelerate the adoption of brain-inspired large language models and facilitate secondary development for researchers, we are releasing a software toolkit named DarwinKit (Darkit). The toolkit is designed specifically for learners, researchers, and developers working on spiking large models, offering a suite of highly user-friendly features that greatly simplify the learning, deployment, and development processes."
+ },
+ "2412.15560v1": {
+ "title": "Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings",
+ "url": "http://arxiv.org/abs/2412.15560v1",
+ "authors": "Taketo Akama, Zhuohao Zhang, Pengcheng Li, Kotaro Hongo, Hiroaki Kitano, Shun Minamikawa, Natalia Polouliakh",
+ "update_time": "2024-12-20",
+ "abstract": "Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations when subjected to identical auditory sensory inputs. In these studies, the ability to predict cortical representations is probed by regressing from ANN representations to cortical representations. Building upon this concept, our approach reverses the direction of prediction: we utilize ANN representations as a supervisory signal to train recognition models using noisy brain recordings obtained through non-invasive measurements. Specifically, we focus on constructing a recognition model for music identification, where electroencephalography (EEG) brain recordings collected during music listening serve as input. By training an EEG recognition model to predict ANN representations-representations associated with music identification-we observed a substantial improvement in classification accuracy. This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli. It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition. Furthermore, it provides new insights into the relationship between auditory brain activity and ANN representations."
+ },
+ "2412.15021v1": {
+ "title": "Event-based backpropagation on the neuromorphic platform SpiNNaker2",
+ "url": "http://arxiv.org/abs/2412.15021v1",
+ "authors": "B\u00e9na Gabriel, Wunderlich Timo, Akl Mahmoud, Vogginger Bernhard, Mayr Christian, Andres Gonzales Hector",
+ "update_time": "2024-12-19",
+ "abstract": "Neuromorphic computing aims to replicate the brain's capabilities for energy efficient and parallel information processing, promising a solution to the increasing demand for faster and more efficient computational systems. Efficient training of neural networks on neuromorphic hardware requires the development of training algorithms that retain the sparsity of spike-based communication during training. Here, we report on the first implementation of event-based backpropagation on the SpiNNaker2 neuromorphic hardware platform. We use EventProp, an algorithm for event-based backpropagation in spiking neural networks (SNNs), to compute exact gradients using sparse communication of error signals between neurons. Our implementation computes multi-layer networks of leaky integrate-and-fire neurons using discretized versions of the differential equations and their adjoints, and uses event packets to transmit spikes and error signals between network layers. We demonstrate a proof-of-concept of batch-parallelized, on-chip training of SNNs using the Yin Yang dataset, and provide an off-chip implementation for efficient prototyping, hyper-parameter search, and hybrid training methods."
+ }
+ },
+ "EEG": {
+ "2412.15947v1": {
+ "title": "Mamba-based Deep Learning Approaches for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography",
+ "url": "http://arxiv.org/abs/2412.15947v1",
+ "authors": "Andrew H. Zhang, Alex He-Mo, Richard Fei Yin, Chunlin Li, Yuzhi Tang, Dharmendra Gurve, Nasim Montazeri Ghahjaverestan, Maged Goubran, Bo Wang, Andrew S. P. Lim",
+ "update_time": "2024-12-20",
+ "abstract": "Study Objectives: We investigate using Mamba-based deep learning approaches for sleep staging on signals from ANNE One (Sibel Health, Evanston, IL), a minimally intrusive dual-sensor wireless wearable system measuring chest electrocardiography (ECG), triaxial accelerometry, and temperature, as well as finger photoplethysmography (PPG) and temperature. Methods: We obtained wearable sensor recordings from 360 adults undergoing concurrent clinical polysomnography (PSG) at a tertiary care sleep lab. PSG recordings were scored according to AASM criteria. PSG and wearable sensor data were automatically aligned using their ECG channels with manual confirmation by visual inspection. We trained Mamba-based models with both convolutional-recurrent neural network (CRNN) and the recurrent neural network (RNN) architectures on these recordings. Ensembling of model variants with similar architectures was performed. Results: Our best approach, after ensembling, attains a 3-class (wake, NREM, REM) balanced accuracy of 83.50%, F1 score of 84.16%, Cohen's $\\kappa$ of 72.68%, and a MCC score of 72.84%; a 4-class (wake, N1/N2, N3, REM) balanced accuracy of 74.64%, F1 score of 74.56%, Cohen's $\\kappa$ of 61.63%, and MCC score of 62.04%; a 5-class (wake, N1, N2, N3, REM) balanced accuracy of 64.30%, F1 score of 66.97%, Cohen's $\\kappa$ of 53.23%, MCC score of 54.38%. Conclusions: Deep learning models can infer major sleep stages from a wearable system without electroencephalography (EEG) and can be successfully applied to data from adults attending a tertiary care sleep clinic."
+ },
+ "2412.15862v1": {
+ "title": "MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems",
+ "url": "http://arxiv.org/abs/2412.15862v1",
+ "authors": "Elifnur Sunger, Yunus Bicer, Deniz Erdogmus, Tales Imbiriba",
+ "update_time": "2024-12-20",
+ "abstract": "Brain-Computer Interfaces (BCIs) help people with severe speech and motor disabilities communicate and interact with their environment using neural activity. This work focuses on the Rapid Serial Visual Presentation (RSVP) paradigm of BCIs using noninvasive electroencephalography (EEG). The RSVP typing task is a recursive task with multiple sequences, where users see only a subset of symbols in each sequence. Extensive research has been conducted to improve classification in the RSVP typing task, achieving fast classification. However, these methods struggle to achieve high accuracy and do not consider the typing mechanism in the learning procedure. They apply binary target and non-target classification without including recursive training. To improve performance in the classification of symbols while controlling the classification speed, we incorporate the typing setup into training by proposing a Partially Observable Markov Decision Process (POMDP) approach. To the best of our knowledge, this is the first work to formulate the RSVP typing task as a POMDP for recursive classification. Experiments show that the proposed approach, MarkovType, results in a more accurate typing system compared to competitors. Additionally, our experiments demonstrate that while there is a trade-off between accuracy and speed, MarkovType achieves the optimal balance between these factors compared to other methods.",
+ "code_url": "https://github.com/neu-spiral/markovtype"
+ },
+ "2412.15598v1": {
+ "title": "SODor: Long-Term EEG Partitioning for Seizure Onset Detection",
+ "url": "http://arxiv.org/abs/2412.15598v1",
+ "authors": "Zheng Chen, Yasuko Matsubara, Yasushi Sakurai, Jimeng Sun",
+ "update_time": "2024-12-20",
+ "abstract": "Deep learning models have recently shown great success in classifying epileptic patients using EEG recordings. Unfortunately, classification-based methods lack a sound mechanism to detect the onset of seizure events. In this work, we propose a two-stage framework, \\method, that explicitly models seizure onset through a novel task formulation of subsequence clustering. Given an EEG sequence, the framework first learns a set of second-level embeddings with label supervision. It then employs model-based clustering to explicitly capture long-term temporal dependencies in EEG sequences and identify meaningful subsequences. Epochs within a subsequence share a common cluster assignment (normal or seizure), with cluster or state transitions representing successful onset detections. Extensive experiments on three datasets demonstrate that our method can correct misclassifications, achieving 5%-11% classification improvements over other baselines and accurately detecting seizure onsets."
+ },
+ "2412.15560v1": {
+ "title": "Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings",
+ "url": "http://arxiv.org/abs/2412.15560v1",
+ "authors": "Taketo Akama, Zhuohao Zhang, Pengcheng Li, Kotaro Hongo, Hiroaki Kitano, Shun Minamikawa, Natalia Polouliakh",
+ "update_time": "2024-12-20",
+ "abstract": "Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations when subjected to identical auditory sensory inputs. In these studies, the ability to predict cortical representations is probed by regressing from ANN representations to cortical representations. Building upon this concept, our approach reverses the direction of prediction: we utilize ANN representations as a supervisory signal to train recognition models using noisy brain recordings obtained through non-invasive measurements. Specifically, we focus on constructing a recognition model for music identification, where electroencephalography (EEG) brain recordings collected during music listening serve as input. By training an EEG recognition model to predict ANN representations-representations associated with music identification-we observed a substantial improvement in classification accuracy. This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli. It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition. Furthermore, it provides new insights into the relationship between auditory brain activity and ANN representations."
+ },
+ "2412.15412v1": {
+ "title": "LG-Sleep: Local and Global Temporal Dependencies for Mice Sleep Scoring",
+ "url": "http://arxiv.org/abs/2412.15412v1",
+ "authors": "Shadi Sartipi, Mie Andersen, Natalie Hauglund, Celia Kjaerby, Verena Untiet, Maiken Nedergaard, Mujdat Cetin",
+ "update_time": "2024-12-19",
+ "abstract": "Efficiently identifying sleep stages is crucial for unraveling the intricacies of sleep in both preclinical and clinical research. The labor-intensive nature of manual sleep scoring, demanding substantial expertise, has prompted a surge of interest in automated alternatives. Sleep studies in mice play a significant role in understanding sleep patterns and disorders and underscore the need for robust scoring methodologies. In response, this study introduces LG-Sleep, a novel subject-independent deep neural network architecture designed for mice sleep scoring through electroencephalogram (EEG) signals. LG-Sleep extracts local and global temporal transitions within EEG signals to categorize sleep data into three stages: wake, rapid eye movement (REM) sleep, and non-rapid eye movement (NREM) sleep. The model leverages local and global temporal information by employing time-distributed convolutional neural networks to discern local temporal transitions in EEG data. Subsequently, features derived from the convolutional filters traverse long short-term memory blocks, capturing global transitions over extended periods. Crucially, the model is optimized in an autoencoder-decoder fashion, facilitating generalization across distinct subjects and adapting to limited training samples. Experimental findings demonstrate superior performance of LG-Sleep compared to conventional deep neural networks. Moreover, the model exhibits good performance across different sleep stages even when tasked with scoring based on limited training samples."
+ },
+ "2412.14522v1": {
+ "title": "CAE-T: A Channelwise AutoEncoder with Transformer for EEG Abnormality Detection",
+ "url": "http://arxiv.org/abs/2412.14522v1",
+ "authors": "Youshen Zhao, Keiji Iramina",
+ "update_time": "2024-12-19",
+ "abstract": "Electroencephalogram (EEG) signals are critical for detecting abnormal brain activity, but their high dimensionality and complexity pose significant challenges for effective analysis. In this paper, we propose CAE-T, a novel framework that combines a channelwise CNN-based autoencoder with a single-head transformer classifier for efficient EEG abnormality detection. The channelwise autoencoder compresses raw EEG signals while preserving channel independence, reducing computational costs and retaining biologically meaningful features. The compressed representations are then fed into the transformer-based classifier, which efficiently models long-term dependencies to distinguish between normal and abnormal signals. Evaluated on the TUH Abnormal EEG Corpus, the proposed model achieves 85.0% accuracy, 76.2% sensitivity, and 91.2% specificity at the per-case level, outperforming baseline models such as EEGNet, Deep4Conv, and FusionCNN. Furthermore, CAE-T requires only 202M FLOPs and 2.9M parameters, making it significantly more efficient than transformer-based alternatives. The framework retains interpretability through its channelwise design, demonstrating great potential for future applications in neuroscience research and clinical practice. The source code is available at https://github.com/YossiZhao/CAE-T.",
+ "code_url": "https://github.com/yossizhao/cae-t"
+ },
+ "2412.13028v1": {
+ "title": "Identification of Epileptic Spasms (ESES) Phases Using EEG Signals: A Vision Transformer Approach",
+ "url": "http://arxiv.org/abs/2412.13028v1",
+ "authors": "Wei Gong, Yaru Li",
+ "update_time": "2024-12-17",
+ "abstract": "This work introduces a new approach to the Epileptic Spasms (ESES) detection based on the EEG signals using Vision Transformers (ViT). Classic ESES detection approaches have usually been performed with manual processing or conventional algorithms, suffering from poor sample sizes, single-channel-based analyses, and low generalization abilities. In contrast, the proposed ViT model overcomes these limitations by using the attention mechanism to focus on the important features in multi-channel EEG data, which is contributing to both better accuracy and efficiency. The model processes frequency-domain representations of EEG signals, such as spectrograms, as image data to capture long-range dependencies and complex patterns in the signal. The model demonstrates high performance with an accuracy of 97% without requiring intensive data preprocessing, thus rendering it suitable for real-time clinical applications on a large scale. The method represents a significant development in the advancement of neurological disorders such as ESES in detection and analysis."
+ },
+ "2412.12428v1": {
+ "title": "Predicting Workload in Virtual Flight Simulations using EEG Features (Including Post-hoc Analysis in Appendix)",
+ "url": "http://arxiv.org/abs/2412.12428v1",
+ "authors": "Bas Verkennis, Evy van Weelden, Francesca L. Marogna, Maryam Alimardani, Travis J. Wiltshire, Max M. Louwerse",
+ "update_time": "2024-12-17",
+ "abstract": "Effective cognitive workload management has a major impact on the safety and performance of pilots. Integrating brain-computer interfaces (BCIs) presents an opportunity for real-time workload assessment. Leveraging cognitive workload data from immersive, high-fidelity virtual reality (VR) flight simulations enhances ecological validity and allows for dynamic adjustments to training scenarios based on individual cognitive states. While prior studies have predominantly concentrated on EEG spectral power for workload prediction, delving into inter-brain connectivity may yield deeper insights. This study assessed the predictive value of EEG spectral and connectivity features in distinguishing high vs. low workload periods during simulated flight in VR and Desktop conditions. EEG data were collected from 52 non-pilot participants conducting flight tasks in an aircraft simulation, after which they reported cognitive workload using the NASA Task Load Index. Using an ensemble approach, a stacked classifier was trained to predict workload using two feature sets extracted from the EEG data: 1) spectral features (Baseline model), and 2) a combination of spectral and connectivity features (Connectivity model), both within the alpha, beta, and theta band ranges. Results showed that the performance of the Connectivity model surpassed the Baseline model. Additionally, Recursive Feature Elimination (RFE) provided insights into the most influential workload-predicting features, highlighting the potential dominance of parietal-directed connectivity in managing cognitive workload during simulated flight. Further research on other connectivity metrics and alternative models (such as deep learning) in a large sample of pilots is essential to validate the possibility of a real-time BCI for the prediction of workload under safety-critical operational conditions.",
+ "code_url": "https://github.com/basverkennis/flight-sim-cognitive-workload-eeg-prediction"
+ },
+ "2412.11695v2": {
+ "title": "CiTrus: Squeezing Extra Performance out of Low-data Bio-signal Transfer Learning",
+ "url": "http://arxiv.org/abs/2412.11695v2",
+ "authors": "Eloy Geenjaar, Lie Lu",
+ "update_time": "2024-12-18",
+ "abstract": "Transfer learning for bio-signals has recently become an important technique to improve prediction performance on downstream tasks with small bio-signal datasets. Recent works have shown that pre-training a neural network model on a large dataset (e.g. EEG) with a self-supervised task, replacing the self-supervised head with a linear classification head, and fine-tuning the model on different downstream bio-signal datasets (e.g., EMG or ECG) can dramatically improve the performance on those datasets. In this paper, we propose a new convolution-transformer hybrid model architecture with masked auto-encoding for low-data bio-signal transfer learning, introduce a frequency-based masked auto-encoding task, employ a more comprehensive evaluation framework, and evaluate how much and when (multimodal) pre-training improves fine-tuning performance. We also introduce a dramatically more performant method of aligning a downstream dataset with a different temporal length and sampling rate to the original pre-training dataset. Our findings indicate that the convolution-only part of our hybrid model can achieve state-of-the-art performance on some low-data downstream tasks. The performance is often improved even further with our full model. In the case of transformer-based models we find that pre-training especially improves performance on downstream datasets, multimodal pre-training often increases those gains further, and our frequency-based pre-training performs the best on average for the lowest and highest data regimes."
+ },
+ "2412.11394v1": {
+ "title": "Privacy-Preserving Brain-Computer Interfaces: A Systematic Review",
+ "url": "http://arxiv.org/abs/2412.11394v1",
+ "authors": "K. Xia, W. Duch, Y. Sun, K. Xu, W. Fang, H. Luo, Y. Zhang, D. Sang, X. Xu, F-Y Wang, D. Wu",
+ "update_time": "2024-12-16",
+ "abstract": "A brain-computer interface (BCI) establishes a direct communication pathway between the human brain and a computer. It has been widely used in medical diagnosis, rehabilitation, education, entertainment, etc. Most research so far focuses on making BCIs more accurate and reliable, but much less attention has been paid to their privacy. Developing a commercial BCI system usually requires close collaborations among multiple organizations, e.g., hospitals, universities, and/or companies. Input data in BCIs, e.g., electroencephalogram (EEG), contain rich privacy information, and the developed machine learning model is usually proprietary. Data and model transmission among different parties may incur significant privacy threats, and hence privacy protection in BCIs must be considered. Unfortunately, there does not exist any contemporary and comprehensive review on privacy-preserving BCIs. This paper fills this gap, by describing potential privacy threats and protection strategies in BCIs. It also points out several challenges and future research directions in developing privacy-preserving BCIs."
+ }
+ },
+ "BCI": {
+ "2412.15862v1": {
+ "title": "MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems",
+ "url": "http://arxiv.org/abs/2412.15862v1",
+ "authors": "Elifnur Sunger, Yunus Bicer, Deniz Erdogmus, Tales Imbiriba",
+ "update_time": "2024-12-20",
+ "abstract": "Brain-Computer Interfaces (BCIs) help people with severe speech and motor disabilities communicate and interact with their environment using neural activity. This work focuses on the Rapid Serial Visual Presentation (RSVP) paradigm of BCIs using noninvasive electroencephalography (EEG). The RSVP typing task is a recursive task with multiple sequences, where users see only a subset of symbols in each sequence. Extensive research has been conducted to improve classification in the RSVP typing task, achieving fast classification. However, these methods struggle to achieve high accuracy and do not consider the typing mechanism in the learning procedure. They apply binary target and non-target classification without including recursive training. To improve performance in the classification of symbols while controlling the classification speed, we incorporate the typing setup into training by proposing a Partially Observable Markov Decision Process (POMDP) approach. To the best of our knowledge, this is the first work to formulate the RSVP typing task as a POMDP for recursive classification. Experiments show that the proposed approach, MarkovType, results in a more accurate typing system compared to competitors. Additionally, our experiments demonstrate that while there is a trade-off between accuracy and speed, MarkovType achieves the optimal balance between these factors compared to other methods.",
+ "code_url": "https://github.com/neu-spiral/markovtype"
+ },
+ "2412.15560v1": {
+ "title": "Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings",
+ "url": "http://arxiv.org/abs/2412.15560v1",
+ "authors": "Taketo Akama, Zhuohao Zhang, Pengcheng Li, Kotaro Hongo, Hiroaki Kitano, Shun Minamikawa, Natalia Polouliakh",
+ "update_time": "2024-12-20",
+ "abstract": "Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations when subjected to identical auditory sensory inputs. In these studies, the ability to predict cortical representations is probed by regressing from ANN representations to cortical representations. Building upon this concept, our approach reverses the direction of prediction: we utilize ANN representations as a supervisory signal to train recognition models using noisy brain recordings obtained through non-invasive measurements. Specifically, we focus on constructing a recognition model for music identification, where electroencephalography (EEG) brain recordings collected during music listening serve as input. By training an EEG recognition model to predict ANN representations-representations associated with music identification-we observed a substantial improvement in classification accuracy. This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli. It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition. Furthermore, it provides new insights into the relationship between auditory brain activity and ANN representations."
+ },
+ "2412.12428v1": {
+ "title": "Predicting Workload in Virtual Flight Simulations using EEG Features (Including Post-hoc Analysis in Appendix)",
+ "url": "http://arxiv.org/abs/2412.12428v1",
+ "authors": "Bas Verkennis, Evy van Weelden, Francesca L. Marogna, Maryam Alimardani, Travis J. Wiltshire, Max M. Louwerse",
+ "update_time": "2024-12-17",
+ "abstract": "Effective cognitive workload management has a major impact on the safety and performance of pilots. Integrating brain-computer interfaces (BCIs) presents an opportunity for real-time workload assessment. Leveraging cognitive workload data from immersive, high-fidelity virtual reality (VR) flight simulations enhances ecological validity and allows for dynamic adjustments to training scenarios based on individual cognitive states. While prior studies have predominantly concentrated on EEG spectral power for workload prediction, delving into inter-brain connectivity may yield deeper insights. This study assessed the predictive value of EEG spectral and connectivity features in distinguishing high vs. low workload periods during simulated flight in VR and Desktop conditions. EEG data were collected from 52 non-pilot participants conducting flight tasks in an aircraft simulation, after which they reported cognitive workload using the NASA Task Load Index. Using an ensemble approach, a stacked classifier was trained to predict workload using two feature sets extracted from the EEG data: 1) spectral features (Baseline model), and 2) a combination of spectral and connectivity features (Connectivity model), both within the alpha, beta, and theta band ranges. Results showed that the performance of the Connectivity model surpassed the Baseline model. Additionally, Recursive Feature Elimination (RFE) provided insights into the most influential workload-predicting features, highlighting the potential dominance of parietal-directed connectivity in managing cognitive workload during simulated flight. Further research on other connectivity metrics and alternative models (such as deep learning) in a large sample of pilots is essential to validate the possibility of a real-time BCI for the prediction of workload under safety-critical operational conditions.",
+ "code_url": "https://github.com/basverkennis/flight-sim-cognitive-workload-eeg-prediction"
+ },
+ "2412.11394v1": {
+ "title": "Privacy-Preserving Brain-Computer Interfaces: A Systematic Review",
+ "url": "http://arxiv.org/abs/2412.11394v1",
+ "authors": "K. Xia, W. Duch, Y. Sun, K. Xu, W. Fang, H. Luo, Y. Zhang, D. Sang, X. Xu, F-Y Wang, D. Wu",
+ "update_time": "2024-12-16",
+ "abstract": "A brain-computer interface (BCI) establishes a direct communication pathway between the human brain and a computer. It has been widely used in medical diagnosis, rehabilitation, education, entertainment, etc. Most research so far focuses on making BCIs more accurate and reliable, but much less attention has been paid to their privacy. Developing a commercial BCI system usually requires close collaborations among multiple organizations, e.g., hospitals, universities, and/or companies. Input data in BCIs, e.g., electroencephalogram (EEG), contain rich privacy information, and the developed machine learning model is usually proprietary. Data and model transmission among different parties may incur significant privacy threats, and hence privacy protection in BCIs must be considered. Unfortunately, there does not exist any contemporary and comprehensive review on privacy-preserving BCIs. This paper fills this gap, by describing potential privacy threats and protection strategies in BCIs. It also points out several challenges and future research directions in developing privacy-preserving BCIs."
+ },
+ "2412.11390v1": {
+ "title": "Accurate, Robust and Privacy-Preserving Brain-Computer Interface Decoding",
+ "url": "http://arxiv.org/abs/2412.11390v1",
+ "authors": "Xiaoqing Chen, Tianwang Jia, Dongrui Wu",
+ "update_time": "2024-12-16",
+ "abstract": "An electroencephalogram (EEG) based brain-computer interface (BCI) enables direct communication between the brain and external devices. However, EEG-based BCIs face at least three major challenges in real-world applications: data scarcity and individual differences, adversarial vulnerability, and data privacy. While previous studies have addressed one or two of these issues, simultaneous accommodation of all three challenges remains challenging and unexplored. This paper fills this gap, by proposing an Augmented Robustness Ensemble (ARE) algorithm and integrating it into three privacy protection scenarios (centralized source-free transfer, federated source-free transfer, and source data perturbation), achieving simultaneously accurate decoding, adversarial robustness, and privacy protection of EEG-based BCIs. Experiments on three public EEG datasets demonstrated that our proposed approach outperformed over 10 classic and state-of-the-art approaches in both accuracy and robustness in all three privacy-preserving scenarios, even outperforming state-of-the-art transfer learning approaches that do not consider privacy protection at all. This is the first time that three major challenges in EEG-based BCIs can be addressed simultaneously, significantly improving the practicalness of EEG decoding in real-world BCIs."
+ },
+ "2412.12215v1": {
+ "title": "Imagined Speech State Classification for Robust Brain-Computer Interface",
+ "url": "http://arxiv.org/abs/2412.12215v1",
+ "authors": "Byung-Kwan Ko, Jun-Young Kim, Seo-Hyun Lee",
+ "update_time": "2024-12-15",
+ "abstract": "This study examines the effectiveness of traditional machine learning classifiers versus deep learning models for detecting the imagined speech using electroencephalogram data. Specifically, we evaluated conventional machine learning techniques such as CSP-SVM and LDA-SVM classifiers alongside deep learning architectures such as EEGNet, ShallowConvNet, and DeepConvNet. Machine learning classifiers exhibited significantly lower precision and recall, indicating limited feature extraction capabilities and poor generalization between imagined speech and idle states. In contrast, deep learning models, particularly EEGNet, achieved the highest accuracy of 0.7080 and an F1 score of 0.6718, demonstrating their enhanced ability in automatic feature extraction and representation learning, essential for capturing complex neurophysiological patterns. These findings highlight the limitations of conventional machine learning approaches in brain-computer interface (BCI) applications and advocate for adopting deep learning methodologies to achieve more precise and reliable classification of detecting imagined speech. This foundational research contributes to the development of imagined speech-based BCI systems."
+ },
+ "2412.09933v1": {
+ "title": "Active Poisoning: Efficient Backdoor Attacks on Transfer Learning-Based Brain-Computer Interfaces",
+ "url": "http://arxiv.org/abs/2412.09933v1",
+ "authors": "X. Jiang, L. Meng, S. Li, D. Wu",
+ "update_time": "2024-12-13",
+ "abstract": "Transfer learning (TL) has been widely used in electroencephalogram (EEG)-based brain-computer interfaces (BCIs) for reducing calibration efforts. However, backdoor attacks could be introduced through TL. In such attacks, an attacker embeds a backdoor with a specific pattern into the machine learning model. As a result, the model will misclassify a test sample with the backdoor trigger into a prespecified class while still maintaining good performance on benign samples. Accordingly, this study explores backdoor attacks in the TL of EEG-based BCIs, where source-domain data are poisoned by a backdoor trigger and then used in TL. We propose several active poisoning approaches to select source-domain samples, which are most effective in embedding the backdoor pattern, to improve the attack success rate and efficiency. Experiments on four EEG datasets and three deep learning models demonstrate the effectiveness of the approaches. To our knowledge, this is the first study about backdoor attacks on TL models in EEG-based BCIs. It exposes a serious security risk in BCIs, which should be immediately addressed."
+ },
+ "2412.09854v1": {
+ "title": "User Identity Protection in EEG-based Brain-Computer Interfaces",
+ "url": "http://arxiv.org/abs/2412.09854v1",
+ "authors": "L. Meng, X. Jiang, J. Huang, W. Li, H. Luo, D. Wu",
+ "update_time": "2024-12-13",
+ "abstract": "A brain-computer interface (BCI) establishes a direct communication pathway between the brain and an external device. Electroencephalogram (EEG) is the most popular input signal in BCIs, due to its convenience and low cost. Most research on EEG-based BCIs focuses on the accurate decoding of EEG signals; however, EEG signals also contain rich private information, e.g., user identity, emotion, and so on, which should be protected. This paper first exposes a serious privacy problem in EEG-based BCIs, i.e., the user identity in EEG data can be easily learned so that different sessions of EEG data from the same user can be associated together to more reliably mine private information. To address this issue, we further propose two approaches to convert the original EEG data into identity-unlearnable EEG data, i.e., removing the user identity information while maintaining the good performance on the primary BCI task. Experiments on seven EEG datasets from five different BCI paradigms showed that on average the generated identity-unlearnable EEG data can reduce the user identification accuracy from 70.01\\% to at most 21.36\\%, greatly facilitating user privacy protection in EEG-based BCIs."
+ },
+ "2412.09015v1": {
+ "title": "Front-end Replication Dynamic Window (FRDW) for Online Motor Imagery Classification",
+ "url": "http://arxiv.org/abs/2412.09015v1",
+ "authors": "X. Chen, J. An, H. Wu, S. Li, B. Liu, D. Wu",
+ "update_time": "2024-12-12",
+ "abstract": "Motor imagery (MI) is a classical paradigm in electroencephalogram (EEG) based brain-computer interfaces (BCIs). Online accurate and fast decoding is very important to its successful applications. This paper proposes a simple yet effective front-end replication dynamic window (FRDW) algorithm for this purpose. Dynamic windows enable the classification based on a test EEG trial shorter than those used in training, improving the decision speed; front-end replication fills a short test EEG trial to the length used in training, improving the classification accuracy. Within-subject and cross-subject online MI classification experiments on three public datasets, with three different classifiers and three different data augmentation approaches, demonstrated that FRDW can significantly increase the information transfer rate in MI decoding. Additionally, FR can also be used in training data augmentation. FRDW helped win national champion of the China BCI Competition in 2022.",
+ "code_url": "https://github.com/xinru2001/frdw"
+ },
+ "2412.09006v1": {
+ "title": "Motor Imagery Classification for Asynchronous EEG-Based Brain-Computer Interfaces",
+ "url": "http://arxiv.org/abs/2412.09006v1",
+ "authors": "Huanyu Wu, Siyang Li, Dongrui Wu",
+ "update_time": "2024-12-12",
+ "abstract": "Motor imagery (MI) based brain-computer interfaces (BCIs) enable the direct control of external devices through the imagined movements of various body parts. Unlike previous systems that used fixed-length EEG trials for MI decoding, asynchronous BCIs aim to detect the user's MI without explicit triggers. They are challenging to implement, because the algorithm needs to first distinguish between resting-states and MI trials, and then classify the MI trials into the correct task, all without any triggers. This paper proposes a sliding window prescreening and classification (SWPC) approach for MI-based asynchronous BCIs, which consists of two modules: a prescreening module to screen MI trials out of the resting-state, and a classification module for MI classification. Both modules are trained with supervised learning followed by self-supervised learning, which refines the feature extractors. Within-subject and cross-subject asynchronous MI classifications on four different EEG datasets validated the effectiveness of SWPC, i.e., it always achieved the highest average classification accuracy, and outperformed the best state-of-the-art baseline on each dataset by about 2%.",
+ "code_url": "https://github.com/why135724/SWPC"
+ }
+ },
+ "fMRI": {
+ "2412.14999v1": {
+ "title": "Accessing the topological properties of human brain functional sub-circuits in Echo State Networks",
+ "url": "http://arxiv.org/abs/2412.14999v1",
+ "authors": "Bach Nguyen, Tianlong Chen, Shu Yang, Bojian Hou, Li Shen, Duy Duong-Tran",
+ "update_time": "2024-12-19",
+ "abstract": "Recent years have witnessed an emerging trend in neuromorphic computing that centers around the use of brain connectomics as a blueprint for artificial neural networks. Connectomics-based neuromorphic computing has primarily focused on embedding human brain large-scale structural connectomes (SCs), as estimated from diffusion Magnetic Resonance Imaging (dMRI) modality, to echo-state networks (ESNs). A critical step in ESN embedding requires pre-determined read-in and read-out layers constructed by the induced subgraphs of the embedded reservoir. As \\textit{a priori} set of functional sub-circuits are derived from functional MRI (fMRI) modality, it is unknown, till this point, whether the embedding of fMRI-induced sub-circuits/networks onto SCs is well justified from the neuro-physiological perspective and ESN performance across a variety of tasks. This paper proposes a pipeline to implement and evaluate ESNs with various embedded topologies and processing/memorization tasks. To this end, we showed that different performance optimums highly depend on the neuro-physiological characteristics of these pre-determined fMRI-induced sub-circuits. In general, fMRI-induced sub-circuit-embedded ESN outperforms simple bipartite and various null models with feed-forward properties commonly seen in MLP for different tasks and reservoir criticality conditions. We provided a thorough analysis of the topological properties of pre-determined fMRI-induced sub-circuits and highlighted their graph-theoretical properties that play significant roles in determining ESN performance."
+ },
+ "2412.13798v1": {
+ "title": "ICA-based Resting-State Networks Obtained on Large Autism fMRI Dataset ABIDE",
+ "url": "http://arxiv.org/abs/2412.13798v1",
+ "authors": "Sjir J. C. Schielen, Jesper Pilmeyer, Albert P. Aldenkamp, Danny Ruijters, Svitlana Zinger",
+ "update_time": "2024-12-18",
+ "abstract": "Functional magnetic resonance imaging (fMRI) has become instrumental in researching brain function. One application of fMRI is investigating potential neural features that distinguish people with autism spectrum disorder (ASD) from healthy controls. The Autism Brain Imaging Data Exchange (ABIDE) facilitates this research through its extensive data-sharing initiative. While ABIDE offers data preprocessed with various atlases, independent component analysis (ICA) for dimensionality reduction remains underutilized. We address this gap by presenting ICA-based resting-state networks (RSNs) from preprocessed scans from ABIDE, now publicly available: https://github.com/SjirSchielen/groupICAonABIDE. These RSNs unveil neural activation clusters without atlas constraints, offering a perspective on ASD analyses that complements the predominantly atlas-based literature. This contribution provides a valuable resource for further research into ASD, potentially aiding in developing new analytical approaches.",
+ "code_url": "https://github.com/sjirschielen/groupicaonabide"
+ },
+ "2412.13237v1": {
+ "title": "Optimized two-stage AI-based Neural Decoding for Enhanced Visual Stimulus Reconstruction from fMRI Data",
+ "url": "http://arxiv.org/abs/2412.13237v1",
+ "authors": "Lorenzo Veronese, Andrea Moglia, Luca Mainardi, Pietro Cerveri",
+ "update_time": "2024-12-17",
+ "abstract": "AI-based neural decoding reconstructs visual perception by leveraging generative models to map brain activity, measured through functional MRI (fMRI), into latent hierarchical representations. Traditionally, ridge linear models transform fMRI into a latent space, which is then decoded using latent diffusion models (LDM) via a pre-trained variational autoencoder (VAE). Due to the complexity and noisiness of fMRI data, newer approaches split the reconstruction into two sequential steps, the first one providing a rough visual approximation, the second on improving the stimulus prediction via LDM endowed by CLIP embeddings. This work proposes a non-linear deep network to improve fMRI latent space representation, optimizing the dimensionality alike. Experiments on the Natural Scenes Dataset showed that the proposed architecture improved the structural similarity of the reconstructed image by about 2\\% with respect to the state-of-the-art model, based on ridge linear transform. The reconstructed image's semantics improved by about 4\\%, measured by perceptual similarity, with respect to the state-of-the-art. The noise sensitivity analysis of the LDM showed that the role of the first stage was fundamental to predict the stimulus featuring high structural similarity. Conversely, providing a large noise stimulus affected less the semantics of the predicted stimulus, while the structural similarity between the ground truth and predicted stimulus was very poor. The findings underscore the importance of leveraging non-linear relationships between BOLD signal and the latent representation and two-stage generative AI for optimizing the fidelity of reconstructed visual stimuli from noisy fMRI data."
+ },
+ "2412.10161v1": {
+ "title": "Data Integration with Fusion Searchlight: Classifying Brain States from Resting-state fMRI",
+ "url": "http://arxiv.org/abs/2412.10161v1",
+ "authors": "Simon Wein, Marco Riebel, Lisa-Marie Brunner, Caroline Nothdurfter, Rainer Rupprecht, Jens V. Schwarzbach",
+ "update_time": "2024-12-13",
+ "abstract": "Spontaneous neural activity observed in resting-state fMRI is characterized by complex spatio-temporal dynamics. Different measures related to local and global brain connectivity and fluctuations in low-frequency amplitudes can quantify individual aspects of these neural dynamics. Even though such measures are derived from the same functional signals, they are often evaluated separately, neglecting their interrelations and potentially reducing the analysis sensitivity. In our study, we present a fusion searchlight (FuSL) framework to combine the complementary information contained in different resting-state fMRI metrics and demonstrate how this can improve the decoding of brain states. Moreover, we show how explainable AI allows us to reconstruct the differential impact of each metric on the decoding, which additionally increases spatial specificity of searchlight analysis. In general, this framework can be adapted to combine information derived from different imaging modalities or experimental conditions, offering a versatile and interpretable tool for data fusion in neuroimaging.",
+ "code_url": "https://github.com/simonvino/fusl"
+ },
+ "2412.09554v1": {
+ "title": "Network Dynamics of Emotional Processing: A Structural Balance Theory Approach",
+ "url": "http://arxiv.org/abs/2412.09554v1",
+ "authors": "Sepehr Gourabi, Parinaz Khosravani, Shahrzad Nosrat, Roya Mohammadi, Masoud Lotfalipour",
+ "update_time": "2024-12-12",
+ "abstract": "Understanding emotional processing in the human brain requires examining the complex interactions between different brain regions. While previous studies have identified specific regions involved in emotion processing, a holistic network approach may provide deeper insights. We use Structural Balance Theory to investigate the stability and triadic structures of signed brain networks during resting state and emotional processing, specifically in response to fear-related stimuli. We hypothesized that imbalanced triadic interactions would be more prevalent during emotional processing, especially in response to fear-related stimuli, potentially reflecting the brain's adaptation to emotional challenges. By analyzing fMRI data from 138 healthy, right-handed participants, we found that emotional processing was marked by an increase in positive connections and a decrease in negative connections compared to the resting state. Our findings clearly show that balanced triads significantly decreased while imbalanced triads increased, indicating a shift toward instability in the brain's functional network during emotional processing. Additionally, the number of influential hubs was significantly lower during fear processing than in neutral conditions, suggesting a more centralized network and higher levels of network energy. These findings reveal the brain's remarkable adaptive capacity during emotional processing, demonstrating how network stability dynamically shifts through changes in balanced and imbalanced triads, hub tendencies, and energy dynamics. Our research illuminates a complex mechanism by which the brain flexibly reconfigures its functional network in response to emotional stimuli with potential implications for understanding emotional resilience and neurological disorders."
+ },
+ "2412.12149v1": {
+ "title": "MHSA: A Multi-scale Hypergraph Network for Mild Cognitive Impairment Detection via Synchronous and Attentive Fusion",
+ "url": "http://arxiv.org/abs/2412.12149v1",
+ "authors": "Manman Yuan, Weiming Jia, Xiong Luo, Jiazhen Ye, Peican Zhu, Junlin Li",
+ "update_time": "2024-12-11",
+ "abstract": "The precise detection of mild cognitive impairment (MCI) is of significant importance in preventing the deterioration of patients in a timely manner. Although hypergraphs have enhanced performance by learning and analyzing brain networks, they often only depend on vector distances between features at a single scale to infer interactions. In this paper, we deal with a more arduous challenge, hypergraph modelling with synchronization between brain regions, and design a novel framework, i.e., A Multi-scale Hypergraph Network for MCI Detection via Synchronous and Attentive Fusion (MHSA), to tackle this challenge. Specifically, our approach employs the Phase-Locking Value (PLV) to calculate the phase synchronization relationship in the spectrum domain of regions of interest (ROIs) and designs a multi-scale feature fusion mechanism to integrate dynamic connectivity features of functional magnetic resonance imaging (fMRI) from both the temporal and spectrum domains. To evaluate and optimize the direct contribution of each ROI to phase synchronization in the temporal domain, we structure the PLV coefficients dynamically adjust strategy, and the dynamic hypergraph is modelled based on a comprehensive temporal-spectrum fusion matrix. Experiments on the real-world dataset indicate the effectiveness of our strategy. The code is available at https://github.com/Jia-Weiming/MHSA."
+ },
+ "2412.05632v1": {
+ "title": "Biological Brain Age Estimation using Sex-Aware Adversarial Variational Autoencoder with Multimodal Neuroimages",
+ "url": "http://arxiv.org/abs/2412.05632v1",
+ "authors": "Abd Ur Rehman, Azka Rehman, Muhammad Usman, Abdullah Shahid, Sung-Min Gho, Aleum Lee, Tariq M. Khan, Imran Razzak",
+ "update_time": "2024-12-07",
+ "abstract": "Brain aging involves structural and functional changes and therefore serves as a key biomarker for brain health. Combining structural magnetic resonance imaging (sMRI) and functional magnetic resonance imaging (fMRI) has the potential to improve brain age estimation by leveraging complementary data. However, fMRI data, being noisier than sMRI, complicates multimodal fusion. Traditional fusion methods often introduce more noise than useful information, which can reduce accuracy compared to using sMRI alone. In this paper, we propose a novel multimodal framework for biological brain age estimation, utilizing a sex-aware adversarial variational autoencoder (SA-AVAE). Our framework integrates adversarial and variational learning to effectively disentangle the latent features from both modalities. Specifically, we decompose the latent space into modality-specific codes and shared codes to represent complementary and common information across modalities, respectively. To enhance the disentanglement, we introduce cross-reconstruction and shared-distinct distance ratio loss as regularization terms. Importantly, we incorporate sex information into the learned latent code, enabling the model to capture sex-specific aging patterns for brain age estimation via an integrated regressor module. We evaluate our model using the publicly available OpenBHB dataset, a comprehensive multi-site dataset for brain age estimation. The results from ablation studies and comparisons with state-of-the-art methods demonstrate that our framework outperforms existing approaches and shows significant robustness across various age groups, highlighting its potential for real-time clinical applications in the early detection of neurodegenerative diseases."
+ },
+ "2412.04936v1": {
+ "title": "Probing the contents of semantic representations from text, behavior, and brain data using the psychNorms metabase",
+ "url": "http://arxiv.org/abs/2412.04936v1",
+ "authors": "Zak Hussain, Rui Mata, Ben R. Newell, Dirk U. Wulff",
+ "update_time": "2024-12-06",
+ "abstract": "Semantic representations are integral to natural language processing, psycholinguistics, and artificial intelligence. Although often derived from internet text, recent years have seen a rise in the popularity of behavior-based (e.g., free associations) and brain-based (e.g., fMRI) representations, which promise improvements in our ability to measure and model human representations. We carry out the first systematic evaluation of the similarities and differences between semantic representations derived from text, behavior, and brain data. Using representational similarity analysis, we show that word vectors derived from behavior and brain data encode information that differs from their text-derived cousins. Furthermore, drawing on our psychNorms metabase, alongside an interpretability method that we call representational content analysis, we find that, in particular, behavior representations capture unique variance on certain affective, agentic, and socio-moral dimensions. We thus establish behavior as an important complement to text for capturing human representations and behavior. These results are broadly relevant to research aimed at learning human-aligned semantic representations, including work on evaluating and aligning large language models.",
+ "code_url": "https://github.com/zak-hussain/psychnorms"
+ },
+ "2412.02945v1": {
+ "title": "Detection of Multiple Influential Observations on Model Selection",
+ "url": "http://arxiv.org/abs/2412.02945v1",
+ "authors": "Dongliang Zhang, Masoud Asgharian, Martin A. Lindquist",
+ "update_time": "2024-12-04",
+ "abstract": "Outlying observations are frequently encountered in a wide spectrum of scientific domains, posing significant challenges for the generalizability of statistical models and the reproducibility of downstream analysis. These observations can be identified through influential diagnosis, which refers to the detection of observations that are unduly influential on diverse facets of statistical inference. To date, methods for identifying observations influencing the choice of a stochastically selected submodel have been underdeveloped, especially in the high-dimensional setting where the number of predictors p exceeds the sample size n. Recently we proposed an improved diagnostic measure to handle this setting. However, its distributional properties and approximations have not yet been explored. To address this shortcoming, the notion of exchangeability is revived, and used to determine the exact finite- and large-sample distributions of our assessment metric. This forms the foundation for the introduction of both parametric and non-parametric approaches for its approximation and the establishment of thresholds for diagnosis. The resulting framework is extended to logistic regression models, followed by a simulation study conducted to assess the performance of various detection procedures. Finally the framework is applied to data from an fMRI study of thermal pain, with the goal of identifying outlying subjects that could distort the formulation of statistical models using functional brain activity in predicting physical pain ratings. Both linear and logistic regression models are used to demonstrate the benefits of detection and compare the performances of different detection procedures. In particular, two additional influential observations are identified, which are not discovered by previous studies."
+ },
+ "2412.02424v1": {
+ "title": "Hierarchical feature extraction on functional brain networks for autism spectrum disorder identification with resting-state fMRI data",
+ "url": "http://arxiv.org/abs/2412.02424v1",
+ "authors": "Yiqian Luo, Qiurong Chen, Fali Li, Liang Yi, Peng Xu, Yangsong Zhang",
+ "update_time": "2024-12-03",
+ "abstract": "Autism spectrum disorder (ASD) is a pervasive developmental disorder of the central nervous system, which occurs most frequently in childhood and is characterized by unusual and repetitive ritualistic behaviors. Currently, diagnostic methods primarily rely on questionnaire surveys and behavioral observation, which may lead to misdiagnoses due to the subjective evaluation and measurement used in these traditional methods. With the advancement in medical imaging, MR imaging-based diagnosis has become an alternative and more objective approach. In this paper, we propose a Hybrid neural Network model for ASD identification, termded ASD-HNet, to hierarchically extract features on the functional brain networks based on resting-state functional magnetic resonance imaging data. This hierarchical method can better extract brain representations, improve the diagnostic accuracy, and help us better locate brain regions related to ASD. Specifically, features are extracted from three scales: local regions of interest (ROIs) scale, community-clustering scale, and the whole-communities scale. For the ROI scale, graph convolution is used to transfer features between ROIs. At the community cluster scale, functional gradients are introduced, the clustering algorithm K-Means is used to automatically cluster ROIs with similar functional gradients into several communities, and features of ROIs belonging to the same community are extracted to characterize these communities. At global information integration scale, we extract global features from community-scale brain networks to characterize the whole brain networks. We validate the effectiveness of our method using the public dataset of Autism Brain Imaging Data Exchange I (ABIDE I), and elucidate the interpretability of the method. Experimental results demonstrate that the proposed ASD-HNet can yield superior performance than compared methods."
+ }
+ },
+ "MEG": {
+ "2412.08896v1": {
+ "title": "LV-CadeNet: Long View Feature Convolution-Attention Fusion Encoder-Decoder Network for Clinical MEG Spike Detection",
+ "url": "http://arxiv.org/abs/2412.08896v1",
+ "authors": "Kuntao Xiao, Xiongfei Wang, Pengfei Teng, Yi Sun, Wanli Yang, Liang Zhang, Hanyang Dong, Guoming Luan, Shurong Sheng",
+ "update_time": "2024-12-12",
+ "abstract": "It is widely acknowledged that the epileptic foci can be pinpointed by source localizing interictal epileptic discharges (IEDs) via Magnetoencephalography (MEG). However, manual detection of IEDs, which appear as spikes in MEG data, is extremely labor intensive and requires considerable professional expertise, limiting the broader adoption of MEG technology. Numerous studies have focused on automatic detection of MEG spikes to overcome this challenge, but these efforts often validate their models on synthetic datasets with balanced positive and negative samples. In contrast, clinical MEG data is highly imbalanced, raising doubts on the real-world efficacy of these models. To address this issue, we introduce LV-CadeNet, a Long View feature Convolution-Attention fusion Encoder-Decoder Network, designed for automatic MEG spike detection in real-world clinical scenarios. Beyond addressing the disparity between training data distribution and clinical test data through semi-supervised learning, our approach also mimics human specialists by constructing long view morphological input data. Moreover, we propose an advanced convolution-attention module to extract temporal and spatial features from the input data. LV-CadeNet significantly improves the accuracy of MEG spike detection, boosting it from 42.31\\% to 54.88\\% on a novel clinical dataset sourced from Sanbo Brain Hospital Capital Medical University. This dataset, characterized by a highly imbalanced distribution of positive and negative samples, accurately represents real-world clinical scenarios."
+ },
+ "2412.04758v1": {
+ "title": "Measuring Goal-Directedness",
+ "url": "http://arxiv.org/abs/2412.04758v1",
+ "authors": "Matt MacDermott, James Fox, Francesco Belardinelli, Tom Everitt",
+ "update_time": "2024-12-06",
+ "abstract": "We define maximum entropy goal-directedness (MEG), a formal measure of goal-directedness in causal models and Markov decision processes, and give algorithms for computing it. Measuring goal-directedness is important, as it is a critical element of many concerns about harm from AI. It is also of philosophical interest, as goal-directedness is a key aspect of agency. MEG is based on an adaptation of the maximum causal entropy framework used in inverse reinforcement learning. It can measure goal-directedness with respect to a known utility function, a hypothesis class of utility functions, or a set of random variables. We prove that MEG satisfies several desiderata and demonstrate our algorithms with small-scale experiments."
+ },
+ "2411.19840v1": {
+ "title": "Neuroplasticity and Psychedelics: a comprehensive examination of classic and non-classic compounds in pre and clinical models",
+ "url": "http://arxiv.org/abs/2411.19840v1",
+ "authors": "Claudio Agnorelli, Meg Spriggs, Kate Godfrey, Gabriela Sawicka, Bettina Bohl, Hannah Douglass, Andrea Fagiolini, Hashemi Parastoo, Robin Carhart-Harris, David Nutt, David Erritzoe",
+ "update_time": "2024-11-29",
+ "abstract": "Neuroplasticity, the ability of the nervous system to adapt throughout an organism's lifespan, offers potential as both a biomarker and treatment target for neuropsychiatric conditions. Psychedelics, a burgeoning category of drugs, are increasingly prominent in psychiatric research, prompting inquiries into their mechanisms of action. Distinguishing themselves from traditional medications, psychedelics demonstrate rapid and enduring therapeutic effects after a single or few administrations, believed to stem from their neuroplasticity-enhancing properties. This review examines how classic psychedelics (e.g., LSD, psilocybin, N,N-DMT) and non-classic psychedelics (e.g., ketamine, MDMA) influence neuroplasticity. Drawing from preclinical and clinical studies, we explore the molecular, structural, and functional changes triggered by these agents. Animal studies suggest psychedelics induce heightened sensitivity of the nervous system to environmental stimuli (meta-plasticity), re-opening developmental windows for long-term structural changes (hyper-plasticity), with implications for mood and behavior. Translating these findings to humans faces challenges due to limitations in current imaging techniques. Nonetheless, promising new directions for human research are emerging, including the employment of novel positron-emission tomography (PET) radioligands, non-invasive brain stimulation methods, and multimodal approaches. By elucidating the interplay between psychedelics and neuroplasticity, this review informs the development of targeted interventions for neuropsychiatric disorders and advances understanding of psychedelics' therapeutic potential."
+ },
+ "2411.19800v1": {
+ "title": "On Monitoring Edge-Geodetic Sets of Dynamic Graph",
+ "url": "http://arxiv.org/abs/2411.19800v1",
+ "authors": "Zin Mar Myint, Ashish Saxena",
+ "update_time": "2024-11-29",
+ "abstract": "The concept of a monitoring edge-geodetic set (MEG-set) in a graph $G$, denoted $MEG(G)$, refers to a subset of vertices $MEG(G)\\subseteq V(G)$ such that every edge $e$ in $G$ is monitored by some pair of vertices $ u, v \\in MEG(G)$, where $e$ lies on all shortest paths between $u$ and $v$. The minimum number of vertices required to form such a set is called the monitoring edge-geodetic number, denoted $meg(G)$. The primary motivation for studying $MEG$-sets in previous works arises from scenarios in which certain edges are removed from $G$. In these cases, the vertices of the $MEG$-set are responsible for detecting these deletions. Such detection is crucial for identifying which edges have been removed from $G$ and need to be repaired. In real life, repairing these edges may be costly, or sometimes it is impossible to repair edges. In this case, the original $MEG$-set may no longer be effective in monitoring the modified graph. This highlights the importance of reassessing and adapting the $MEG$-set after edge deletions. This work investigates the monitoring edge-geodetic properties of graphs, focusing on how the removal of $k$ edges affects the structure of a graph and influences its monitoring capabilities. Specifically, we explore how the monitoring edge-geodetic number $meg(G)$ changes when $k$ edges are removed. The study aims to compare the monitoring properties of the original graph with those of the modified graph and to understand the impact of edge deletions."
+ },
+ "2411.09723v1": {
+ "title": "Towards Neural Foundation Models for Vision: Aligning EEG, MEG, and fMRI Representations for Decoding, Encoding, and Modality Conversion",
+ "url": "http://arxiv.org/abs/2411.09723v1",
+ "authors": "Matteo Ferrante, Tommaso Boccato, Grigorii Rashkov, Nicola Toschi",
+ "update_time": "2024-11-14",
+ "abstract": "This paper presents a novel approach towards creating a foundational model for aligning neural data and visual stimuli across multimodal representationsof brain activity by leveraging contrastive learning. We used electroencephalography (EEG), magnetoencephalography (MEG), and functional magnetic resonance imaging (fMRI) data. Our framework's capabilities are demonstrated through three key experiments: decoding visual information from neural data, encoding images into neural representations, and converting between neural modalities. The results highlight the model's ability to accurately capture semantic information across different brain imaging techniques, illustrating its potential in decoding, encoding, and modality conversion tasks."
+ },
+ "2411.07994v1": {
+ "title": "Search for the X17 particle in $^{7}\\mathrm{Li}(\\mathrm{p},\\mathrm{e}^+ \\mathrm{e}^{-}) ^{8}\\mathrm{Be}$ processes with the MEG II detector",
+ "url": "http://arxiv.org/abs/2411.07994v1",
+ "authors": "The MEG II collaboration, K. Afanaciev, A. M. Baldini, S. Ban, H. Benmansour, G. Boca, P. W. Cattaneo, G. Cavoto, F. Cei, M. Chiappini, A. Corvaglia, G. Dal Maso, A. De Bari, M. De Gerone, L. Ferrari Barusso, M. Francesconi, L. Galli, G. Gallucci, F. Gatti, L. Gerritzen, F. Grancagnolo, E. G. Grandoni, M. Grassi, D. N. Grigoriev, M. Hildebrandt, F. Ignatov, F. Ikeda, T. Iwamoto, S. Karpov, P. -R. Kettle, N. Khomutov, A. Kolesnikov, N. Kravchuk, V. Krylov, N. Kuchinskiy, F. Leonetti, W. Li, V. Malyshev, A. Matsushita, M. Meucci, S. Mihara, W. Molzon, T. Mori, D. Nicol\u00f2, H. Nishiguchi, A. Ochi, W. Ootani, A. Oya, D. Palo, M. Panareo, A. Papa, V. Pettinacci, A. Popov, F. Renga, S. Ritt, M. Rossella, A. Rozhdestvensky. S. Scarpellini, P. Schwendimann, G. Signorelli, M. Takahashi, Y. Uchiyama, A. Venturini, B. Vitali, C. Voena, K. Yamamoto, R. Yokota, T. Yonemoto",
+ "update_time": "2024-11-12",
+ "abstract": "The observation of a resonance structure in the opening angle of the electron-positron pairs in the $^{7}$Li(p,\\ee) $^{8}$Be reaction was claimed and interpreted as the production and subsequent decay of a hypothetical particle (X17). Similar excesses, consistent with this particle, were later observed in processes involving $^{4}$He and $^{12}$C nuclei with the same experimental technique. The MEG II apparatus at PSI, designed to search for the $\\mu^+ \\rightarrow \\mathrm{e}^+ \\gamma$ decay, can be exploited to investigate the existence of this particle and study its nature. Protons from a Cockroft-Walton accelerator, with an energy up to 1.1 MeV, were delivered on a dedicated Li-based target. The $\\gamma$ and the e$^{+}$e$^{-}$ pair emerging from the $^8\\mathrm{Be}^*$ transitions were studied with calorimeters and a spectrometer, featuring a broader angular acceptance than previous experiments. We present in this paper the analysis of a four-week data-taking in 2023 with a beam energy of 1080 keV, resulting in the excitation of two different resonances with Q-value \\SI{17.6}{\\mega\\electronvolt} and \\SI{18.1}{\\mega\\electronvolt}. No significant signal was found, and limits at \\SI{90}{\\percent} C.L. on the branching ratios (relative to the $\\gamma$ emission) of the two resonances to X17 were set, $R_{17.6} < 1.8 \\times 10^{-6} $ and $R_{18.1} < 1.2 \\times 10^{-5} $."
+ },
+ "2411.03883v2": {
+ "title": "MEG: Medical Knowledge-Augmented Large Language Models for Question Answering",
+ "url": "http://arxiv.org/abs/2411.03883v2",
+ "authors": "Laura Cabello, Carmen Martin-Turrero, Uchenna Akujuobi, Anders S\u00f8gaard, Carlos Bobed",
+ "update_time": "2024-11-07",
+ "abstract": "Question answering is a natural language understanding task that involves reasoning over both explicit context and unstated, relevant domain knowledge. Large language models (LLMs), which underpin most contemporary question answering systems, struggle to induce how concepts relate in specialized domains such as medicine. Existing medical LLMs are also costly to train. In this work, we present MEG, a parameter-efficient approach for medical knowledge-augmented LLMs. MEG uses a lightweight mapping network to integrate graph embeddings into the LLM, enabling it to leverage external knowledge in a cost-effective way. We evaluate our method on four popular medical multiple-choice datasets and show that LLMs greatly benefit from the factual grounding provided by knowledge graph embeddings. MEG attains an average of +10.2% accuracy over the Mistral-Instruct baseline, and +6.7% over specialized models like BioMistral. We also show results based on Llama-3. Finally, we show that MEG's performance remains robust to the choice of graph encoder.",
+ "code_url": "https://github.com/lautel/meg"
+ },
+ "2410.23386v1": {
+ "title": "STIED: A deep learning model for the SpatioTemporal detection of focal Interictal Epileptiform Discharges with MEG",
+ "url": "http://arxiv.org/abs/2410.23386v1",
+ "authors": "Raquel Fern\u00e1ndez-Mart\u00edn, Alfonso Gij\u00f3n, Odile Feys, Elodie Juven\u00e9, Alec Aeby, Charline Urbain, Xavier De Ti\u00e8ge, Vincent Wens",
+ "update_time": "2024-10-30",
+ "abstract": "Magnetoencephalography (MEG) allows the non-invasive detection of interictal epileptiform discharges (IEDs). Clinical MEG analysis in epileptic patients traditionally relies on the visual identification of IEDs, which is time consuming and partially subjective. Automatic, data-driven detection methods exist but show limited performance. Still, the rise of deep learning (DL)-with its ability to reproduce human-like abilities-could revolutionize clinical MEG practice. Here, we developed and validated STIED, a simple yet powerful supervised DL algorithm combining two convolutional neural networks with temporal (1D time-course) and spatial (2D topography) features of MEG signals inspired from current clinical guidelines. Our DL model enabled both temporal and spatial localization of IEDs in patients suffering from focal epilepsy with frequent and high amplitude spikes (FE group), with high-performance metrics-accuracy, specificity, and sensitivity all exceeding 85%-when learning from spatiotemporal features of IEDs. This performance can be attributed to our handling of input data, which mimics established clinical MEG practice. Reverse engineering further revealed that STIED encodes fine spatiotemporal features of IEDs rather than their mere amplitude. The model trained on the FE group also showed promising results when applied to a separate group of presurgical patients with different types of refractory focal epilepsy, though further work is needed to distinguish IEDs from physiological transients. This study paves the way of incorporating STIED and DL algorithms into the routine clinical MEG evaluation of epilepsy."
+ },
+ "2410.20916v1": {
+ "title": "NeuGPT: Unified multi-modal Neural GPT",
+ "url": "http://arxiv.org/abs/2410.20916v1",
+ "authors": "Yiqian Yang, Yiqun Duan, Hyejeong Jo, Qiang Zhang, Renjing Xu, Oiwi Parker Jones, Xuming Hu, Chin-teng Lin, Hui Xiong",
+ "update_time": "2024-10-28",
+ "abstract": "This paper introduces NeuGPT, a groundbreaking multi-modal language generation model designed to harmonize the fragmented landscape of neural recording research. Traditionally, studies in the field have been compartmentalized by signal type, with EEG, MEG, ECoG, SEEG, fMRI, and fNIRS data being analyzed in isolation. Recognizing the untapped potential for cross-pollination and the adaptability of neural signals across varying experimental conditions, we set out to develop a unified model capable of interfacing with multiple modalities. Drawing inspiration from the success of pre-trained large models in NLP, computer vision, and speech processing, NeuGPT is architected to process a diverse array of neural recordings and interact with speech and text data. Our model mainly focus on brain-to-text decoding, improving SOTA from 6.94 to 12.92 on BLEU-1 and 6.93 to 13.06 on ROUGE-1F. It can also simulate brain signals, thereby serving as a novel neural interface. Code is available at \\href{https://github.com/NeuSpeech/NeuGPT}{NeuSpeech/NeuGPT (https://github.com/NeuSpeech/NeuGPT) .}",
+ "code_url": "https://github.com/neuspeech/neugpt"
+ },
+ "2410.19986v1": {
+ "title": "Resolving Domain Shift For Representations Of Speech In Non-Invasive Brain Recordings",
+ "url": "http://arxiv.org/abs/2410.19986v1",
+ "authors": "Jeremiah Ridge, Oiwi Parker Jones",
+ "update_time": "2024-10-25",
+ "abstract": "Machine learning techniques have enabled researchers to leverage neuroimaging data to decode speech from brain activity, with some amazing recent successes achieved by applications built using invasive devices. However, research requiring surgical implants has a number of practical limitations. Non-invasive neuroimaging techniques provide an alternative but come with their own set of challenges, the limited scale of individual studies being among them. Without the ability to pool the recordings from different non-invasive studies, data on the order of magnitude needed to leverage deep learning techniques to their full potential remains out of reach. In this work, we focus on non-invasive data collected using magnetoencephalography (MEG). We leverage two different, leading speech decoding models to investigate how an adversarial domain adaptation framework augments their ability to generalize across datasets. We successfully improve the performance of both models when training across multiple datasets. To the best of our knowledge, this study is the first ever application of feature-level, deep learning based harmonization for MEG neuroimaging data. Our analysis additionally offers further evidence of the impact of demographic features on neuroimaging data, demonstrating that participant age strongly affects how machine learning models solve speech decoding tasks using MEG data. Lastly, in the course of this study we produce a new open-source implementation of one of these models to the benefit of the broader scientific community."
+ }
+ },
+ "neuroAI": {
+ "2411.18526v1": {
+ "title": "NeuroAI for AI Safety",
+ "url": "http://arxiv.org/abs/2411.18526v1",
+ "authors": "Patrick Mineault, Niccol\u00f2 Zanichelli, Joanne Zichen Peng, Anton Arkhipov, Eli Bingham, Julian Jara-Ettinger, Emily Mackevicius, Adam Marblestone, Marcelo Mattar, Andrew Payne, Sophia Sanborn, Karen Schroeder, Zenna Tavares, Andreas Tolias",
+ "update_time": "2024-11-27",
+ "abstract": "As AI systems become increasingly powerful, the need for safe AI has become more pressing. Humans are an attractive model for AI safety: as the only known agents capable of general intelligence, they perform robustly even under conditions that deviate significantly from prior experiences, explore the world safely, understand pragmatics, and can cooperate to meet their intrinsic goals. Intelligence, when coupled with cooperation and safety mechanisms, can drive sustained progress and well-being. These properties are a function of the architecture of the brain and the learning algorithms it implements. Neuroscience may thus hold important keys to technical AI safety that are currently underexplored and underutilized. In this roadmap, we highlight and critically evaluate several paths toward AI safety inspired by neuroscience: emulating the brain's representations, information processing, and architecture; building robust sensory and motor systems from imitating brain data and bodies; fine-tuning AI systems on brain data; advancing interpretability using neuroscience methods; and scaling up cognitively-inspired architectures. We make several concrete recommendations for how neuroscience can positively impact AI safety."
+ },
+ "2411.14633v1": {
+ "title": "Evaluating Representational Similarity Measures from the Lens of Functional Correspondence",
+ "url": "http://arxiv.org/abs/2411.14633v1",
+ "authors": "Yiqing Bo, Ansh Soni, Sudhanshu Srivastava, Meenakshi Khosla",
+ "update_time": "2024-11-21",
+ "abstract": "Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics for comparison. In our evaluation of eight commonly used representational similarity metrics in the visual domain -- spanning alignment-based, Canonical Correlation Analysis (CCA)-based, inner product kernel-based, and nearest-neighbor methods -- we found that metrics like linear Centered Kernel Alignment (CKA) and Procrustes distance, which emphasize the overall geometric structure or shape of representations, excelled in differentiating trained from untrained models and aligning with behavioral measures, whereas metrics such as linear predictivity, commonly used in neuroscience, demonstrated only moderate alignment with behavior. These insights are crucial for selecting metrics that emphasize behaviorally meaningful comparisons in NeuroAI research."
+ },
+ "2410.19315v1": {
+ "title": "A prescriptive theory for brain-like inference",
+ "url": "http://arxiv.org/abs/2410.19315v1",
+ "authors": "Hadi Vafaii, Dekel Galor, Jacob L. Yates",
+ "update_time": "2024-10-25",
+ "abstract": "The Evidence Lower Bound (ELBO) is a widely used objective for training deep generative models, such as Variational Autoencoders (VAEs). In the neuroscience literature, an identical objective is known as the variational free energy, hinting at a potential unified framework for brain function and machine learning. Despite its utility in interpreting generative models, including diffusion models, ELBO maximization is often seen as too broad to offer prescriptive guidance for specific architectures in neuroscience or machine learning. In this work, we show that maximizing ELBO under Poisson assumptions for general sequence data leads to a spiking neural network that performs Bayesian posterior inference through its membrane potential dynamics. The resulting model, the iterative Poisson VAE (iP-VAE), has a closer connection to biological neurons than previous brain-inspired predictive coding models based on Gaussian assumptions. Compared to amortized and iterative VAEs, iP-VAElearns sparser representations and exhibits superior generalization to out-of-distribution samples. These findings suggest that optimizing ELBO, combined with Poisson assumptions, provides a solid foundation for developing prescriptive theories in NeuroAI."
+ },
+ "2409.05771v1": {
+ "title": "Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models",
+ "url": "http://arxiv.org/abs/2409.05771v1",
+ "authors": "Emily Cheng, Richard J. Antonello",
+ "update_time": "2024-09-09",
+ "abstract": "Research has repeatedly demonstrated that intermediate hidden states extracted from large language models are able to predict measured brain response to natural language stimuli. Yet, very little is known about the representation properties that enable this high prediction performance. Why is it the intermediate layers, and not the output layers, that are most capable for this unique and highly general transfer task? In this work, we show that evidence from language encoding models in fMRI supports the existence of a two-phase abstraction process within LLMs. We use manifold learning methods to show that this abstraction process naturally arises over the course of training a language model and that the first \"composition\" phase of this abstraction process is compressed into fewer layers as training continues. Finally, we demonstrate a strong correspondence between layerwise encoding performance and the intrinsic dimensionality of representations from LLMs. We give initial evidence that this correspondence primarily derives from the inherent compositionality of LLMs and not their next-word prediction properties."
+ },
+ "2407.04117v2": {
+ "title": "Predictive Coding Networks and Inference Learning: Tutorial and Survey",
+ "url": "http://arxiv.org/abs/2407.04117v2",
+ "authors": "Bj\u00f6rn van Zwol, Ro Jefferson, Egon L. van den Broek",
+ "update_time": "2024-07-22",
+ "abstract": "Recent years have witnessed a growing call for renewed emphasis on neuroscience-inspired approaches in artificial intelligence research, under the banner of NeuroAI. A prime example of this is predictive coding networks (PCNs), based on the neuroscientific framework of predictive coding. This framework views the brain as a hierarchical Bayesian inference model that minimizes prediction errors through feedback connections. Unlike traditional neural networks trained with backpropagation (BP), PCNs utilize inference learning (IL), a more biologically plausible algorithm that explains patterns of neural activity that BP cannot. Historically, IL has been more computationally intensive, but recent advancements have demonstrated that it can achieve higher efficiency than BP with sufficient parallelization. Furthermore, PCNs can be mathematically considered a superset of traditional feedforward neural networks (FNNs), significantly extending the range of trainable architectures. As inherently probabilistic (graphical) latent variable models, PCNs provide a versatile framework for both supervised learning and unsupervised (generative) modeling that goes beyond traditional artificial neural networks. This work provides a comprehensive review and detailed formal specification of PCNs, particularly situating them within the context of modern ML methods. Additionally, we introduce a Python library (PRECO) for practical implementation. This positions PC as a promising framework for future ML innovations."
+ },
+ "2306.10168v3": {
+ "title": "Beyond Geometry: Comparing the Temporal Structure of Computation in Neural Circuits with Dynamical Similarity Analysis",
+ "url": "http://arxiv.org/abs/2306.10168v3",
+ "authors": "Mitchell Ostrow, Adam Eisen, Leo Kozachkov, Ila Fiete",
+ "update_time": "2023-10-29",
+ "abstract": "How can we tell whether two neural networks utilize the same internal processes for a particular computation? This question is pertinent for multiple subfields of neuroscience and machine learning, including neuroAI, mechanistic interpretability, and brain-machine interfaces. Standard approaches for comparing neural networks focus on the spatial geometry of latent states. Yet in recurrent networks, computations are implemented at the level of dynamics, and two networks performing the same computation with equivalent dynamics need not exhibit the same geometry. To bridge this gap, we introduce a novel similarity metric that compares two systems at the level of their dynamics, called Dynamical Similarity Analysis (DSA). Our method incorporates two components: Using recent advances in data-driven dynamical systems theory, we learn a high-dimensional linear system that accurately captures core features of the original nonlinear dynamics. Next, we compare different systems passed through this embedding using a novel extension of Procrustes Analysis that accounts for how vector fields change under orthogonal transformation. In four case studies, we demonstrate that our method disentangles conjugate and non-conjugate recurrent neural networks (RNNs), while geometric methods fall short. We additionally show that our method can distinguish learning rules in an unsupervised manner. Our method opens the door to comparative analyses of the essential temporal structure of computation in neural circuits.",
+ "code_url": "https://github.com/mitchellostrow/dsa"
+ },
+ "2305.11275v2": {
+ "title": "Explaining V1 Properties with a Biologically Constrained Deep Learning Architecture",
+ "url": "http://arxiv.org/abs/2305.11275v2",
+ "authors": "Galen Pogoncheff, Jacob Granley, Michael Beyeler",
+ "update_time": "2023-05-25",
+ "abstract": "Convolutional neural networks (CNNs) have recently emerged as promising models of the ventral visual stream, despite their lack of biological specificity. While current state-of-the-art models of the primary visual cortex (V1) have surfaced from training with adversarial examples and extensively augmented data, these models are still unable to explain key neural properties observed in V1 that arise from biological circuitry. To address this gap, we systematically incorporated neuroscience-derived architectural components into CNNs to identify a set of mechanisms and architectures that comprehensively explain neural activity in V1. We show drastic improvements in model-V1 alignment driven by the integration of architectural components that simulate center-surround antagonism, local receptive fields, tuned normalization, and cortical magnification. Upon enhancing task-driven CNNs with a collection of these specialized components, we uncover models with latent representations that yield state-of-the-art explanation of V1 neural activity and tuning properties. Our results highlight an important advancement in the field of NeuroAI, as we systematically establish a set of architectural components that contribute to unprecedented explanation of V1. The neuroscience insights that could be gleaned from increasingly accurate in-silico models of the brain have the potential to greatly advance the fields of both neuroscience and artificial intelligence."
+ },
+ "2302.07243v4": {
+ "title": "A Deep Probabilistic Spatiotemporal Framework for Dynamic Graph Representation Learning with Application to Brain Disorder Identification",
+ "url": "http://arxiv.org/abs/2302.07243v4",
+ "authors": "Sin-Yee Yap, Junn Yong Loo, Chee-Ming Ting, Fuad Noman, Raphael C. -W. Phan, Adeel Razi, David L. Dowe",
+ "update_time": "2024-11-09",
+ "abstract": "Recent applications of pattern recognition techniques on brain connectome classification using functional connectivity (FC) are shifting towards acknowledging the non-Euclidean topology and dynamic aspects of brain connectivity across time. In this paper, a deep spatiotemporal variational Bayes (DSVB) framework is proposed to learn time-varying topological structures in dynamic FC networks for identifying autism spectrum disorder (ASD) in human participants. The framework incorporates a spatial-aware recurrent neural network with an attention-based message passing scheme to capture rich spatiotemporal patterns across dynamic FC networks. To overcome model overfitting on limited training datasets, an adversarial training strategy is introduced to learn graph embedding models that generalize well to unseen brain networks. Evaluation on the ABIDE resting-state functional magnetic resonance imaging dataset shows that our proposed framework substantially outperforms state-of-the-art methods in identifying patients with ASD. Dynamic FC analyses with DSVB-learned embeddings reveal apparent group differences between ASD and healthy controls in brain network connectivity patterns and switching dynamics of brain states. The code is available at https://github.com/Monash-NeuroAI/Deep-Spatiotemporal-Variational-Bayes.",
+ "code_url": "https://github.com/Monash-NeuroAI/Deep-Spatiotemporal-Variational-Bayes"
+ },
+ "2301.09245v2": {
+ "title": "Towards NeuroAI: Introducing Neuronal Diversity into Artificial Neural Networks",
+ "url": "http://arxiv.org/abs/2301.09245v2",
+ "authors": "Feng-Lei Fan, Yingxin Li, Hanchuan Peng, Tieyong Zeng, Fei Wang",
+ "update_time": "2023-03-11",
+ "abstract": "Throughout history, the development of artificial intelligence, particularly artificial neural networks, has been open to and constantly inspired by the increasingly deepened understanding of the brain, such as the inspiration of neocognitron, which is the pioneering work of convolutional neural networks. Per the motives of the emerging field: NeuroAI, a great amount of neuroscience knowledge can help catalyze the next generation of AI by endowing a network with more powerful capabilities. As we know, the human brain has numerous morphologically and functionally different neurons, while artificial neural networks are almost exclusively built on a single neuron type. In the human brain, neuronal diversity is an enabling factor for all kinds of biological intelligent behaviors. Since an artificial network is a miniature of the human brain, introducing neuronal diversity should be valuable in terms of addressing those essential problems of artificial networks such as efficiency, interpretability, and memory. In this Primer, we first discuss the preliminaries of biological neuronal diversity and the characteristics of information transmission and processing in a biological neuron. Then, we review studies of designing new neurons for artificial networks. Next, we discuss what gains can neuronal diversity bring into artificial networks and exemplary applications in several important fields. Lastly, we discuss the challenges and future directions of neuronal diversity to explore the potential of NeuroAI."
+ },
+ "2212.04401v1": {
+ "title": "A Rubric for Human-like Agents and NeuroAI",
+ "url": "http://arxiv.org/abs/2212.04401v1",
+ "authors": "Ida Momennejad",
+ "update_time": "2022-12-08",
+ "abstract": "Researchers across cognitive, neuro-, and computer sciences increasingly reference human-like artificial intelligence and neuroAI. However, the scope and use of the terms are often inconsistent. Contributed research ranges widely from mimicking behaviour, to testing machine learning methods as neurally plausible hypotheses at the cellular or functional levels, or solving engineering problems. However, it cannot be assumed nor expected that progress on one of these three goals will automatically translate to progress in others. Here a simple rubric is proposed to clarify the scope of individual contributions, grounded in their commitments to human-like behaviour, neural plausibility, or benchmark/engineering goals. This is clarified using examples of weak and strong neuroAI and human-like agents, and discussing the generative, corroborate, and corrective ways in which the three dimensions interact with one another. The author maintains that future progress in artificial intelligence will need strong interactions across the disciplines, with iterative feedback loops and meticulous validity tests, leading to both known and yet-unknown advances that may span decades to come."
+ }
+ },
+ "medical": {
+ "2412.16134v1": {
+ "title": "EF-Net: A Deep Learning Approach Combining Word Embeddings and Feature Fusion for Patient Disposition Analysis",
+ "url": "http://arxiv.org/abs/2412.16134v1",
+ "authors": "Nafisa Binte Feroz, Chandrima Sarker, Tanzima Ahsan, K M Arefeen Sultan, Raqeebir Rab",
+ "update_time": "2024-12-20",
+ "abstract": "One of the most urgent problems is the overcrowding in emergency departments (EDs), caused by an aging population and rising healthcare costs. Patient dispositions have become more complex as a result of the strain on hospital infrastructure and the scarcity of medical resources. Individuals with more dangerous health issues should be prioritized in the emergency room. Thus, our research aims to develop a prediction model for patient disposition using EF-Net. This model will incorporate categorical features into the neural network layer and add numerical features with the embedded categorical features. We combine the EF-Net and XGBoost models to attain higher accuracy in our results. The result is generated using the soft voting technique. In EF-Net, we attained an accuracy of 95.33%, whereas in the Ensemble Model, we achieved an accuracy of 96%. The experiment's analysis shows that EF-Net surpasses existing works in accuracy, AUROC, and F1-Score on the MIMIC-IV-ED dataset, demonstrating its potential as a scalable solution for patient disposition assessment. Our code is available at https://github.com/nafisa67/thesis",
+ "code_url": "https://github.com/nafisa67/thesis"
+ },
+ "2412.16118v1": {
+ "title": "Convolutional Deep Operator Networks for Learning Nonlinear Focused Ultrasound Wave Propagation in Heterogeneous Spinal Cord Anatomy",
+ "url": "http://arxiv.org/abs/2412.16118v1",
+ "authors": "Avisha Kumar, Xuzhe Zhi, Zan Ahmad, Minglang Yin, Amir Manbachi",
+ "update_time": "2024-12-20",
+ "abstract": "Focused ultrasound (FUS) therapy is a promising tool for optimally targeted treatment of spinal cord injuries (SCI), offering submillimeter precision to enhance blood flow at injury sites while minimizing impact on surrounding tissues. However, its efficacy is highly sensitive to the placement of the ultrasound source, as the spinal cord's complex geometry and acoustic heterogeneity distort and attenuate the FUS signal. Current approaches rely on computer simulations to solve the governing wave propagation equations and compute patient-specific pressure maps using ultrasound images of the spinal cord anatomy. While accurate, these high-fidelity simulations are computationally intensive, taking up to hours to complete parameter sweeps, which is impractical for real-time surgical decision-making. To address this bottleneck, we propose a convolutional deep operator network (DeepONet) to rapidly predict FUS pressure fields in patient spinal cords. Unlike conventional neural networks, DeepONets are well equipped to approximate the solution operator of the parametric partial differential equations (PDEs) that govern the behavior of FUS waves with varying initial and boundary conditions (i.e., new transducer locations or spinal cord geometries) without requiring extensive simulations. Trained on simulated pressure maps across diverse patient anatomies, this surrogate model achieves real-time predictions with only a 2% loss on the test set, significantly accelerating the modeling of nonlinear physical systems in heterogeneous domains. By facilitating rapid parameter sweeps in surgical settings, this work provides a crucial step toward precise and individualized solutions in neurosurgical treatments.",
+ "code_url": "https://github.com/avishakumar21/nonlinear-fus-with-neural-operators"
+ },
+ "2412.16086v1": {
+ "title": "Towards Interpretable Radiology Report Generation via Concept Bottlenecks using a Multi-Agentic RAG",
+ "url": "http://arxiv.org/abs/2412.16086v1",
+ "authors": "Hasan Md Tusfiqur Alam, Devansh Srivastav, Md Abdul Kadir, Daniel Sonntag",
+ "update_time": "2024-12-20",
+ "abstract": "Deep learning has advanced medical image classification, but interpretability challenges hinder its clinical adoption. This study enhances interpretability in Chest X-ray (CXR) classification by using concept bottleneck models (CBMs) and a multi-agent Retrieval-Augmented Generation (RAG) system for report generation. By modeling relationships between visual features and clinical concepts, we create interpretable concept vectors that guide a multi-agent RAG system to generate radiology reports, enhancing clinical relevance, explainability, and transparency. Evaluation of the generated reports using an LLM-as-a-judge confirmed the interpretability and clinical utility of our model's outputs. On the COVID-QU dataset, our model achieved 81% classification accuracy and demonstrated robust report generation performance, with five key metrics ranging between 84% and 90%. This interpretable multi-agent framework bridges the gap between high-performance AI and the explainability required for reliable AI-driven CXR analysis in clinical settings.",
+ "code_url": "https://github.com/tifat58/irr-with-cbm-rag"
+ },
+ "2412.16085v1": {
+ "title": "Efficient MedSAMs: Segment Anything in Medical Images on Laptop",
+ "url": "http://arxiv.org/abs/2412.16085v1",
+ "authors": "Jun Ma, Feifei Li, Sumin Kim, Reza Asakereh, Bao-Hiep Le, Dang-Khoa Nguyen-Vu, Alexander Pfefferle, Muxin Wei, Ruochen Gao, Donghang Lyu, Songxiao Yang, Lennart Purucker, Zdravko Marinov, Marius Staring, Haisheng Lu, Thuy Thanh Dao, Xincheng Ye, Zhi Li, Gianluca Brugnara, Philipp Vollmuth, Martha Foltyn-Dumitru, Jaeyoung Cho, Mustafa Ahmed Mahmutoglu, Martin Bendszus, Irada Pfl\u00fcger, Aditya Rastogi, Dong Ni, Xin Yang, Guang-Quan Zhou, Kaini Wang, Nicholas Heller, Nikolaos Papanikolopoulos, Christopher Weight, Yubing Tong, Jayaram K Udupa, Cahill J. Patrick, Yaqi Wang, Yifan Zhang, Francisco Contijoch, Elliot McVeigh, Xin Ye, Shucheng He, Robert Haase, Thomas Pinetz, Alexander Radbruch, Inga Krause, Erich Kobler, Jian He, Yucheng Tang, Haichun Yang, Yuankai Huo, Gongning Luo, Kaisar Kushibar, Jandos Amankulov, Dias Toleshbayev, Amangeldi Mukhamejan, Jan Egger, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Shohei Fujita, Tomohiro Kikuchi, Benedikt Wiestler, Jan S. Kirschke, Ezequiel de la Rosa, Federico Bolelli, Luca Lumetti, Costantino Grana, Kunpeng Xie, Guomin Wu, Behrus Puladi, Carlos Mart\u00edn-Isla, Karim Lekadir, Victor M. Campello, Wei Shao, Wayne Brisbane, Hongxu Jiang, Hao Wei, Wu Yuan, Shuangle Li, Yuyin Zhou, Bo Wang",
+ "update_time": "2024-12-20",
+ "abstract": "Promptable segmentation foundation models have emerged as a transformative approach to addressing the diverse needs in medical images, but most existing models require expensive computing, posing a big barrier to their adoption in clinical practice. In this work, we organized the first international competition dedicated to promptable medical image segmentation, featuring a large-scale dataset spanning nine common imaging modalities from over 20 different institutions. The top teams developed lightweight segmentation foundation models and implemented an efficient inference pipeline that substantially reduced computational requirements while maintaining state-of-the-art segmentation accuracy. Moreover, the post-challenge phase advanced the algorithms through the design of performance booster and reproducibility tasks, resulting in improved algorithms and validated reproducibility of the winning solution. Furthermore, the best-performing algorithms have been incorporated into the open-source software with a user-friendly interface to facilitate clinical adoption. The data and code are publicly available to foster the further development of medical image segmentation foundation models and pave the way for impactful real-world applications.",
+ "code_url": "https://github.com/bowang-lab/medsam"
+ },
+ "2412.16079v1": {
+ "title": "Fair Distributed Machine Learning with Imbalanced Data as a Stackelberg Evolutionary Game",
+ "url": "http://arxiv.org/abs/2412.16079v1",
+ "authors": "Sebastian Niehaus, Ingo Roeder, Nico Scherf",
+ "update_time": "2024-12-20",
+ "abstract": "Decentralised learning enables the training of deep learning algorithms without centralising data sets, resulting in benefits such as improved data privacy, operational efficiency and the fostering of data ownership policies. However, significant data imbalances pose a challenge in this framework. Participants with smaller datasets in distributed learning environments often achieve poorer results than participants with larger datasets. Data imbalances are particularly pronounced in medical fields and are caused by different patient populations, technological inequalities and divergent data collection practices. In this paper, we consider distributed learning as an Stackelberg evolutionary game. We present two algorithms for setting the weights of each node's contribution to the global model in each training round: the Deterministic Stackelberg Weighting Model (DSWM) and the Adaptive Stackelberg Weighting Model (ASWM). We use three medical datasets to highlight the impact of dynamic weighting on underrepresented nodes in distributed learning. Our results show that the ASWM significantly favours underrepresented nodes by improving their performance by 2.713% in AUC. Meanwhile, nodes with larger datasets experience only a modest average performance decrease of 0.441%."
+ },
+ "2412.16014v1": {
+ "title": "Probing lactate exchange and compartmentation in Gray Matter via time-dependent diffusion-weighted MRS",
+ "url": "http://arxiv.org/abs/2412.16014v1",
+ "authors": "Eloise Mougel, Marco Palombo, Julien Valette",
+ "update_time": "2024-12-20",
+ "abstract": "Lactate is crucial in the brain, as it is involved in neuronal activity and memory formation. This is thought to be related to the astrocyte-neuron lactate shuttle hypothesis (ANLS), which has been debated for 30 years, not least because it is difficult to measure lactate compartmentation and exchange in vivo. In particular, while ANLS requires transfer of lactate via the extracellular space, intracellular/extracellular exchange rate remains unknown. In this work, we propose to use time-dependent diffusion MRS to assess lactate exchange and lactate compartmentation in mouse brain gray matter in vivo. First, by comparing lactate time-dependent diffusivity and kurtosis with those of water and purely intracellular metabolites (which exhibit \"fast\" and \"no\" exchange signatures, respectively), we estimate that lactate exchange is slow (i.e. on the order of hundreds of milliseconds). Then, using biophysical models neglecting exchange, we estimate extracellular, neuronal and astrocytic lactate fractions compatible with the ANLS hypothesis."
+ },
+ "2412.15967v1": {
+ "title": "Self-Supervised Radiograph Anatomical Region Classification -- How Clean Is Your Real-World Data?",
+ "url": "http://arxiv.org/abs/2412.15967v1",
+ "authors": "Simon Langer, Jessica Ritter, Rickmer Braren, Daniel Rueckert, Paul Hager",
+ "update_time": "2024-12-20",
+ "abstract": "Modern deep learning-based clinical imaging workflows rely on accurate labels of the examined anatomical region. Knowing the anatomical region is required to select applicable downstream models and to effectively generate cohorts of high quality data for future medical and machine learning research efforts. However, this information may not be available in externally sourced data or generally contain data entry errors. To address this problem, we show the effectiveness of self-supervised methods such as SimCLR and BYOL as well as supervised contrastive deep learning methods in assigning one of 14 anatomical region classes in our in-house dataset of 48,434 skeletal radiographs. We achieve a strong linear evaluation accuracy of 96.6% with a single model and 97.7% using an ensemble approach. Furthermore, only a few labeled instances (1% of the training set) suffice to achieve an accuracy of 92.2%, enabling usage in low-label and thus low-resource scenarios. Our model can be used to correct data entry mistakes: a follow-up analysis of the test set errors of our best-performing single model by an expert radiologist identified 35% incorrect labels and 11% out-of-domain images. When accounted for, the radiograph anatomical region labelling performance increased -- without and with an ensemble, respectively -- to a theoretical accuracy of 98.0% and 98.8%."
+ },
+ "2412.15957v1": {
+ "title": "From General to Specific: Tailoring Large Language Models for Personalized Healthcare",
+ "url": "http://arxiv.org/abs/2412.15957v1",
+ "authors": "Ruize Shi, Hong Huang, Wei Zhou, Kehan Yin, Kai Zhao, Yun Zhao",
+ "update_time": "2024-12-20",
+ "abstract": "The rapid development of large language models (LLMs) has transformed many industries, including healthcare. However, previous medical LLMs have largely focused on leveraging general medical knowledge to provide responses, without accounting for patient variability and lacking true personalization at the individual level. To address this, we propose a novel method called personalized medical language model (PMLM), which explores and optimizes personalized LLMs through recommendation systems and reinforcement learning (RL). Specifically, by utilizing self-informed and peer-informed personalization, PMLM captures changes in behaviors and preferences to design initial personalized prompts tailored to individual needs. We further refine these initial personalized prompts through RL, ultimately enhancing the precision of LLM guidance. Notably, the personalized prompt are hard prompt, which grants PMLM high adaptability and reusability, allowing it to directly leverage high-quality proprietary LLMs. We evaluate PMLM using real-world obstetrics and gynecology data, and the experimental results demonstrate that PMLM achieves personalized responses, and it provides more refined and individualized services, offering a potential way for personalized medical LLMs."
+ },
+ "2412.15925v1": {
+ "title": "MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Classification and Detection",
+ "url": "http://arxiv.org/abs/2412.15925v1",
+ "authors": "Andrea Moglia, Elia Clement Nastasio, Luca Mainardi, Pietro Cerveri",
+ "update_time": "2024-12-20",
+ "abstract": "Problem: Pancreas radiological imaging is challenging due to the small size, blurred boundaries, and variability of shape and position of the organ among patients. Goal: In this work we present MiniGPT-Pancreas, a Multimodal Large Language Model (MLLM), as an interactive chatbot to support clinicians in pancreas cancer diagnosis by integrating visual and textual information. Methods: MiniGPT-v2, a general-purpose MLLM, was fine-tuned in a cascaded way for pancreas detection, tumor classification, and tumor detection with multimodal prompts combining questions and computed tomography scans from the National Institute of Health (NIH), and Medical Segmentation Decathlon (MSD) datasets. The AbdomenCT-1k dataset was used to detect the liver, spleen, kidney, and pancreas. Results: MiniGPT-Pancreas achieved an Intersection over Union (IoU) of 0.595 and 0.550 for the detection of pancreas on NIH and MSD datasets, respectively. For the pancreas cancer classification task on the MSD dataset, accuracy, precision, and recall were 0.876, 0.874, and 0.878, respectively. When evaluating MiniGPT-Pancreas on the AbdomenCT-1k dataset for multi-organ detection, the IoU was 0.8399 for the liver, 0.722 for the kidney, 0.705 for the spleen, and 0.497 for the pancreas. For the pancreas tumor detection task, the IoU score was 0.168 on the MSD dataset. Conclusions: MiniGPT-Pancreas represents a promising solution to support clinicians in the classification of pancreas images with pancreas tumors. Future research is needed to improve the score on the detection task, especially for pancreas tumors.",
+ "code_url": "https://github.com/elianastasio/minigpt-pancreas"
+ },
+ "2412.15907v1": {
+ "title": "Development of a Large-scale Dataset of Chest Computed Tomography Reports in Japanese and a High-performance Finding Classification Model",
+ "url": "http://arxiv.org/abs/2412.15907v1",
+ "authors": "Yosuke Yamagishi, Yuta Nakamura, Tomohiro Kikuchi, Yuki Sonoda, Hiroshi Hirakawa, Shintaro Kano, Satoshi Nakamura, Shouhei Hanaoka, Takeharu Yoshikawa, Osamu Abe",
+ "update_time": "2024-12-20",
+ "abstract": "Background: Recent advances in large language models highlight the need for high-quality multilingual medical datasets. While Japan leads globally in CT scanner deployment and utilization, the lack of large-scale Japanese radiology datasets has hindered the development of specialized language models for medical imaging analysis. Objective: To develop a comprehensive Japanese CT report dataset through machine translation and establish a specialized language model for structured finding classification. Additionally, to create a rigorously validated evaluation dataset through expert radiologist review. Methods: We translated the CT-RATE dataset (24,283 CT reports from 21,304 patients) into Japanese using GPT-4o mini. The training dataset consisted of 22,778 machine-translated reports, while the validation dataset included 150 radiologist-revised reports. We developed CT-BERT-JPN based on \"tohoku-nlp/bert-base-japanese-v3\" architecture for extracting 18 structured findings from Japanese radiology reports. Results: Translation metrics showed strong performance with BLEU scores of 0.731 and 0.690, and ROUGE scores ranging from 0.770 to 0.876 for Findings and from 0.748 to 0.857 for Impression sections. CT-BERT-JPN demonstrated superior performance compared to GPT-4o in 11 out of 18 conditions, including lymphadenopathy (+14.2%), interlobular septal thickening (+10.9%), and atelectasis (+7.4%). The model maintained F1 scores exceeding 0.95 in 14 out of 18 conditions and achieved perfect scores in four conditions. Conclusions: Our study establishes a robust Japanese CT report dataset and demonstrates the effectiveness of a specialized language model for structured finding classification. The hybrid approach of machine translation and expert validation enables the creation of large-scale medical datasets while maintaining high quality."
+ }
+ }
+}
\ No newline at end of file