Skip to content

randomrisk/NeuroAI-Daily-Arxiv

Repository files navigation

Updated on 2025-02-28

Brain

Publish Date Title Authors URL Abstract
2025-02-26 Multi-modal Contrastive Learning for Tumor-specific Missing Modality Synthesis Minjoo Lim, Bogyeong Kang, Tae-Eui Kam Link Multi-modal magnetic resonance imaging (MRI) is essential for providing complementary information about brain anatomy and pathology, leading to more accurate diagnoses. However, obtaining high-quality multi-modal MRI in a clinical setting is difficult due to factors such as time constraints, high costs, and patient movement artifacts. To overcome this difficulty, there is increasing interest in developing generative models that can synthesize missing target modality images from the available source ones. Therefore, we design a generative model for missing MRI that integrates multi-modal contrastive learning with a focus on critical tumor regions. Specifically, we integrate multi-modal contrastive learning, tailored for multiple source modalities, and enhance its effectiveness by selecting features based on entropy during the contrastive learning process. Additionally, our network not only generates the missing target modality images but also predicts segmentation outputs, simultaneously. This approach improves the generator's capability to precisely generate tumor regions, ultimately improving performance in downstream segmentation tasks. By leveraging a combination of contrastive, segmentation, and additional self-representation losses, our model effectively reflects target-specific information and generate high-quality target images. Consequently, our results in the Brain MR Image Synthesis challenge demonstrate that the proposed model excelled in generating the missing modality.
2025-02-26 Surface-Based Manipulation Ziqiao Wang, Serhat Demirtas, Fabio Zuliani, Jamie Paik Link Intelligence lies not only in the brain but in the body. The shape of our bodies can influence how we think and interact with the physical world. In robotics research, interacting with the physical world is crucial as it allows robots to manipulate objects in various real-life scenarios. Conventional robotic manipulation strategies mainly rely on finger-shaped end effectors. However, achieving stable grasps on fragile, deformable, irregularly shaped, or slippery objects is challenging due to difficulties in establishing stable force or geometric constraints. Here, we present surface-based manipulation strategies that diverge from classical grasping approaches, using with flat surfaces as minimalist end-effectors. By changing the position and orientation of these surfaces, objects can be translated, rotated and even flipped across the surface using closed-loop control strategies. Since this method does not rely on stable grasp, it can adapt to objects of various shapes, sizes, and stiffness levels, even enabling the manipulation the shape of deformable objects. Our results provide a new perspective for solving complex manipulation problems.
2025-02-26 Efficient 4D fMRI ASD Classification using Spatial-Temporal-Omics-based Learning Framework Ziqiao Weng, Weidong Cai, Bo Zhou Link Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder impacting social and behavioral development. Resting-state fMRI, a non-invasive tool for capturing brain connectivity patterns, aids in early ASD diagnosis and differentiation from typical controls (TC). However, previous methods, which rely on either mean time series or full 4D data, are limited by a lack of spatial information or by high computational costs. This underscores the need for an efficient solution that preserves both spatial and temporal information. In this paper, we propose a novel, simple, and efficient spatial-temporal-omics learning framework designed to efficiently extract spatio-temporal features from fMRI for ASD classification. Our approach addresses these limitations by utilizing 3D time-domain derivatives as the spatial-temporal inter-voxel omics, which preserve full spatial resolution while capturing diverse statistical characteristics of the time series at each voxel. Meanwhile, functional connectivity features serve as the spatial-temporal inter-regional omics, capturing correlations across brain regions. Extensive experiments and ablation studies on the ABIDE dataset demonstrate that our framework significantly outperforms previous methods while maintaining computational efficiency. We believe our research offers valuable insights that will inform and advance future ASD studies, particularly in the realm of spatial-temporal-omics-based learning.
2025-02-26 Integrating Biological and Machine Intelligence: Attention Mechanisms in Brain-Computer Interfaces Jiyuan Wang, Weishan Ye, Jialin He, Li Zhang, Gan Huang, Zhuliang Yu, Zhen Liang Link With the rapid advancement of deep learning, attention mechanisms have become indispensable in electroencephalography (EEG) signal analysis, significantly enhancing Brain-Computer Interface (BCI) applications. This paper presents a comprehensive review of traditional and Transformer-based attention mechanisms, their embedding strategies, and their applications in EEG-based BCI, with a particular emphasis on multimodal data fusion. By capturing EEG variations across time, frequency, and spatial channels, attention mechanisms improve feature extraction, representation learning, and model robustness. These methods can be broadly categorized into traditional attention mechanisms, which typically integrate with convolutional and recurrent networks, and Transformer-based multi-head self-attention, which excels in capturing long-range dependencies. Beyond single-modality analysis, attention mechanisms also enhance multimodal EEG applications, facilitating effective fusion between EEG and other physiological or sensory data. Finally, we discuss existing challenges and emerging trends in attention-based EEG modeling, highlighting future directions for advancing BCI technology. This review aims to provide valuable insights for researchers seeking to leverage attention mechanisms for improved EEG interpretation and application.
2025-02-26 Deep learning and classical computer vision techniques in medical image analysis: Case studies on brain MRI tissue segmentation, lung CT COPD registration, and skin lesion classification Anyimadu Daniel Tweneboah, Suleiman Taofik Ahmed, Hossain Mohammad Imran Link Medical imaging spans diverse tasks and modalities which play a pivotal role in disease diagnosis, treatment planning, and monitoring. This study presents a novel exploration, being the first to systematically evaluate segmentation, registration, and classification tasks across multiple imaging modalities. Integrating both classical and deep learning (DL) approaches in addressing brain MRI tissue segmentation, lung CT image registration, and skin lesion classification from dermoscopic images, we demonstrate the complementary strengths of these methodologies in diverse applications. For brain tissue segmentation, 3D DL models outperformed 2D and patch-based models, specifically nnU-Net achieving Dice of 0.9397, with 3D U-Net models on ResNet34 backbone, offering competitive results with Dice 0.8946. Multi-Atlas methods provided robust alternatives for cases where DL methods are not feasible, achieving average Dice of 0.7267. In lung CT registration, classical Elastix-based methods outperformed DL models, achieving a minimum Target Registration Error (TRE) of 6.68 mm, highlighting the effectiveness of parameter tuning. HighResNet performed best among DL models with a TRE of 7.40 mm. For skin lesion classification, ensembles of DL models like InceptionResNetV2 and ResNet50 excelled, achieving up to 90.44%, and 93.62% accuracies for binary and multiclass classification respectively. Also, adopting One-vs-All method, DL attained accuracies of 94.64% (mel vs. others), 95.35% (bcc vs. others), and 96.93% (scc vs. others), while ML models specifically Multi-Layer Perceptron (MLP) on handcrafted features offered interpretable alternatives with 85.04% accuracy using SMOTE for class imbalance correction on the multi-class task and 83.27% on the binary-class task. Links to source code are available on request.
2025-02-26 Few Shot Alternating GD and Minimization for Generalizable Real-Time MRI Silpa Babu, Sajan Goud Lingala, Namrata Vaswani Link This work introduces a novel near real-time (real-time after an initial short delay) MRI solution that handles motion well and is generalizable. Here, real-time means the algorithm works well on a highly accelerated scan, is zero-latency (reconstructs a new frame as soon as MRI data for it arrives), and is fast enough, i.e., the time taken to process a frame is comparable to the scan time per frame or lesser. We demonstrate its generalizability through experiments on 6 prospective datasets and 17 retrospective datasets that span multiple different applications -- speech larynx imaging, brain, ungated cardiac perfusion, cardiac cine, cardiac OCMR, abdomen; sampling schemes -- Cartesian, pseudo-radial, radial, spiral; and sampling rates -- ranging from 6x to 4 radial lines per frame. Comparisons with a large number of existing real-time and batch methods, including unsupervised and supervised deep learning methods, show the power and speed of our approach.
2025-02-26 Embodying mechano-fluidic memory in soft machines to program behaviors upon interactions Alberto Comoretto, Tanaya Mandke, Johannes T. B. Overvelde Link Soft machines display shape adaptation to external circumstances due to their intrinsic compliance. To achieve increasingly more responsive behaviors upon interactions without relying on centralized computation, embodying memory directly in the machines' structure is crucial. Here, we harness the bistability of elastic shells to alter the fluidic properties of an enclosed cavity, thereby switching between stable frequency states of a locomoting self-oscillating machine. To program these memory states upon interactions, we develop fluidic circuits surrounding the bistable shell, with soft tubes that kink and unkink when externally touched. We implement circuits for both long-term and short-term memory in a soft machine that switches behaviors in response to a human user and that autonomously changes direction after detecting a wall. By harnessing only geometry and elasticity, embodying memory allows physical structures without a central brain to exhibit autonomous feats that are typically reserved for computer-based robotic systems.
2025-02-26 Zero Echo Time Functional MRI in Humans Ali Caglar Özen, Shuai Liu, Serhat Ilbey, Michael Bock, Uzay Emir, Yen-Yu Ian Shih Link Motivation: Conventional echo planar imaging(EPI) based functional MRI(fMRI) uses the BOLD contrast to map activity changes in human brains. Introducing an efficient ZTE sequence for functional brain mapping can help address limitations of EPI and demonstrate the feasibility of using T1 related changes as a surrogate marker of brain activity. Goals: To test and optimize ZTE sequence for fMRI. Methods: A ZTE sequence with radial inside out spokes was used to prepare a dynamic imaging protocol that matches conventional EPI time course. Temporal SNR and sensitivity to susceptibility differences of ZTE were evaluated and the sequence was benchmarked against BOLD EPI in a task based visual fMRI study with healthy volunteers at 3T. Results: Phantom measurements confirmed sensitivity of the ZTE protocol to the oxygen concentration. Functional activation in primary visual cortex could be detected using ZTE. Resting state networks could also be identified using independent component analysis. Discussion: ZTE-based fMRI is proposed for mapping functional activation in human brain. ZTE is robust against susceptibility artefacts and significantly reduces acoustic noise. Radial sampling pattern allows for high undersampling rates to increase temporal resolution.
2025-02-26 Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs Yiheng Yang, Yujie Wang, Chi Ma, Lei Yu, Emmanuele Chersoni, Chu-Ren Huang Link Dense large language models(LLMs) face critical efficiency bottlenecks as they rigidly activate all parameters regardless of input complexity. While existing sparsity methods(static pruning or dynamic activation) address this partially, they either lack adaptivity to contextual or model structural demands or incur prohibitive computational overhead. Inspired by human brain's dual-process mechanisms - predictive coding (N400) for backbone sparsity and structural reanalysis (P600) for complex context - we propose CLADA, a \textit{\textbf{C}ognitive-\textbf{L}oad-\textbf{A}ware \textbf{D}ynamic \textbf{A}ctivation} framework that synergizes statistical sparsity with semantic adaptability. Our key insight is that LLM activations exhibit two complementary patterns: 1) \textit{Global statistical sparsity} driven by sequence-level prefix information, and 2) \textit{Local semantic adaptability} modulated by cognitive load metrics(e.g., surprisal and entropy). CLADA employs a hierarchical thresholding strategy: a baseline from offline error-controlled optimization ensures 40\%+ sparsity, dynamically adjusted by real-time cognitive signals. Evaluations across six mainstream LLMs and nine benchmarks demonstrate that CLADA achieves \textbf{~20\% average speedup with <2\% accuracy drop}, outperforming Griffin (5\%+ degradation) and TT (negligible speedup). Crucially, we establish the first formal connection between neurolinguistic event-related potential (ERP) components and LLM efficiency mechanisms through multi-level regression analysis ($R^2=0.17$ for sparsity-adaptation synergy). Requiring no retraining or architectural changes, CLADA offers a deployable solution for resource-aware LLM inference while advancing biologically-inspired AI design. Our code is available at \href{https://github.com/Oldify/CLADA}{CLADA}.
2025-02-26 Brain-inspired analogical mixture prototypes for few-shot class-incremental learning Wanyi Li, Wei Wei, Yongkang Luo, Peng Wang Link Few-shot class-incremental learning (FSCIL) poses significant challenges for artificial neural networks due to the need to efficiently learn from limited data while retaining knowledge of previously learned tasks. Inspired by the brain's mechanisms for categorization and analogical learning, we propose a novel approach called Brain-inspired Analogical Mixture Prototypes (BAMP). BAMP has three components: mixed prototypical feature learning, statistical analogy, and soft voting. Starting from a pre-trained Vision Transformer (ViT), mixed prototypical feature learning represents each class using a mixture of prototypes and fine-tunes these representations during the base session. The statistical analogy calibrates the mean and covariance matrix of prototypes for new classes according to similarity to the base classes, and computes classification score with Mahalanobis distance. Soft voting combines both merits of statistical analogy and an off-shelf FSCIL method. Our experiments on benchmark datasets demonstrate that BAMP outperforms state-of-the-art on both traditional big start FSCIL setting and challenging small start FSCIL setting. The study suggests that brain-inspired analogical mixture prototypes can alleviate catastrophic forgetting and over-fitting problems in FSCIL.

EEG

Publish Date Title Authors URL Abstract
2025-02-26 Integrating Biological and Machine Intelligence: Attention Mechanisms in Brain-Computer Interfaces Jiyuan Wang, Weishan Ye, Jialin He, Li Zhang, Gan Huang, Zhuliang Yu, Zhen Liang Link With the rapid advancement of deep learning, attention mechanisms have become indispensable in electroencephalography (EEG) signal analysis, significantly enhancing Brain-Computer Interface (BCI) applications. This paper presents a comprehensive review of traditional and Transformer-based attention mechanisms, their embedding strategies, and their applications in EEG-based BCI, with a particular emphasis on multimodal data fusion. By capturing EEG variations across time, frequency, and spatial channels, attention mechanisms improve feature extraction, representation learning, and model robustness. These methods can be broadly categorized into traditional attention mechanisms, which typically integrate with convolutional and recurrent networks, and Transformer-based multi-head self-attention, which excels in capturing long-range dependencies. Beyond single-modality analysis, attention mechanisms also enhance multimodal EEG applications, facilitating effective fusion between EEG and other physiological or sensory data. Finally, we discuss existing challenges and emerging trends in attention-based EEG modeling, highlighting future directions for advancing BCI technology. This review aims to provide valuable insights for researchers seeking to leverage attention mechanisms for improved EEG interpretation and application.
2025-02-26 Music Therapy based Stress Prediction using Homological Feature Analysis on EEG Signals Srikireddy Dhanunjay Reddy, Tharun Kumar Reddy Bollu Link Stress became a common factor in the busy daily routines of all academic and corporate working environments. Everyone checks for efficient stress-buster alternatives to calm down from work pressure. Instead of investing time in unnecessary efforts, this work shows the stress relief scenario of subjects by listening to Raag Darbari music notes as a simple add-on to their schedule. An innovative approach has been implemented on the MUSEI-EEG dataset using Topological Data Analysis (TDA) to analyze this stress relief study. This study reveals that persistent homological features can be robust biomarkers for classifying closely distributed subject data. The proposed TDA approach framework revealed homological features like birth-death rate and entropy efficacy in stress prediction using Electroencephalogram (EEG) signals with 86% average accuracy and 0.2 standard deviation.
2025-02-25 Error-related Potential driven Reinforcement Learning for adaptive Brain-Computer Interfaces Aline Xavier Fidêncio, Felix Grün, Christian Klaes, Ioannis Iossifidis Link Brain-computer interfaces (BCIs) provide alternative communication methods for individuals with motor disabilities by allowing control and interaction with external devices. Non-invasive BCIs, especially those using electroencephalography (EEG), are practical and safe for various applications. However, their performance is often hindered by EEG non-stationarities, caused by changing mental states or device characteristics like electrode impedance. This challenge has spurred research into adaptive BCIs that can handle such variations. In recent years, interest has grown in using error-related potentials (ErrPs) to enhance BCI performance. ErrPs, neural responses to errors, can be detected non-invasively and have been integrated into different BCI paradigms to improve performance through error correction or adaptation. This research introduces a novel adaptive ErrP-based BCI approach using reinforcement learning (RL). We demonstrate the feasibility of an RL-driven adaptive framework incorporating ErrPs and motor imagery. Utilizing two RL agents, the framework adapts dynamically to EEG non-stationarities. Validation was conducted using a publicly available motor imagery dataset and a fast-paced game designed to boost user engagement. Results show the framework's promise, with RL agents learning control policies from user interactions and achieving robust performance across datasets. However, a critical insight from the game-based protocol revealed that motor imagery in a high-speed interaction paradigm was largely ineffective for participants, highlighting task design limitations in real-time BCI applications. These findings underscore the potential of RL for adaptive BCIs while pointing out practical constraints related to task complexity and user responsiveness.
2025-02-25 EEGM2: An Efficient Mamba-2-Based Self-Supervised Framework for Long-Sequence EEG Modeling Jiazhen Hong, Geoffrey Mackellar, Soheila Ghane Link Deep learning has achieved significant progress in the development of electroencephalogram (EEG) foundation models, with Transformer-based architectures excelling at capturing long-range dependencies. However, their quadratic computational complexity presents challenges in memory efficiency, training, and inference speed, limiting their scalability and generalizability as a foundation model. In this paper, we propose EEGM2, a self-supervised framework based on structured state space duality (SSD) that overcomes these limitations. EEGM2 introduces three key innovations: (1) a reconstruction-based framework that captures both local and global EEG features through Mamba-2 structured state space models, (2) a spatiotemporal-aware loss function that enhances robustness to noise and preserves spectral information, and (3) a multi-branch receptive field input embedding strategy that improves cross-subject generalization and stability for EEG sequences of varying lengths. In comparison to traditional pretraining methods, on raw EEG or latent representation spaces, EEGM2 shows superior performance on long-sequence tasks, where conventional models struggle. Our experimental results on six EEG datasets validate that EEGM2 not only achieves state-of-the-art cross-domain accuracy but also reduces computational overhead, making it a more efficient solution for deployment on resource-constrained BCI devices.
2025-02-24 Deep Learning-Powered Electrical Brain Signals Analysis: Advancing Neurological Diagnostics Jiahe Li, Xin Chen, Fanqi Shen, Junru Chen, Yuxin Liu, Daoze Zhang, Zhizhang Yuan, Fang Zhao, Meng Li, Yang Yang Link Neurological disorders represent significant global health challenges, driving the advancement of brain signal analysis methods. Scalp electroencephalography (EEG) and intracranial electroencephalography (iEEG) are widely used to diagnose and monitor neurological conditions. However, dataset heterogeneity and task variations pose challenges in developing robust deep learning solutions. This review systematically examines recent advances in deep learning approaches for EEG/iEEG-based neurological diagnostics, focusing on applications across 7 neurological conditions using 46 datasets. We explore trends in data utilization, model design, and task-specific adaptations, highlighting the importance of pre-trained multi-task models for scalable, generalizable solutions. To advance research, we propose a standardized benchmark for evaluating models across diverse datasets to enhance reproducibility. This survey emphasizes how recent innovations can transform neurological diagnostics and enable the development of intelligent, adaptable healthcare solutions.
2025-02-23 SDA-DDA Semi-supervised Domain Adaptation with Dynamic Distribution Alignment Network For Emotion Recognition Using EEG Signals Jiahao Tang Link In this paper, we focus on the challenge of individual variability in affective brain-computer interfaces (aBCI), which employs electroencephalogram (EEG) signals to monitor and recognize human emotional states, thereby facilitating the advancement of emotion-aware technologies. The variability in EEG data across individuals poses a significant barrier to the development of effective and widely applicable aBCI models. To tackle this issue, we propose a novel transfer learning framework called Semi-supervised Domain Adaptation with Dynamic Distribution Alignment (SDA-DDA). This approach aligns the marginal and conditional probability distribution of source and target domains using maximum mean discrepancy (MMD) and conditional maximum mean discrepancy (CMMD). We introduce a dynamic distribution alignment mechanism to adjust differences throughout training and enhance adaptation. Additionally, a pseudo-label confidence filtering module is integrated into the semi-supervised process to refine pseudo-label generation and improve the estimation of conditional distributions. Extensive experiments on EEG benchmark databases (SEED, SEED-IV and DEAP) validate the robustness and effectiveness of SDA-DDA. The results demonstrate its superiority over existing methods in emotion recognition across various scenarios, including cross-subject and cross-session conditions. This advancement enhances the generalization and accuracy of emotion recognition, potentially fostering the development of personalized aBCI applications. The source code is accessible at https://github.com/XuanSuTrum/SDA-DDA.
2025-02-22 ZIA: A Theoretical Framework for Zero-Input AI Aditi De, NeuroBits Labs Link Zero-Input AI (ZIA) introduces a novel framework for human-computer interaction by enabling proactive intent prediction without explicit user commands. It integrates gaze tracking, bio-signals (EEG, heart rate), and contextual data (time, location, usage history) into a multi-modal model for real-time inference, targeting <100 ms latency. The proposed architecture employs a transformer-based model with cross-modal attention, variational Bayesian inference for uncertainty estimation, and reinforcement learning for adaptive optimization. To support deployment on edge devices (CPUs, TPUs, NPUs), ZIA utilizes quantization, weight pruning, and linear attention to reduce complexity from quadratic to linear with sequence length. Theoretical analysis establishes an information-theoretic bound on prediction error and demonstrates how multi-modal fusion improves accuracy over single-modal approaches. Expected performance suggests 85-90% accuracy with EEG integration and 60-100 ms inference latency. ZIA provides a scalable, privacy-preserving framework for accessibility, healthcare, and consumer applications, advancing AI toward anticipatory intelligence.
2025-02-22 Electrophysiological Investigation of Insect Pain Threshold Marc Josep Montagut Marques, Pan Minghao, Ryuichi Okada, Midori Sakura, Kayo Hirose, Shinjiro Umezu Link The question of whether insects experience pain has long been debated in neuroscience and animal behavior research. Increasing evidence suggests that insects possess the ability to detect and respond to noxious stimuli, exhibiting behaviors indicative of pain perception. This study investigates the relationship between pain stimuli and physiological responses in crickets (Gryllidae), focusing on heart rate (ECG) and brain wave (EEG) patterns. We applied a range of mechanical, chemical, thermal, and electrical stimuli to crickets, recording ECG and EEG data while employing a deep learning-based model to classify pain levels. Our findings revealed significant heart rate changes and EEG fluctuations in response to various stimuli, with the highest intensity stimuli inducing marked physiological stress. The AI-based analysis, utilizing AlexNet for EEG signal classification, achieved 90% accuracy in distinguishing between resting, low-pain, and high-pain states. While no social sharing of pain was observed through ECG measurements, these results contribute to the growing body of evidence supporting insect nociception and offer new insights into their physiological responses to external stressors. This research advances the understanding of insect pain mechanisms and demonstrates the potential for AI-driven analysis in entomological studies.
2025-02-22 Single-Channel EEG Tokenization Through Time-Frequency Modeling Jathurshan Pradeepkumar, Xihao Piao, Zheng Chen, Jimeng Sun Link We introduce TFM-Tokenizer, a novel tokenization framework tailored for EEG analysis that transforms continuous, noisy brain signals into a sequence of discrete, well-represented tokens for various EEG tasks. Conventional approaches typically rely on continuous embeddings and inter-channel dependencies, which are limited in capturing inherent EEG features such as temporally unpredictable patterns and diverse oscillatory waveforms. In contrast, we hypothesize that critical time-frequency features can be effectively captured from a single channel. By learning tokens that encapsulate these intrinsic patterns within a single channel, our approach yields a scalable tokenizer adaptable across diverse EEG settings. We integrate the TFM-Tokenizer with a transformer-based TFM-Encoder, leveraging established pretraining techniques from natural language processing, such as masked token prediction, followed by downstream fine-tuning for various EEG tasks. Experiments across four EEG datasets show that TFM-Token outperforms state-of-the-art methods. On TUEV, our approach improves balanced accuracy and Cohen's Kappa by 5% over baselines. Comprehensive analysis of the learned tokens demonstrates their ability to capture class-distinctive features, enhance frequency representation, and ability to encode time-frequency motifs into distinct tokens, improving interpretability.
2025-02-21 Parameter Estimation of the Network of FitzHugh-Nagumo Neurons Based on the Speed-Gradient and Filtering Aleksandra Rybalko, Alexander Fradkov Link The paper addresses the problem of parameter estimation (or identification) in dynamical networks composed of an arbitrary number of FitzHugh-Nagumo neuron models with diffusive couplings between each other. It is assumed that only the membrane potential of each model is measured, while the other state variable and all derivatives remain unmeasured. Additionally, potential measurement errors in the membrane potential due to sensor imprecision are considered. To solve this problem, firstly, the original FitzHugh-Nagumo network is transformed into a linear regression model, where the regressors are obtained by applying a filter-differentiator to specific combinations of the measured variables. Secondly, the speed-gradient method is applied to this linear model, leading to the design of an identification algorithm for the FitzHugh-Nagumo neural network. Sufficient conditions for the asymptotic convergence of the parameter estimates to their true values are derived for the proposed algorithm. Parameter estimation for a network of five interconnected neurons is demonstrated through computer simulation. The results confirm that the sufficient conditions are satisfied in the numerical experiments conducted. Furthermore, the algorithm's capabilities for adjusting the identification accuracy and time are investigated. The proposed approach has potential applications in nervous system modeling, particularly in the context of human brain modeling. For instance, EEG signals could serve as the measured variables of the network, enabling the integration of mathematical neural models with empirical data collected by neurophysiologists.

BCI

Publish Date Title Authors URL Abstract
2025-02-26 Integrating Biological and Machine Intelligence: Attention Mechanisms in Brain-Computer Interfaces Jiyuan Wang, Weishan Ye, Jialin He, Li Zhang, Gan Huang, Zhuliang Yu, Zhen Liang Link With the rapid advancement of deep learning, attention mechanisms have become indispensable in electroencephalography (EEG) signal analysis, significantly enhancing Brain-Computer Interface (BCI) applications. This paper presents a comprehensive review of traditional and Transformer-based attention mechanisms, their embedding strategies, and their applications in EEG-based BCI, with a particular emphasis on multimodal data fusion. By capturing EEG variations across time, frequency, and spatial channels, attention mechanisms improve feature extraction, representation learning, and model robustness. These methods can be broadly categorized into traditional attention mechanisms, which typically integrate with convolutional and recurrent networks, and Transformer-based multi-head self-attention, which excels in capturing long-range dependencies. Beyond single-modality analysis, attention mechanisms also enhance multimodal EEG applications, facilitating effective fusion between EEG and other physiological or sensory data. Finally, we discuss existing challenges and emerging trends in attention-based EEG modeling, highlighting future directions for advancing BCI technology. This review aims to provide valuable insights for researchers seeking to leverage attention mechanisms for improved EEG interpretation and application.
2025-02-26 Enhancing Subject-Independent Accuracy in fNIRS-based Brain-Computer Interfaces with Optimized Channel Selection Yuxin Li, Hao Fang, Wen Liu, Chuantong Cheng, Hongda Chen Link Achieving high subject-independent accuracy in functional near-infrared spectroscopy (fNIRS)-based brain-computer interfaces (BCIs) remains a challenge, particularly when minimizing the number of channels. This study proposes a novel feature extraction scheme and a Pearson correlation-based channel selection algorithm to enhance classification accuracy while reducing hardware complexity. Using an open-access fNIRS dataset, our method improved average accuracy by 28.09% compared to existing approaches, achieving a peak subject-independent accuracy of 95.98% with only two channels. These results demonstrate the potential of our optimized feature extraction and channel selection methods for developing efficient, subject-independent fNIRS-based BCI systems.
2025-02-25 Error-related Potential driven Reinforcement Learning for adaptive Brain-Computer Interfaces Aline Xavier Fidêncio, Felix Grün, Christian Klaes, Ioannis Iossifidis Link Brain-computer interfaces (BCIs) provide alternative communication methods for individuals with motor disabilities by allowing control and interaction with external devices. Non-invasive BCIs, especially those using electroencephalography (EEG), are practical and safe for various applications. However, their performance is often hindered by EEG non-stationarities, caused by changing mental states or device characteristics like electrode impedance. This challenge has spurred research into adaptive BCIs that can handle such variations. In recent years, interest has grown in using error-related potentials (ErrPs) to enhance BCI performance. ErrPs, neural responses to errors, can be detected non-invasively and have been integrated into different BCI paradigms to improve performance through error correction or adaptation. This research introduces a novel adaptive ErrP-based BCI approach using reinforcement learning (RL). We demonstrate the feasibility of an RL-driven adaptive framework incorporating ErrPs and motor imagery. Utilizing two RL agents, the framework adapts dynamically to EEG non-stationarities. Validation was conducted using a publicly available motor imagery dataset and a fast-paced game designed to boost user engagement. Results show the framework's promise, with RL agents learning control policies from user interactions and achieving robust performance across datasets. However, a critical insight from the game-based protocol revealed that motor imagery in a high-speed interaction paradigm was largely ineffective for participants, highlighting task design limitations in real-time BCI applications. These findings underscore the potential of RL for adaptive BCIs while pointing out practical constraints related to task complexity and user responsiveness.
2025-02-25 EEGM2: An Efficient Mamba-2-Based Self-Supervised Framework for Long-Sequence EEG Modeling Jiazhen Hong, Geoffrey Mackellar, Soheila Ghane Link Deep learning has achieved significant progress in the development of electroencephalogram (EEG) foundation models, with Transformer-based architectures excelling at capturing long-range dependencies. However, their quadratic computational complexity presents challenges in memory efficiency, training, and inference speed, limiting their scalability and generalizability as a foundation model. In this paper, we propose EEGM2, a self-supervised framework based on structured state space duality (SSD) that overcomes these limitations. EEGM2 introduces three key innovations: (1) a reconstruction-based framework that captures both local and global EEG features through Mamba-2 structured state space models, (2) a spatiotemporal-aware loss function that enhances robustness to noise and preserves spectral information, and (3) a multi-branch receptive field input embedding strategy that improves cross-subject generalization and stability for EEG sequences of varying lengths. In comparison to traditional pretraining methods, on raw EEG or latent representation spaces, EEGM2 shows superior performance on long-sequence tasks, where conventional models struggle. Our experimental results on six EEG datasets validate that EEGM2 not only achieves state-of-the-art cross-domain accuracy but also reduces computational overhead, making it a more efficient solution for deployment on resource-constrained BCI devices.
2025-02-18 Multi-View Contrastive Network (MCNet) for Motor Imagery Classification Ziwei Wang, Siyang Li, Xiaoqing Chen, Wei Li, Dongrui Wu Link Objective: An electroencephalography (EEG)-based brain-computer interface (BCI) serves as a direct communication pathway between the human brain and an external device. While supervised learning has been extensively explored for motor imagery (MI) EEG classification, small data quantity has been a key factor limiting the performance of deep feature learning. Methods: This paper proposes a knowledge-driven time-space-frequency based multi-view contrastive network (MCNet) for MI EEG decoding in BCIs. MCNet integrates knowledge from the time, space, and frequency domains into the training process through data augmentations from multiple views, fostering more discriminative feature learning of the characteristics of EEG data. We introduce a cross-view contrasting module to learn from different augmented views and a cross-model contrasting module to enhance the consistency of features extracted between knowledge-guided and data-driven models. Results: The combination of EEG data augmentation strategies was systematically investigated for more informative supervised contrastive learning. Experiments on four public MI datasets and three different architectures demonstrated that MCNet outperformed 10 existing approaches. Significance: Our approach can significantly boost EEG classification performance beyond designated networks, showcasing the potential to enhance the feature learning process for better EEG decoding.
2025-02-18 A Survey on Bridging EEG Signals and Generative AI: From Image and Text to Beyond Shreya Shukla, Jose Torres, Abhijit Mishra, Jacek Gwizdka, Shounak Roychowdhury Link Integration of Brain-Computer Interfaces (BCIs) and Generative Artificial Intelligence (GenAI) has opened new frontiers in brain signal decoding, enabling assistive communication, neural representation learning, and multimodal integration. BCIs, particularly those leveraging Electroencephalography (EEG), provide a non-invasive means of translating neural activity into meaningful outputs. Recent advances in deep learning, including Generative Adversarial Networks (GANs) and Transformer-based Large Language Models (LLMs), have significantly improved EEG-based generation of images, text, and speech. This paper provides a literature review of the state-of-the-art in EEG-based multimodal generation, focusing on (i) EEG-to-image generation through GANs, Variational Autoencoders (VAEs), and Diffusion Models, and (ii) EEG-to-text generation leveraging Transformer based language models and contrastive learning methods. Additionally, we discuss the emerging domain of EEG-to-speech synthesis, an evolving multimodal frontier. We highlight key datasets, use cases, challenges, and EEG feature encoding methods that underpin generative approaches. By providing a structured overview of EEG-based generative AI, this survey aims to equip researchers and practitioners with insights to advance neural decoding, enhance assistive technologies, and expand the frontiers of brain-computer interaction.
2025-02-18 An Innovative Brain-Computer Interface Interaction System Based on the Large Language Model Jing Jin, Yutao Zhang, Ruitian Xu, Yixin Chen Link Recent advancements in large language models (LLMs) provide a more effective pathway for upgrading brain-computer interface (BCI) technology in terms of user interaction. The widespread adoption of BCIs in daily application scenarios is still limited by factors such as their single functionality, restricted paradigm design, weak multilingual support, and low levels of intelligence. In this paper, we propose an innovative BCI system that deeply integrates a steady-state visual evoked potential (SSVEP) speller with an LLM application programming interface (API). It allows natural language input through the SSVEP speller and dynamically calls large models to generate SSVEP paradigms. The command prompt, blinking frequency, and layout position are adjustable to meet the user's control requirements in various scenarios. More than ten languages are compatible with the multilingual support of LLM. A variety of task scenarios, such as home appliance control, robotic arm operation, and unmanned aerial vehicle (UAV) management are provided. The task interfaces of the system can be personalized according to the user's habits, usage scenarios, and equipment characteristics. By combining the SSVEP speller with an LLM, the system solves numerous challenges faced by current BCI systems and makes breakthroughs in functionality, intelligence, and multilingual support. The introduction of LLM not only enhances user experience but also expands the potential applications of BCI technology in real-world environments.
2025-02-16 SSVEP-BiMA: Bifocal Masking Attention Leveraging Native and Symmetric-Antisymmetric Components for Robust SSVEP Decoding Yuxin Liu, Zhenxi Song, Guoyang Xu, Zirui Wang, Feng Wan, Yong Hu, Min Zhang, Zhiguo Zhang Link Brain-computer interface (BCI) based on steady-state visual evoked potentials (SSVEP) is a popular paradigm for its simplicity and high information transfer rate (ITR). Accurate and fast SSVEP decoding is crucial for reliable BCI performance. However, conventional decoding methods demand longer time windows, and deep learning models typically require subject-specific fine-tuning, leaving challenges in achieving optimal performance in cross-subject settings. This paper proposed a biofocal masking attention-based method (SSVEP-BiMA) that synergistically leverages the native and symmetric-antisymmetric components for decoding SSVEP. By utilizing multiple signal representations, the network is able to integrate features from a wider range of sample perspectives, leading to more generalized and comprehensive feature learning, which enhances both prediction accuracy and robustness. We performed experiments on two public datasets, and the results demonstrate that our proposed method surpasses baseline approaches in both accuracy and ITR. We believe that this work will contribute to the development of more efficient SSVEP-based BCI systems.
2025-02-13 Revisiting Euclidean Alignment for Transfer Learning in EEG-Based Brain-Computer Interfaces Dongrui Wu Link Due to the non-stationarity and large individual differences of EEG signals, EEG-based brain-computer interfaces (BCIs) usually need subject-specific calibration to tailor the decoding algorithm for each new subject, which is time-consuming and user-unfriendly, hindering their real-world applications. Transfer learning (TL) has been extensively used to expedite the calibration, by making use of EEG data from other subjects/sessions. An important consideration in TL for EEG-based BCIs is to reduce the data distribution discrepancies among different subjects/session, to avoid negative transfer. Euclidean alignment (EA) was proposed in 2020 to address this challenge. Numerous experiments from 10 different BCI paradigms demonstrated its effectiveness and efficiency. This paper revisits the EA, explaining its procedure and correct usage, introducing its applications and extensions, and pointing out potential new research directions. It should be very helpful to BCI researchers, especially those who are working on EEG signal decoding.
2025-02-12 CSSSTN: A Class-sensitive Subject-to-subject Semantic Style Transfer Network for EEG Classification in RSVP Tasks Ziyue Yang, Chengrui Chen, Yong Peng, Qiong Chen, Wanzeng Kong Link The Rapid Serial Visual Presentation (RSVP) paradigm represents a promising application of electroencephalography (EEG) in Brain-Computer Interface (BCI) systems. However, cross-subject variability remains a critical challenge, particularly for BCI-illiterate users who struggle to effectively interact with these systems. To address this issue, we propose the Class-Sensitive Subject-to-Subject Semantic Style Transfer Network (CSSSTN), which incorporates a class-sensitive approach to align feature distributions between golden subjects (BCI experts) and target (BCI-illiterate) users on a class-by-class basis. Building on the SSSTN framework, CSSSTN incorporates three key components: (1) subject-specific classifier training, (2) a unique style loss to transfer class-discriminative features while preserving semantic information through a modified content loss, and (3) an ensemble approach to integrate predictions from both source and target domains. We evaluated CSSSTN using both a publicly available dataset and a self-collected dataset. Experimental results demonstrate that CSSSTN outperforms state-of-the-art methods, achieving mean balanced accuracy improvements of 6.4\% on the Tsinghua dataset and 3.5\% on the HDU dataset, with notable benefits for BCI-illiterate users. Ablation studies confirm the effectiveness of each component, particularly the class-sensitive transfer and the use of lower-layer features, which enhance transfer performance and mitigate negative transfer. Additionally, CSSSTN achieves competitive results with minimal target data, reducing calibration time and effort. These findings highlight the practical potential of CSSSTN for real-world BCI applications, offering a robust and scalable solution to improve the performance of BCI-illiterate users while minimizing reliance on extensive training data. Our code is available at https://github.com/ziyuey/CSSSTN.

fMRI

Publish Date Title Authors URL Abstract
2025-02-26 Efficient 4D fMRI ASD Classification using Spatial-Temporal-Omics-based Learning Framework Ziqiao Weng, Weidong Cai, Bo Zhou Link Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder impacting social and behavioral development. Resting-state fMRI, a non-invasive tool for capturing brain connectivity patterns, aids in early ASD diagnosis and differentiation from typical controls (TC). However, previous methods, which rely on either mean time series or full 4D data, are limited by a lack of spatial information or by high computational costs. This underscores the need for an efficient solution that preserves both spatial and temporal information. In this paper, we propose a novel, simple, and efficient spatial-temporal-omics learning framework designed to efficiently extract spatio-temporal features from fMRI for ASD classification. Our approach addresses these limitations by utilizing 3D time-domain derivatives as the spatial-temporal inter-voxel omics, which preserve full spatial resolution while capturing diverse statistical characteristics of the time series at each voxel. Meanwhile, functional connectivity features serve as the spatial-temporal inter-regional omics, capturing correlations across brain regions. Extensive experiments and ablation studies on the ABIDE dataset demonstrate that our framework significantly outperforms previous methods while maintaining computational efficiency. We believe our research offers valuable insights that will inform and advance future ASD studies, particularly in the realm of spatial-temporal-omics-based learning.
2025-02-26 Zero Echo Time Functional MRI in Humans Ali Caglar Özen, Shuai Liu, Serhat Ilbey, Michael Bock, Uzay Emir, Yen-Yu Ian Shih Link Motivation: Conventional echo planar imaging(EPI) based functional MRI(fMRI) uses the BOLD contrast to map activity changes in human brains. Introducing an efficient ZTE sequence for functional brain mapping can help address limitations of EPI and demonstrate the feasibility of using T1 related changes as a surrogate marker of brain activity. Goals: To test and optimize ZTE sequence for fMRI. Methods: A ZTE sequence with radial inside out spokes was used to prepare a dynamic imaging protocol that matches conventional EPI time course. Temporal SNR and sensitivity to susceptibility differences of ZTE were evaluated and the sequence was benchmarked against BOLD EPI in a task based visual fMRI study with healthy volunteers at 3T. Results: Phantom measurements confirmed sensitivity of the ZTE protocol to the oxygen concentration. Functional activation in primary visual cortex could be detected using ZTE. Resting state networks could also be identified using independent component analysis. Discussion: ZTE-based fMRI is proposed for mapping functional activation in human brain. ZTE is robust against susceptibility artefacts and significantly reduces acoustic noise. Radial sampling pattern allows for high undersampling rates to increase temporal resolution.
2025-02-26 NeuroTree: Hierarchical Functional Brain Pathway Decoding for Mental Health Disorders Jun-En Ding, Dongsheng Luo, Anna Zilverstand, Feng Liu Link Analyzing functional brain networks using functional magnetic resonance imaging (fMRI) is crucial for understanding psychiatric disorders and addictive behaviors. While existing fMRI-based graph convolutional networks (GCNs) show considerable promise for feature extraction, they often fall short in characterizing complex relationships between brain regions and demographic factors and accounting for interpretable variables linked to psychiatric conditions. We propose NeuroTree to overcome these limitations, integrating a k-hop AGE-GCN with neural ordinary differential equations (ODEs). This framework leverages an attention mechanism to optimize functional connectivity (FC), thereby enhancing dynamic FC feature learning for brain disease classification. Furthermore, NeuroTree effectively decodes fMRI network features into tree structures, which improves the capture of high-order brain regional pathway features and enables the identification of hierarchical neural behavioral patterns essential for understanding disease-related brain subnetworks. Our empirical evaluations demonstrate that NeuroTree achieves state-of-the-art performance across two distinct mental disorder datasets and provides valuable insights into age-related deterioration patterns. These findings underscore the model's efficacy in predicting psychiatric disorders and elucidating their underlying neural mechanisms.
2025-02-26 Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation Xin Liu, Ziyue Zhang, Jingxin Nie Link Traditional psychological experiments utilizing naturalistic stimuli face challenges in manual annotation and ecological validity. To address this, we introduce a novel paradigm leveraging multimodal large language models (LLMs) as proxies to extract rich semantic information from naturalistic images through a Visual Question Answering (VQA) strategy for analyzing human visual semantic representation. LLM-derived representations successfully predict established neural activity patterns measured by fMRI (e.g., faces, buildings), validating its feasibility and revealing hierarchical semantic organization across cortical regions. A brain semantic network constructed from LLM-derived representations identifies meaningful clusters reflecting functional and contextual associations. This innovative methodology offers a powerful solution for investigating brain semantic organization with naturalistic stimuli, overcoming limitations of traditional annotation methods and paving the way for more ecologically valid explorations of human cognition.
2025-02-24 Simulation and Harmonic Analysis of k-Space Readout (SHAKER) John C. Bodenschatz, Daniel B. Rowe Link In the realm of neuroimaging research, the demand for efficient and accurate simulation tools for functional magnetic resonance imaging (fMRI) data is ever increasing. We present SHAKER, a comprehensive MATLAB package for simulating complex-valued fMRI time series data that will advance understanding and implementation of the MR signal equation and related physics principles to fMRI simulation. The core objective of the package is to provide researchers with a user-friendly MATLAB graphical user interface (GUI) tool capable of generating complex-valued fMRI time series data. This tool will allow researchers to input various parameters related to the MRI scan and receive simulated k-space data with ease, facilitating a deeper understanding of the intricacies of the generation and interpretation of fMRI data.
2025-02-24 A Survey of fMRI to Image Reconstruction Weiyu Guo, Guoying Sun, JianXiang He, Tong Shao, Shaoguang Wang, Ziyang Chen, Meisheng Hong, Ying Sun, Hui Xiong Link Functional magnetic resonance imaging (fMRI) based image reconstruction plays a pivotal role in decoding human perception, with applications in neuroscience and brain-computer interfaces. While recent advancements in deep learning and large-scale datasets have driven progress, challenges such as data scarcity, cross-subject variability, and low semantic consistency persist. To address these issues, we introduce the concept of fMRI-to-Image Learning (fMRI2Image) and present the first systematic review in this field. This review highlights key challenges, categorizes methodologies such as fMRI signal encoding, feature mapping, and image generator. Finally, promising research directions are proposed to advance this emerging frontier, providing a reference for future studies.
2025-02-23 Category-Selective Neurons in Deep Networks: Comparing Purely Visual and Visual-Language Models Zitong Lu, Yuxin Wang Link Category-selective regions in the human brain, such as the fusiform face area (FFA), extrastriate body area (EBA), parahippocampal place area (PPA), and visual word form area (VWFA), play a crucial role in high-level visual processing. Here, we investigate whether artificial neural networks (ANNs) exhibit similar category-selective neurons and how these neurons vary across model layers and between purely visual and vision-language models. Inspired by fMRI functional localizer experiments, we presented images from different categories (faces, bodies, scenes, words, scrambled scenes, and scrambled words) to deep networks and identified category-selective neurons using statistical criteria. Comparing ResNet and the structurally controlled ResNet-based CLIP model, we found that both models contain category-selective neurons, with their proportion increasing across layers, mirroring category selectivity in higher-level visual brain regions. However, CLIP exhibited a higher proportion but lower specificity of category-selective neurons compared to ResNet. Additionally, CLIP's category-selective neurons were more evenly distributed across feature maps and demonstrated greater representational consistency across layers. These findings suggest that language learning increases the number of category-selective neurons while reducing their selectivity strength, reshaping visual representations in deep networks. Our study provides insights into how ANNs mirror biological vision and how multimodal learning influences category-selective representations.
2025-02-21 Causal Modeling of fMRI Time-series for Interpretable Autism Spectrum Disorder Classification Peiyu Duan, Nicha C. Dvornek, Jiyao Wang, Lawrence H. Staib, James S. Duncan Link Autism spectrum disorder (ASD) is a neurological and developmental disorder that affects social and communicative behaviors. It emerges in early life and is generally associated with lifelong disabilities. Thus, accurate and early diagnosis could facilitate treatment outcomes for those with ASD. Functional magnetic resonance imaging (fMRI) is a useful tool that measures changes in brain signaling to facilitate our understanding of ASD. Much effort is being made to identify ASD biomarkers using various connectome-based machine learning and deep learning classifiers. However, correlation-based models cannot capture the non-linear interactions between brain regions. To solve this problem, we introduce a causality-inspired deep learning model that uses time-series information from fMRI and captures causality among ROIs useful for ASD classification. The model is compared with other baseline and state-of-the-art models with 5-fold cross-validation on the ABIDE dataset. We filtered the dataset by choosing all the images with mean FD less than 15mm to ensure data quality. Our proposed model achieved the highest average classification accuracy of 71.9% and an average AUC of 75.8%. Moreover, the inter-ROI causality interpretation of the model suggests that the left precuneus, right precuneus, and cerebellum are placed in the top 10 ROIs in inter-ROI causality among the ASD population. In contrast, these ROIs are not ranked in the top 10 in the control population. We have validated our findings with the literature and found that abnormalities in these ROIs are often associated with ASD.
2025-02-21 BP-GPT: Auditory Neural Decoding Using fMRI-prompted LLM Xiaoyu Chen, Changde Du, Che Liu, Yizhe Wang, Huiguang He Link Decoding language information from brain signals represents a vital research area within brain-computer interfaces, particularly in the context of deciphering the semantic information from the fMRI signal. Although existing work uses LLM to achieve this goal, their method does not use an end-to-end approach and avoids the LLM in the mapping of fMRI-to-text, leaving space for the exploration of the LLM in auditory decoding. In this paper, we introduce a novel method, the Brain Prompt GPT (BP-GPT). By using the brain representation that is extracted from the fMRI as a prompt, our method can utilize GPT-2 to decode fMRI signals into stimulus text. Further, we introduce the text prompt and align the fMRI prompt to it. By introducing the text prompt, our BP-GPT can extract a more robust brain prompt and promote the decoding of pre-trained LLM. We evaluate our BP-GPT on the open-source auditory semantic decoding dataset and achieve a significant improvement up to 4.61 on METEOR and 2.43 on BERTScore across all the subjects compared to the state-of-the-art method. The experimental results demonstrate that using brain representation as a prompt to further drive LLM for auditory neural decoding is feasible and effective. The code is available at https://github.com/1994cxy/BP-GPT.
2025-02-21 Explanations of Deep Language Models Explain Language Representations in the Brain Maryam Rahimi, Yadollah Yaghoobzadeh, Mohammad Reza Daliri Link Recent advances in artificial intelligence have given rise to large language models (LLMs) that not only achieve human-like performance but also share computational principles with the brain's language processing mechanisms. While previous research has primarily focused on aligning LLMs' internal representations with neural activity, we introduce a novel approach that leverages explainable AI (XAI) methods to forge deeper connections between the two domains. Using attribution methods, we quantified how preceding words contribute to an LLM's next-word predictions and employed these explanations to predict fMRI recordings from participants listening to the same narratives. Our findings demonstrate that attribution methods robustly predict brain activity across the language network, surpassing traditional internal representations in early language areas. This alignment is hierarchical: early-layer explanations correspond to the initial stages of language processing in the brain, while later layers align with more advanced stages. Moreover, the layers more influential on LLM next-word prediction$\unicode{x2014}$those with higher attribution scores$\unicode{x2014}$exhibited stronger alignment with neural activity. This work establishes a bidirectional bridge between AI and neuroscience. First, we demonstrate that attribution methods offer a powerful lens for investigating the neural mechanisms of language comprehension, revealing how meaning emerges from preceding context. Second, we propose using brain alignment as a metric to evaluate the validity of attribution methods, providing a framework for assessing their biological plausibility.

MEG

Publish Date Title Authors URL Abstract
2025-02-24 Forecasting Rare Language Model Behaviors Erik Jones, Meg Tong, Jesse Mu, Mohammed Mahfoud, Jan Leike, Roger Grosse, Jared Kaplan, William Fithian, Ethan Perez, Mrinank Sharma Link Standard language model evaluations can fail to capture risks that emerge only at deployment scale. For example, a model may produce safe responses during a small-scale beta test, yet reveal dangerous information when processing billions of requests at deployment. To remedy this, we introduce a method to forecast potential risks across orders of magnitude more queries than we test during evaluation. We make forecasts by studying each query's elicitation probability -- the probability the query produces a target behavior -- and demonstrate that the largest observed elicitation probabilities predictably scale with the number of queries. We find that our forecasts can predict the emergence of diverse undesirable behaviors -- such as assisting users with dangerous chemical synthesis or taking power-seeking actions -- across up to three orders of magnitude of query volume. Our work enables model developers to proactively anticipate and patch rare failures before they manifest during large-scale deployments.
2025-02-18 Brain-to-Text Decoding: A Non-invasive Approach via Typing Jarod Lévy, Mingfang Zhang, Svetlana Pinet, Jérémy Rapin, Hubert Banville, Stéphane d'Ascoli, Jean-Rémi King Link Modern neuroprostheses can now restore communication in patients who have lost the ability to speak or move. However, these invasive devices entail risks inherent to neurosurgery. Here, we introduce a non-invasive method to decode the production of sentences from brain activity and demonstrate its efficacy in a cohort of 35 healthy volunteers. For this, we present Brain2Qwerty, a new deep learning architecture trained to decode sentences from either electro- (EEG) or magneto-encephalography (MEG), while participants typed briefly memorized sentences on a QWERTY keyboard. With MEG, Brain2Qwerty reaches, on average, a character-error-rate (CER) of 32% and substantially outperforms EEG (CER: 67%). For the best participants, the model achieves a CER of 19%, and can perfectly decode a variety of sentences outside of the training set. While error analyses suggest that decoding depends on motor processes, the analysis of typographical errors suggests that it also involves higher-level cognitive factors. Overall, these results narrow the gap between invasive and non-invasive methods and thus open the path for developing safe brain-computer interfaces for non-communicating patients.
2025-02-18 From Thought to Action: How a Hierarchy of Neural Dynamics Supports Language Production Mingfang Zhang, Jarod Lévy, Stéphane d'Ascoli, Jérémy Rapin, F. -Xavier Alario, Pierre Bourdillon, Svetlana Pinet, Jean-Rémi King Link Humans effortlessly communicate their thoughts through intricate sequences of motor actions. Yet, the neural processes that coordinate language production remain largely unknown, in part because speech artifacts limit the use of neuroimaging. To elucidate the unfolding of language production in the brain, we investigate with magnetoencephalography (MEG) and electroencephalography (EEG) the neurophysiological activity of 35 skilled typists, while they typed sentences on a keyboard. This approach confirms the hierarchical predictions of linguistic theories: the neural activity preceding the production of each word is marked by the sequential rise and fall of context-, word-, syllable-, and letter-level representations. Remarkably, each of these neural representations is maintained over long time periods within each level of the language hierarchy. This phenomenon results in a superposition of successive representations that is supported by a hierarchy of dynamic neural codes. Overall, these findings provide a precise computational breakdown of the neural dynamics that coordinate the production of language in the human brain.
2025-02-07 Estimated Roadway Segment Traffic Data by Vehicle Class for the United States: A Machine Learning Approach Brittany Antonczak, Meg Fay, Aviral Chawla, Gregory Rowangould Link The Highway Performance Monitoring System, managed by the Federal Highway Administration, provides essential data on average annual daily traffic across U.S. roadways, but it has limited representation of medium- and heavy-duty vehicles on non-interstate roads. This gap limits research and policy analysis on the impacts of truck traffic, especially concerning air quality and public health. To address this, we use random forest regression to estimate medium- and heavy-duty vehicle traffic volumes in areas with sparse data. This results in a more comprehensive dataset, which enables the estimation of traffic density at the census block level as a proxy for traffic-related air pollution exposure. Our high-resolution spatial data products, rigorously validated, provide a more accurate representation of truck traffic and its environmental and health impacts. These datasets are valuable for transportation planning, public health research, and policy decisions aimed at mitigating the effects of truck traffic on vulnerable communities exposed to air pollution.
2025-02-07 Shifting Attention to You: Personalized Brain-Inspired AI Models Stephen Chong Zhao, Yang Hu, Jason Lee, Andrew Bender, Trisha Mazumdar, Mark Wallace, David A. Tovar Link The integration of human and artificial intelligence represents a scientific opportunity to advance our understanding of information processing, as each system offers unique computational insights that can enhance and inform the other. The synthesis of human cognitive principles with artificial intelligence has the potential to produce more interpretable and functionally aligned computational models, while simultaneously providing a formal framework for investigating the neural mechanisms underlying perception, learning, and decision-making through systematic model comparisons and representational analyses. In this study, we introduce personalized brain-inspired modeling that integrates human behavioral embeddings and neural data to align with cognitive processes. We took a stepwise approach, fine-tuning the Contrastive Language-Image Pre-training (CLIP) model with large-scale behavioral decisions, group-level neural data, and finally, participant-level neural data within a broader framework that we have named CLIP-Human-Based Analysis (CLIP-HBA). We found that fine-tuning on behavioral data enhances its ability to predict human similarity judgments while indirectly aligning it with dynamic representations captured via MEG. To further gain mechanistic insights into the temporal evolution of cognitive processes, we introduced a model specifically fine-tuned on millisecond-level MEG neural dynamics (CLIP-HBA-MEG). This model resulted in enhanced temporal alignment with human neural processing while still showing improvement on behavioral alignment. Finally, we trained individualized models on participant-specific neural data, effectively capturing individualized neural dynamics and highlighting the potential for personalized AI systems. These personalized systems have far-reaching implications for the fields of medicine, cognitive research, human-computer interfaces, and AI development.
2025-02-06 Detecting Mild Traumatic Brain Injury with MEG Scan Data: One-vs-K-Sample Tests Jian Zhang, Gary Green Link Magnetoencephalography (MEG) scanner has been shown to be more accurate than other medical devices in detecting mild traumatic brain injury (mTBI). However, MEG scan data in certain spectrum ranges can be skewed, multimodal and heterogeneous which can mislead the conventional case-control analysis that requires the data to be homogeneous and normally distributed within the control group. To meet this challenge, we propose a flexible one-vs-K-sample testing procedure for detecting brain injury for a single-case versus heterogeneous controls. The new procedure begins with source magnitude imaging using MEG scan data in frequency domain, followed by region-wise contrast tests for abnormality between the case and controls. The critical values for these tests are automatically determined by cross-validation. We adjust the testing results for heterogeneity effects by similarity analysis. An asymptotic theory is established for the proposed test statistic. By simulated and real data analyses in the context of neurotrauma, we show that the proposed test outperforms commonly used nonparametric methods in terms of overall accuracy and ability in accommodating data non-normality and subject-heterogeneity.
2025-01-31 Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Mrinank Sharma, Meg Tong, Jesse Mu, Jerry Wei, Jorrit Kruthoff, Scott Goodfriend, Euan Ong, Alwin Peng, Raj Agarwal, Cem Anil, Amanda Askell, Nathan Bailey, Joe Benton, Emma Bluemke, Samuel R. Bowman, Eric Christiansen, Hoagy Cunningham, Andy Dau, Anjali Gopal, Rob Gilson, Logan Graham, Logan Howard, Nimit Kalra, Taesung Lee, Kevin Lin, Peter Lofgren, Francesco Mosconi, Clare O'Hara, Catherine Olsson, Linda Petrini, Samir Rajani, Nikhil Saxena, Alex Silverstein, Tanya Singh, Theodore Sumers, Leonard Tang, Kevin K. Troy, Constantin Weisser, Ruiqi Zhong, Giulio Zhou, Jan Leike, Jared Kaplan, Ethan Perez Link Large language models (LLMs) are vulnerable to universal jailbreaks-prompting strategies that systematically bypass model safeguards and enable users to carry out harmful processes that require many model interactions, like manufacturing illegal substances at scale. To defend against these attacks, we introduce Constitutional Classifiers: safeguards trained on synthetic data, generated by prompting LLMs with natural language rules (i.e., a constitution) specifying permitted and restricted content. In over 3,000 estimated hours of red teaming, no red teamer found a universal jailbreak that could extract information from an early classifier-guarded LLM at a similar level of detail to an unguarded model across most target queries. On automated evaluations, enhanced classifiers demonstrated robust defense against held-out domain-specific jailbreaks. These classifiers also maintain deployment viability, with an absolute 0.38% increase in production-traffic refusals and a 23.7% inference overhead. Our work demonstrates that defending against universal jailbreaks while maintaining practical deployment viability is tractable.
2025-01-28 "Ownership, Not Just Happy Talk": Co-Designing a Participatory Large Language Model for Journalism Emily Tseng, Meg Young, Marianne Aubin Le Quéré, Aimee Rinehart, Harini Suresh Link Journalism has emerged as an essential domain for understanding the uses, limitations, and impacts of large language models (LLMs) in the workplace. News organizations face divergent financial incentives: LLMs already permeate newswork processes within financially constrained organizations, even as ongoing legal challenges assert that AI companies violate their copyright. At stake are key questions about what LLMs are created to do, and by whom: How might a journalist-led LLM work, and what can participatory design illuminate about the present-day challenges about adapting ``one-size-fits-all'' foundation models to a given context of use? In this paper, we undertake a co-design exploration to understand how a participatory approach to LLMs might address opportunities and challenges around AI in journalism. Our 20 interviews with reporters, data journalists, editors, labor organizers, product leads, and executives highlight macro, meso, and micro tensions that designing for this opportunity space must address. From these desiderata, we describe the result of our co-design work: organizational structures and functionality for a journalist-controlled LLM. In closing, we discuss the limitations of commercial foundation models for workplace use, and the methodological implications of applying participatory methods to LLM co-design.
2025-01-26 The Advanced Muon Facility: a proposed multi-purpose muon facility at Fermilab Sophie Middleton Link Charged lepton flavor violation (CLFV) is expected in a diverse set of new physics scenarios. The current generation of experiments probe CLFV in the muon sector in three complementary channels: $\mu^-N \rightarrow e^- N$ (Mu2e, COMET), $\mu^+ \rightarrow e^+ \gamma$ (MEG-II), and $\mu^+ \rightarrow e^+e^+e^-$s (Mu3e). These experiments aim to enhance existing limits by several orders-of-magnitude in the coming decade and offer discovery potential to many new physics models. The proposed Advanced Muon Facility (AMF) would be a multi-purpose muon facility based at Fermilab and introduces an innovative approach based on a muon storage ring to enable a full suite of muon CLFV experiments. AMF would host CLFV experiments with sensitivities orders-of-magnitude beyond the present era. In the event of a signal in these currently planned experiments, AMF would enable additional measurements to elucidate the nature of the new physics observed. The design and R$\&$D for AMF is in its infancy. This article outlines the motivations for AMF, detailing on-going R$\&$D efforts, and highlighting potential synergies with the proposed muon collider.
2025-01-28 Scaling laws for decoding images from brain activity Hubert Banville, Yohann Benchetrit, Stéphane d'Ascoli, Jérémy Rapin, Jean-Rémi King Link Generative AI has recently propelled the decoding of images from brain activity. How do these approaches scale with the amount and type of neural recordings? Here, we systematically compare image decoding from four types of non-invasive devices: electroencephalography (EEG), magnetoencephalography (MEG), high-field functional Magnetic Resonance Imaging (3T fMRI) and ultra-high field (7T) fMRI. For this, we evaluate decoding models on the largest benchmark to date, encompassing 8 public datasets, 84 volunteers, 498 hours of brain recording and 2.3 million brain responses to natural images. Unlike previous work, we focus on single-trial decoding performance to simulate real-time settings. This systematic comparison reveals three main findings. First, the most precise neuroimaging devices tend to yield the best decoding performances, when the size of the training sets are similar. However, the gain enabled by deep learning - in comparison to linear models - is obtained with the noisiest devices. Second, we do not observe any plateau of decoding performance as the amount of training data increases. Rather, decoding performance scales log-linearly with the amount of brain recording. Third, this scaling law primarily depends on the amount of data per subject. However, little decoding gain is observed by increasing the number of subjects. Overall, these findings delineate the path most suitable to scale the decoding of images from non-invasive brain recordings.

neuroAI

Publish Date Title Authors URL Abstract
2025-02-22 Brain-Model Evaluations Need the NeuroAI Turing Test Jenelle Feather, Meenakshi Khosla, N. Apurva Ratan Murty, Aran Nayebi Link What makes an artificial system a good model of intelligence? The classical test proposed by Alan Turing focuses on behavior, requiring that an artificial agent's behavior be indistinguishable from that of a human. While behavioral similarity provides a strong starting point, two systems with very different internal representations can produce the same outputs. Thus, in modeling biological intelligence, the field of NeuroAI often aims to go beyond behavioral similarity and achieve representational convergence between a model's activations and the measured activity of a biological system. This position paper argues that the standard definition of the Turing Test is incomplete for NeuroAI, and proposes a stronger framework called the ``NeuroAI Turing Test'', a benchmark that extends beyond behavior alone and \emph{additionally} requires models to produce internal neural representations that are empirically indistinguishable from those of a brain up to measured individual variability, i.e. the differences between a computational model and the brain is no more than the difference between one brain and another brain. While the brain is not necessarily the ceiling of intelligence, it remains the only universally agreed-upon example, making it a natural reference point for evaluating computational models. By proposing this framework, we aim to shift the discourse from loosely defined notions of brain inspiration to a systematic and testable standard centered on both behavior and internal representations, providing a clear benchmark for neuroscientific modeling and AI development.
2025-01-04 Asynchronous Hebbian/anti-Hebbian networks Henrique Reis Aguiar, Matthias H. Hennig Link Lateral inhibition models coupled with Hebbian plasticity have been shown to learn factorised causal representations of input stimuli, for instance, oriented edges are learned from natural images. Currently, these models require the recurrent dynamics to settle into a stable state before weight changes can be applied, which is not only biologically implausible, but also impractical for real-time learning systems. Here, we propose a new Hebbian learning rule which is implemented using plausible biological mechanisms that have been observed experimentally. We find that this rule allows for efficient, time-continuous learning of factorised representations, very similar to the classic noncontinuous Hebbian/anti-Hebbian learning. Furthermore, we show that this rule naturally prevents catastrophic forgetting when stimuli from different distributions are shown sequentially.
2024-11-27 NeuroAI for AI Safety Patrick Mineault, Niccolò Zanichelli, Joanne Zichen Peng, Anton Arkhipov, Eli Bingham, Julian Jara-Ettinger, Emily Mackevicius, Adam Marblestone, Marcelo Mattar, Andrew Payne, Sophia Sanborn, Karen Schroeder, Zenna Tavares, Andreas Tolias Link As AI systems become increasingly powerful, the need for safe AI has become more pressing. Humans are an attractive model for AI safety: as the only known agents capable of general intelligence, they perform robustly even under conditions that deviate significantly from prior experiences, explore the world safely, understand pragmatics, and can cooperate to meet their intrinsic goals. Intelligence, when coupled with cooperation and safety mechanisms, can drive sustained progress and well-being. These properties are a function of the architecture of the brain and the learning algorithms it implements. Neuroscience may thus hold important keys to technical AI safety that are currently underexplored and underutilized. In this roadmap, we highlight and critically evaluate several paths toward AI safety inspired by neuroscience: emulating the brain's representations, information processing, and architecture; building robust sensory and motor systems from imitating brain data and bodies; fine-tuning AI systems on brain data; advancing interpretability using neuroscience methods; and scaling up cognitively-inspired architectures. We make several concrete recommendations for how neuroscience can positively impact AI safety.
2024-11-21 Evaluating Representational Similarity Measures from the Lens of Functional Correspondence Yiqing Bo, Ansh Soni, Sudhanshu Srivastava, Meenakshi Khosla Link Neuroscience and artificial intelligence (AI) both face the challenge of interpreting high-dimensional neural data, where the comparative analysis of such data is crucial for revealing shared mechanisms and differences between these complex systems. Despite the widespread use of representational comparisons and the abundance classes of comparison methods, a critical question remains: which metrics are most suitable for these comparisons? While some studies evaluate metrics based on their ability to differentiate models of different origins or constructions (e.g., various architectures), another approach is to assess how well they distinguish models that exhibit distinct behaviors. To investigate this, we examine the degree of alignment between various representational similarity measures and behavioral outcomes, employing group statistics and a comprehensive suite of behavioral metrics for comparison. In our evaluation of eight commonly used representational similarity metrics in the visual domain -- spanning alignment-based, Canonical Correlation Analysis (CCA)-based, inner product kernel-based, and nearest-neighbor methods -- we found that metrics like linear Centered Kernel Alignment (CKA) and Procrustes distance, which emphasize the overall geometric structure or shape of representations, excelled in differentiating trained from untrained models and aligning with behavioral measures, whereas metrics such as linear predictivity, commonly used in neuroscience, demonstrated only moderate alignment with behavior. These insights are crucial for selecting metrics that emphasize behaviorally meaningful comparisons in NeuroAI research.
2024-10-25 A prescriptive theory for brain-like inference Hadi Vafaii, Dekel Galor, Jacob L. Yates Link The Evidence Lower Bound (ELBO) is a widely used objective for training deep generative models, such as Variational Autoencoders (VAEs). In the neuroscience literature, an identical objective is known as the variational free energy, hinting at a potential unified framework for brain function and machine learning. Despite its utility in interpreting generative models, including diffusion models, ELBO maximization is often seen as too broad to offer prescriptive guidance for specific architectures in neuroscience or machine learning. In this work, we show that maximizing ELBO under Poisson assumptions for general sequence data leads to a spiking neural network that performs Bayesian posterior inference through its membrane potential dynamics. The resulting model, the iterative Poisson VAE (iP-VAE), has a closer connection to biological neurons than previous brain-inspired predictive coding models based on Gaussian assumptions. Compared to amortized and iterative VAEs, iP-VAElearns sparser representations and exhibits superior generalization to out-of-distribution samples. These findings suggest that optimizing ELBO, combined with Poisson assumptions, provides a solid foundation for developing prescriptive theories in NeuroAI.
2024-09-09 Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models Emily Cheng, Richard J. Antonello Link Research has repeatedly demonstrated that intermediate hidden states extracted from large language models are able to predict measured brain response to natural language stimuli. Yet, very little is known about the representation properties that enable this high prediction performance. Why is it the intermediate layers, and not the output layers, that are most capable for this unique and highly general transfer task? In this work, we show that evidence from language encoding models in fMRI supports the existence of a two-phase abstraction process within LLMs. We use manifold learning methods to show that this abstraction process naturally arises over the course of training a language model and that the first "composition" phase of this abstraction process is compressed into fewer layers as training continues. Finally, we demonstrate a strong correspondence between layerwise encoding performance and the intrinsic dimensionality of representations from LLMs. We give initial evidence that this correspondence primarily derives from the inherent compositionality of LLMs and not their next-word prediction properties.
2024-07-22 Predictive Coding Networks and Inference Learning: Tutorial and Survey Björn van Zwol, Ro Jefferson, Egon L. van den Broek Link Recent years have witnessed a growing call for renewed emphasis on neuroscience-inspired approaches in artificial intelligence research, under the banner of NeuroAI. A prime example of this is predictive coding networks (PCNs), based on the neuroscientific framework of predictive coding. This framework views the brain as a hierarchical Bayesian inference model that minimizes prediction errors through feedback connections. Unlike traditional neural networks trained with backpropagation (BP), PCNs utilize inference learning (IL), a more biologically plausible algorithm that explains patterns of neural activity that BP cannot. Historically, IL has been more computationally intensive, but recent advancements have demonstrated that it can achieve higher efficiency than BP with sufficient parallelization. Furthermore, PCNs can be mathematically considered a superset of traditional feedforward neural networks (FNNs), significantly extending the range of trainable architectures. As inherently probabilistic (graphical) latent variable models, PCNs provide a versatile framework for both supervised learning and unsupervised (generative) modeling that goes beyond traditional artificial neural networks. This work provides a comprehensive review and detailed formal specification of PCNs, particularly situating them within the context of modern ML methods. Additionally, we introduce a Python library (PRECO) for practical implementation. This positions PC as a promising framework for future ML innovations.
2023-10-29 Beyond Geometry: Comparing the Temporal Structure of Computation in Neural Circuits with Dynamical Similarity Analysis Mitchell Ostrow, Adam Eisen, Leo Kozachkov, Ila Fiete Link How can we tell whether two neural networks utilize the same internal processes for a particular computation? This question is pertinent for multiple subfields of neuroscience and machine learning, including neuroAI, mechanistic interpretability, and brain-machine interfaces. Standard approaches for comparing neural networks focus on the spatial geometry of latent states. Yet in recurrent networks, computations are implemented at the level of dynamics, and two networks performing the same computation with equivalent dynamics need not exhibit the same geometry. To bridge this gap, we introduce a novel similarity metric that compares two systems at the level of their dynamics, called Dynamical Similarity Analysis (DSA). Our method incorporates two components: Using recent advances in data-driven dynamical systems theory, we learn a high-dimensional linear system that accurately captures core features of the original nonlinear dynamics. Next, we compare different systems passed through this embedding using a novel extension of Procrustes Analysis that accounts for how vector fields change under orthogonal transformation. In four case studies, we demonstrate that our method disentangles conjugate and non-conjugate recurrent neural networks (RNNs), while geometric methods fall short. We additionally show that our method can distinguish learning rules in an unsupervised manner. Our method opens the door to comparative analyses of the essential temporal structure of computation in neural circuits.
2023-05-25 Explaining V1 Properties with a Biologically Constrained Deep Learning Architecture Galen Pogoncheff, Jacob Granley, Michael Beyeler Link Convolutional neural networks (CNNs) have recently emerged as promising models of the ventral visual stream, despite their lack of biological specificity. While current state-of-the-art models of the primary visual cortex (V1) have surfaced from training with adversarial examples and extensively augmented data, these models are still unable to explain key neural properties observed in V1 that arise from biological circuitry. To address this gap, we systematically incorporated neuroscience-derived architectural components into CNNs to identify a set of mechanisms and architectures that comprehensively explain neural activity in V1. We show drastic improvements in model-V1 alignment driven by the integration of architectural components that simulate center-surround antagonism, local receptive fields, tuned normalization, and cortical magnification. Upon enhancing task-driven CNNs with a collection of these specialized components, we uncover models with latent representations that yield state-of-the-art explanation of V1 neural activity and tuning properties. Our results highlight an important advancement in the field of NeuroAI, as we systematically establish a set of architectural components that contribute to unprecedented explanation of V1. The neuroscience insights that could be gleaned from increasingly accurate in-silico models of the brain have the potential to greatly advance the fields of both neuroscience and artificial intelligence.
2024-11-09 A Deep Probabilistic Spatiotemporal Framework for Dynamic Graph Representation Learning with Application to Brain Disorder Identification Sin-Yee Yap, Junn Yong Loo, Chee-Ming Ting, Fuad Noman, Raphael C. -W. Phan, Adeel Razi, David L. Dowe Link Recent applications of pattern recognition techniques on brain connectome classification using functional connectivity (FC) are shifting towards acknowledging the non-Euclidean topology and dynamic aspects of brain connectivity across time. In this paper, a deep spatiotemporal variational Bayes (DSVB) framework is proposed to learn time-varying topological structures in dynamic FC networks for identifying autism spectrum disorder (ASD) in human participants. The framework incorporates a spatial-aware recurrent neural network with an attention-based message passing scheme to capture rich spatiotemporal patterns across dynamic FC networks. To overcome model overfitting on limited training datasets, an adversarial training strategy is introduced to learn graph embedding models that generalize well to unseen brain networks. Evaluation on the ABIDE resting-state functional magnetic resonance imaging dataset shows that our proposed framework substantially outperforms state-of-the-art methods in identifying patients with ASD. Dynamic FC analyses with DSVB-learned embeddings reveal apparent group differences between ASD and healthy controls in brain network connectivity patterns and switching dynamics of brain states. The code is available at https://github.com/Monash-NeuroAI/Deep-Spatiotemporal-Variational-Bayes.

medical

Publish Date Title Authors URL Abstract
2025-02-26 Neutron Beam Shaping by Ghost Projection Andrew M. Kingston, Alaleh Aminzadeh, Jeremy M. C. Brown, Filomena Salvemini, Joseph J. Bevitt, Ulf Garbe, David M. Paganin Link We present a method to shape a neutron beam and project any specified target image using a single universal patterned mask that is transversely displaced. The method relies on "ghost projection", which is a reversed form of classical ghost imaging. A set of sub-mask regions that combine to construct the required beam shape is computed; illumination of each region with the determined exposure time projects the shaped beam. We demonstrate this method experimentally, using the Dingo neutron imaging beamline at the OPAL nuclear research reactor (Australia). The ability to shape a neutron beam "on demand" allows selective dose delivery away from sensitive areas of samples, such as in cultural heritage artifacts. It also benefits irradiation techniques, e.g., in testing resilience of electronic components for space and defense technologies or neutron therapies.
2025-02-26 Deep learning and classical computer vision techniques in medical image analysis: Case studies on brain MRI tissue segmentation, lung CT COPD registration, and skin lesion classification Anyimadu Daniel Tweneboah, Suleiman Taofik Ahmed, Hossain Mohammad Imran Link Medical imaging spans diverse tasks and modalities which play a pivotal role in disease diagnosis, treatment planning, and monitoring. This study presents a novel exploration, being the first to systematically evaluate segmentation, registration, and classification tasks across multiple imaging modalities. Integrating both classical and deep learning (DL) approaches in addressing brain MRI tissue segmentation, lung CT image registration, and skin lesion classification from dermoscopic images, we demonstrate the complementary strengths of these methodologies in diverse applications. For brain tissue segmentation, 3D DL models outperformed 2D and patch-based models, specifically nnU-Net achieving Dice of 0.9397, with 3D U-Net models on ResNet34 backbone, offering competitive results with Dice 0.8946. Multi-Atlas methods provided robust alternatives for cases where DL methods are not feasible, achieving average Dice of 0.7267. In lung CT registration, classical Elastix-based methods outperformed DL models, achieving a minimum Target Registration Error (TRE) of 6.68 mm, highlighting the effectiveness of parameter tuning. HighResNet performed best among DL models with a TRE of 7.40 mm. For skin lesion classification, ensembles of DL models like InceptionResNetV2 and ResNet50 excelled, achieving up to 90.44%, and 93.62% accuracies for binary and multiclass classification respectively. Also, adopting One-vs-All method, DL attained accuracies of 94.64% (mel vs. others), 95.35% (bcc vs. others), and 96.93% (scc vs. others), while ML models specifically Multi-Layer Perceptron (MLP) on handcrafted features offered interpretable alternatives with 85.04% accuracy using SMOTE for class imbalance correction on the multi-class task and 83.27% on the binary-class task. Links to source code are available on request.
2025-02-26 Quantum Annealing Feature Selection on Light-weight Medical Image Datasets Merlin A. Nau, Luca A. Nutricati, Bruno Camino, Paul A. Warburton, Andreas K. Maier Link We investigate the use of quantum computing algorithms on real quantum hardware to tackle the computationally intensive task of feature selection for light-weight medical image datasets. Feature selection is often formulated as a k of n selection problem, where the complexity grows binomially with increasing k and n. As problem sizes grow, classical approaches struggle to scale efficiently. Quantum computers, particularly quantum annealers, are well-suited for such problems, offering potential advantages in specific formulations. We present a method to solve larger feature selection instances than previously presented on commercial quantum annealers. Our approach combines a linear Ising penalty mechanism with subsampling and thresholding techniques to enhance scalability. The method is tested in a toy problem where feature selection identifies pixel masks used to reconstruct small-scale medical images. The results indicate that quantum annealing-based feature selection is effective for this simplified use case, demonstrating its potential in high-dimensional optimization tasks. However, its applicability to broader, real-world problems remains uncertain, given the current limitations of quantum computing hardware.
2025-02-26 HDM: Hybrid Diffusion Model for Unified Image Anomaly Detection Zekang Weng, Jinjin Shi, Jinwei Wang, Zeming Han Link Image anomaly detection plays a vital role in applications such as industrial quality inspection and medical imaging, where it directly contributes to improving product quality and system reliability. However, existing methods often struggle with complex and diverse anomaly patterns. In particular, the separation between generation and discrimination tasks limits the effective coordination between anomaly sample generation and anomaly region detection. To address these challenges, we propose a novel hybrid diffusion model (HDM) that integrates generation and discrimination into a unified framework. The model consists of three key modules: the Diffusion Anomaly Generation Module (DAGM), the Diffusion Discriminative Module (DDM), and the Probability Optimization Module (POM). DAGM generates realistic and diverse anomaly samples, improving their representativeness. DDM then applies a reverse diffusion process to capture the differences between generated and normal samples, enabling precise anomaly region detection and localization based on probability distributions. POM refines the probability distributions during both the generation and discrimination phases, ensuring high-quality samples are used for training. Extensive experiments on multiple industrial image datasets demonstrate that our method outperforms state-of-the-art approaches, significantly improving both image-level and pixel-level anomaly detection performance, as measured by AUROC.
2025-02-26 MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis Daniel Rose, Chia-Chien Hung, Marco Lepri, Israa Alqassem, Kiril Gashteovski, Carolin Lawrence Link Differential Diagnosis (DDx) is a fundamental yet complex aspect of clinical decision-making, in which physicians iteratively refine a ranked list of possible diseases based on symptoms, antecedents, and medical knowledge. While recent advances in large language models have shown promise in supporting DDx, existing approaches face key limitations, including single-dataset evaluations, isolated optimization of components, unrealistic assumptions about complete patient profiles, and single-attempt diagnosis. We introduce a Modular Explainable DDx Agent (MEDDxAgent) framework designed for interactive DDx, where diagnostic reasoning evolves through iterative learning, rather than assuming a complete patient profile is accessible. MEDDxAgent integrates three modular components: (1) an orchestrator (DDxDriver), (2) a history taking simulator, and (3) two specialized agents for knowledge retrieval and diagnosis strategy. To ensure robust evaluation, we introduce a comprehensive DDx benchmark covering respiratory, skin, and rare diseases. We analyze single-turn diagnostic approaches and demonstrate the importance of iterative refinement when patient profiles are not available at the outset. Our broad evaluation demonstrates that MEDDxAgent achieves over 10% accuracy improvements in interactive DDx across both large and small LLMs, while offering critical explainability into its diagnostic reasoning process.
2025-02-26 Zero Echo Time Functional MRI in Humans Ali Caglar Özen, Shuai Liu, Serhat Ilbey, Michael Bock, Uzay Emir, Yen-Yu Ian Shih Link Motivation: Conventional echo planar imaging(EPI) based functional MRI(fMRI) uses the BOLD contrast to map activity changes in human brains. Introducing an efficient ZTE sequence for functional brain mapping can help address limitations of EPI and demonstrate the feasibility of using T1 related changes as a surrogate marker of brain activity. Goals: To test and optimize ZTE sequence for fMRI. Methods: A ZTE sequence with radial inside out spokes was used to prepare a dynamic imaging protocol that matches conventional EPI time course. Temporal SNR and sensitivity to susceptibility differences of ZTE were evaluated and the sequence was benchmarked against BOLD EPI in a task based visual fMRI study with healthy volunteers at 3T. Results: Phantom measurements confirmed sensitivity of the ZTE protocol to the oxygen concentration. Functional activation in primary visual cortex could be detected using ZTE. Resting state networks could also be identified using independent component analysis. Discussion: ZTE-based fMRI is proposed for mapping functional activation in human brain. ZTE is robust against susceptibility artefacts and significantly reduces acoustic noise. Radial sampling pattern allows for high undersampling rates to increase temporal resolution.
2025-02-26 Impact of deep learning model uncertainty on manual corrections to auto-segmentation in prostate cancer radiotherapy Viktor Rogowski, Angelica Svalkvist, Matteo Maspero, Tomas Janssen, Federica Carmen Maruccio, Jenny Gorgisyan, Jonas Scherman, Ida Häggström, Victor Wåhlstrand, Adalsteinn Gunnlaugsson, Martin P Nilsson, Mathieu Moreau, Nándor Vass, Niclas Pettersson, Christian Jamtheim Gustafsson Link Background: Deep learning (DL)-based organ segmentation is increasingly used in radiotherapy, yet voxel-wise DL uncertainty maps are rarely presented to clinicians. Purpose: This study assessed how DL-generated uncertainty maps impact radiation oncologists during manual correction of prostate radiotherapy DL segmentations. Methods: Two nnUNet models were trained by 10-fold cross-validation on 434 MRI-only prostate cancer cases to segment the prostate and rectum. Each model was evaluated on 35 independent cases. Voxel-wise uncertainty was calculated using the SoftMax standard deviation (n=10) and visualized as a color-coded map. Four oncologists performed segmentation in two steps: Step 1: Rated segmentation quality and confidence using Likert scales and edited DL segmentations without uncertainty maps. Step 2 ($\geq 4$ weeks later): Repeated step 1, but with uncertainty maps available. Segmentation time was recorded for both steps, and oncologists provided qualitative free-text feedback. Histogram analysis compared voxel edits across uncertainty levels. Results: DL segmentations showed high agreement with oncologist edits. Quality ratings varied: rectum segmentation ratings slightly decreased overall in step 2, while prostate ratings differed among oncologists. Confidence ratings also varied. Three oncologists reduced segmentation time with uncertainty maps, saving 1-2 minutes per case. Histogram analysis showed 50% fewer edits for step 2 in low-uncertainty areas. Conclusions: Presenting DL segmentation uncertainty information to radiation oncologists influences their decision-making, quality perception, and confidence in the DL segmentations. Low-uncertainty regions were edited less frequently, indicating increased trust in DL predictions. Uncertainty maps improve efficiency by reducing segmentation time and can be a valuable clinical tool, enhancing radiotherapy planning efficiency.
2025-02-26 Bridging Spectroscopy and Advanced Molecular Orientation Analysis with New 4+ Angle Polarization Toolbox in Quasar Callum Gassner, Jitraporn Vongsvivut, Meguya Ryu, Soon Hock Ng, Marko Toplak, Vijayakumar Anand, Pooja Takkalkar, Mary Louise Fac, Natalie A. Sims, Bayden R. Wood, Mark J. Tobin, Saulius Juodkazis, Junko Morikawa Link Anisotropy plays a critical role in governing the mechanical, thermal, electrical, magnetic, and optical properties of materials, influencing their behavior across diverse applications. Probing and quantifying this directional dependence is crucial for advancing materials science and biomedical research, as it provides a deeper understanding of structural orientations at the molecular level, encompassing both scientific and industrial benefits. This study introduces the "4+ Angle Polarization" widget, an innovative extension to the open-source Quasar platform (https://quasar.codes/), tailored for advanced multiple-angle polarization analysis. This toolbox enables precise molecular orientation analysis of complex microspectroscopic datasets through a streamlined workflow. Using polarized Fourier transform infrared (p-FTIR) spectroscopy, we demonstrate its versatility across various sample types, including polylactic acid (PLA) organic crystals, murine cortical bone, and human osteons. By overcoming the limitations of traditional two-angle methods, the widget significantly enhances the accuracy of structural and orientational analysis. This novel analytical tool expands the potential of multiple-angle p-FTIR techniques into advanced characterization of structural anisotropy in heterogeneous systems, providing transformative insights for materials characterization, biomedical imaging and beyond.
2025-02-26 On Aggregation Queries over Predicted Nearest Neighbors Carrie Wang, Sihem Amer-Yahia, Laks V. S. Lakshmanan, Reynold Cheng Link We introduce Aggregation Queries over Nearest Neighbors (AQNNs), a novel type of aggregation queries over the predicted neighborhood of a designated object. AQNNs are prevalent in modern applications where, for instance, a medical professional may want to compute "the average systolic blood pressure of patients whose predicted condition is similar to a given insomnia patient". Since prediction typically involves an expensive deep learning model or a human expert, we formulate query processing as the problem of returning an approximate aggregate by combining an expensive oracle and a cheaper model (e.g, a simple ML model) to compute the predictions. We design the Sampler with Precision-Recall in Target (SPRinT) framework for answering AQNNs. SPRinT consists of sampling, nearest neighbor refinement, and aggregation, and is tailored for various aggregation functions. It enjoys provable theoretical guarantees, including bounds on sample size and on error in approximate aggregates. Our extensive experiments on medical, e-commerce, and video datasets demonstrate that SPRinT consistently achieves the lowest aggregation error with minimal computation cost compared to its baselines. Scalability results show that SPRinT's execution time and aggregation error remain stable as the dataset size increases, confirming its suitability for large-scale applications.
2025-02-25 Rapidly Built Medical Crash Cart! Lessons Learned and Impacts on High-Stakes Team Collaboration in the Emergency Room Angelique Taylor, Tauhid Tanjim, Michael Joseph Sack, Maia Hirsch, Kexin Cheng, Kevin Ching, Jonathan St. George, Thijs Roumen, Malte F. Jung, Hee Rin Lee Link Designing robots to support high-stakes teamwork in emergency settings presents unique challenges, including seamless integration into fast-paced environments, facilitating effective communication among team members, and adapting to rapidly changing situations. While teleoperated robots have been successfully used in high-stakes domains such as firefighting and space exploration, autonomous robots that aid highs-takes teamwork remain underexplored. To address this gap, we conducted a rapid prototyping process to develop a series of seemingly autonomous robot designed to assist clinical teams in the Emergency Room. We transformed a standard crash cart--which stores medical equipment and emergency supplies into a medical robotic crash cart (MCCR). The MCCR was evaluated through field deployments to assess its impact on team workload and usability, identified taxonomies of failure, and refined the MCCR in collaboration with healthcare professionals. Our work advances the understanding of robot design for high-stakes, time-sensitive settings, providing insights into useful MCCR capabilities and considerations for effective human-robot collaboration. By publicly disseminating our MCCR tutorial, we hope to encourage HRI researchers to explore the design of robots for high-stakes teamwork.

About

Daily arxiv updates on NeuroAI related topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages