new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Apr 21

Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments

Finding ways to accelerate text input for individuals with profound motor impairments has been a long-standing area of research. Closing the speed gap for augmentative and alternative communication (AAC) devices such as eye-tracking keyboards is important for improving the quality of life for such individuals. Recent advances in neural networks of natural language pose new opportunities for re-thinking strategies and user interfaces for enhanced text-entry for AAC users. In this paper, we present SpeakFaster, consisting of large language models (LLMs) and a co-designed user interface for text entry in a highly-abbreviated form, allowing saving 57% more motor actions than traditional predictive keyboards in offline simulation. A pilot study with 19 non-AAC participants typing on a mobile device by hand demonstrated gains in motor savings in line with the offline simulation, while introducing relatively small effects on overall typing speed. Lab and field testing on two eye-gaze typing users with amyotrophic lateral sclerosis (ALS) demonstrated text-entry rates 29-60% faster than traditional baselines, due to significant saving of expensive keystrokes achieved through phrase and word predictions from context-aware LLMs. These findings provide a strong foundation for further exploration of substantially-accelerated text communication for motor-impaired users and demonstrate a direction for applying LLMs to text-based user interfaces.

  • 16 authors
·
Dec 3, 2023 2

Applying Dimensionality Reduction as Precursor to LSTM-CNN Models for Classifying Imagery and Motor Signals in ECoG-Based BCIs

Motor impairments, frequently caused by neurological incidents like strokes or traumatic brain injuries, present substantial obstacles in rehabilitation therapy. This research aims to elevate the field by optimizing motor imagery classification algorithms within Brain-Computer Interfaces (BCIs). By improving the efficiency of BCIs, we offer a novel approach that holds significant promise for enhancing motor rehabilitation outcomes. Utilizing unsupervised techniques for dimensionality reduction, namely Uniform Manifold Approximation and Projection (UMAP) coupled with K-Nearest Neighbors (KNN), we evaluate the necessity of employing supervised methods such as Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNNs) for classification tasks. Importantly, participants who exhibited high KNN scores following UMAP dimensionality reduction also achieved high accuracy in supervised deep learning (DL) models. Due to individualized model requirements and massive neural training data, dimensionality reduction becomes an effective preprocessing step that minimizes the need for extensive data labeling and supervised deep learning techniques. This approach has significant implications not only for targeted therapies in motor dysfunction but also for addressing regulatory, safety, and reliability concerns in the rapidly evolving BCI field.

  • 1 authors
·
Nov 22, 2023

Graph AI generates neurological hypotheses validated in molecular, organoid, and clinical systems

Neurological diseases are the leading global cause of disability, yet most lack disease-modifying treatments. We present PROTON, a heterogeneous graph transformer that generates testable hypotheses across molecular, organoid, and clinical systems. To evaluate PROTON, we apply it to Parkinson's disease (PD), bipolar disorder (BD), and Alzheimer's disease (AD). In PD, PROTON linked genetic risk loci to genes essential for dopaminergic neuron survival and predicted pesticides toxic to patient-derived neurons, including the insecticide endosulfan, which ranked within the top 1.29% of predictions. In silico screens performed by PROTON reproduced six genome-wide α-synuclein experiments, including a split-ubiquitin yeast two-hybrid system (normalized enrichment score [NES] = 2.30, FDR-adjusted p < 1 times 10^{-4}), an ascorbate peroxidase proximity labeling assay (NES = 2.16, FDR < 1 times 10^{-4}), and a high-depth targeted exome sequencing study in 496 synucleinopathy patients (NES = 2.13, FDR < 1 times 10^{-4}). In BD, PROTON predicted calcitriol as a candidate drug that reversed proteomic alterations observed in cortical organoids derived from BD patients. In AD, we evaluated PROTON predictions in health records from n = 610,524 patients at Mass General Brigham, confirming that five PROTON-predicted drugs were associated with reduced seven-year dementia risk (minimum hazard ratio = 0.63, 95% CI: 0.53-0.75, p < 1 times 10^{-7}). PROTON generated neurological hypotheses that were evaluated across molecular, organoid, and clinical systems, defining a path for AI-driven discovery in neurological disease.

  • 29 authors
·
Dec 13, 2025

SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting

Chronic Obstructive Pulmonary Disease (COPD), a major chronic respiratory disease with persistent airflow limitation, is a leading global cause of disability and mortality. Respiratory spirogram time series, routinely collected during pulmonary function tests (PFTs), play a critical role in the early detection of repsiratory diseases and in monitoring lung function over time. However, most current AI models for COPD diagnosis are limited to outputting classification results without providing a rationale for their diagnostic process, while current Large Language Models (LLMs) cannot understand spirograms yet, which severely limits their clinical trust and adoption. To tackle this challenge, we leverage a cohort of 234,028 individuals from the UK Biobank (UKB) to propose SpiroLLM, the first multimodal large language model that can understand spirogram. The model extracts morphological features from respiratory curves via a SpiroEncoder and aligns them with PFT numerical values in a unified latent space using a SpiroProjector, ultimately empowering a large language model to generate a comprehensive diagnostic report. Experimental results confirm that SpiroLLM achieved a diagnostic AUROC of 0.8980 (95% CI: 0.8820-0.9132). In a robustness test with missing core data, it maintained a 100% valid response rate, far surpassing the 13.4% of a text-only model and showcasing the superiority of its multimodal design. This work demonstrates the substantial potential of deeply fusing physiological signals with large language models, establishing a new paradigm for the next generation of interpretable and reliable clinical decision support tools.

  • 8 authors
·
Jul 21, 2025

High-density Electromyography for Effective Gesture-based Control of Physically Assistive Mobile Manipulators

Injury to the cervical spinal cord can cause quadriplegia, impairing muscle function in all four limbs. People with impaired hand function and mobility encounter significant difficulties in carrying out essential self-care and household tasks. Despite the impairment of their neural drive, their volitional myoelectric activity is often partially preserved. High-density electromyography (HDEMG) can detect this myoelectric activity, which can serve as control inputs to assistive devices. Previous HDEMG-controlled robotic interfaces have primarily been limited to controlling table-mounted robot arms. These have constrained reach capabilities. Instead, the ability to control mobile manipulators, which have no such workspace constraints, could allow individuals with quadriplegia to perform a greater variety of assistive tasks, thus restoring independence and reducing caregiver workload. In this study, we introduce a non-invasive wearable HDEMG interface with real-time myoelectric hand gesture recognition, enabling both coarse and fine control over the intricate mobility and manipulation functionalities of an 8 degree-of-freedom mobile manipulator. Our evaluation, involving 13 participants engaging in challenging self-care and household activities, demonstrates the potential of our wearable HDEMG system to profoundly enhance user independence by enabling non-invasive control of a mobile manipulator.

  • 4 authors
·
Dec 12, 2023

The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

Background:Speech patterns have emerged as potential diagnostic markers for conditions with varying etiologies. Machine learning (ML) presents an opportunity to harness these patterns for accurate disease diagnosis. Objective: This review synthesized findings from studies exploring ML's capability in leveraging speech for the diagnosis of neurological, laryngeal and mental disorders. Methods: A systematic examination of 564 articles was conducted with 91 articles included in the study, which encompassed a wide spectrum of conditions, ranging from voice pathologies to mental and neurological disorders. Methods for speech classifications were assessed based on the relevant studies and scored between 0-10 based on the reported diagnostic accuracy of their ML models. Results: High diagnostic accuracies were consistently observed for laryngeal disorders, dysarthria, and changes related to speech in Parkinsons disease. These findings indicate the robust potential of speech as a diagnostic tool. Disorders like depression, schizophrenia, mild cognitive impairment and Alzheimers dementia also demonstrated high accuracies, albeit with some variability across studies. Meanwhile, disorders like OCD and autism highlighted the need for more extensive research to ascertain the relationship between speech patterns and the respective conditions. Conclusion: ML models utilizing speech patterns demonstrate promising potential in diagnosing a range of mental, laryngeal, and neurological disorders. However, the efficacy varies across conditions, and further research is needed. The integration of these models into clinical practice could potentially revolutionize the evaluation and diagnosis of a number of different medical conditions.

  • 4 authors
·
Mar 3, 2025

Upper Limb Movement Recognition utilising EEG and EMG Signals for Rehabilitative Robotics

Upper limb movement classification, which maps input signals to the target activities, is a key building block in the control of rehabilitative robotics. Classifiers are trained for the rehabilitative system to comprehend the desires of the patient whose upper limbs do not function properly. Electromyography (EMG) signals and Electroencephalography (EEG) signals are used widely for upper limb movement classification. By analysing the classification results of the real-time EEG and EMG signals, the system can understand the intention of the user and predict the events that one would like to carry out. Accordingly, it will provide external help to the user. However, the noise in the real-time EEG and EMG data collection process contaminates the effectiveness of the data, which undermines classification performance. Moreover, not all patients process strong EMG signals due to muscle damage and neuromuscular disorder. To address these issues, this paper explores different feature extraction techniques and machine learning and deep learning models for EEG and EMG signals classification and proposes a novel decision-level multisensor fusion technique to integrate EEG signals with EMG signals. This system retrieves effective information from both sources to understand and predict the desire of the user, and thus aid. By testing out the proposed technique on a publicly available WAY-EEG-GAL dataset, which contains EEG and EMG signals that were recorded simultaneously, we manage to conclude the feasibility and effectiveness of the novel system.

  • 2 authors
·
Jul 18, 2022

A Machine Learning Approach for Identifying Anatomical Biomarkers of Early Mild Cognitive Impairment

Alzheimer's Disease (AD) is a progressive neurodegenerative disorder that primarily affects the aging population by impairing cognitive and motor functions. Early detection of AD through accessible methodologies like magnetic resonance imaging (MRI) is vital for developing effective interventions to halt or slow the disease's progression. This study aims to perform a comprehensive analysis of machine learning techniques for selecting MRI-based biomarkers and classifying individuals into healthy controls (HC) and unstable controls (uHC) who later show mild cognitive impairment within five years. The research utilizes MRI data from the Alzheimer's Disease Neuroinformatics Initiative (ADNI) and the Open Access Series of Imaging Studies 3 (OASIS-3), focusing on both HC and uHC participants. The study addresses the challenges of imbalanced data by testing classification methods on balanced and unbalanced datasets, and harmonizes data using polynomial regression to mitigate nuisance variables like age, gender, and intracranial volume. Results indicate that Gaussian Naive Bayes and RusBoost classifiers shows an optimal performance, achieving accuracies of up to 76.46% and 72.48% respectively on the ADNI dataset. For the OASIS-3 dataset, Kernel Naive Bayes and RusBoost yield accuracies ranging from 64.66% to 75.71%, improving further in age-matched datasets. Brain regions like the entorhinal cortex, hippocampus, lateral ventricle, and lateral orbitofrontal cortex are identified as significantly impacted during early cognitive decline. Despite limitations such as small sample sizes, the study's harmonization approach enhances the robustness of biomarker selection, suggesting the potential of this semi-automatic machine learning pipeline for early AD detection using MRI.

  • 6 authors
·
May 29, 2024

RISE Controller Tuning and System Identification Through Machine Learning for Human Lower Limb Rehabilitation via Neuromuscular Electrical Stimulation

Neuromuscular electrical stimulation (NMES) has been effectively applied in many rehabilitation treatments of individuals with spinal cord injury (SCI). In this context, we introduce a novel, robust, and intelligent control-based methodology to closed-loop NMES systems. Our approach utilizes a robust control law to guarantee system stability and machine learning tools to optimize both the controller parameters and system identification. Regarding the latter, we introduce the use of past rehabilitation data to build more realistic data-driven identified models. Furthermore, we apply the proposed methodology for the rehabilitation of lower limbs using a control technique named the robust integral of the sign of the error (RISE), an offline improved genetic algorithm optimizer, and neural network models. Although in the literature, the RISE controller presented good results on healthy subjects, without any fine-tuning method, a trial and error approach would quickly lead to muscle fatigue for individuals with SCI. In this paper, for the first time, the RISE controller is evaluated with two paraplegic subjects in one stimulation session and with seven healthy individuals in at least two and at most five sessions. The results showed that the proposed approach provided a better control performance than empirical tuning, which can avoid premature fatigue on NMES-based clinical procedures.

  • 7 authors
·
Jun 28, 2020

Right Prediction, Wrong Reasoning: Uncovering LLM Misalignment in RA Disease Diagnosis

Large language models (LLMs) offer a promising pre-screening tool, improving early disease detection and providing enhanced healthcare access for underprivileged communities. The early diagnosis of various diseases continues to be a significant challenge in healthcare, primarily due to the nonspecific nature of early symptoms, the shortage of expert medical practitioners, and the need for prolonged clinical evaluations, all of which can delay treatment and adversely affect patient outcomes. With impressive accuracy in prediction across a range of diseases, LLMs have the potential to revolutionize clinical pre-screening and decision-making for various medical conditions. In this work, we study the diagnostic capability of LLMs for Rheumatoid Arthritis (RA) with real world patients data. Patient data was collected alongside diagnoses from medical experts, and the performance of LLMs was evaluated in comparison to expert diagnoses for RA disease prediction. We notice an interesting pattern in disease diagnosis and find an unexpected misalignment between prediction and explanation. We conduct a series of multi-round analyses using different LLM agents. The best-performing model accurately predicts rheumatoid arthritis (RA) diseases approximately 95\% of the time. However, when medical experts evaluated the reasoning generated by the model, they found that nearly 68\% of the reasoning was incorrect. This study highlights a clear misalignment between LLMs high prediction accuracy and its flawed reasoning, raising important questions about relying on LLM explanations in clinical settings. LLMs provide incorrect reasoning to arrive at the correct answer for RA disease diagnosis.

  • 7 authors
·
Apr 9, 2025

The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up

We present the findings of "The Alzheimer's Disease Prediction Of Longitudinal Evolution" (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimer's disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcomes: clinical diagnosis, Alzheimer's Disease Assessment Scale Cognitive Subdomain (ADAS-Cog13), and total volume of the ventricles. The methods used by challenge participants included multivariate linear regression, machine learning methods such as support vector machines and deep neural networks, as well as disease progression models. No single submission was best at predicting all three outcomes. For clinical diagnosis and ventricle volume prediction, the best algorithms strongly outperform simple baselines in predictive ability. However, for ADAS-Cog13 no single submitted prediction method was significantly better than random guesswork. Two ensemble methods based on taking the mean and median over all predictions, obtained top scores on almost all tasks. Better than average performance at diagnosis prediction was generally associated with the additional inclusion of features from cerebrospinal fluid (CSF) samples and diffusion tensor imaging (DTI). On the other hand, better performance at ventricle volume prediction was associated with inclusion of summary statistics, such as the slope or maxima/minima of biomarkers. TADPOLE's unique results suggest that current prediction algorithms provide sufficient accuracy to exploit biomarkers related to clinical diagnosis and ventricle volume, for cohort refinement in clinical trials for Alzheimer's disease. However, results call into question the usage of cognitive test scores for patient selection and as a primary endpoint in clinical trials.

  • 96 authors
·
Feb 9, 2020

Anatomical Foundation Models for Brain MRIs

Deep Learning (DL) in neuroimaging has become increasingly relevant for detecting neurological conditions and neurodegenerative disorders. One of the most predominant biomarkers in neuroimaging is represented by brain age, which has been shown to be a good indicator for different conditions, such as Alzheimer's Disease. Using brain age for weakly supervised pre-training of DL models in transfer learning settings has also recently shown promising results, especially when dealing with data scarcity of different conditions. On the other hand, anatomical information of brain MRIs (e.g. cortical thickness) can provide important information for learning good representations that can be transferred to many downstream tasks. In this work, we propose AnatCL, an anatomical foundation model for brain MRIs that i.) leverages anatomical information in a weakly contrastive learning approach, and ii.) achieves state-of-the-art performances across many different downstream tasks. To validate our approach we consider 12 different downstream tasks for the diagnosis of different conditions such as Alzheimer's Disease, autism spectrum disorder, and schizophrenia. Furthermore, we also target the prediction of 10 different clinical assessment scores using structural MRI data. Our findings show that incorporating anatomical information during pre-training leads to more robust and generalizable representations. Pre-trained models can be found at: https://github.com/EIDOSLAB/AnatCL.

  • 4 authors
·
Aug 7, 2024

VIGMA: An Open-Access Framework for Visual Gait and Motion Analytics

Gait disorders are commonly observed in older adults, who frequently experience various issues related to walking. Additionally, researchers and clinicians extensively investigate mobility related to gait in typically and atypically developing children, athletes, and individuals with orthopedic and neurological disorders. Effective gait analysis enables the understanding of the causal mechanisms of mobility and balance control of patients, the development of tailored treatment plans to improve mobility, the reduction of fall risk, and the tracking of rehabilitation progress. However, analyzing gait data is a complex task due to the multivariate nature of the data, the large volume of information to be interpreted, and the technical skills required. Existing tools for gait analysis are often limited to specific patient groups (e.g., cerebral palsy), only handle a specific subset of tasks in the entire workflow, and are not openly accessible. To address these shortcomings, we conducted a requirements assessment with gait practitioners (e.g., researchers, clinicians) via surveys and identified key components of the workflow, including (1) data processing and (2) data analysis and visualization. Based on the findings, we designed VIGMA, an open-access visual analytics framework integrated with computational notebooks and a Python library, to meet the identified requirements. Notably, the framework supports analytical capabilities for assessing disease progression and for comparing multiple patient groups. We validated the framework through usage scenarios with experts specializing in gait and mobility rehabilitation. VIGMA is available at https://github.com/komar41/VIGMA.

  • 5 authors
·
Apr 24, 2025

Plant Disease Detection through Multimodal Large Language Models and Convolutional Neural Networks

Automation in agriculture plays a vital role in addressing challenges related to crop monitoring and disease management, particularly through early detection systems. This study investigates the effectiveness of combining multimodal Large Language Models (LLMs), specifically GPT-4o, with Convolutional Neural Networks (CNNs) for automated plant disease classification using leaf imagery. Leveraging the PlantVillage dataset, we systematically evaluate model performance across zero-shot, few-shot, and progressive fine-tuning scenarios. A comparative analysis between GPT-4o and the widely used ResNet-50 model was conducted across three resolutions (100, 150, and 256 pixels) and two plant species (apple and corn). Results indicate that fine-tuned GPT-4o models achieved slightly better performance compared to the performance of ResNet-50, achieving up to 98.12% classification accuracy on apple leaf images, compared to 96.88% achieved by ResNet-50, with improved generalization and near-zero training loss. However, zero-shot performance of GPT-4o was significantly lower, underscoring the need for minimal training. Additional evaluations on cross-resolution and cross-plant generalization revealed the models' adaptability and limitations when applied to new domains. The findings highlight the promise of integrating multimodal LLMs into automated disease detection pipelines, enhancing the scalability and intelligence of precision agriculture systems while reducing the dependence on large, labeled datasets and high-resolution sensor infrastructure. Large Language Models, Vision Language Models, LLMs and CNNs, Disease Detection with Vision Language Models, VLMs

  • 5 authors
·
Apr 29, 2025 1

SGUQ: Staged Graph Convolution Neural Network for Alzheimer's Disease Diagnosis using Multi-Omics Data

Alzheimer's disease (AD) is a chronic neurodegenerative disorder and the leading cause of dementia, significantly impacting cost, mortality, and burden worldwide. The advent of high-throughput omics technologies, such as genomics, transcriptomics, proteomics, and epigenomics, has revolutionized the molecular understanding of AD. Conventional AI approaches typically require the completion of all omics data at the outset to achieve optimal AD diagnosis, which are inefficient and may be unnecessary. To reduce the clinical cost and improve the accuracy of AD diagnosis using multi-omics data, we propose a novel staged graph convolutional network with uncertainty quantification (SGUQ). SGUQ begins with mRNA and progressively incorporates DNA methylation and miRNA data only when necessary, reducing overall costs and exposure to harmful tests. Experimental results indicate that 46.23% of the samples can be reliably predicted using only single-modal omics data (mRNA), while an additional 16.04% of the samples can achieve reliable predictions when combining two omics data types (mRNA + DNA methylation). In addition, the proposed staged SGUQ achieved an accuracy of 0.858 on ROSMAP dataset, which outperformed existing methods significantly. The proposed SGUQ can not only be applied to AD diagnosis using multi-omics data but also has the potential for clinical decision-making using multi-viewed data. Our implementation is publicly available at https://github.com/chenzhao2023/multiomicsuncertainty.

  • 7 authors
·
Oct 14, 2024

Determining the Difficulties of Students With Dyslexia via Virtual Reality and Artificial Intelligence: An Exploratory Analysis

Learning disorders are neurological conditions that affect the brain's ability to interconnect communication areas. Dyslexic students experience problems with reading, memorizing, and exposing concepts; however the magnitude of these can be mitigated through both therapies and the creation of compensatory mechanisms. Several efforts have been made to mitigate these issues, leading to the creation of digital resources for students with specific learning disorders attending primary and secondary education levels. Conversely, a standard approach is still missed in higher education. The VRAIlexia project has been created to tackle this issue by proposing two different tools: a mobile application integrating virtual reality (VR) to collect data quickly and easily, and an artificial intelligencebased software (AI) to analyze the collected data for customizing the supporting methodology for each student. The first one has been created and is being distributed among dyslexic students in Higher Education Institutions, for the conduction of specific psychological and psychometric tests. The second tool applies specific artificial intelligence algorithms to the data gathered via the application and other surveys. These AI techniques have allowed us to identify the most relevant difficulties faced by the students' cohort. Our different models have obtained around 90\% mean accuracy for predicting the support tools and learning strategies.

  • 6 authors
·
Jan 15, 2024

Deep Convolutional Neural Network based Classification of Alzheimer's Disease using MRI data

Alzheimer's disease (AD) is a progressive and incurable neurodegenerative disease which destroys brain cells and causes loss to patient's memory. An early detection can prevent the patient from further damage of the brain cells and hence avoid permanent memory loss. In past few years, various automatic tools and techniques have been proposed for diagnosis of AD. Several methods focus on fast, accurate and early detection of the disease to minimize the loss to patients mental health. Although machine learning and deep learning techniques have significantly improved medical imaging systems for AD by providing diagnostic performance close to human level. But the main problem faced during multi-class classification is the presence of highly correlated features in the brain structure. In this paper, we have proposed a smart and accurate way of diagnosing AD based on a two-dimensional deep convolutional neural network (2D-DCNN) using imbalanced three-dimensional MRI dataset. Experimental results on Alzheimer Disease Neuroimaging Initiative magnetic resonance imaging (MRI) dataset confirms that the proposed 2D-DCNN model is superior in terms of accuracy, efficiency, and robustness. The model classifies MRI into three categories: AD, mild cognitive impairment, and normal control: and has achieved 99.89% classification accuracy with imbalanced classes. The proposed model exhibits noticeable improvement in accuracy as compared to the state-fo-the-art methods.

  • 6 authors
·
Jan 8, 2021

Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports

Medical images and radiology reports are crucial for diagnosing medical conditions, highlighting the importance of quantitative analysis for clinical decision-making. However, the diversity and cross-source heterogeneity of these data challenge the generalizability of current data-mining methods. Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence (AGI) for computer vision, showcasing their potential in the biomedical domain. In this study, we evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets, including 5 medical imaging categories (dermatology, radiology, dentistry, ophthalmology, and endoscopy), and 3 radiology report datasets. The investigated tasks encompass disease classification, lesion segmentation, anatomical localization, disease diagnosis, report generation, and lesion detection. Our experimental results demonstrated that Gemini-series models excelled in report generation and lesion detection but faces challenges in disease classification and anatomical localization. Conversely, GPT-series models exhibited proficiency in lesion segmentation and anatomical localization but encountered difficulties in disease diagnosis and lesion detection. Additionally, both the Gemini series and GPT series contain models that have demonstrated commendable generation efficiency. While both models hold promise in reducing physician workload, alleviating pressure on limited healthcare resources, and fostering collaboration between clinical practitioners and artificial intelligence technologies, substantial enhancements and comprehensive validations remain imperative before clinical deployment.

  • 14 authors
·
Jul 8, 2024

Semixup: In- and Out-of-Manifold Regularization for Deep Semi-Supervised Knee Osteoarthritis Severity Grading from Plain Radiographs

Knee osteoarthritis (OA) is one of the highest disability factors in the world. This musculoskeletal disorder is assessed from clinical symptoms, and typically confirmed via radiographic assessment. This visual assessment done by a radiologist requires experience, and suffers from moderate to high inter-observer variability. The recent literature has shown that deep learning methods can reliably perform the OA severity assessment according to the gold standard Kellgren-Lawrence (KL) grading system. However, these methods require large amounts of labeled data, which are costly to obtain. In this study, we propose the Semixup algorithm, a semi-supervised learning (SSL) approach to leverage unlabeled data. Semixup relies on consistency regularization using in- and out-of-manifold samples, together with interpolated consistency. On an independent test set, our method significantly outperformed other state-of-the-art SSL methods in most cases. Finally, when compared to a well-tuned fully supervised baseline that yielded a balanced accuracy (BA) of 70.9pm0.8% on the test set, Semixup had comparable performance -- BA of 71pm0.8% (p=0.368) while requiring 6 times less labeled data. These results show that our proposed SSL method allows building fully automatic OA severity assessment tools with datasets that are available outside research settings.

  • 4 authors
·
Mar 4, 2020

LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models

The prevalence of vision-threatening eye diseases is a significant global burden, with many cases remaining undiagnosed or diagnosed too late for effective treatment. Large vision-language models (LVLMs) have the potential to assist in understanding anatomical information, diagnosing eye diseases, and drafting interpretations and follow-up plans, thereby reducing the burden on clinicians and improving access to eye care. However, limited benchmarks are available to assess LVLMs' performance in ophthalmology-specific applications. In this study, we introduce LMOD, a large-scale multimodal ophthalmology benchmark consisting of 21,993 instances across (1) five ophthalmic imaging modalities: optical coherence tomography, color fundus photographs, scanning laser ophthalmoscopy, lens photographs, and surgical scenes; (2) free-text, demographic, and disease biomarker information; and (3) primary ophthalmology-specific applications such as anatomical information understanding, disease diagnosis, and subgroup analysis. In addition, we benchmarked 13 state-of-the-art LVLM representatives from closed-source, open-source, and medical domains. The results demonstrate a significant performance drop for LVLMs in ophthalmology compared to other domains. Systematic error analysis further identified six major failure modes: misclassification, failure to abstain, inconsistent reasoning, hallucination, assertions without justification, and lack of domain-specific knowledge. In contrast, supervised neural networks specifically trained on these tasks as baselines demonstrated high accuracy. These findings underscore the pressing need for benchmarks in the development and validation of ophthalmology-specific LVLMs.

  • 9 authors
·
Oct 2, 2024

Enhancing End Stage Renal Disease Outcome Prediction: A Multi-Sourced Data-Driven Approach

Objective: To improve prediction of Chronic Kidney Disease (CKD) progression to End Stage Renal Disease (ESRD) using machine learning (ML) and deep learning (DL) models applied to an integrated clinical and claims dataset of varying observation windows, supported by explainable AI (XAI) to enhance interpretability and reduce bias. Materials and Methods: We utilized data about 10,326 CKD patients, combining their clinical and claims information from 2009 to 2018. Following data preprocessing, cohort identification, and feature engineering, we evaluated multiple statistical, ML and DL models using data extracted from five distinct observation windows. Feature importance and Shapley value analysis were employed to understand key predictors. Models were tested for robustness, clinical relevance, misclassification errors and bias issues. Results: Integrated data models outperformed those using single data sources, with the Long Short-Term Memory (LSTM) model achieving the highest AUC (0.93) and F1 score (0.65). A 24-month observation window was identified as optimal for balancing early detection and prediction accuracy. The 2021 eGFR equation improved prediction accuracy and reduced racial bias, notably for African American patients. Discussion: Improved ESRD prediction accuracy, results interpretability and bias mitigation strategies presented in this study have the potential to significantly enhance CKD and ESRD management, support targeted early interventions and reduce healthcare disparities. Conclusion: This study presents a robust framework for predicting ESRD outcomes in CKD patients, improving clinical decision-making and patient care through multi-sourced, integrated data and AI/ML methods. Future research will expand data integration and explore the application of this framework to other chronic diseases.

  • 2 authors
·
Oct 1, 2024

Assessing and Enhancing Large Language Models in Rare Disease Question-answering

Despite the impressive capabilities of Large Language Models (LLMs) in general medical domains, questions remain about their performance in diagnosing rare diseases. To answer this question, we aim to assess the diagnostic performance of LLMs in rare diseases, and explore methods to enhance their effectiveness in this area. In this work, we introduce a rare disease question-answering (ReDis-QA) dataset to evaluate the performance of LLMs in diagnosing rare diseases. Specifically, we collected 1360 high-quality question-answer pairs within the ReDis-QA dataset, covering 205 rare diseases. Additionally, we annotated meta-data for each question, facilitating the extraction of subsets specific to any given disease and its property. Based on the ReDis-QA dataset, we benchmarked several open-source LLMs, revealing that diagnosing rare diseases remains a significant challenge for these models. To facilitate retrieval augmentation generation for rare disease diagnosis, we collect the first rare diseases corpus (ReCOP), sourced from the National Organization for Rare Disorders (NORD) database. Specifically, we split the report of each rare disease into multiple chunks, each representing a different property of the disease, including their overview, symptoms, causes, effects, related disorders, diagnosis, and standard therapies. This structure ensures that the information within each chunk aligns consistently with a question. Experiment results demonstrate that ReCOP can effectively improve the accuracy of LLMs on the ReDis-QA dataset by an average of 8%. Moreover, it significantly guides LLMs to generate trustworthy answers and explanations that can be traced back to existing literature.

  • 10 authors
·
Aug 15, 2024

An Explainable Diagnostic Framework for Neurodegenerative Dementias via Reinforcement-Optimized LLM Reasoning

The differential diagnosis of neurodegenerative dementias is a challenging clinical task, mainly because of the overlap in symptom presentation and the similarity of patterns observed in structural neuroimaging. To improve diagnostic efficiency and accuracy, deep learning-based methods such as Convolutional Neural Networks and Vision Transformers have been proposed for the automatic classification of brain MRIs. However, despite their strong predictive performance, these models find limited clinical utility due to their opaque decision making. In this work, we propose a framework that integrates two core components to enhance diagnostic transparency. First, we introduce a modular pipeline for converting 3D T1-weighted brain MRIs into textual radiology reports. Second, we explore the potential of modern Large Language Models (LLMs) to assist clinicians in the differential diagnosis between Frontotemporal dementia subtypes, Alzheimer's disease, and normal aging based on the generated reports. To bridge the gap between predictive accuracy and explainability, we employ reinforcement learning to incentivize diagnostic reasoning in LLMs. Without requiring supervised reasoning traces or distillation from larger models, our approach enables the emergence of structured diagnostic rationales grounded in neuroimaging findings. Unlike post-hoc explainability methods that retrospectively justify model decisions, our framework generates diagnostic rationales as part of the inference process-producing causally grounded explanations that inform and guide the model's decision-making process. In doing so, our framework matches the diagnostic performance of existing deep learning methods while offering rationales that support its diagnostic conclusions.

  • 6 authors
·
May 26, 2025 2

BRAINS: A Retrieval-Augmented System for Alzheimer's Detection and Monitoring

As the global burden of Alzheimer's disease (AD) continues to grow, early and accurate detection has become increasingly critical, especially in regions with limited access to advanced diagnostic tools. We propose BRAINS (Biomedical Retrieval-Augmented Intelligence for Neurodegeneration Screening) to address this challenge. This novel system harnesses the powerful reasoning capabilities of Large Language Models (LLMs) for Alzheimer's detection and monitoring. BRAINS features a dual-module architecture: a cognitive diagnostic module and a case-retrieval module. The Diagnostic Module utilizes LLMs fine-tuned on cognitive and neuroimaging datasets -- including MMSE, CDR scores, and brain volume metrics -- to perform structured assessments of Alzheimer's risk. Meanwhile, the Case Retrieval Module encodes patient profiles into latent representations and retrieves similar cases from a curated knowledge base. These auxiliary cases are fused with the input profile via a Case Fusion Layer to enhance contextual understanding. The combined representation is then processed with clinical prompts for inference. Evaluations on real-world datasets demonstrate BRAINS effectiveness in classifying disease severity and identifying early signs of cognitive decline. This system not only shows strong potential as an assistive tool for scalable, explainable, and early-stage Alzheimer's disease detection, but also offers hope for future applications in the field.

eliteresearch ELITE Research Lab
·
Nov 4, 2025 1

SSVEP-Based BCI Wheelchair Control System

A brain-computer interface (BCI) is a system that allows a person to communicate or control the surroundings without depending on the brain's normal output pathways of peripheral nerves and muscles. A lot of successful applications have arisen utilizing the advantages of BCI to assist disabled people with so-called assistive technology. Considering using BCI has fewer limitations and huge potential, this project has been proposed to control the movement of an electronic wheelchair via brain signals. The goal of this project is to help disabled people, especially paralyzed people suffering from motor disabilities, improve their life qualities. In order to realize the project stated above, Steady-State Visual Evoked Potential (SSVEP) is involved. It can be easily elicited in the visual cortical with the same frequency as the one is being focused by the subject. There are two important parts in this project. One is to process the EEG signals and another one is to make a visual stimulator using hardware. The EEG signals are processed in Matlab using the algorithm of Butterworth Infinite Impulse Response (IIR) bandpass filter (for preprocessing) and Fast Fourier Transform (FFT) (for feature extraction). Besides, a harmonics-based classification method is proposed and applied in the classification part. Moreover, the design of the visual stimulator combines LEDs as flickers and LCDs as information displayers on one panel. Microcontrollers are employed to control the SSVEP visual stimuli panel. This project is evaluated by subjects with different races and ages. Experimental results show the system is easy to be operated and it can achieve approximately a minimum 1-second time delay. So it demonstrates that this SSVEP-based BCI-controlled wheelchair has a huge potential to be applied to disabled people in the future.

  • 1 authors
·
Jul 12, 2023

Advancing Tabular Stroke Modelling Through a Novel Hybrid Architecture and Feature-Selection Synergy

Brain stroke remains one of the principal causes of death and disability worldwide, yet most tabular-data prediction models still hover below the 95% accuracy threshold, limiting real-world utility. Addressing this gap, the present work develops and validates a completely data-driven and interpretable machine-learning framework designed to predict strokes using ten routinely gathered demographic, lifestyle, and clinical variables sourced from a public cohort of 4,981 records. We employ a detailed exploratory data analysis (EDA) to understand the dataset's structure and distribution, followed by rigorous data preprocessing, including handling missing values, outlier removal, and class imbalance correction using Synthetic Minority Over-sampling Technique (SMOTE). To streamline feature selection, point-biserial correlation and random-forest Gini importance were utilized, and ten varied algorithms-encompassing tree ensembles, boosting, kernel methods, and a multilayer neural network-were optimized using stratified five-fold cross-validation. Their predictions based on probabilities helped us build the proposed model, which included Random Forest, XGBoost, LightGBM, and a support-vector classifier, with logistic regression acting as a meta-learner. The proposed model achieved an accuracy rate of 97.2% and an F1-score of 97.15%, indicating a significant enhancement compared to the leading individual model, LightGBM, which had an accuracy of 91.4%. Our study's findings indicate that rigorous preprocessing, coupled with a diverse hybrid model, can convert low-cost tabular data into a nearly clinical-grade stroke-risk assessment tool.

  • 3 authors
·
May 18, 2025

Decade of Natural Language Processing in Chronic Pain: A Systematic Review

In recent years, the intersection of Natural Language Processing (NLP) and public health has opened innovative pathways for investigating various domains, including chronic pain in textual datasets. Despite the promise of NLP in chronic pain, the literature is dispersed across various disciplines, and there is a need to consolidate existing knowledge, identify knowledge gaps in the literature, and inform future research directions in this emerging field. This review aims to investigate the state of the research on NLP-based interventions designed for chronic pain research. A search strategy was formulated and executed across PubMed, Web of Science, IEEE Xplore, Scopus, and ACL Anthology to find studies published in English between 2014 and 2024. After screening 132 papers, 26 studies were included in the final review. Key findings from this review underscore the significant potential of NLP techniques to address pressing challenges in chronic pain research. The past 10 years in this field have showcased the utilization of advanced methods (transformers like RoBERTa and BERT) achieving high-performance metrics (e.g., F1>0.8) in classification tasks, while unsupervised approaches like Latent Dirichlet Allocation (LDA) and k-means clustering have proven effective for exploratory analyses. Results also reveal persistent challenges such as limited dataset diversity, inadequate sample sizes, and insufficient representation of underrepresented populations. Future research studies should explore multimodal data validation systems, context-aware mechanistic modeling, and the development of standardized evaluation metrics to enhance reproducibility and equity in chronic pain research.

  • 1 authors
·
Dec 19, 2024