Medical Metaverse, Part 2: Artificial Intelligence Algorithms and Large Language Models in Psychiatry and Clinical Neurosciences
The medical metaverse and its emerging technologies, such as artificial intelligence (AI), are transforming medical education, neuropsychiatric practice, and clinical neurosciences (6–8). AI provides earlier detection of many diseases and facilitates the clinical management of physical and mental conditions (8). For example, AI algorithms (e.g., deep learning and machine learning) can learn and recognize specific patterns in pathological slides and brain images associated with various diseases and brain disorders, such as Parkinson’s disease and Alzheimer’s disease (9–12).
Electronic large language models (LLMs) are expanding, and their use is becoming more standardized in medicine, including medical education, research, and health care (13, 14) (Figure 1). LLMs include software featuring AI natural language processing (NLP) and conversational capabilities, called “chatbots” (13). There are a number of clinical LLMs, such as GatorTron, which is the largest of these clinical language models (using more than 90 billion words of text, including more than 82 billion words of deidentified clinical text) (13, 15). Florence and its successor Pahola are chatbots designed by the World Health Organization and Pan American Health Organization that were introduced during the COVID-19 pandemic to foster healthier lifestyles and mental health (16). Woebot, Wysa, and Leora are other mental health–focused LLMs available to individuals who prefer interacting with chatbots (17).
Chat generative pretrained transformer (ChatGPT) is a recent LLM application created by Open AI, Inc., enabling public users to ask a computer questions using natural and colloquial language (14). Most of the latest generation of NLP technologies are configured on the basis of the “transformer model,” capable of recognizing text input and differentially weighting the significance of each part of the prompt (4, 18, 19) (Figure 2). The most recent versions of LLMs can generate meaningful and rationalized contextual information that may be indistinguishable from text produced by humans (5).
The latest LLM, ChatGPT-4, uses the plethora of information and documents available on the World Wide Web. Its algorithm procures the online data, and within seconds it elaborates well-supported essays, reports, critical evaluations, and research articles (5). A recent study evaluating the use of ChatGPT in medical education reported that this technology can achieve passing scores comparable to a third-year medical student on Steps 1 and 2 of the United States Medical Licensing Examination (USMLE) (20). Another study assessed its performance on Steps 1–3 of the USMLE and found that ChatGPT can achieve scores near the passing threshold for all three examinations without prior training or specialized support (21). In addition, when tested on a clinical toxicology case example involving organophosphate poisoning, the responses generated by ChatGPT were appropriate and provided good explanations of the underlying clinical reasoning (22). These results suggest that ChatGPT could become a supportive learning tool in medical education and practice, among other clinical applications (20–22).
In a research context, ChatGPT can facilitate reviewing and writing scientific articles by using the evidence available from thousands of online search engines (23). This technology could transform scientific and medical writing by saving time and increasing efficiency (24). There are increasing concerns about ChatGPT’s potential role in plagiarism and the impact on academic research (5, 25). However, ChatGPT-generated scientific papers may lack clinical reasoning and critical thinking (23). Therefore, ChatGPT cannot generate documents with original, logical, and customized text (i.e., personalized phrases) (25).
In addition, advances in AI, including ChatGPT, could assist in the design, development, and safety assessment of new drugs (26). These technologies have the potential to analyze chemical formulas and molecular algorithms, fostering the development of new biochemical compounds and formulations leading to the discovery of new medications (26).
AI in Psychiatry and Clinical Neurosciences
ChatGPT and similar AI applications are becoming important training tools for medical students and residents in neurology and can be beneficial for those in psychiatry and clinical neurosciences (7, 27–29). However, the use of technology in the general mental health field has been limited to brain imaging and other routine diagnostic screening tools (e.g., blood tests and urinalysis) (6–8).
Neurological Conditions
Some AI technologies are being integrated into clinical use for the rapid detection of disease, for disease management, and for the treatment of physical and neurological conditions (8, 12). For example, AI segmentation and quantification are used to assess neuroimaging data (12). Deep-learning and machine-learning algorithms can facilitate the interpretation of computed tomography scans among patients with traumatic brain injury or other neurological conditions (12, 30). AI algorithms can gather individualized patient data (e.g., genetic profiles, patient history, and response to interventions) to help tailor treatment plans, supporting a more personalized therapeutic approach (29). In addition, AI-supported technologies (e.g., wearables) can document interventional progress in real time, providing continuous physiological feedback (e.g., electrocardiogram [ECG] and digital phenotyping), which facilitates prompt and appropriate therapeutic adjustments when needed (31).
AI algorithms also have been integrated in the diagnosis, prognosis, and monitoring of motor neuron diseases and typically incorporate the progressive dysfunction of lower motor neurons or upper motor neurons within the central nervous system (32, 33). One study used machine learning and lipidomics to distinguish primary lateral sclerosis (PLS) from amyotrophic lateral sclerosis (ALS). The investigators reported that patients with PLS can be accurately distinguished (specificity >88%) from those with ALS and individuals in a healthy control group by using machine-learning-supervised analysis of lipidome profiles (34). A recent review highlighted the impact of machine learning and other AI advances for patients with motor neuron diseases (33).
The ability of LLMs to analyze large data sets facilitates the identification and compilation of relevant clinical evidence among patients with epilepsy, such as patient subgroups, seizure patterns, best interventional options, and other treatment parameters (35, 36). For example, LLMs can facilitate identification of individuals who are early candidates for resective epilepsy surgery. These analyses may mitigate issues such as surgical delays or procedural inadequacies and thus improve patient outcomes and save lives (37).
Neurodegenerative Diseases
LLMs may also have an important impact in the early detection and treatment of neurodegenerative diseases (12). Traditional diagnostic tools (patient history, brain scans, etc.) are inadequate for predicting which candidates who meet the criteria for mild cognitive impairment will eventually develop Alzheimer’s disease (38). Empirically, AI technologies can examine comprehensive sets of multimodal clinical data, including cognitive test results and neuropathology data sets of large cohorts, to generate improved predictive models of neurodegenerative diseases (39). One study used data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database and developed a modeling system of continuous measurements (i.e., change in ADNI–Memory scores) of the progression to Alzheimer’s disease. The results suggested that machine-learning algorithms were better than binary classification (yes or no) for stratifying individuals on the basis of prognostic disease trajectories, reducing misclassification (40).
A more recent study described an innovative multivariable model employing machine learning to accurately differentiate abnormal neuroimaging profiles on 123I-ioflupane single-photon emission computed tomography images for differential diagnosis of Parkinson’s syndrome, Parkinson’s disease, and dementia with Lewy bodies (41). The machine-learning method demonstrated high diagnostic accuracy (from 0.86 to 0.93) for the differentiation of each condition compared with conventional methods on the basis of calculating specific binding ratios derived from regions of interest on neuroimaging scans (41).
Depression, Anxiety, and Suicide Prevention
Machine-learning algorithms and ECG signals can be used to enhance the initial screening for major depressive disorder (42). In a study that integrated machine-learning algorithms to analyze polysomnographic data and ECG signals, the AI-assisted model was highly accurate (86.32%) and specific (86.49%) in predicting major depressive disorder and suggested that gender was among the most important factors in that prediction (42).
Mental health chatbots are becoming increasingly valuable for individuals with depression and anxiety disorders (17). For example, Leora is a sophisticated version of ChatGPT capable of communicating with users about their mental health status and providing immediate help for those with minimal to mild symptoms of anxiety and depression. Chatbots like Leora may improve access, provide uninterrupted support, and triage individuals who are unwilling or unable to see mental health therapists (17). Importantly, integration of AI in psychiatry holds promise for assessments of early depression, interventions, and suicide prevention (17, 43, 44). For example, NLP algorithms can analyze online interactions (e.g., posts and discussions in social media platforms) to identify patterns and vulnerable emotional states associated with increased risk of self-harm (44).
Substance Use Disorders
Excessive alcohol consumption and associated mental and public health issues (e.g., substance use disorders, traffic accidents, and alcohol-related violence) could be reduced by the emerging advances in AI technologies (45–47). For example, chatbots can facilitate alcohol education among consumers, thus empowering them to make better decisions regarding alcohol consumption (45). AI algorithms could be a suitable alternative to traditional breathalyzers (measuring blood alcohol concentration) for identifying intoxicated individuals (46, 48). Audio-based deep-learning algorithms can predict an individual’s intoxication status within seconds on the basis of the individual’s speech recordings (46). Furthermore, some preliminary evidence suggests that AI algorithms may assist in predicting the risks and outcomes of substance use (47, 49, 50).
Conclusions
In summary, LLMs may represent a revolution in medicine, including psychiatry and clinical neurosciences. However, the impact of AI on these fields remains uncertain. AI models appear to be promising in medical education, clinical training, and basic and translational research. Nonetheless, there is lack of dissemination of AI-related resources among medical school faculty and clinicians and, consequentially, insufficient technology integration into medical curricula. These technologies also have a role in fields relying on analyses of large data sets (e.g., medical records, biostatistics, and medical imaging). Taken together, these technologies have the potential to improve the delivery of mental health services and clinical outcomes.
The integration of AI innovations into mental health clinical practice may increase the efficiency and reduce the clinical caseload of doctors, therapists, and ancillary health care professionals. Thus, these innovations could allow care providers to focus on higher-level functions requiring human judgment. AI models, such as ChatGPT, may function best as supporting tools, not as substitutes for physicians and other medical professionals. Furthermore, governmental and health care agencies need to develop appropriate guidelines, including strategies to mitigate potential risks (e.g., data breaches and unauthorized access) and undesirable outcomes (e.g., overreliance on AI and digital automation technologies could result in reduced human empathy, creativity, reasoning, and emotional expression, affecting social skills and family and peer interactions), because the progression of AI, LLMs, and similar technologies seems unstoppable in the metaverse era.
1. : Attention is All you Need. Ithaca, NY, arXiv, 2017. http://arxiv.org/abs/1706.03762Google Scholar
2. : Improving Language Understanding by Generative Pre-Training. Seattle, Semantic Scholar, 2018. https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford-Narasimhan/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035Google Scholar
3. : Language Models are Unsupervised Multitask Learners. Seattle, Semantic Scholar, 2019. https://www.semanticscholar.org/paper/Language-Models-are-Unsupervised-Multitask-Learners-Radford-Wu/9405cc0d6169988371b2755e573cc28650d14dfeGoogle Scholar
4. : ChatGPT: jack of all trades, master of none. Inf Fusion 2023; 99:101861 Crossref, Google Scholar
5. : Intelligence or artificial intelligence? More hard problems for authors of Biological Psychology, the neurosciences, and everyone else. Biol Psychol 2023; 181:108590Crossref, Medline, Google Scholar
6. : The medical metaverse, part 1: introduction, definitions, and new horizons for neuropsychiatry. J Neuropsychiatry Clin Neurosci 2023; 35:A4, 1–3Link, Google Scholar
7. : AI and psychiatry: the ChatGPT perspective. Alpha Psychiatry 2023; 24:41–42Medline, Google Scholar
8. : Artificial intelligence and psychiatry: an overview. Asian J Psychiatr 2022; 70:103021Crossref, Medline, Google Scholar
9. : Deep learning regressor model based on nigrosome MRI in Parkinson syndrome effectively predicts striatal dopamine transporter-SPECT uptake. Neuroradiology 2023; 65:1101–1109Crossref, Medline, Google Scholar
10. : Digital pathology and artificial intelligence. Lancet Oncol 2019; 20:e253–e261Crossref, Medline, Google Scholar
11. : Estimating explainable Alzheimer’s disease likelihood map via clinically-guided prototype learning. NeuroImage 2023; 273:120073Crossref, Medline, Google Scholar
12. : Diverse applications of artificial intelligence in neuroradiology. Neuroimaging Clin N Am 2020; 30:505–516Crossref, Medline, Google Scholar
13. : AI chatbots not yet ready for clinical use. Front Digit Health 2023; 5:1161098Crossref, Medline, Google Scholar
14. : The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ 2023; 9:e46885Crossref, Medline, Google Scholar
15. : GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records. Ithaca, NY, arXiv, 2022. http://arxiv.org/abs/2203.03540Google Scholar
16. Using AI to Lead a Healthier Lifestyle. Geneva, World Health Organization, 2022. https://www.who.int/campaigns/FlorenceGoogle Scholar
17. : Providing self-led mental health support through an artificial intelligence-powered chat bot (Leora) to meet the demand of mental health care. J Med Internet Res 2023; 25:e46448Crossref, Medline, Google Scholar
18. : Large Language Models Encode Clinical Knowledge. Ithaca, NY, arXiv, 2022. http://arxiv.org/abs/2212.13138Google Scholar
19. : BioMedLM: A Domain-Specific Large Language Model for Biomedical Text. San Francisco, MosaicML, 2022. https://www.mosaicml.com/blog/introducing-pubmed-gptGoogle Scholar
20. : How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 2023; 9:e45312Crossref, Medline, Google Scholar
21. : Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health 2023; 2:e0000198Crossref, Medline, Google Scholar
22. : ChatGPT in clinical toxicology. JMIR Med Educ 2023; 9:e46876Crossref, Medline, Google Scholar
23. : The future of medical education and research: is ChatGPT a blessing or blight in disguise? Med Educ Online 2023; 28:2181052Crossref, Medline, Google Scholar
24. : ChatGPT and the future of medical writing. Radiology 2023; 307:e223312Crossref, Medline, Google Scholar
25. : Generative artificial intelligence as a plagiarism problem. Biol Psychol 2023; 181:108621Crossref, Medline, Google Scholar
26. : Developing role for artificial intelligence in drug discovery in drug design, development, and safety assessment. Chem Res Toxicol 2022; 35:1925–1928Crossref, Medline, Google Scholar
27. : Artificial intelligence and the future of psychiatry. IEEE Pulse 2020; 11:2–6Crossref, Medline, Google Scholar
28. : Evaluating the limits of AI in medical specialisation: ChatGPT’s performance on the UK neurology specialty certificate examination. BMJ Neurol Open 2023; 5:e000451Crossref, Medline, Google Scholar
29. : Machine learning approaches for clinical psychology and psychiatry. Annu Rev Clin Psychol 2018; 14:91–118Crossref, Medline, Google Scholar
30. : AI-based decision support system for traumatic brain injury: a survey. Diagnostics 2023; 13:1640Crossref, Medline, Google Scholar
31. : Digital phenotyping: technology for a new science of behavior. JAMA 2017; 318:1215–1216Crossref, Medline, Google Scholar
32. : Motor neuron susceptibility in ALS/FTD. Front Neurosci 2019; 13:532Crossref, Medline, Google Scholar
33. : Implications of artificial intelligence algorithms in the diagnosis and treatment of motor neuron diseases: a review. Life 2023; 13:1031Crossref, Medline, Google Scholar
34. : Utilizing machine learning and lipidomics to distinguish primary lateral sclerosis from amyotrophic lateral sclerosis. Muscle Nerve 2023; 67:306–310Crossref, Medline, Google Scholar
35. : Are AI language models such as ChatGPT ready to improve the care of individuals with epilepsy? Epilepsia 2023; 64:1195–1199Crossref, Medline, Google Scholar
36. : Development of a natural language processing algorithm to extract seizure types and frequencies from the electronic health record. Seizure 2022; 101:48–51Crossref, Medline, Google Scholar
37. : Prospective validation of a machine learning model that uses provider notes to identify candidates for resective epilepsy surgery. Epilepsia 2020; 61:39–48Crossref, Medline, Google Scholar
38. : A deep learning model for early prediction of Alzheimer’s disease dementia based on hippocampal magnetic resonance imaging data. Alzheimers Dement 2019; 15:1059–1070Crossref, Medline, Google Scholar
39. : Building better biomarkers: brain models in translational neuroimaging. Nat Neurosci 2017; 20:365–377Crossref, Medline, Google Scholar
40. : Modelling prognostic trajectories of cognitive decline due to Alzheimer’s disease. Neuroimage Clin 2020; 26:102199Crossref, Medline, Google Scholar
41. : Diagnosis of Parkinson syndrome and Lewy-body disease using 123I-ioflupane images and a model with image features based on machine learning. Ann Nucl Med 2022; 36:765–776Crossref, Medline, Google Scholar
42. : Identification of major depression patients using machine learning models based on heart rate variability during sleep stages for pre-hospital screening. Comput Biol Med 2023; 162:107060Crossref, Medline, Google Scholar
43. : MHA: a multimodal hierarchical attention model for depression detection in social media. Health Inf Sci Syst 2023; 11:6Crossref, Medline, Google Scholar
44. : Artificial intelligence and suicide prevention: a systematic review. Eur Psychiatry 2022; 65:1–22Crossref, Medline, Google Scholar
45. : Using the Pan American Health Organization digital conversational agent to educate the public on alcohol use and health: preliminary analysis. JMIR Form Res 2023; 7:e43165Crossref, Medline, Google Scholar
46. : Audio-based deep learning algorithm to identify alcohol inebriation (ADLAIA). Alcohol 2023; 109:49–54Crossref, Medline, Google Scholar
47. : Analysis of substance use and its outcomes by machine learning, I: childhood evaluation of liability to substance use disorder. Drug Alcohol Depend 2020; 206:107605Crossref, Medline, Google Scholar
48. : The accuracy and promise of personal breathalysers for research: steps toward a cost-effective reliable measure of alcohol intoxication? Digit Health 2017; 3:2055207617746752Crossref, Google Scholar
49. : Analysis of substance use and its outcomes by machine learning, II: derivation and prediction of the trajectory of substance use severity. Drug Alcohol Depend 2020; 206:107604Crossref, Medline, Google Scholar
50. : Development and evaluation of a risk algorithm predicting alcohol dependence after early onset of regular alcohol use. Addiction 2023; 118:954–966Crossref, Medline, Google Scholar