Department of Surgery

Comparison of quality, empathy and readability of physician responses versus chatbot responses to common cerebrovascular neurosurgical questions on a social media platform.

Joanna M Roy
Elias Atallah
Keenan Piper
Shyam Majmundar
Nikolaos Mouchtouris
D Mitchell Self
Anand Kaul MD, Lehigh Valley Health NetworkFollow
Saman Sizdahkhani
Basel Musmar
Stavropoula I Tjoumakaris
Michael R Gooch
Robert H Rosenwasser
Pascal M Jabbour

Publication/Presentation Date

8-1-2025

Abstract

BACKGROUND: Social media platforms are utilized by patients prior to scheduling formal consultations and also serve as a means of pursuing second opinions. Cerebrovascular pathologies require regular surveillance and specialized care. In recent years, chatbots have been trained to provide information on neurosurgical conditions. However, their ability to answer questions in vascular neurosurgery have not been evaluated in comparison to physician responses. Our study is a pilot study evaluating the accuracy, completeness, empathy, and readability of responses provided by ChatGPT 3.5 (Open AI, San Francisco) to standard specialist physician responses on social media.

METHODS: We identified the top 50 cerebrovascular questions and their verified physician responses from Reddit. These questions were inputted into ChatGPT. Responses were anonymized and ranked on a Likert scale for accuracy regarding neurosurgical guidelines, completeness and empathy by four independent reviewers. Readability was assessed using standardized indexes (Flesch Reading Ease, Flesch Kincaid Grade, Gunning Fox Index, Simple Measure of "Gobbledygook" (SMOG) Index, Automated Readability Index and Coleman Liau Index).

RESULTS: Responses provided by ChatGPT had significantly higher ratings of completeness (median (IQR) 3 (2-3) vs. 2 (1-3) and empathy 4 (3-5) vs. 2 (1-3) compared to physician responses, respectively (p <  0.001). Accuracy of healthcare information did not differ significantly (4 (3-4) vs. (4 (3-4), p = 0.752). Physician responses had significantly higher ease of readability and lower grade-level readability compared to ChatGPT (p <  0.001).

CONCLUSION: Our results suggest higher empathy and completeness of information provided by ChatGPT compared to physicians. However, these responses are at readability levels higher than the literacy of the average American population. Future research could emphasize incorporating chatbot responses while drafting physician responses to provide more balanced information to healthcare questions.

Volume

255

First Page

108986

Last Page

108986

ISSN

1872-6968

Published In/Presented At

Roy, J. M., Atallah, E., Piper, K., Majmundar, S., Mouchtouris, N., Self, D. M., Kaul, A., Sizdahkhani, S., Musmar, B., Tjoumakaris, S. I., Gooch, M. R., Rosenwasser, R. H., & Jabbour, P. M. (2025). Comparison of quality, empathy and readability of physician responses versus chatbot responses to common cerebrovascular neurosurgical questions on a social media platform. Clinical neurology and neurosurgery, 255, 108986. https://doi.org/10.1016/j.clineuro.2025.108986

Disciplines

Medicine and Health Sciences

PubMedID

40451125

Department(s)

Department of Surgery

Document Type

Article

Link to Full Text

Find in your library

COinS

Department of Surgery

Comparison of quality, empathy and readability of physician responses versus chatbot responses to common cerebrovascular neurosurgical questions on a social media platform.

Publication/Presentation Date

Abstract

Volume

First Page

Last Page

ISSN

Published In/Presented At

Disciplines

PubMedID

Department(s)

Document Type

Search

Browse

Author Corner

Department of Surgery

Comparison of quality, empathy and readability of physician responses versus chatbot responses to common cerebrovascular neurosurgical questions on a social media platform.

Authors

Publication/Presentation Date

Abstract

Volume

First Page

Last Page

ISSN

Published In/Presented At

Disciplines

PubMedID

Department(s)

Document Type

Share

Search

Browse

Author Corner