Department of Surgery

Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI).

Jerome R Lechien
Antonino Maniaci
Isabelle Gengler MD, Lehigh Valley Health NetworkFollow
Stephane Hans
Carlos M Chiesa-Estomba
Luigi A Vaira

Publication/Presentation Date

4-1-2024

Abstract

OBJECTIVES: To evaluate the reliability and validity of the Artificial Intelligence Performance Instrument (AIPI).

METHODS: Medical records of patients consulting in otolaryngology were evaluated by physicians and ChatGPT for differential diagnosis, management, and treatment. The ChatGPT performance was rated twice using AIPI within a 7-day period to assess test-retest reliability. Internal consistency was evaluated using Cronbach's α. Internal validity was evaluated by comparing the AIPI scores of the clinical cases rated by ChatGPT and 2 blinded practitioners. Convergent validity was measured by comparing the AIPI score with a modified version of the Ottawa Clinical Assessment Tool (OCAT). Interrater reliability was assessed using Kendall's tau.

RESULTS: Forty-five patients completed the evaluations (28 females). The AIPI Cronbach's alpha analysis suggested an adequate internal consistency (α = 0.754). The test-retest reliability was moderate-to-strong for items and the total score of AIPI (r

CONCLUSIONS: AIPI is a valid and reliable instrument in assessing the performance of ChatGPT in ear, nose and throat conditions. Future studies are needed to investigate the usefulness of AIPI in medicine and surgery, and to evaluate the psychometric properties in these fields.

Volume

281

Issue

First Page

2063

Last Page

2079

ISSN

1434-4726

Published In/Presented At

Lechien, J. R., Maniaci, A., Gengler, I., Hans, S., Chiesa-Estomba, C. M., & Vaira, L. A. (2024). Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI). European archives of oto-rhino-laryngology : official journal of the European Federation of Oto-Rhino-Laryngological Societies (EUFOS) : affiliated with the German Society for Oto-Rhino-Laryngology - Head and Neck Surgery, 281(4), 2063–2079. https://doi.org/10.1007/s00405-023-08219-y

Disciplines

Medicine and Health Sciences

PubMedID

37698703

Department(s)

Department of Surgery, Division of Otolaryngology

Document Type

Article

Link to Full Text

Find in your library

COinS

Department of Surgery

Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI).

Publication/Presentation Date

Abstract

Volume

Issue

First Page

Last Page

ISSN

Published In/Presented At

Disciplines

PubMedID

Department(s)

Document Type

Search

Browse

Author Corner

Department of Surgery

Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI).

Authors

Publication/Presentation Date

Abstract

Volume

Issue

First Page

Last Page

ISSN

Published In/Presented At

Disciplines

PubMedID

Department(s)

Document Type

Share

Search

Browse

Author Corner