Can Patients Differentiate Responses by Orthopaedic Surgeons and ChatGPT Regarding Total Hip Arthroplasty?.

Prasad A.K.; Lim P.L.; Blackburn A.Z.; Cote M.P.; Ensor D.; Succi M.D.; Alpaugh K.; Melnic C.M.; Sepucha K.R.; Bedair, H. S.

Can Patients Differentiate Responses by Orthopaedic Surgeons and ChatGPT Regarding Total Hip Arthroplasty?.

Authors

Prasad A.K.

Lim P.L.

Blackburn A.Z.

Cote M.P.

Ensor D.

Succi M.D.

Alpaugh K.

Melnic C.M.

Sepucha K.R.

Bedair, H. S.

Check for full-text access

http://libkey.io/10.5435/JAAOS-D-25-00471

Issue Date

2025

Type

Article

Abstract

INTRODUCTION: The digital age offers vast health information, boosting patient health literacy but also fostering challenges like misinformation. Large language models-driven chatbots such as chat generative pretrained transformer 3 (ChatGPT 3) blur human artificial intelligence (AI) content creation boundaries, potentially affecting healthcare literacy. This study aimed to assess perceived message credibility by patients when comparing healthcare descriptions by ChatGPT 3 or orthopaedic attendings. METHOD(S): A cross-sectional survey of 160 patients assessed hip arthroplasty topics at increasing complexities: (1) "What is hip arthritis?"; (2) "What is a total hip replacement?"; (3) "What are the risks of undergoing a hip replacement?"; and (4) "What is the treatment of a periprosthetic joint infection of the hip?." Patients received one question randomly and compared four answers: one from ChatGPT and three from orthopaedic surgeons. Patients rated each answer (1 to 7) on credibility (accuracy, authenticity, believability) and indicated their most trusted answer. Multilevel modeling accounted for varying intercepts, providing estimated scores for each source. RESULT(S): Across all four questions, ChatGPT matched or exceeded the performance of orthopaedic surgeons across the domains of accuracy, authenticity, and believability. In three questions, ChatGPT surpassed at least one surgeon in one domain and, in two questions, outperformed at least two surgeons in two or more domains. Multilevel modeling revealed that ChatGPT scored the highest across all three credibility domains. Notably, ChatGPT's answer was the most trusted response by patients in two out of the four questions (Question #1 and #3). CONCLUSION(S): The study suggests that patients perceive ChatGPT as highly credible, particularly in terms of accuracy, authenticity, and believability compared with orthopaedic surgeons. These findings underscore the potential of AI to improve healthcare literacy and patient decision making. Further research is warranted to explore the nuances of patient trust and preferences in AI-generated healthcare content. LEVEL OF EVIDENCE: Level II, prospective, comparative study. Copyright Â© 2025 by the American Academy of Orthopaedic Surgeons.

Journal

The Journal of the American Academy of Orthopaedic Surgeons

Issue

Transcatheter Cardiovascular Therapeutics Abstracts.

URI

https://hdl.handle.net/20.500.14753/409

Collections

Surgery

Full item page

Can Patients Differentiate Responses by Orthopaedic Surgeons and ChatGPT Regarding Total Hip Arthroplasty?.

Authors

Check for full-text access

Issue Date

Type

Language

Keywords

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Description

Citation

Publisher

License

Journal

Volume

Issue

URI

PubMed ID

DOI

ISSN

EISSN

Collections