Introduction
Many forms of AI have been developed and most have been used without second thought even among academic writers. Such list includes the use of various search engines, reference management tools, plagiarism detectors, language editing tools and data management software’s among others. But the emergence of large Language Models (LLM) such as generative AIs has caused serious concerns among both researchers and developers alike. This editorial will focus on LLM based AIs since they have the potential to raise serious ethical concerns among academic writers including surgical research processes. The LLM generates content that has a natural conversation flow answering questions instantly, answering examination questions, and writing poems among many capabilities. This inherent ability of generative AIs renders them with potential to generate a scientific paper. In recent publications, authors have been citing AI as a co-authors and others admitted using AI to generate their manuscripts.1 Many LLM based AIs are currently available and the list is growing very fast with the likes of ChatGPT, Gemini, Elicit, JANE among others freely available. This has opened a pandora box that we as a surgical research community must explore to identify pitfalls and tradeoffs in their use.
Many surgeons and surgical residents must be asking themselves some of these questions: what is Artificial Intelligence (AI)? How does it work? Is it ethical and acceptable? And more still, is there guidance for safe usage? This editorial will try to address some of these issues as we strive to provide unbiased opinions on the matter. This is an important topic for both practicing surgeons and trainees alike. While AI has multifaceted use in the practice of surgery, we shall focus our attention on its use in the conduct, review and dissemination of research. AI has gained wide popularity in the research community. This adoption will not be without pitfalls, but we believe trade-offs also exist. But we first need to understand what AI is and how it works briefly.
The development of AI was originally meant to solve some real or perceived tasks that seemed demanding for human capacity. However, throughout its developmental stages, AI had a lot of professional and ethical concerns equally among developers and later funders.2 These tensions underscore the ongoing ethical debates surrounding AI, particularly in sensitive fields like surgical research, where ensuring responsible application is crucial to balancing innovation with accountability.3 In realizing the position of any academic writing to medical practice, it is important that as surgeons we understand the basics of AI and develop a common sense of how to navigate it to ensure safe usage of the various available platforms. A better understanding of AI will allow surgeons to use new tools wisely for the benefit of their patients.4 This understanding will help surgeons and researchers to better understand the potential that come with AI, how they can contribute to it and how to use it safely.
In defining current generative AI in use, Kaplan and colleagues referred to them as a system with the ability to correctly interpret external data, learn from such data, and use those learnings to achieve specific goals and tasks through flexible adaptation.5 In their current forms, the category of Generative Pre-trained Transformers (GPT) refers to Large Language Models (LLMs) that use deep learning techniques for extensive training with tremendous amounts of data.6 The generative AI can generate human-like text and creative content, such as music and images, as well as consolidate data from different sources for analysis.7 This capability allows the generative AI to have capacity to give a perception to researchers that the responses are from human beings rather than from machines, passing the Turing test. It must be noted that AI generated information is biased8 to the data used to train it. AI generated information, unlike conventional search engines, produce specific answers for each prompt. This might lead to over-reliance on AI thereby killing creativity, critical thinking and problem solving skills.9 Habitual acceptance of generative AI recommendations biases can arise due to human automation.10
Despite all the benefits AI brings to the academia, caution is needed to safeguard the integrity of human interaction and creativity in research. In academia, “publish or perish” culture combined with AI tools may lead to a flooding of fraudulent publications straining the peer review process,11 unnecessary retractions and loss of public confidence in the medics. At the heart of ethical concerns lies the potential for plagiarism, authorship attribution issues, and the need to maintain academic integrity.12 While AI is utilized to enhance productivity, there’s a risk of compromising core academic values such as originality and innovation.13 AI-human collaboration is the key to addressing challenges and seizing opportunities created by generative AI. It is therefore wise to consider AI as just suggestive and authors maintain the responsibility on how they use such suggested AI content in academic communication.
AI Use in the Conduct of Surgical Research
Intersectionality Between AI and Research
The use of AI in research is not new to researchers since the advent of search engines. Google Search, one of the most used search engines including in our set ups, has leveraged AI to enhance user experience and deliver more relevant results. AI-powered features like RankBrain, natural language processing, and machine learning enable Google to understand the context of search queries, personalize results, and provide accurate information. Additionally, AI-driven image and voice search capabilities allow users to search using images or voice commands. As AI technology continues to evolve, Google Search is poised to become even more powerful and sophisticated.
In recent years, the scientific community has seen an explosion of Generative AI such as Chat Generative Pre-Trained Transformer or ChatGPT (ChatGPT) in 2022, Gemini in 2023, Claude and many more that have accelerated the writing time and grammar check. These unlike search engines produce human-like texts thereby generating automated paragraphs of information. It is this automation that carries the potential for abuse when AI is taken as a co-researcher rather than an assistant. Despite this risk, AI has several areas where its potential can be harnessed to improve efficiency in the conduct of research. We shall therefore bring our argument on some of these pitfalls and trade-offs in using AI in surgical research.
AI Use in Idea Development and Research Design
The use of AI in idea generation has received less negativity among researchers. AI is vital in brainstorming and spotting researchable gaps and hypothesis suggestion. AI has particularly been very efficient in research planning and design by suggesting thoughtful methodologies.14 By handling big data, AI can recognize under researched areas thereby directing researchers on areas to prioritize.15 AI also utilizes existing data to predict potential correlations and casual relationships thereby assisting with the suggestion of strong hypothesis. In research planning, AI offers guidance in methodology consideration appropriate for proposed research question.16 However, since AI is just a tool utilizing existing mega data that is available, it has limitations in considering the local context derived from potential data source biases that exist such as publication bias. Researchers’ creativity will still be needed to meaningfully engage with suggestions generated from AI to localize the research priorities and hypothesis setting in their work.
AI Use in Content Development and Structuring
Another area where we consider the use of AI to be widely acceptable among researchers is in content development and structuring of work. In this context, AI is useful in text expansion, autocompletion features and offering predictive text capabilities during the writing process. In this way, AI holds the promise of significantly impacting on the quality of scientific writing and at the same time the writing time is shortened.17 In probabilistic manner, AI can make meaning from findings in research and suggestions on expanding discussion. This AI predictive capabilities can anticipate and suggest technical terms thereby streamlining the writing process.18 Furthermore, AI tools can structure the human developed content into some logical flow and coherence. Utilizing emotional tone analysis, AI can tailor the content tone to the target audience, more useful for grant application where persuasive language is needed.19,20
AI Use in Literature Retrieval and Organization
Researchers have looked at AI as a summary of knowledge about a topic since it does not critically review the content it is producing. This aspect of AI can be utilized to do the manual work of identifying literature to inform the current study being conducted. In doing so, AI has however been found to produce falsified Author/Journal mismatch, hence, such papers might be subject to desk rejection. Authors must take control of this stage of scientific writing by using AI to only probe for what is available and must counter-confirm and add critical reasoning to the paper.
AI Use in the Peer Review of Surgical Research
With the growing number of submissions in peer reviewed journals coupled with the lack of enough well-trained reviewers and editors, there has equally been a shift towards the incorporation of AI in several stages of the peer review process.21 Conversations around its effectiveness, efficiency and degree of bias have come up in the recent past. Its greatest use has largely been in the detection of elements of fabrication, falsification, plagiarism and image manipulation among submissions, a task that is often met with lots of challenges in the peer review space.22 Interestingly, while generative AI is often biased to the data used to train it, in several instances it has been postulated as a stop-gap measure to eliminating bias in peer review by overcoming the bias that comes with reviewer’s knowledge of the authors of the manuscripts they are reviewing while at the same time accommodating scenarios where there might be a language barrier or mismatch between the manuscript and reviewers.21 Its capacity to improve efficiency in peer review has further been buttressed by a reported good agreement between comments raised by reviewers and those generated by GPT-4 for a subset of manuscripts, with about 35% overlap of comments between the two parties.23 It has similarly in a feasibility study been applauded for its outstanding capacity to identify methodological flaws and assessing overall contribution of a manuscript to the respective field and in these cases providing insightful feedback on theoretical frameworks.24 A huge barrier to its use in peer review, however, is the potential for breach of confidentiality since any input made into generative AI becomes inherently deposited into the pool of data accessible in the internet and used for further training of AI models and in generating responses.25
Recommendations on the Way Forward
Unsupervised, AI can generate academic papers that are extremely difficult to differentiate from human generated content thus producing what we would consider fraudulent content.26,27 It is important to maintain transparency and acknowledge use of AI in research to maintain transparency in scientific writing and allow scholars to critically consider the laid evidence.28 AI in all these should only serve supportive roles while maintaining human creativity and critical thinking in surgical research. AI-human collaboration holds the potential to overcome the real threats that it currently poses.
Equity issues also exist from an access point of view when these seemingly useful tools become accessible by subscriptions, thereby creating an imbalance in research generation. Likewise, AI has potential for extensive plagiarism and accuracy issues especially with reference citation. AI might also miss critical components of the design or information that is required. Currently available AI content detectors might be flawed, thereby failing to distinguish between human vs AI generated text. The latter is complicated by the availability of paraphrasing tools that allow to re-write AI generated context without detection. Of note, however, plagiarism software like Turnitin for instance, now have the capacity to determine whether a given text was generated from a large-language model and further qualifies it as either likely AI-generated text or likely AI-generated text that was possibly revised using an AI-paraphrasing tool or a word spinner. Importantly, they also have accounted for the possibility of false positives by not flagging detection scores of under 20% due to the possibility of incorrectly flagging human-written text as AI-generated. A loophole however exists where since these detection models work only on long-form writing, work submitted in the form of bullet points, and annotated bibliographies may bypass detection. It is therefore important that editors of surgical journals remain vigilant by ensuring all submissions made to journals are run through AI similarity detection software with further scrutiny on the nature of AI use with a recommended threshold of 20% upheld for all pieces of literature.
Overall, it’s quite evident that AI and its use in surgical research is inevitably becoming more common as days progress. Efforts therefore need to continually be made to guard its ethical and integral use to make the best of the benefits that come along with AI and at the same time guarding the integrity of research. To achieve this, clear guidelines therefore need to be established to govern its use. Notably, preliminary findings from a survey conducted by Wiley found that 70% of researchers want publishers to provide guidelines for acceptable AI use, with about 67% exhibiting reservation in its use due to the lack of guidelines or appropriate training. To the best of our knowledge, there exists no ‘one size fit’ policy document to meet this need, we therefore defer to recent provisions made by The African Journal Partnership Program’s guidance on the use of AI in scholarly publishing25 as well as an ethical framework designed for artificial intelligence in healthcare research to provide foundational guidance.29