Over the past few years, I’ve seen the same pattern repeated over and over regarding statements made about LLMs. Someone will say something about the capabilities of a given LLM, and then someone else will respond with a dismissive comment to the effect of, “Well, I just ask ChatGPT, Claude, Grok, etc., to do ABC . . .” and then the comment goes on to mention some embarrassing response from the LLM.
A lot of these responses are due to people using the free version of an LLM, which is always months behind the paid version, which is months behind the pro version of the same LLM, and they are all periodically updated.
Therefore, many of those embarrassing responses that people find at one point are either not the case in the paid or pro versions, or disappear within a few months as the LLM gets updated.
For example, at least in the paid version of ChatGPT that I am using, hallucinated sources are now ancient history, and while I was impressed when it first came out that it was capable of translating classical Chinese, it can now do that well.
I think we are now reaching another one of those points where LLMs are demonstrating that they can do what many previously thought they would never be able to do.
Editing
Recently I have been doing a lot of editing. I first edit a paper in the “traditional” way, then I have ChatGPT check everything.
When I have a paper ready, I first copy the bibliography and ask ChatGPT to check it based on whatever bibliography system I am using. I do this a couple of times until the issues are resolved. Then I give it the entire paper to check.
In fact, at this point in the development of LLMs, I could just give ChatGPT the entire paper at the start and instruct it to “fix everything” and let me know if there are any issues that it can’t resolve. However, I prefer to make the edits it points out manually so that I can be aware of what changes I am making to the paper.
In checking to see if a bibliography is compliant with a given referencing system, like APA 7 or Chicago, ChatGPT also checks the information about the sources. So, for instance, I have had ChatGPT point out that in a bibliography I was editing, a work listed as being authored by someone was actually edited by that person.
Given the broad range of sources one can encounter while editing, it’s impossible for any single copyeditor to have knowledge about every publication in his/her head. Therefore, the only way to catch issues like that one, and many others, is to manually search for information about each reference online. And in the past, that’s what I used to do when I had a question about a source.
As such, what I can now see is that the job of copyeditor has largely become redundant.
Research
I wrote a paper in 2018 for a conference, but didn’t have time to finalize it and submit it to a journal. Then a few years ago, I did more research on it, but did not have time to update the paper.
So, returning to this pre-LLM paper in the LLM age, I have spent the past few weeks remembering what I had done, conducting more research, and essentially rewriting most of the paper.
In the paper, I use digitized sources in French from the French National Library (Gallica), and some sources in classical Chinese that have been scanned. For both of these types of sources, I have now used LLMs to help “access” them.
I will, for instance, locate a French article in a journal. Download it. Extract the pages of the article from the journal, upload them to ChatGPT, and ask it to transcribe and translate each page.
Et voilà! Within minutes I have a document that I can quickly scan through in English and then check key passages in French.
This was possible to a certain degree in 2018 when I first wrote the article. At that time, some of the materials on the Gallica site had been OCRed, but there remained plenty of errors. So, you would have to copy that information, then correct it against the original, and then you could put it into Google Translate, which had its own limitations.
Gallica still has this OCR function, and the error rate is lower than it was back in 2018, but I find ChatGPT to still be superior. Its accuracy is either 100% or very close, and its translations are excellent. And it’s fast.
Meanwhile, as for the classical Chinese sources, ChatGPT can also do quite well in transcribing them, certainly over 80%, which saves a lot of time. Further, its translations are also excellent, and definitely better than mine.
Indeed, I spent A LOT of time in 2018 transcribing passages and translating them, and while I got the translations “correct,” when I look at ChatGPT’s translations, I go, “Oh, right, that’s a better way to say it.”
As such, I really don’t see the need to translate anymore. Why should readers have to read my clumsy translations when ChatGPT can produce translations that are clear and accurate?
As I see it, my job is to find the passages that I want translated, and in the case of classical Chinese texts, there is still work that has to be done to transcribe them, but after that, LLMs are superior.
For complicated or arcane texts, I can see a need to “intervene” in an LLM translation to be able to explain the texts to readers, but if that is not needed, then the LLM translations are fine.
Writing
I have really enjoyed the process of rewriting the paper. I work best in the morning, and I love that feeling when your brain is sharp and the caffeine from the morning coffee is kicking in, and you go over what you wrote and make changes.
Or that feeling when you struggle with a sentence as you try to figure out what exactly it is that you are trying to say and how that can best be worded . . . and then you find a solution.
However, now that I’m almost done with the paper, I suddenly wondered, “Could ChatGPT do what I just did?”
In finding materials on Gallica, I went through an “organic” process of searching for materials, and then finding references in those materials to other sources, etc.
What I then did was to ask ChatGPT to search for all of the articles and publications on topic XYZ on Gallica between the years xxxx–xxxx. In five minutes, it essentially located all of the articles that it took me quite a bit of time to locate (it’s hard to know how long because I didn’t try to find them all at one time), and located an additional one that I was unaware of.
I then asked ChatGPT to write a 7,000-word essay on the same topic I was writing about using the sources it had identified. It did so in just over two minutes. And parts of it were damn good.
Was it as “good” as my paper? No, but I think where it fell short had more to do with the limited number of sources I asked it to work with (I didn’t include the sources in classical Chinese, for instance) and the lack of detailed instructions in my prompt.
As such, what this showed me is that we are probably already at the point where LLMs can produce humanities papers that are as good as, and in some ways better than, the papers that many of us, myself included, produce.
Further, this isn’t some hypothetical claim. I researched and wrote a paper over a period of several years. I reached the point where I knew what the main sources were, and what arguments one could make from those sources, and in the limited experiment that I conducted with ChatGPT, it “got it” in roughly seven minutes.
You Know It’s There, but Don’t Look!
For copyediting, there is no question but that LLMs are totally capable of replacing the work of human beings.
How about research and writing? Knowing what I now know, I could have researched and written my paper together with ChatGPT from the start, and the end result would not have been all that different from what I have written.
While my main argument in the paper is one that ChatGPT did not pick up on, 1) it could have if I had provided it with more sources and a better prompt, and 2) even without noticing what I argue, the ideas it proposed were still good enough to make a novel contribution.
Yes, some human input would still be necessary to get such a paper to the point where it would be ready for submission to a journal, but ChatGPT can get most of the job done on its own, if fed the appropriate sources and prompted effectively.
I remember when I was teaching in Brunei, a student once told me that when she went online, she saw web pages that were about certain topics considered “haram,” or forbidden, and that therefore, she didn’t click on them, although she was fully aware that they were there.
That is the point that I think we are approaching with LLMs. With every passing month, they are getting better at doing what we do, and in some ways, they are already superior to us, but to let them do what they can do is “haram,” so we have to keep telling ourselves the LLM equivalent of “don’t click on that link.”
LLMs as Co-Authors
In 2022–2023, there were some scholars who listed ChatGPT as a co-author. That created an uproar and led many publishers to ban that practice, arguing that LLMs cannot take responsibility for the accuracy, integrity, and originality of the work, as human authors can.
As LLMs continue to improve, I think that policy will have to be revisited.
While I guess it is true that LLMs cannot take responsibility for their output, I can. I can take responsibility for verifying that an LLM’s translation is accurate, and I can take responsibility for verifying that a statement that an LLM makes and the works it cites to justify the statement are accurate.
In that sense, humans can co-author with an LLM, and at this point, I think that is going to be an inevitable development, because we are quickly approaching a stage where we will be publishing works that are inferior to what LLMs can produce because we insist that human beings do the work.
The paper I am working on is not the only unfinished paper I have. To the contrary, I have a career’s worth of unfinished articles and book manuscripts. Why are they unfinished? In general, I would say that it is because I find the process of discovering information and the argument I want to make from that information much more enjoyable than the more tedious task of documenting that information and the argument.
LLMs can now do that tedious part. So, why not let them? I’ll take responsibility for their statements, translations, and the sources they cite.
This is a direction that I can see we are heading. Yes, there are many scholars who began prior to the LLM age who will continue to work as they always have, but I can’t see how anyone starting out in the LLM age can be convinced to do the same.
As for those in between. . . over time I think more and more will look at that “haram” link and ask, “Wait. Why exactly is it that I’m not supposed to click on it?”