Notes on AI and authorship
See also:
- Right now, there is no AI that can outperform humans at performing meaningful research and analysis (2024-12-12)
- Some thoughts on AI (2024-02-26)
da Silva 2024, Proposing authorship for artificial intelligence and large language models
- (This paper is very one-sided and even sensationalist in its language, but it’s useful as a summary of the “other” perspective!)
- The current and predominant school of thought in academic publishing, with a correspondingly rigorously implemented set of ethical policies, notes that classic authorship is a purely human endeavor. However, such rigid conceptual restrictions on authorship for artificial intelligence (AI), like large language models (LLMs), may be borne from fear, emerging perhaps from being intellectually threatened by AI/ LLMs that might outperform humans.
- AI as an author – in the human sense of the word – is largely unrecognized, even banned, including by major status quo publishers and journals.1-3 Current existing limitations to the use of AI/LLMs4,5 serve primarily to protect human endeavor, i.e., anti-AI bias, representing a fearful response by some humans, who may perceive AI/LLMs as a threat to human endeavor, like authorship, original writing, expression, and editing, because AI is able to generate, in a tiny fraction of time, what takes some humans a lifetime to achieve.
- a patchwork of papers written honestly by humans, interlaced with AI/LLM-derived text that is not transparently declared by human users, making academic literature that is written by humans, humans+AI/LLMs, and only AI/ LLMs indistinguishable.
- AI-authorship, defined here as the valid assignment of authorship to an AI/LLM, as well as its recognition as a form of authorship and “originality,” arises when a human and AI/LLM are viewed as “intellectual” equals.
- Artificial intelligence/large language model do not have the ability to engage in completely independent conduct, i.e., they do not independently generate ideas nor are they capable of independently writing text or academic papers. Rather, a human prompt is still required, and thus any paper that lists AI/ LLM as an author, within the AI-authorship model, will obligatorily also include a human who feeds AI/LLM a prompt.
- There is an important subtlety here, though, and one which currently invalidates AI/LLMs from being considered authors in the current human authorship model, namely their inability to be held accountable
Bajohr 2024, Writing at a Distance: Notes on Authorship and Artificial Intelligence
- I propose the concept of causal authorship
- Primary authorship would be the still valid standard of conventional, “immediate” writing, in which an author more or less directly puts text on paper or onto a data carrier.
- Secondary authorship would thus be a first-order distance that duplicates the act of writing. […] the author’s contribution in these cases is to formulate a sequence of rules, the execution of which produces the work.
- tertiary authorship […] rules are no longer written in a program script whose application (to data) produces an output, but rather an ANN is trained on a large set of exemplary outputs that makes the “rules” that eventually lead to the final text.
- One can now plausibly speak of quaternary authorship. Since large language models are predominantly proprietary software—which is still in most cases too large to be trained from scratch by individual users—consumers are more often than not limited to the “factory settings” and cannot choose their own dataset. […] in the quaternary model, it is solely the input that counts and that is still under human control: “Write a poem in the style of Wallace Stevens.”66 “Promptology”—the efficient, even virtuosic formulation of such input prompts—is the main mode of operation of quaternary authorship.
Murray 2024, Tools do not create: human authorship in the use of generative artificial intelligence
- The advent of generative Al tools like Midjourney, DALL-E, and Stable Diffusion has blurred this understanding, causing observers to believe these tools are the authors of the artworks they produce, even so far as to imagine that the artworks are “created” by the Al in the copyright sense of the word. Not so.
- The U.S. Copyright Office recently issued guidance on the copyrightability of works produced using generative Al tools. The Office has accepted the narrative that Al tools perform the steps of authorship, conceiving of the image and rendering it into existence, and denying copyright because randomly or automatically generated works lack human authorship. This interpretation of generative Al is fundamentally flawed.
- Contemporary visual generative Al systems can do extraordinary things, but as of yet not autonomously and not automatically. Generative Al systems are tools highly complex, deeply technological tools to be sure, but tools none the less. And these tools require a human author or artist the end-user of the generative Al system to provide the inspiration and design and often the instructions and directions on how to produce the image.
- An artist working with a generative Al tool is no different from an artist working with a digital or analog camera or with Photoshop or another image editing and image rendering tool.
Liang et al 2024, Can large language models provide useful feedback on research papers? A large-scale empirical analysis
- We then conducted a prospective user study with 308 researchers from 110 US institutions in the field of AI and computational biology to understand how researchers perceive feedback generated by our GPT-4 system on their own papers.
- Overall, more than half (57.4%) of the users found GPT-4 generated feedback helpful/very helpful and 82.4% found it more beneficial than feedback from at least some human reviewers. While our findings show that LLM-generated feedback can help researchers, we also identify several limitations. For example, GPT-4 tends to focus on certain aspects of scientific feedback (e.g., ‘add experiments on more datasets’), and often struggles to provide in-depth critique of method design.
- Together our results suggest that LLM and human feedback can complement each other. While human expert review is and should continue to be the foundation of rigorous scientific process, LLM feedback could benefit researchers, especially when timely expert feedback is not available and in earlier stages of manuscript preparation before peer-review.
Ziems et al 2024, Can Large Language Models Transform Computational Social Science?
- We conclude that the performance of today’s LLMs can augment the CSS research pipeline in two ways: (1) serving as zero-shot data annotators on human annotation teams, and (2) bootstrapping challenging creative generation tasks (e.g., explaining the underlying attributes of a text). In summary, LLMs are posed to meaningfully participate in social science analysis in partnership with humans.
Abdurahman et al 2024, Perils and opportunities in using large language models in psychological research
- GPTology—which we define as the hurried and unjustified application of LLMs either as “replacements” for human participants, or as an off-the-shelf “one-size-fits-all” method in psychological text analysis—can lead to a proliferation of low-quality research, especially if the convenience of using LLMs such as ChatGPT leads researchers to rely too heavily on them.
- While LLMs, especially fine-tuned ones, can achieve impressive performances on many tasks, the presence of a WEIRD bias, along with the opaque and often irreproducible nature of these models, particularly the proprietary ones, makes them a double-edged sword for psychological research.
- researchers must actively exercise caution and critically evaluate the limitations of these models before incorporating them into their research paradigms
- A commitment to rigor and replicability should guide the integration of AI into psychological research, not convenience.