An Australian Science Journalism Outlet Started Publishing AI-generated Articles — You Won't Believe What Happened Next
If you've been paying attention, you totally will.
When I decided to revive No Breakthroughs, I wanted it to return as a Science Media Watch: Call out the silly stuff publications do when reporting on science, the ways in which mainstream media often drool out some press-release-backed nonsense about tampons causing cancer or allowing water divining to grace their publication’s illustrious pages.
What I didn’t expect was, just a week in, that I’d find the call was coming from inside the house.
In early August, a journalist in the Science Journalists Association of Australia pointed out that Cosmos, one of Australia’s eminent science publications, had been using AI to generate explainer content around things like black holes, carbon sinks, what happens to our bodies after death and how to stay safe from cybercrime.
Cosmos is one of Australia's last remaining dedicated science publications in print and has been in circulation since July 2005. It's a publishing success story but has, like many publications, suffered from the rise of social media, Google algorithm changes taking a huge dump on publishers and general news avoidance.
In recent years, Cosmos was published by the Royal Institution of Australia but in June, facing extreme budgetary constraints, it was transferred to CSIRO Publishing, the independent publishing arm of Australia’s national science organisation.
This is, to my knowledge, the first science journalism outlet to be using generative AI (at least openly) to generate entire articles. It’s definitely the first in Australia. So this is a big deal.
Cosmos places a small disclaimer at the bottom of the piece that says it was generated “by our custom AI service”. My understanding is that this is OpenAI’s GPT-4 but with an extra layer over the top to enhance the accuracy of the generated work (I believe RAG, according to reporting by the ABC’s James Purtill) that refers to the Cosmos database of some 15,000 articles. There are obvious problems here — it’s using GPT-4, trained on copyrighted works, but it’s also my understanding journalists and contributors to Cosmos had no knowledge of this project. The ABC has reported that this is an “experiment”.
This is why it sucks, in bullet point form:
Contributors have looked for clarity around the use of AI at Cosmos, but have struggled to get in touch with the editorial and its publisher, CSIRO publishing, in the past week.
Freelance contributors, in particular, retain copyright over their works in Cosmos, which raises ethical issues about the use of their content in RAG.
The articles which used AI are not well-written, contain weird phrasing and inaccuracies and similar-sounding articles are littered across the web — whether this is plagiarism is hard to determine, but it’s clear some of these articles have very similar language to previously published work. This is a major problem that undermines science journalism and science communication.
Actually, let’s look at that last point more closely.
Take the What happens to our bodies after death? piece, published on July 26.
It discusses concepts like rigor mortis, but the only thing stiff about this piece is the writing (har har). It’s lifeless (haaaaaaaaaaaaarhaaha). And a number of weird turns of phrase are used (no joke here, just disappointment).
It describes autolysis as “self-breaking”. This is a bit of a mangled expression. There’s very few places I could find that describe autolysis in this way. You can find “self splitting” (from its Greek origins) and “self breaking down” or even “self destruction” but not often straight up self-breaking…
It suggests as soon as “3 to 4 hours after death, rigor mortis sets in”. This is at odds with the literature. Even weirder, this information is at odds with the information provided in “The Science Briefing”, a podcast produced by the same publication. Yikes.
In life, the energy molecule ATP is used to relax muscles, but ATP is depleted in death. True, ATP gets depleted in death, but this really destroys any nuance about the process occurring. ATP is widely used in the cells of the human body. This gives the impression its simply a muscle relaxant at death. Splitting hairs, perhaps.
Rigor mortis is noticeable in smaller muscles first, such as the eyelids, hands and face. Yes, definitely the eyelids early (after the heart), but hands are generally late to the rigor mortis party. There’s even an old physician’s rule that describes this sequence of events: Nysten’s rule. Upper body does go stiff early, but not that early.
The article on carbon sinks, which is quite short, features a lot of phrases similar to those found elsewhere online. It’s futile looking at plagiarism via an online detector — it will mostly serve up Cosmos as being the original — but Googling phrases does turn up similarities.
For instance, the Cosmos piece: Peat bogs are another noteworthy carbon sink. They cover only 3% of the Earth's land surface yet store around 600 billion tonnes of carbon.
And the Natural History Museum / UNEP etc: Peatlands cover just 3% of Earth's land surface but store an estimated 600 billion tonnes of carbon worldwide.
While this is factual information (note: it’s actually rather hard to find the most up to date details about this) and so slightly harder to write in any other way, it raises another important issue: For what purpose are these explainers being written?
What business case is there to flood Google with new explainers when that information already exists, except to try and boost your website’s ranking? And if you’re trying to boost your website’s ranking with AI-generated content… let me tell you, you’re already tackling the problem the entirely wrong way.
I've long been disturbed by the idea that news will be written by generative AI, given the myriad problems with the technology still to be solved. It still makes things up, it’s trained on copyrighted materials and, for the most part, it provides corpse-cold, boring-ass prose. Just the saddest, most dehumanising stuff.
I've seen the damage this can cause.
While I was Science Editor at CNET, formerly one of the leading tech websites in the US, there was a decision made to begin using a proprietary AI to generate articles. These articles would target simple questions that were believed to enhance rankings on Google, such as "What is a credit card?" and be edited by a staff member.
CNET began this process with finance articles and told readers the articles were authored by "CNET Money staff". But after Gael Breton, an online marketing and SEO professional, spotted the byliner referring to automation, it blew up in CNET's face. Other outlets scoured the AI-generated articles and found errors. In total, 77 pieces were published. More than 40 contained factual errors.
The backlash was swift. The internet was "horrified" according to one report. Another called it a "journalistic disaster". The editorial team, who were mostly blindsided by the decision, were not happy. CNET justified it by saying it was an experiment (damn, where have I heard that before), but ultimately paused the use of AI.
This story has repeated in other publications. Over and over. At Gizmodo. At Sports Illustrated. Every time, met with derision and horror from the audience and from the wider media.
There likely will be a place for AI-generated content in newsrooms, perhaps as an idea generator or for quick feedback or some other bullshit an executive comes up with and thinks is worthwhile. Lots of organisations are already moving toward giving AI the green light as this type of assistant, desperately searching for ways to jam the square AI peg in the round efficiency hole, or perhaps trying to justify spending the last two years finding a use case for it.
The bottom line is: There’s really nothing that AI can do for science journalism and science explainers right now that a human won’t be able to do better. An AI cannot — and I guess I am hoping for my sake never will be able to — pick up the phone and speak to a scientist about the absolute cutting-edge of research. And that’s what explainer content should be attempting to include! It should be providing information, old and new, not just listing off a set of facts.
This should, maybe, finally provide a lesson (but of course it won’t): If you’re going to incorporate AI into your publication, you must start with being upfront about the way it’s going to be used. In science, if you run an experiment, you have to get ethical approval. You have to get consent from the people you are going to be running that experiment on.
Using GPT to write articles is an experiment — the publications admit this — so they should be upfront with this and give the audience and editorial teams a heads up, at the very least! That’s informed consent baby. It’s a key element of the scientific process.
Cosmos, as one of the oldest and most illustrious science publications, and CSIRO publishing should abandon their experiment and focus on providing quality science journalism — the thing that made it successful over the past 20 years.
A small update to this: I removed some of the explanation in Point 2: “It suggests as soon as “3 to 4 hours after death, rigor mortis sets in” because I had re-used the figure provided by the Cosmos podcast. Rigor mortis sets in as quickly as an hour after death. Lots of things affect this, but this is a key point: There’s no definitive answer here and AI fails to provide that nuance more than a human might ;)