Royal Society's Future of Scientific Publishing meeting

Royal Society's Future of Scientific Publishing meeting / Jul 2025

I was a bit sleepy getting into the Royal Society Future of Scientific Publishing conference early this morning, but was quickly woken up by the dramatic passion on show as publishers, librarians, academics and funders all got together for a "frank exchange of views" at a meeting that didn't pull any punches!

These are my hot-off-the-press livenotes and only lightly edited; a more cleaned up version will be available from the RS in due course.

Sir Mark Walport FRS opens up the conference

Mark Walport sets the scene

Sir Mark Walport was a delightful emcee for the proceedings of the day, and opened how important the moment is for the future of how we conduct science. Academic publishing faces a perfect storm: peer review is buckling under enormous volume, funding models are broken and replete with perverse incentives, and the entire system groans with inefficiency.

The Royal Society is the publisher of the world's oldest continuously published scientific journal Philosophical Transactions (since 1665) and has convened this conference for academies worldwide. The overall question is: what is a scientific journal in 2025 and beyond? Walport traced the economic evolution of publishing: for centuries, readers paid through subscriptions (I hadn't realised that the early editions of the RS used to be sent for free to libraries worldwide until the current commercial model arrived about 80 years ago).. Now, the pendulum has swung to open access that creates perverse incentives that prioritize volume over quality. He called it a "smoke and mirrors" era where diamond open access models obscure who actually pays for the infrastructure of knowledge dissemination: is it the publishers, the governments, the academics, the libraries, or some combination of the above? The profit margins of the commercial publishers answers that question for me...

He then identified the transformative forces that are a forcing function:

LLMs have entered the publishing ecosystem
The proliferation of journals has created an attention economy rather than a knowledge economy
Preprint archives are reshaping how research is shared quickly

The challenges ahead while dealing with these are maintaining metadata integrity, preserving the scholarly archive into the long term, and ensuring systematic access for meta-analyses that advance human knowledge.

Historical Perspectives: 350 Years of Evolution

The opening pair of speakers were unexpected: they brought a historical and linguistic perspective to the problem. I found both of these talks the highlights of the day! Firstly Professor Aileen Fyfe drew upon her research from 350 years of the Royal Society archives. Back in the day, there was no real fixed entity called a "scientific journal". Over the centuries, everything from editorial practices to publication methods over to dissemination means have transformed repeatedly, so we shouldn't view the status quo as set in stone.

Professor Aileen Fyfe talks publishing history

While the early days of science were essentially people writing letters to each other, the post-WWII era of journals marked the shift to "scale". The tools for distance communication (i.e. publishing collected issues) and universities switching from being teaching focused over to today's research-centric publishing ecosystem were both key factors. University scientists used to produce 30% of published articles in 1900; by 2020, that figure exceeded 80%. This parallels the globalization of science itself in the past century; research has expanded well beyond its European origins to encompass almost all institutions and countries worldwide.

Amusingly, Prof Fyfe pointed out that a 1960 Nature editorial asked "How many more new journals?" even back then! The 1950s did bring some standardization efforts (nomenclature, units, symbols) also though citation formats robustly seem to resist uniformity. English was also explicitly selected as the "default language for science, and peer review was also formalised via papers like "Uniform requirements for manuscripts submitted to biomedical journals" (in 1979). US Congressional hearings with the NSF began distinguishing peer review from other evaluation methods.

Professor Aileen Fyfe shows the globalisation of research over the years

All of this scale was then "solved" by financialisation after WWII. At the turn of the 20th century, almost no journals generated any profit (the Royal Society distributed its publications freely). By 1955, financial pressures and growing scale of submissions forced a reckoning, leading to more self-supporting models by the 1960s. An era of mergers and acquisitions among journals followed, reshaping the scientific information system.

Professor Vincent Larivière then took the stage to dispel some myths of English monolingualism in scientific publishing. While English offers some practical benefits, the reality at non-Anglophone institutions (like his own Université de Montréal) reveals that researchers spend significantly more time reading, writing, and processing papers as non-native language speakers, and often face higher rejection rates as a result of this. This wasn't always the case though; Einstein published primarily in German, not English!

He went on to note that today's landscape for paper language choices is more diverse than is commonly assumed. English represents only 67% of publications, a figure whic itself has been inflated by non-English papers that are commonly published with English abstracts. Initiatives like the Public Knowledge Project has enabled growth in Indonesian and Latin America for example. Chinese journals now publish twice the volume of English-language publishers, but are difficult to index which makes Lariviere's numbers even more interesting: a growing majority of the world is no longer publishing in English! I also heard this in my trip in 2023 to China with the Royal Society; the scholars we met had a sequence of Chinese language journals they submitted too, often before "translating" the outputs to English journals.

Professor Lariviere uses OpenAlex to show non-English linguistic breakdowns

All this leads us to believe that the major publisher's market share is smaller than commonly believed, which gives us reason for hope to change! Open access adoption worldwide currently varies fairly dramatically by per-capita wealth and geography, but reveals substantive greenspace for publishing beyond the major commercial publishers. Crucially, Larivière argued that research "prestige" is a socially constructed phenomenon, and not intrinsic to quality.

In the Q&A, Magdalena Skipper (Nature's Editor-in-Chief) noted that the private sector is reentering academic publishing (especially in AI topics). Fyfe noted the challenge of tracking private sector activities; e.g. varying corporate policies on patenting and disclosure mean they are hard to infdex. A plug from Coherent Digital noted they have catalogued 20 million reports from non-academic research; this is an exciting direction (we've got 30TB of grey literature on our servers, still waiting to be categorisd).

Professor Lariviere shows how uneven citations are across languages and geographies

What researchers actually need from STEM publishing

Our very own Bill Sutherland opened with a sobering demonstration of "AI poisoning" in the literature, referencing our recent Nature comment. He did the risky-but-catchy generation of a plausible-sounding but entirely fabricated conservation study using an LLM and noted how economically motivated rational actors might quite reasonably use these tools to advance their agendas via the scientific record. And recovering from this will be very difficult indeed once it mixes up with real science.

Bill talks about our recent AI poisoning piece

Bill then outlined our emerging approach to subject-wide synthesis via:

Systematic reviews: Slow, steady, comprehensive
Rapid reviews: Sprint-based approaches for urgent needs
Subject-wide evidence synthesis: Focused sectoral analyses
Ultrafast bespoke reviews: AI-accelerated with human-in-the-loop

Going back to what ournals are for in 2025, Bill then discussed how they were originally vehicles for exchanging information through letters, but now serve primarily as stamps of authority and quality assurance. In an "AI slop world," this quality assurance function becomes existentially important, but shouldn't necessarily be implemented in the current system of incentives. So then, how do we maintain trust when the vast majority of submissions may soon be AI-generated? (Bill and I scribbled down a plan on the back of a napkin for this; more on that soon!)

Bill also does a cheeky advert for his Conservation Concepts channel!

Early Career Researcher perspectives

Dr. Sophie Meekings then took the stage to discuss the many barriers facing early career researchers (ECRs). They're on short-term contracts, are dependent on others people's grant funding, and yet are the ones conducting the frontline research that drives scientific progress. And this is after years spent on poorly paid PhD stipends!

ECRs require:

clear, accessible guidelines spelling out each publishing stage without requiring implicit knowledge of the "system"
constructive, blinded peer review** that educates rather than gatekeeps
consistent authorship conventions like CRediT (Contributor Roles Taxonomy)

Dr. Meekings then noted how the precarious nature of most ECR positions creates cascading complications for individuals. When job-hopping between short-term contracts, who funds the publication of work from previous positions? How do ECRs balance completing past research with new employers' priorities? Eleanor Toye Scott also had this issue when joining my group a few years ago, as it took a significant portion of her time in the first year to finish up her previous publication from her last research contract.

If we're going to fix the system itself, then ECRs need better incentives for PIs to publish null results and exploratory work, the councils need to improve support for interdisciplinary research that doesn't fit traditional journal boundaries (as these as frontiers between "conventional" science where many ECRs will work), and recognition that ECRs often lack the networks for navigating journal politics where editors rule supreme.

Dr. Meekings summarized ECR needs with an excellent new acronym (SCARF) that drew a round of applause!

Speed in publication processes
Clarity in requirements and decisions
Affordability of publication fees
Recognition of contributions
Fairness in review and credit

Dr Sophie Meekings' SCARF principles for ECRs

The audience Q&A was quite robust at this point. The first question was about how might we extend the evidence synthesis approach widely? Bill Sutherland noted that we are currently extending this to education working with Jenny Gibson. Interconnected datasets across subjects are an obvious future path for evidence datasets, with common technology for handling (e.g.) retracted datasets that can be applied consistently. Sadiq Jaffer and Alec Christie are supervising projects on evidence synthesis this summer on just this topic here in Cambridge.

Another question was why ECRs feel that double blind review is important. Dr. Meekings noted that reviewers may not take ECR peer reviews as seriously, but this coul dbe fixed by opening up peer review and assigning credit after the process is completed and not during. Interestingly, the panel all like double-blind, which is the norm in computer science but not in other science journals. Some from the BMJ noted there exists a lot of research into blinding; they summarised it that blinding doesn't work on the whole (people know who it is anyway) and open review doesn't cause any of the problems that people think it causes.

A really interesting comment from Mark Walport was that a grand scale community project could work for the future of evidence collation, but this critically depends on breaking down the current silos since it doesn't work unless everyone makes their literature available. There was much nodding from the audience in support of this line of thinkin.g

Charting the future for scientific publishing

The next panel brought together folks from across the scientific publishing ecosystem, moderated by Clive Cookson of the Financial Times. This was a particularly frank and pointed panel, with lots of quite direct messages being sent between the representatives of libraries, publishers and funders!

Amy Brand from MIT Press opens the panel

Amy Brand (MIT Press) started by delivered a warning about conflating "open to read" with "open to train on". She pointed out that when MIT Press did a survey across their authors, many of them raised concerns about the reinforcement of bias through AI training on scientific literature. While many of the authors acknowledged a moral imperative to make science available for LLM training, they also wanted the choice of making their own work used for this. She urged the community to pause and ask fundamental questions like "AI training, at what cost?" and "to whose benefit?". I did think she made a good point by drawing parallels with the early internet, where Brand pointed out that lack of regulation accelerated the decline of non-advertising-driven models. Her closing question asked if search engines merely lead to AI-generated summaries, why serve the original content at all? This is something we discuss in our upcoming Aarhus paper on an Internet ecology.

Danny Kingsley from Deakin University Library then delivered a biting perspective as a representative of libraries. She said that libraries are "the ones that sign the cheques that keeps the system running", which the rest of the panel all disagreed with in the subsequent discussion (they all claimed to be responsible, from the government to the foundations). Her survey of librarians was interesting; they all asked for:

Transparent peer review processes
Unified expectations around AI declarations and disclosures
Licensing as open as possible, resisting the "salami slicing" of specific use. We also ran across this problem of overly precise restrictions on use while building our paper corpus for CE.

Kingsley had a great line that "publishers re monetizing the funding mandate", which Charlotte Deane later also said was the most succinct way she had heard to describe the annoyance we all have with the vast profit margins of commercial publishers. Kingsley highlighted this via the troubling practices in the IEEE and the American Chemical Society by charging to place repositories under green open access. Her blunt assessment was that publishers are not negotiating in good faith. Her talk drew the biggest applause of the day by far.

After this, John-Arne Røttingen (CEO of the Wellcome Trust) emphasised that funders depend on scientific discourse as a continuous process of refutations and discussions. He expressed concern about overly depending on brand value as a proxy for quality, calling it eventually misleading even if it works sometimes in the short term. Key priorities the WT have is ensuring that reviewers have easy access to all literature, to supporting evidence synthesis initiatives to translate research into impact, and controlling the open body of research outputs through digital infrastructure to manage the new scale. However, his challenge lies in maintaining sustainable financing models for all this research data; he noted explicitly that the Wellcome would not cover open access costs for commercial publishers.

Røttingen further highlighted the Global Biodata Coalition (which he was a member of) concerns about US data resilience and framed research infrastructure as "a global public good" requiring collective investment and fair financing across nations. Interestingly, he explicitly called out UNESCO as a weak force in global governance for this from the UN; I hadn't even realised that UNESCO was responsible for this stuff!

Finally, Prof Charlotte Deane from the EPSRC also discussed what a scientific journal is for these days. It's not for proofreading or typesetting anymore and (as Bill Sutherland also noted earlier), the stamp of quality is key. Deane argued that "research completion" doesn't happen until someone else can read it and reasonably verify the methods are sound; not something that can happen without more open access. Deane also warned of the existential threat of AI poisoning since "AI can make fake papers at a rate humans can't imagine. It won't be long before mose of the content on the Internet will be AI generated".

The audience Q&A was very blunt here. Stefanie Haustein pointed out that we are pumping of billions of dollars into the publishing industry, many of which are shareholder companies, and so we are losing a significant percentage of each dollar spent. There is enough money in the system, but it's very inefficiently deployed right now!

Richard Sever from openRxiv asked how we pay for this when major funders like the NIH have issued a series of unfunded open data mandates over recent years. John-Arne Rottingen noted that UNESCO is a very weak global body and not influential here, but that we need coalitions of the willing to build such open data approaches from the bottom up. Challenging the publisher hegemony can only be done as a pack, which lead nicely onto the next session after lunch where the founder of OpenAlex would be present!

Who are the stewards of knowledge ?

After lunch (where sadly, the vegetarian options were terrible but luckily I had my trustly Huel bar!), we reconvened with a panel debating who the stewards of the scientific record should be. This brought together perspectives from commercial publishers (Elsevier), open infrastructure advocates (OpenAlex), funders (MRC), and university leadership (pro-VC of Birmingham).

Victoria Eva (SVP from Elsevier) opened by describing the "perfect storm" facing their academic publishing business as they had 600k more submissions this year than the previous year. There was a high level view on how their digital pipeline "aims to insert safeguards" throughout the publication process to maintain integrity. She argued in general terms to view GenAI through separate lenses of trust and discoverability and argud that Elsevier's substantial technological investments position them to manage both challenges well. I was predisposed to dislike excuses from staggeringly profitable commercial publishers, but I did find her answers to providing bulk access to their corpus unsatisfying. While she highlighted their growing open access base of papers, she also noted that the transitionon to open access cannot happen overnight (my personal translation is that this means slow-walking). She mentioned special cases in place for TDM in the Global South and healthcare access (presumably at the commercial discretion of Elsevier).

Jason Priem from OpenAlex (part of OurResearch) then offered a radically different perspective. I'm a huge fan of OpenAlex, as we use it extensively in the CE infrastructure. He disagreed with the conference framing of publishers as "custodians" or "stewards," noting that these evoke someone maintaining a static, old lovely house. Science isn't a static edifice but a growing ecosystem, with more scientists alive today than at any point in history. He instead proposed a "gardener" as a better metaphor; the science ecosystem needs to nourish growth rather than merely preserving what exists. Extending the metaphor, Priem contrasted French and English garden styles: French gardens constrain nature into platonic geometric forms, while English gardens embrace a more rambling style that better represents nature's inherent diversity. He argued that science needs to adopt the "English garden" approach and that we don't have an information overload problem but rather "bad filters" (to quote Clay Shirky).

Jason Priem (OpenAlex), Victoria Eva (Elsevier) and Mark Walport in the panel

Priem advocated strongly for open infrastructures since communities don't just produce papers: also software, datasets, abstracts, and things we don't envision yet. If we provide them with the "digital soil" (open infrastructure) then they will prosper. OpenAlex and Zenodo are great examples of how such open infrastructure hold up here. I use both all the time; I'm a huge fan of Jason's work and talk.

Patrick Chinnery from the Medical Research Council brought the funder perspective with some numbers: publishing consumes 1 to 2% of total research turnover funds (roughly £24 million for UKRI) . He noted that during the pandemic, decision-makers were reviewing preprint data in real-time to determine which treatments should proceed to clinical trials and decisions had to be reversed after peer review revealed flaws. He emphasised the the need for more real time quality assurance in rapid decision-making contexts.

Adam Tickell from the University of Birmingham declared the current model "broken", and not that each attempt at reform fails to solve the basic problem of literature access (something I've faced myself). He noted that David Willetts (former UK Minister for Science) couldn't access paywalled material while minister of science in government (!) which significantly influenced subsequent government policy towards open access. Tickell was scathing about the oligopolies of Elsevier and Springer, arguing their profit margins are out of proportion with the public funding for science. He noted that early open access attempts from the Finch Report were well-intentioned but ultimately insufficient to break the hegemony. Perhaps an opportunity for a future UK National Data Library... Tickell closed his talk with an observation about the current crisis of confidence in science. This did make me think of a recent report on British confidence in science, which shows the British public still retains belief in scientific institutions. So at least we're doing better than the US in this regard for now!

Stefanie Haustein points out ChatGPT-related content in response to Elsevier's comments on stage.

The Q&A session opened with Mark Walport asked how Elsevier manages to publish so many articles. Victoria Eva from Elsevier responded that they receive 3.5m articles annually with ~750k published. Eva mentioned something about "digital screening throughout the publication process" but acknowledged that this was a challenge due to the surge from paper mills. A suggestion of paying peer reviewers was raised from the audience but not substantively addressed. Stefanie Haustein once again made a great point from the audience about how Elsevier could let through AI generated rats with giant penises with all this protection in place; clearly, some papers have been published by them with no humans ever reading it. This generated a laugh from the audience, and an acknowlegment from the Elsevier rep that they needed to invest more and improve.

How to make open infrastructure sustainable

My laptop power ran out at this point, but the next panel was an absolute treat as it had both Kaitlin Thaney and Jimmy Wales of Wikipedia fame on it!

Hylke Koers, Kaitlin Thaney, Jummy Wales and Ian Mulvany

Jimmy Wales pointed out an interesting point from his "seven rules of trust" is that a key one is to be personal with human-to-human contact and not run too quickly to technological solutions. Rather than, for example, asking what percentage of academic papers showed evidence of language from ChatGPT, it's more fruitful to ask whether the science contained within the paper is good instead of how it's written. There are many reasons why someone might have used ChatGPT (non-native speakers etc) but also many reasons unrelated why the science might be bad.

Kaitlin Thaney pointed out the importance of openness given the US assault on science means that the open data repositories can be replicated reasonably as well.

Ian Mulvaney pointed out that Nature claims to have invested $240m in research infrastructure, and this is a struggle for a medium sized publisher (like his own BMJ). Open infrastructure allows sharing and creation of value to make it possible to let these smaller organisations survive.

When it comes to policy recommendations, what did the panel have to say about a more trustworthy literature?

The POSI principles came up as important levels.
Kaitlin mentioned the FOREST framework funded by Arcadia and how they need to manifest in concrete infrastructure. There's an implicit reliance on infrastructure that you only notice when it's taken away! Affordability of open is a key consideration as well.
Jimmy talked about open source software, and what generally works is not one-size-fits-all. Some are run by companies (their main product and they sell services), and others by individuals. If we bring this back to policy, we need to look at preserving whats already working sustainably but support it. Dont try to find a general solution but adopt targeted, well thought through interventions instead.

I'm updating this as I go along but running out of laptop battery too!

# 14th Jul 2025

notes ai evidence livenotes publishing royalsociety

Anil Madhavapeddy, Professor of Planetary Computing