A.I. Is Poised to Rewrite History. Literally.

A.I. Is Poised to Rewrite History. Literally.

During my 25 years as a magazine editor, my favorite part of the job has always been helping writers figure out what the story is: where to start it, where to end it, what’s important and new about it. So it was with no small amount of humility that, earlier this year, I sat in a Google corporate cafeteria along the West Side of Manhattan and watched as one of my longtime writers — Steven Johnson, the technology journalist and historian — received that kind of guidance from an A.I. instead of me.

Listen to this article, read by Robert Fass

Johnson, who has published popular histories about pirate attacks, the invention of modern policing and the birth of public health, had begun noodling on a possible book about the California gold rush of the mid-19th century, he explained. But he was still at the point where he didn’t know much more than that. “What’s my twist?” he said. “Literally, I don’t know.”

To figure it out, Johnson had loaded some of his sources into NotebookLM, an app for researchers and writers that he himself helped build, after becoming the editorial director of Google’s Labs division three years ago. Unlike most other A.I. tools, which draw their answers to questions from the mind-boggling infinitude of data they were trained on, NotebookLM draws only from files selected by the user, on the premise that most forms of research benefit from thoughtfully curating your source material.

Since the product’s worldwide release last year, Google and Johnson have been promoting its utility for all manner of tasks, whether it’s auto-generating minutes and takeaways from an audio recording of a meeting or encouraging a more licit use of A.I. among college students. NotebookLM’s most viral capability is an auto-generated podcast, which in a matter of a couple of minutes will spit out a detailed conversation between two freakishly realistic voices, drawing out the key concepts of the source material. But as an occasional author of history books myself, I was most interested in how A.I. — one of whose many superpowers is the ability to inhale large amounts of text in an instant and offer credible summaries of it — might transform the way history is written.

At Google that evening, as the sun went down over the Hudson, Johnson showed me the results of his experiments so far. He started his brainstorming process by giving NotebookLM excerpts from one of the finest existing histories on the Gold Rush, H.W. Brands’s “The Age of Gold.” He thought he might want to focus on the conflict between white gold-seekers and the Native American groups living in the Yosemite Valley in the 1850s, so he uploaded the text of an older source called “Discovery of the Yosemite,” by Lafayette Houghton Bunnell, who was part of the Mariposa Battalion, the militia unit that rode into the valley in 1851. Next, to bring in the Indigenous perspective, he went to public-domain websites and found two accounts about the people whom the battalion expelled from the valley: “The Ahwahneechees: A Story of the Yosemite Indians” and “Indians of the Yosemite Valley and Vicinity.”

Johnson started his conversation with NotebookLM with a little orientation, identifying himself as the author Steven Johnson, so that the A.I. (whose training allows it to understand almost the whole internet, as ChatGPT does, even if it constrains its answers to the uploaded sources) might get a sense for the kinds of books he writes. Then he started peppering it with questions: What in the two sources that focused on the Indigenous experience, he asked, was missing from the other two sources? When the model returned its summary, his eye was caught by the observation that “The Ahwahneechees,” by including short biographies of individual Yosemite Indians, “helps to humanize the people beyond being a tribal mass, which is something that ‘The Age of Gold’ and Bunnell’s book tend to do.” The A.I. listed some of their names, among them Maria Lebrado, granddaughter of the Yosemite chief Teneiya.

That piqued Johnson’s interest. He asked for more information about Lebrado, and the tool returned nearly 600 words of biography — that she was one of the 72 Native people forced to leave the valley by the battalion in March 1851; that she eventually married a Mexican man who ran a pack train down in the Central Valley; that she was “discovered” by a white historian in the 1920s and held up as the last of the original Yosemite Indians.

Right away, Johnson recognized that she would make a great character. He took note in particular of the fact that Lebrado returned to the valley near the end of her life. “I’m like, What’s the [expletive] structure of ‘Titanic’?” he joked. The book could open with what Johnson imagined was Lebrado’s emotional return to the valley at nearly 90 years old, before zooming back in time — to her childhood, to a broader cast of characters and the violent drama of the 1850s.

Johnson wasn’t sold on this idea yet. But he marveled at how A.I., operating with mostly open-source texts and a tiny amount of human labor, had delivered him a concept that he absolutely could use. “Everything I’ve just showed you is, like, 30 minutes of work,” he said.

Like most people who work with words for a living, I’ve watched the rise of large-language models with a combination of fascination and horror, and it makes my skin crawl to imagine one of them writing on my behalf. But there is, I confess, something seductive about the idea of letting A.I. read for me — considering how cruelly the internet-era explosion of digitized text now mocks nonfiction writers with access to more voluminous sources on any given subject than we can possibly process. This is true not just of present-day subjects but past ones as well: Any history buff knows that a few hours of searching online, amid the tens of millions of books digitized by Google, the endless trove of academic papers available on JSTOR, the newspaper databases that let you keyword-search hundreds of publications on any given day in history, can cough up months’ or even years’ worth of reading material. It’s impossible to read it all, but once you know it exists, it feels irresponsible not to read it.

What if you could entrust most of that reading to someone else … or something else? As A.I. becomes more capable of parsing large data sets, it seems inevitable that historians and other nonfiction writers will turn to it for assistance; in fact, as I discovered in surveying a wide variety of historians over the last few months, experiments with it are already far more common than I expected. But it also seems inevitable that this power to help search and synthesize historical texts will change the kinds of history books that are written. If history, per the adage, is written by the winners, then it’s not premature to wonder how the winners of the A.I. race might soon shape the stories that historians tell about the past.

Among the historians I spoke with, one of the more enthusiastic experimenters was Fred Turner, who teaches in the communication department at Stanford. I arrived at his office expecting to interview him about how A.I. fits into the long history of information technology, but we wound up spending much of our time discussing how ChatGPT has helped him with his latest book project, which revolves around the New York City art scene of the 1970s and 1980s.

He was at a stage that he called a “source outline” — meaning a document of 100 pages or so compiling all his research, “organized more or less in the arc of the book that I think it’s going to become.” From that, he was planning to write a proposal, but he figured he would ask ChatGPT first. In response, the chatbot rattled off a plausible, polished eight-chapter structure, one that surfaced useful connections in his research and also suggested a more streamlined narrative for the project.

Beyond that, he said, “what it did was give me my quite scholarly work read back to me through a middlebrow voice. It found the sort of average of my work, which was really interesting.” He went on: “It was almost as though I got to take the book to market and stand in the bookstore and hold it up in front of a whole bunch of interested but nonspecialist readers and have them tell me what was working for them and what wasn’t.”

There are a handful of scholars who are beginning to more formally incorporate the use of L.L.M.s into their work. One of them is Mark Humphries, a professor at Wilfrid Laurier University in Ontario, whose research projects involve enormous stores of digitized records from Canadian history. In one project, he and his students used A.I. to analyze tens of thousands of handwritten records about late 18th- and early 19th-century fur trading, in order to better understand the far-flung community of traders (now known collectively as “voyageurs”) who, with their families, explored and later settled much of what would eventually become Canada. “If you can pass those records to a large-language model and you can say, ‘Tell me who Alexander Henry’s trading partners were,’” Humphries said, “the neat thing is that it’s able to very quickly go through and not just search by name but do cross-referencing, where you’re able to find relationships.”

The goal is to find not just one-to-one transactions between specific voyageurs but chains of interconnection that would be hard for human researchers to make quickly. “It takes an A.I. 20 seconds,” Humphries said. “It might take a graduate student doing the same work weeks.”

To be sure, Humphries — like Steven Johnson — is swimming in the deep end of A.I experimentation. Most historians I contacted were only dipping a wary toe in the water. When I reached out to Ada Ferrer, Princeton professor and author of the Pulitzer Prize-winning book “Cuba: An American History,” she wrote back an hour later to say that she had just tried playing with A.I. in recent weeks. “I was finishing up a book and feeling stuck on a title, so I asked ChatGPT,” she wrote. “I kept tinkering with my prompt, including different themes, making suggestions about tone and so on. In the end, it probably gave me about 20 title ideas.” (None were quite good enough, however, and her book remained untitled.)

Ferrer sounded a note that many other academic historians did: that their attitudes toward A.I. lived in the shadow of their students’ cheating with it, which simultaneously made them reluctant to touch it but also seemed to have made them understand just how powerful it was as a tool. “I am haunted by the fact that it would be hypocritical for me to use A.I. given how concerned I am with my students’ use of it,” said Jefferson Cowie of Vanderbilt, who won a Pulitzer Prize in 2023 for his book “Freedom’s Dominion: A Saga of White Resistance to Federal Power.” But he added, “I also know that a few people will be putting it to astoundingly creative use once we get a handle on it.”

The obvious question hanging over the future of A.I. history — and the technology’s utility for all manner of other things — comes down to the strides it needs to make on its accuracy problem. Charles C. Mann, author of “1491” and, most recently, “The Wizard and the Prophet,” told me that he experimented with various models while researching a new book project about the history of the American West, and they turned up some great leads, but he became disturbed at how easy it was for them to regurgitate bad information. Mann contrasted it with the rigor of a human editorial process: “I’m sure you’ve had that moment, as a journalist, where a smart editor has said, ‘Wait a second — does this make sense?’ And you say, ‘Oh, crap, it doesn’t.’ That’s what A.I. can’t do. It has no bullshit detector.”

In May of this year, The Times published bracing numbers about how, inexplicably, for all the strides in capability that L.L.M.s were otherwise making, their hallucination problem was getting worse: For example, on a benchmark test, OpenAI’s new o3 “reasoning” model delivered inaccuracies 33 percent of the time, more than twice the rate of its predecessor. To Johnson and his team at Google, the persistence of this problem validates the approach of NotebookLM: While the tool does occasionally misrepresent what’s in its sources (and passes along errors from those sources without much ability to fact-check them), constraining the research material does seem to cut down on the types of whole-cloth fabrications that still emerge from the major chatbots.

That said, Johnson also believes that the most exciting uses for L.L.M.s in nonfiction research are the ones in which the results will always be fact-checked. Rather than seeing A.I. as an undergrad research assistant, he’s envisioning it as more like a colleague from another department or perhaps a smart book agent or editor who can help him see the most interesting version of his own ideas. It should be noted that, for other historians, this is hardly an inspiring prospect. When I asked Stacy Schiff, the author of decorated biographies of Cleopatra and Véra Nabokov, about the notion of consulting A.I. on how to structure a piece of writing, she replied, “To turn to A.I. for structure seems less like a cheat than a deprivation, like enlisting someone to eat your hot fudge sundae for you.”

A digitally altered photo of a painting featuring a book, a globe, a compass and other printed materials.
Photo illustration by enigmatriz

How might A.I. change the way history is written and understood? To answer that question, it’s useful to think about L.L.M.s as merely the latest in a long series of shifts in the organizing of human knowledge. At least since the third century B.C., when Callimachus wrote his “Pinakes,” a series of books (now lost) cataloging the holdings of the famous library (now lost) in Alexandria, humanity has devised increasingly sophisticated systems for navigating pools of information too large for any one individual to take in.

Such systems inevitably have a double edge when it comes to scholarly research, a task where “efficiency” always risks being synonymous with cutting corners. The printed index in books, a device dating back at least to the year 1467, allowed scholars to find relevant material without reading each tome in full. From the perspective of human knowledge, was that a step toward utopia or dystopia? Even now, 558 years later, who’s to say? Innovations that cultivate serendipity — such as the Dewey Decimal System, by whose graces a trip into the stacks for one book often leads to a different, more salient discovery — must, almost by definition, be plagued by arbitrariness. Classify a book about the Mariposa Battalion with Brands’s “The Age of Gold” and other gold-rush titles (979.404), and it will acquire a very different set of neighbors than if it’s classified as a book about the Battalion’s victims (“Native populations, multiple tribes,” 973.0497).

The rise of computers and the internet were of course an unprecedented turning point in the history of tools for writing history — exponentially increasing the quantity of information about the past and, at the same time, our power to sift and search that information. Psychologically, digital texts and tools have thrown us into an era, above all, of “availability”: both in the colloquial sense of that word (everything’s seemingly available) and in the social-scientific sense of “availability bias,” whereby we can fool ourselves into thinking that we have a clear and complete picture of a topic, buffaloed by the sheer quantity of supporting facts that can spring up with a single, motivated search.

Even among academic historians, this availability has shifted incentives in a direction that A.I. is likely to push even further. In 2016, years before the L.L.M. explosion, the University of Pittsburgh historian Lara Putnam published an essay about the achievements but also the dangers of search-driven digital research. “For the first time, historians can find without knowing where to look,” she wrote, in a particularly trenchant paragraph. “Technology has exploded the scope and speed of discovery. But our ability to read accurately the sources we find, and evaluate their significance, cannot magically accelerate apace. The more far-flung the locales linked through our discoveries, the less consistent our contextual knowledge. The place-specific learning that historical research in a predigital world required is no longer baked into the process. We make rookie mistakes.”

Putnam’s essay wasn’t a jeremiad against digital tools — which can power what she memorably calls the “sideways glance,” the ability for a historian whose expertise lies squarely in one domain to get up to speed more quickly on other topics. Digital search has allowed historians to make genuine, powerful connections that wouldn’t have been made otherwise. But she worried about what was being lost, especially given that the pool of digitized sources, even as it keeps growing, remains stubbornly unrepresentative: biased toward the English language and toward wealthy nations over poor ones, but biased especially toward “official” sources (those printed rather than written, housed in institutional rather than smaller or less formal archives). “Gazing at the past through the lens of the digitizable,” Putnam notes, “makes certain phenomena prominent and others less so, renders certain people vividly visible and others vanishingly less so.”

When she and I chatted recently, Putnam compared this shift to Baumol’s cost disease — the phenomenon, noted by the economist William Baumol, that when technology makes certain workers more efficient, it winds up making other forms of labor more expensive and therefore harder to justify. In principle, Putnam notes, digital tools have no downside: Professional historians remain more than capable of carrying out time-consuming research in physical archives. But in practice, the different, faster, more connective kind of research was making the more traditional work seem too professionally “expensive” by comparison. Why spend a month camped out in some dusty repository, not knowing for sure that anything publishable will even turn up, when instead you can follow real, powerful intellectual trails through the seeming infinitude of sources accessible from the comfort of home?

Putnam hasn’t experimented with A.I. research herself, and her own fears run to the basic (and perhaps correct) nightmare scenario that the technology will destroy the history business the same way that it’s destroying the art of student essay-writing, i.e., by composing texts that are just plausible enough to make human work irrelevant. But even if that nightmare is averted and A.I. becomes the human historian’s collaborator instead of her replacement, it’s easy to see how A.I. summarization could transform nonfiction writing along lines analogous to what digital search has done. The individual sources would fade yet further into the background, as users trust tools like NotebookLM to offer cogent-seeming summaries of enormous troves of texts without much attention to their origins or agendas. What becomes staggeringly “cheap,” in such a world, is work that attempts to synthesize astonishing amounts of material, perhaps drawing on sources far beyond what a single human could process in a lifetime, ranging promiscuously across languages, borders and time periods, at a speed that would allow a single human to complete multiple such projects in a career.

The potential upsides of such a shift should not be discounted. At the moment, perhaps the greatest drama in the world of popular history is the question of whether the 89-year-old Robert Caro can finish the fifth and final volume of his Lyndon Johnson biography, a titanic undertaking that has consumed 50 years of his life. His personal papers, some of which he has already given to the New York Historical, so far constitute 150 linear feet of file boxes, and one wonders how many more prodigious biographies Caro could have produced over those 50 years if a chatbot had been able to synthesize some portion of that material for him. Then again, without a mania to touch every single source himself, and to pour the rigor of all that relentless reading into his prose, an A.I.-assisted Robert Caro would not have been Robert Caro at all.

In Steven Johnson’s experiments with gold-rush history, the most eye-opening result came when he floated a structural thought of his own to the NotebookLM chatbot. It drew on Johnson’s infatuation with what he calls the “long zoom” — a mode of storytelling that, like the famous Charles and Ray Eames documentary “Powers of Ten,” moves exponentially in time or scale.

“Just thought of a cool structure for this book,” he told the chatbot. “Maybe most of it is a series of chapters telling the story leading up to the Mariposa Battalion events” — meaning the invasion of Yosemite Valley — “but each chapter is a time scale. So Chapter 1 is called: a million years before; Chapter 2 is a thousand years before; Chapter 3 is a hundred years before, etc. etc.: a decade, a year, a month, a week, a day. And then the war starts in 1851. And then there’s a long epilogue. Sketch out what the content of each of those chapters could be? What part of the big picture story would we be telling in that segment?”

Johnson already knew why that structure could work: because the time scales that endowed the valley with the key elements of the drama (gold, the first humans, the specific Indigenous nations of the mid-19th century, the encroaching white settlers) stretched out over all those powers of 10. But it was astonishing to see that the A.I. perceived all of that, too, without Johnson having to tell it:

A Million Years Before: The geological formation of the Sierra Nevada, the creation of the gold deposits and the carving of the Yosemite Valley by glacial activity.
A Thousand Years Before: Early human migrations and the establishment of Indigenous cultures in the Sierra Nevada, focusing on their adaptation to the environment.
A Hundred Years Before: The lives of the ancestors of the Yosemite Indians, intertribal relationships, early contact with Europeans and the initial impacts of disease.
A Decade Before: The increasing American presence in California, the Mexican-American War, the annexation of California and the initial rumors of gold.
A Year Before: The discovery of gold at Sutter’s Mill, the rapid influx of prospectors, the establishment of mining camps and the first signs of conflict with Native American tribes. Introduction of key figures like James Savage and Chief Tenieya.
A Month Before: Escalating tensions, raids on trading posts, the formation of settler militias and failed attempts at negotiation leading to the Mariposa War.
A Week Before: The organization and movements of the Mariposa Battalion, the perspectives of the Yosemite Indians as conflict becomes imminent.
A Day Before: The final hours before the outbreak of the Mariposa War, focusing on the psychological states and expectations of individuals on both sides.

Earlier this spring, I went to visit Johnson at Google’s campus in Mountain View, Calif., for an update on his progress. Johnson had stuck with his cosmic powers-of-ten approach, and he now was trying to refine his concept for a first chapter, the one set a million years before 1851. He told me that he had become obsessed with the California oak, whose acorns were one of the main sources of subsistence that allowed the earliest Indigenous groups to settle in the valley.

When I sat down with Johnson in a conference room, he showed me NotebookLM’s new “mind map” feature, which creates an auto-generated conceptual tree from your source material, allowing you to explore thematically without having to write specific queries. He was interested in a phenomenon called “masting,” in which oaks sync up to drop large quantities of acorns in a single season, and he had added a group of sources on masting into the tool. Then he generated a mind map for them, which created forking thematic paths about the phenomenon: for example, the topic “Hypotheses for Masting” branched out to include “Predator Satiation Hypothesis,” “Pollination Efficiency Hypothesis” and more. Johnson saw the utility of all this in similar terms to how Putnam talked about the “sideways glance”: “I don’t need to be an expert on acorns, but I do want to have two paragraphs about acorns.”

Here at the Mountain View mothership, it was easier to understand NotebookLM as just one of a host of A.I. tools that the tech giant was developing, all swirling around the theme of creativity. In just a couple of months, the company’s DeepMind division would introduce VEO 3, which can create high-definition videos with realistic audio entirely from prompts written by users. Flow, a different video project within Google Labs, was helping teams of up-and-coming filmmakers create A.I. elements for their projects, giving them access to, say, a caliber of animation or exotic settings from around the world that they couldn’t have afforded otherwise.

Johnson and I were joined by Josh Woodward, the head of Google Labs, who told me that he saw A.I. as capable of aiding creative projects in two ways: First, it can “lower the barrier,” helping people undertake projects that they otherwise couldn’t, whether out of a lack of resources, lack of training or both; and second, it can “raise the ceiling,” helping established creators take their projects to new levels. In terms of “raising the ceiling,” the division was working with Hollywood talent (“you would know their names”) to use A.I. on particular shots in major films: “Of course there’s an economic dimension to it,” Woodward said, but these established creators were also seeing how the technology let them “explore so many more stories, so fast.”

These tools, along with those being developed elsewhere in Silicon Valley, clearly have the capacity to throw the creative economy into even greater turmoil. Beyond that, many of them — the mind-boggling video-generation engines, in particular — seem likely to accelerate the cultural changes that have made serious writing less and less relevant in the internet era. Perhaps it was naïve to even worry about A.I.’s competing with historians, when the typical user, amid a life increasingly consumed with other, nonverbal diversions, is satisfied to receive facts on demand in bite-size bursts.

Both Woodward and Johnson seemed aware of the destructive potential of their enterprise. In the tools it designs, Labs has tried to keep the interests of the human creators at the forefront. Johnson has been thinking all along about ways that historians and other authors could make money through the app. In our conversation in Mountain View, he put forward a possible new revenue stream: What if e-books of history came enhanced with a NotebookLM-like interface?

Imagine, he went on, that “there’s a linear version of the story with chapters,” but then the primary materials the author used to write the book also come bundled with it. That way, “instead of just a bibliography, you have a live collection of all the original sources” for a chatbot to explore: delivering timelines, “mind maps,” explanations of key themes, anything you can think to ask.

It is perhaps the most brain-breaking vision of A.I. history, in which an intelligent agent helps you write a book about the past and then stays attached to that book into the indefinite future, forever helping your audience to interpret it. From the perspective of human knowledge, is that utopia or dystopia? Who’s to say?

Source artwork for illustrations above: Corbis Historical/Getty Images; Hulton Fine Art Collection/Heritage Images, via Getty Images.

Read this on New York Times Technology
  About

Omnixia News is your intelligent news aggregator, delivering real-time, curated headlines from trusted global sources. Stay informed with personalized updates on tech, business, entertainment, and more — all in one place..