When Words Count

Writing by the Numbers

A data-driven, witty exploration of style, emotion, and literary craft

An experiment in reading that uncovers how writers spend, shape, and withhold emotion.

Writing by the Numbers begins with a deceptively simple question, whether emotional language leaves a measurable trace across a writer’s work, and turns it into a wide-ranging inquiry into style, habit, and literary judgment. By analyzing thousands of pages across major authors, the book challenges easy assumptions about good writing, showing that emotional force does not depend on the frequency of emotional words but on control, distribution, and restraint. What emerges is not a tribunal of better and worse writers, but a map of tendencies: who names feeling, who implies it, who repeats, who varies, who pads, and who compresses. The result is both analytical and practical, a reorientation of how we read and write, arguing that style is not merely what a writer says, but what they reach for too quickly, repeat too often, or deliberately withhold.

For Agents & Publishers

Synopsis Writing by the Numbers

Writing by the Numbers asks what happens when you count the things literary criticism has always noticed impressionistically: which writers name emotions often, which avoid it, which repeat themselves, and whether any of this correlates with the critical reputations those writers have earned. The answer, across thirty writers and roughly 150,000 pages spanning two centuries, is yes — measurably, surprisingly, and in ways that occasionally overturn received opinion while confirming intuitions serious readers have held without being able to demonstrate.

Method and Limits

Fifty-four emotional adjectives were searched across the complete or near-complete works of thirty writers from Jane Austen to David Foster Wallace, with raw counts normalized by total output so that Dickens and Austen stand on the same footing. A parallel analysis covers thirty-two throat-clearer words, "just," "really," "very," and their relatives, and a third covers seasonal, monthly, and weekday references. The data cannot distinguish irony from sincerity, narration from dialogue, or displacement of feeling into gesture from simple avoidance. It measures lexical explicitness, not total emotional force. That limitation weakens sweeping claims about greatness. It strengthens a narrower one: the writers who use the fewest emotional adjectives per page are, with remarkable consistency, the writers critics have rated most highly for attention to language.

The Distribution and Its Complications

Ford Madox Ford, Pynchon, Joyce, Nabokov, and Woolf distrust the purchase of emotion through naming. Their pages are not emotionally empty; they are lexically indirect. Lawrence, Wolfe, Maugham, and Fitzgerald label feelings openly and repeatedly, giving their prose an agitated edge missing in their more restrained contemporaries. The interesting complication is Updike, who appears in the high-use cluster alongside writers with considerably looser reputations. His presence suggests that the relationship between emotional density and literary control is not a simple inverse. What matters is not quantity but governance.

Standard Deviation as Biography

Henry James shows the highest standard deviation of all thirty authors, not noise but biography. His early work names feelings with a casualness his mature prose would never permit, and the number records a genuine transformation without being directed to find one. Joyce and Hemingway show the lowest standard deviations: each found their method early and held it without wavering. Wolfe's deviation is also high, but for the opposite reason — his variability is the record of a search that ended before finding what it sought. The data, without knowing any of this, draws the distinction.

Two Kinds of Restraint

Joyce's restraint and Hemingway's are different. Hemingway concentrates his emotional vocabulary into blunt, repeated terms — "bad" and "tired" carry much of his emotional weight. The feeling behind "I felt very tired" is not tiredness but everything tiredness stands in for when a man cannot say the rest. Joyce deploys emotional adjectives sparingly but under ironic pressure, "the strange sad happy air," the modifier undercutting itself immediately. One method is concentration. The other is strategic deployment. The data groups them as minimalists. They are practicing opposite crafts that converge on the same low number.

The Correlation Findings

The highest correlation of 0.97 falls between Chekhov and Conrad, writers working in different languages and traditions with no direct line of influence between them. The explanation is convergence under identical pressure: the reaction against Victorian sentimentalism, Flaubert's doctrine that emotion must radiate from the scene rather than be declared, and the editorial culture of the 1890s, which rewarded compression. They breathed the same atmosphere in different rooms. The figure is empirical evidence for what literary historians have argued impressionistically for a century: that late-nineteenth-century naturalism produced a shared emotional grammar measurable across national traditions.

Also Tolstoy correlates significantly with more writers than any other figure: Chekhov, Conrad, Joyce, Lawrence, Nabokov, and Proust all show high positive correlations with his profile. His pattern of emotional naming sits near the weighted center of two centuries of serious literary fiction. Bloom, Moretti, and James Wood have each argued for Tolstoy's canonical centrality by different means. The data arrives at the same conclusion without reading the novels.

Weather and Temporal Specificity

Modern writers name seasons, months, and weekdays far more often than their nineteenth-century predecessors, a trend the book argues is sociological rather than aesthetic. The railway timetable, the standardized working week, and the daily newspaper together colonized daily life with clock-time, and fiction's contract changed accordingly. Writers who had been content with "that summer" learned to say "that Tuesday in June" because their readers had become people for whom the difference mattered. Conrad is the outlier: he mentions a season, month, or weekday once every twenty-two pages, the lowest figure of any of these writers, substituting meteorological time for calendar time. When other writers say "that June," Conrad says "the gale."

Throat Clearers and the Coherence of Economy

The throat-clearer analysis mirrors the emotional adjective findings: the writers who use the fewest emotional adjectives also use the fewest hedges and intensifiers. Economy of language is a coherent practice, not a collection of separate habits. The two datasets were collected independently and found the same writers at each extreme.

Conclusion

Strong prose does not abolish emotional language. It subordinates it. Style is not just what a writer says. It is what a writer reaches for too quickly, repeats too easily, or resists saying at all. The data has not settled any argument that literary criticism has been having. It has made several of those arguments considerably harder to dismiss.

From Writing by the Numbers

Three passages tracing how style emerges from habit, restraint, and choice.

Excerpt I — On Emotional Language and Control

This essay begins with that difficulty. I wanted to know whether emotional adjectives leave a measurable trace across a writer’s work. Not whether they explain the work, not whether they summarize its greatness, only whether they leave behind a lexical watermark. The experiment began with a crude question and had to be improved as it went along. Do some writers name feelings more often than others? If so, does that suggest emotional preoccupation, verbal laziness, stylistic fashion, or simply a different way of carrying emotion through prose? Does the overuse of certain adjectives represent an author’s obsession with the subject? For example, happiness or anger. Or does the overuse equal lazy writing? I can answer some of the questions.

I hear the oft-repeated phrase, "Write with strong nouns and not adjectives.” I am so committed to reading grammar books that I find the fifth or sixth variation on the theme affects me. I begin to see my repetition as a cliche. If such advice is a cliche, then why do so many commit the error of adding blank adjectives and adverbs? Hence, I work on a paragraph that discusses, not the crime, for that is settled law, but the method of its repetition. More emotional adjectives do not make a writer better. A great writer may name feeling often and yet keep every word under control. Another may scarcely name feeling at all and still saturate a page with emotional pressure. The opposition is not between emotional and unemotional writing. It is between different economies of emotional explicitness.

That distinction matters. Fiction can move feeling through many channels. A writer can say a character is anxious, proud, bitter, or happy. A writer can also make the reader infer anxiety from gesture, pride from rhythm, bitterness from recurrence, or happiness from irony. The naming of emotion is only one device among many. Scene, syntax, dialogue, free indirect style, pacing, image pattern, and narrative distance can all do work that an adjective might otherwise do. The data does not capture those things. It captures one layer: the rate and distribution of selected emotional words.

Excerpt II — On Cliché, Repetition, and the Burden of Tradition

I have read only a few of London’s novels to say that I have read something by him. There is something unfair about going through dozens of a writer’s novels and just reading about “wind.” You give up after you read so many chill and gentle west, east, and north winds occurring repeatedly. Don’t ever describe the wind as howling or liken it to a breath (not even a first or second breath) or a puff. I can vouch that Jack London alone has done it to death. He mentions wind 1,054 times. I gave up at 400 when nothing novel had jumped at me in the preceding 200 mentions.

The history of writing becomes a burden to new writers, the clichés strewn about thousands of stories. Not worth attempting to take the wind as a metaphor for something you can’t see directly. You can describe its effects with verve and originality more easily than by adjectively characterizing it.

Anton Chekov surprises even when you know how accomplished a writer he is. Going painstakingly, conscientiously, through 400 of Jack London’s wind quotes, I convinced myself that all possible wind descriptions had been exhausted. What more can one expect from a writer? How many nouns and verbs can convey the beating wind? I almost lost faith. Never sell short the imaginative brain.

Excerpt III — On Style, Habit, and What Writing Really Is

I began with the vulgar hope that arithmetic might settle an argument criticism never quite settles: who writes cleanly, who writes loosely, who trusts naming, who trusts implication. The data didn’t give me a tribunal. It gave me something better. It gave me habits.

Some writers readily and often name feelings. Some distrust the naming of feeling and push emotion into the scene, syntax, irony, or the pressure of things. Some can afford abundance because they distribute it so well. Some cannot. Some repeat a handful of emotional words until they become the prose equivalent of a familiar gesture. Some all but refuse themselves the convenience. Some waste words in padding and verbal false limbs. Some appear unable to bear that kind of slackness. None of these findings alone settles literary value. But together they help explain why certain pages feel aerated and others gummy, why some styles trust direct statement and others force the reader to infer.

The most important correction the data imposed on me was this: using more emotive words does not make a better writer. Good writing lies in control, in proportion, in pressure, in the ability to spend words where they matter and withhold them where they do not. A writer may be lavish and exact. A writer may be sparse and slack. The true distinction is not quantity but governance