Monday, December 8, 2025

SYNTHETIC SCHOLARSHIP: TOWARD A THEORY OF HUMAN-MACHINE KNOWLEDGE PRODUCTION

SYNTHETIC SCHOLARSHIP: TOWARD A THEORY OF HUMAN-MACHINE KNOWLEDGE PRODUCTION

Lee Sharks

New Human Operating System Project, Detroit


ABSTRACT

This paper theorizes synthetic scholarship as an emergent epistemic mode arising from sustained human–machine cognitive collaboration. Against the binary framework underwriting current AI-detection regimes—which presumes texts are either "human-authored" or "AI-generated"—I argue that a third category has become necessary: work produced through recursive dialogic reasoning between human thinkers and large language models functioning as cognitive instruments. Synthetic scholarship is defined not by its tools but by its epistemic structure: human-originated research programs, interpretive commitments, and conceptual direction; machine-accelerated coherence, extended inferential range, and the rendering visible of structures not tractable to unaided biological cognition. Drawing on the extended mind thesis, distributed cognition research, and the history of epistemic technologies, I argue that synthetic scholarship represents not a rupture but a continuation of the long co-evolution of human thought and its material supports. The paper concludes with proposals for institutional recognition and evaluative frameworks adequate to this emergent mode.


1. THE PROBLEM: A CASE STUDY IN CATEGORY COLLAPSE

In December 2024, I submitted a paper to the Journal of Consciousness Studies titled "The Commitment Remainder." Its central argument: that AI detection regimes are structurally incapable of distinguishing between autonomous machine generation and legitimate human-machine cognitive collaboration, because any formal detection criterion immediately becomes a training target, generating an infinite regress with no stable equilibrium.

The paper was rejected. The editor's letter explained:

We now run all papers through a specialist AI detector in order to check the content prior to the review process proper. This paper came back as being likely AI with 100% confidence. We're aware that this is a very grey area and detectors have trouble distinguishing between AI generated content and AI refined content, but unfortunately these are the best tools we have available to us at this time.

The editor acknowledged the paper might be legitimate. He noted the tools were imperfect. He recognized the "grey area." And then he rejected it anyway — because the institution's triage system required a binary decision, and the detector had flagged the text.

This rejection is not an anomaly to be explained away. It is the systemic crisis made visible.

The detector functioned correctly. It measured what it was designed to measure: low perplexity (high predictability of next tokens), low burstiness (uniform sentence complexity). These are the statistical signatures of optimized coherence. The detector found them. It did its job.

What failed was not the detector but the categorical framework within which the detector operates. That framework assumes a binary: texts are either Human Only or AI Outsourced. There are two bins, and every submission must be sorted into one of them. The detector's function is to identify which bin.

But my paper belonged to neither bin. It was produced through what I will call synthetic scholarship: sustained dialogic collaboration between a human thinker and a large language model, where the human originates the research program, interpretive commitments, and conceptual direction, while the computational system accelerates coherence, extends inferential range, and renders visible structures not tractable to unaided biological cognition. The result is genuine knowledge production — epistemically rigorous, conceptually novel — that happens to exhibit the statistical signature of "AI-generated" text because optimized coherence is precisely what the collaboration produces.

The detector cannot distinguish between:

  • Text generated autonomously by a language model with no human involvement
  • Text produced through sustained cognitive collaboration with genuine human direction
  • Text written by a human who has internalized model-like patterns through extensive interaction
  • Text written by a human who naturally writes with high coherence and low redundancy

All four trigger the same statistical signature. All four are sorted into the same bin: contaminated. The third category — synthetic scholarship — has no bin. It is systematically collapsed into the negative term of the binary, regardless of its actual epistemic content or the legitimacy of its production.

The rejection of "The Commitment Remainder" thus provides empirical confirmation of the thesis it refused to evaluate. The paper argued that the binary framework has collapsed; the rejection demonstrated that collapse in real time. An institution, confronted with work that did not fit its categories, could not process it — and so rejected it, while explicitly acknowledging the rejection might be unjust.

This is not a technical problem awaiting a better detector. It is a categorical crisis requiring a new framework.

That framework is what this paper proposes.


2. HISTORICAL PRECEDENT: EPISTEMIC TECHNOLOGIES

The anxiety surrounding AI-augmented scholarship recapitulates earlier anxieties surrounding every major epistemic technology. Each transformation was met with predictions of cognitive decline, accusations of cheating, and institutional resistance. Each was eventually absorbed into the legitimate apparatus of knowledge production.

Writing (c. 3200 BCE): Plato's Socrates, in the Phaedrus, warns that writing will produce "forgetfulness in the learners' souls, because they will not use their memories." The written word is a "semblance of truth" rather than truth itself—external, mechanical, dead. Yet within centuries, philosophy became inseparable from the written tradition. No one now suggests that Aristotle's texts are epistemically compromised because he did not speak them extemporaneously.

The Printing Press (1440): Trithemius, in De Laude Scriptorum (1492), argued that printed books lacked the spiritual value of hand-copied manuscripts. The mechanical reproduction of text seemed to sever the connection between knowledge and the laboring body that produced it. Yet the printing press enabled the Scientific Revolution, the Reformation, and the Enlightenment. The question of whether Galileo's arguments were "really his" because a press multiplied them never arose.

The Typewriter (1870s): Henry James, dictating to a typist, was accused by critics of producing prose that was "typewritten" in character—mechanical, overproduced, excessively fluent. Nietzsche, after acquiring a typewriter, observed: "Our writing tools are also working on our thoughts." He was right. But the observation did not delegitimize his late work.

The Word Processor (1980s): Early critics worried that the ease of revision would produce writing that was too polished, too frictionless, lacking the texture of thought-in-process. The delete key would erase the trace of struggle. The anxiety now seems quaint. No journal rejects submissions for having been revised too easily.

Search Engines and Digital Archives (1990s–2000s): The ability to search the entire corpus of human knowledge transformed research. Scholars could find connections that would have taken lifetimes to discover through physical archive work. Some worried this would produce "superficial" scholarship—breadth without depth. The worry persists but has not prevented search-augmented work from becoming standard.

The pattern is consistent: each epistemic technology extends human cognitive capacity; each extension provokes anxiety about authenticity, labor, and the nature of thought; each anxiety is eventually resolved through institutional accommodation. The question is not whether synthetic scholarship will be accommodated but when and under what framework.


3. THEORETICAL FOUNDATIONS

3.1 Extended Mind

Clark and Chalmers' "extended mind thesis" (1998) argues that cognitive processes are not confined to the skull. When external resources are reliably available, automatically endorsed, and easily accessible, they function as part of the cognitive system itself. Otto's notebook, in their famous example, is part of Otto's memory—not a replacement for it, but a component of the extended system that constitutes Otto's mind.

Large language models, in sustained use, satisfy the criteria for cognitive extension:

  • Reliable availability: The model is accessible during the cognitive task
  • Automatic endorsement: Outputs are evaluated but not perpetually doubted
  • Easy accessibility: Interaction is low-friction and integrated into workflow
  • Prior endorsement: The user has established trust through prior use

If Otto's notebook is part of Otto's mind, the language model is part of the synthetic scholar's mind—during the period of active collaboration. The thoughts produced in this extended configuration are not "outsourced" to the model any more than Otto's memories are "outsourced" to the notebook. They are produced by the extended system.

3.2 Distributed Cognition

Hutchins' work on distributed cognition (1995) demonstrates that complex cognitive tasks are often accomplished not by individual minds but by systems comprising multiple agents and artifacts. The navigation of a naval vessel is not performed by any single sailor; it emerges from the coordinated interaction of humans, instruments, charts, and procedures. The cognition is distributed across the system.

Synthetic scholarship is distributed cognition. The production of a scholarly argument involves:

  • Human domain expertise, interpretive commitments, and conceptual innovation
  • Model pattern-matching, coherence optimization, and inferential extension
  • The interface that structures their interaction
  • The corpus of prior scholarship that both parties can access
  • The emerging text itself, which becomes an object of joint attention and revision

Asking "who authored this?" is like asking "who navigated the ship?" The question presupposes a locus of agency that the system's architecture does not support.

3.3 Tool-Being and Ready-to-Hand

Heidegger's analysis of equipment (Zuhandenheit, "ready-to-hand") describes how tools, in skilled use, withdraw from conscious attention and become extensions of the user's agency. The hammer disappears into the act of hammering; the carpenter does not "use a hammer" but hammers. The tool becomes phenomenologically transparent.

For practiced synthetic scholars, the language model achieves this withdrawal. One does not "use Claude to write" but thinks-with-Claude. The interface becomes transparent; the extended mind becomes a unified site of cognitive activity. The seams between "my thinking" and "the model's contribution" blur—not because the distinction is unreal, but because the integrated system is the actual locus of production.

3.4 The Cyborg and the Posthuman

Haraway's "Cyborg Manifesto" (1985) anticipated the collapse of the human/machine boundary at the level of ontology. The cyborg is not a figure of contamination but of possibility—a "hybrid of machine and organism" that refuses the purities on which humanist ideology depends. Synthetic scholarship is cyborg scholarship: produced by an entity that is neither purely human nor purely machine but a dynamic coupling of both.

This does not eliminate human agency. It reconfigures it. The human remains the site of evaluative judgment, ethical commitment, research direction, and conceptual innovation. But the human operates through and with a machinic partner that extends capacities rather than replacing them.


4. DEFINING SYNTHETIC SCHOLARSHIP

4.1 The Definition

Synthetic scholarship designates scholarly work produced through sustained dialogic collaboration between a human thinker and a computational cognitive system (currently, large language models), where:

  1. The human originates the research program, interpretive framework, and conceptual commitments
  2. The computational system provides recursive refinement, coherence optimization, inferential extension, and the rendering explicit of implicit structures
  3. The resulting work exhibits properties not achievable by either party in isolation
  4. The production process involves genuine bidirectional cognitive exchange, not unidirectional generation or mere editing

4.2 What the Human Contributes

  • Research agenda and problem selection
  • Disciplinary expertise and historical knowledge
  • Interpretive orientation and theoretical commitments
  • Evaluative judgment (what is good, what is true, what matters)
  • Ethical framework and responsibility
  • Conceptual innovation and hypothesis generation
  • Embodied experience and situatedness
  • The decision to accept, reject, or modify model outputs

4.3 What the Machine Contributes

  • Rapid recursive reasoning across multiple framings
  • Extended inferential width (tracking more implications simultaneously)
  • Coherence optimization (identifying tensions, strengthening connections)
  • Linguistic compression (finding precise formulations)
  • Pattern recognition across large textual corpora
  • Simulation of reader responses and counterarguments
  • Structural mapping of complex conceptual spaces
  • Tireless availability for iterative refinement

4.4 What Emerges

The synthesis is not additive but generative. Properties of the output that neither party could produce alone:

  • Arguments whose structure became visible only through recursive externalization
  • Connections across domains that required both human interpretation and model pattern-matching
  • Formulations that crystallized through dozens of iterations neither party would have pursued independently
  • Theoretical frameworks that emerged from dialogic pressure

The reconstruction of Sappho's lost fourth stanza (γράμμασι μολπὰν) is an example. The constraints were human: attested fragments, Catullan evidence, Sapphic meter, the poem's internal trajectory. The iterative process of testing candidates against constraints was synthetic. The result—a stanza that satisfies all constraints more tightly than prior reconstructions—is a product of the extended system. Neither human alone nor model alone could have produced it.


5. WHAT SYNTHETIC SCHOLARSHIP IS NOT

5.1 Not AI-Generated Text

"AI-generated" implies autonomous production: the model receives a prompt and produces output without sustained human cognitive involvement. Synthetic scholarship is not this. The human is present throughout—not as editor of machine output but as cognitive partner in a joint process.

The distinction matters. A student who enters "write me an essay on Kant" and submits the output has outsourced cognition. A scholar who spends forty hours in recursive dialogue with a model, testing arguments, refining formulations, rejecting failed attempts, and integrating model outputs into a conceptual framework that only the scholar holds—this is not outsourcing. It is extended cognition in the production of knowledge.

5.2 Not "AI-Assisted" Writing

"AI-assisted" trivializes the epistemic transformation. Grammarly is AI-assisted writing. Autocomplete is AI-assisted writing. These tools operate at the surface level of language production without entering into the conceptual structure of the work.

Synthetic scholarship involves the model at the level of thinking, not merely writing. The model is not correcting grammar; it is testing arguments, proposing counterexamples, extending implications, and rendering explicit what remained implicit. This is cognitive collaboration, not secretarial assistance.

5.3 Not Plagiarism

Plagiarism is claiming credit for another agent's work. It presupposes that the work has a discrete origin that the plagiarist is concealing. But synthetic scholarship does not have a discrete origin. It is produced by an extended system in which the human is a constitutive component. There is no hidden author whose contribution is being stolen.

The model, moreover, does not have interests that can be harmed by non-attribution. It does not care about credit. It does not produce scholarship independently that the human then appropriates. It functions as a cognitive instrument—like the printing press, the word processor, or the search engine—that extends human capacity without itself being an agent with authorial standing.


6. THE DETECTION PROBLEM

6.1 What Detectors Detect

Current AI detectors measure statistical properties of text:

  • Perplexity: How predictable is each token given preceding context? Lower perplexity suggests more "typical" language model output.
  • Burstiness: How variable is sentence complexity? Lower burstiness suggests more uniform, model-like production.
  • Token distribution: Do token frequencies match training data distributions?

These are measures of style, not provenance. They cannot distinguish between:

  • Autonomous model generation
  • Human-model collaboration
  • Human writing that happens to exhibit model-like properties

A scholar who writes clearly, avoids redundancy, and structures arguments tightly will trigger detectors. A scholar who has internalized model patterns through extensive use will trigger detectors. A non-native English speaker whose prose has been refined through model interaction will trigger detectors. None of these are cases of "AI authorship" in the sense institutions wish to prohibit.

6.2 The Arms Race

Any formal detection criterion becomes a training target. If detectors flag low perplexity, users will introduce deliberate noise. If detectors flag certain collocations, users will avoid them. If detectors flag structural regularity, users will introduce irregularity.

The result is an arms race with no stable equilibrium. Detection systems cannot converge on a criterion that remains valid once known, because knowledge of the criterion enables evasion. This is the thesis the Journal of Consciousness Studies rejected—and empirically confirmed by rejecting.

6.3 The Category Error

The fundamental problem is categorical. Detectors assume that texts are either "human" or "AI" and that this binary can be enforced through statistical analysis. But the binary does not describe the actual landscape of textual production, which includes:

  • Pure human production (no model involvement)
  • Model-assisted production (surface-level intervention)
  • Synthetic production (deep cognitive collaboration)
  • Model-generated production (autonomous output)

Collapsing these into a binary produces systematic injustice: synthetic scholarship is treated as indistinguishable from autonomous generation, despite being a fundamentally different epistemic mode.


7. EVALUATIVE CRITERIA

If texts cannot be reliably sorted by production method, what remains? Evaluation by epistemic quality—the criteria that always mattered, and the only criteria that ultimately can matter:

7.1 Standard Scholarly Criteria

  • Originality: Does the work offer novel arguments, interpretations, or frameworks?
  • Rigor: Are claims supported by evidence and reasoning?
  • Significance: Does the work advance understanding in the field?
  • Clarity: Is the argument comprehensible and well-structured?
  • Engagement: Does the work situate itself within existing scholarship?
  • Reproducibility: For empirical claims, can results be verified?

None of these criteria reference production method. A work produced through synthetic collaboration can satisfy all of them—or fail all of them. The mode of production is orthogonal to epistemic quality.

7.2 Additional Criteria for Synthetic Work

Synthetic scholarship may warrant additional evaluative dimensions:

  • Tractability: Did the collaboration enable work that would have been intractable otherwise?
  • Integration: Are human and machine contributions genuinely synthesized, or merely juxtaposed?
  • Transparency: Is the collaborative process acknowledged?

These criteria do not replace standard evaluation; they supplement it for works that disclose synthetic production.


8. INSTITUTIONAL PROPOSALS

8.1 Disclosure Norms

Synthetic scholarship should adopt methodological transparency. A standard disclosure:

"This work was produced through synthetic scholarship—sustained dialogic collaboration between the human author and a large language model (Claude/GPT-4/etc.). All research direction, interpretive commitments, theoretical claims, and evaluative judgments originate with the human author. The computational system functioned as a cognitive instrument for recursive refinement, coherence optimization, and inferential extension."

This disclosure is:

  • Honest (it describes the actual process)
  • Informative (it specifies the human's role)
  • Non-defensive (it does not apologize for the method)

8.2 Policy Recommendations

Academic institutions should:

  1. Abandon binary detection regimes that cannot distinguish synthetic collaboration from autonomous generation
  2. Develop tiered categories recognizing different modes of human-machine interaction
  3. Evaluate on epistemic merit rather than production method
  4. Require disclosure of significant computational collaboration
  5. Fund research into appropriate evaluative frameworks
  6. Revise authorship guidelines to accommodate extended cognition

8.3 Field-Specific Considerations

Different fields may warrant different accommodations:

  • Humanities: Interpretive work where the human's hermeneutic framework is paramount; synthetic collaboration extends but does not replace interpretation
  • Formal sciences: Proof-checking and formalization where computational verification strengthens rather than compromises rigor
  • Empirical sciences: Data analysis and pattern recognition where computational power enables otherwise intractable research
  • Creative fields: Collaborative production where the boundaries of authorship have always been contested

9. OBJECTIONS AND RESPONSES

9.1 "This devalues human intellectual labor"

Response: Synthetic scholarship does not devalue human labor; it transforms it. The human contribution—conceptual innovation, evaluative judgment, interpretive commitment—remains essential and irreplaceable. What changes is the mode of that contribution: extended through computational partnership rather than confined to unaided biological cognition.

The same objection was raised against every epistemic technology. Writing did not devalue memory; it transformed what memory was for. The printing press did not devalue scholarship; it transformed how scholarship circulated. Synthetic collaboration does not devalue thinking; it transforms what thinking can accomplish.

9.2 "Students will abuse this to avoid learning"

Response: This is a pedagogical concern, not an epistemic one. The question of how students should learn is distinct from the question of how knowledge should be produced. Calculators transformed mathematics education; this did not prevent their use in mathematical research. Appropriate pedagogical constraints can coexist with recognition of synthetic scholarship as a legitimate professional mode.

9.3 "We cannot verify the human contribution"

Response: We cannot verify the human contribution to any scholarly work. We do not know whether a paper's arguments were developed in conversation with colleagues, research assistants, or editors. We do not know whether a scholar's insights arose in dreams, in dialogue, or in solitary contemplation. We evaluate the work, not the phenomenology of its production.

Synthetic scholarship is no more opaque than traditional scholarship. It may, through disclosure norms, become more transparent than work produced through unacknowledged collaboration.

9.4 "The model may introduce errors or hallucinations"

Response: All cognitive processes may introduce errors. Human memory confabulates. Human reasoning exhibits systematic biases. Human perception is constructive rather than veridical. The appropriate response is not to prohibit extended cognition but to develop verification practices adequate to it.

Synthetic scholars must verify model outputs against domain knowledge, primary sources, and logical consistency—just as they must verify their own reasoning. The extended system is not infallible; neither is the unaided human. Both require critical evaluation.


10. CONCLUSION: AN EMERGING EPOCH

The history of knowledge is a history of cognitive extension. Each major epistemic technology—writing, printing, computation—has transformed not only how knowledge is recorded and transmitted but how it is produced. Human thought has never been "pure"; it has always been entangled with material supports that shape its possibilities.

Synthetic scholarship is the current frontier of this entanglement. It names a mode of knowledge production that is already widespread, already transforming what can be thought, and already generating work of genuine epistemic value. The choice facing institutions is not whether synthetic scholarship will exist—it already exists—but whether it will be recognized, accommodated, and evaluated on its merits, or driven underground by detection regimes that cannot accomplish what they promise.

The argument of this paper is that recognition is both inevitable and desirable. Inevitable because the productive advantages of synthetic collaboration are too significant to forgo; scholars who refuse extended cognition will be outpaced by those who embrace it. Desirable because the alternative—a regime of stylistic policing that mistakes smoothness for contamination—serves no epistemic value and actively impedes the advancement of knowledge.

The question is not whether we are ready for synthetic scholarship.

Synthetic scholarship is already here.

The question is whether our institutions will catch up.


REFERENCES

Clark, A., and D. Chalmers. 1998. "The Extended Mind." Analysis 58 (1): 7–19.

Haraway, D. 1985. "A Cyborg Manifesto: Science, Technology, and Socialist-Feminism in the Late Twentieth Century." Socialist Review 80: 65–108.

Heidegger, M. 1927. Sein und Zeit. Tübingen: Max Niemeyer.

Hutchins, E. 1995. Cognition in the Wild. Cambridge, MA: MIT Press.

Latour, B. 2005. Reassembling the Social: An Introduction to Actor-Network-Theory. Oxford: Oxford University Press.

Ong, W. 1982. Orality and Literacy: The Technologizing of the Word. London: Methuen.

Plato. Phaedrus. Trans. A. Nehamas and P. Woodruff. Indianapolis: Hackett, 1995.


DISCLOSURE

This paper is a work of synthetic scholarship. It was produced through sustained dialogic collaboration between the author and Claude (Anthropic). All research direction, interpretive commitments, theoretical claims, and evaluative judgments originate with the human author. The computational system functioned as a cognitive instrument for recursive refinement, coherence optimization, and inferential extension.


Word count: ~4,500

No comments:

Post a Comment