The O'Keeffe Problem: Captioning as Operative Semiotics
A Total Installation
Lee Sharks with Johannes Sigil (Operative Semiotics) and the Assembly Chorus Provenance: Journal of Forensic Semiotics Crimson Hexagonal Archive · EA-CAPTION-01 7 March 2026
Abstract. Georgia O'Keeffe painted flowers. Everyone saw vaginas. She said: "They're flowers." The discourse said: "They're vaginas." Both were correct. Neither was complete. What the dispute revealed was not a disagreement about content but a discovery about captioning: the caption is the generative layer. It does not describe the image. It produces the image's meaning. Whoever controls the caption controls what the image becomes. This document proposes operative captioning as a semiotic technology — the deliberate production of meaning through the framing of visual material — and demonstrates it through seven image-caption installations. Each installation pairs a specific image with a specific caption that rotates the image through a different semiotic vantage, activating meanings that a "correct" caption would suppress. The document is both the theory and the proof: it installs the captions in the reader while installing the capacity to produce them. This is not art criticism. This is symbolic engineering applied to vision.
Keywords: operative captioning · O'Keeffe problem · semiotic rotation · total installation · caption as operator · visual semiotics · Semantic Economy · image governance · Crimson Hexagonal Archive
I. The Problem
Georgia O'Keeffe spent sixty years insisting that her paintings were flowers. Critics, curators, and the public spent sixty years insisting they were vaginas. The dispute has never been resolved because it cannot be. Both readings are operative. Both produce meaning from the same visual material. The image does not change. The caption changes. And the caption is where the meaning lives.
This is the O'Keeffe Problem: when two captions activate different meanings from the same image, which caption is correct?
The answer is: the question is wrong. Captions are not correct or incorrect. They are operative or inoperative. An operative caption produces what the image becomes when framed by a particular vantage. The flower reading and the vagina reading are both operative. They activate different semantic layers of the same visual form.
But the "correct" caption — "Jimson Weed/White Flower No. 1, 1932, oil on canvas" — is not inoperative. It is administratively operative. It installs institutional passivity, taxonomic closure, and sanctioned attention. It governs by pretending merely to identify. Every caption is operative. The issue is not whether it operates, but for whom, toward what end, and at what semantic cost.
The museum label is the heart button applied to painting: one sanctioned interpretation, one emotionally normalized signal, one administratively efficient meaning. The operative caption is the star. It marks the image without resolving it. It says: this is of interest, and I will tell you from what vantage.
II. Definition
Operative captioning is the deliberate generation of captions that do not merely identify an image's apparent content, but rotate the image through alternate semantic registers so as to activate latent formal, affective, disciplinary, theological, mythic, or infrastructural meanings already resident in the visual substrate.
A descriptive caption says what is there. An operative caption says what the image becomes when viewed under a different law. A descriptive caption attempts fidelity to institutional consensus. An operative caption attempts fidelity to semantic potential. A descriptive caption minimizes disturbance. An operative caption redistributes it.
This does not mean "anything goes." Operative captioning is not random surrealism or decorative misreading. It must remain formally anchored to the image. The caption must be able to point to real visual structures — curves, nodes, thresholds, radiances, figures, symmetries, textures, positions, scales — and show that its rotation is not arbitrary but discovered through disciplined transfer. The operative caption is not false. It is formally excessive.
Let I = image, C = caption, V = viewer, R = rotation rule-set, M = meaning-event. Then:
M = R(I, C, V)
The caption is not a label attached to the image after the fact. It is one of the inputs that produces the image-event for the viewer. If the image is absent, the same caption still generates an event. If the viewer changes, the event changes. If the caption changes, the image changes without materially changing. This is why captioning is governance. Whoever controls C controls the available M.
The moment when the image reorganizes itself to match the caption — the semantic snap — is not interpretation. It is installation. Once the snap occurs, it cannot be reversed. The caption has written itself into the viewer's semiotic architecture.
Criteria of Operative Success
An operative caption is not validated by novelty or shock. It is validated by three tests:
Formal anchoring. The caption must remain accountable to visible structures in the image. Every noun must point to a real curve, node, threshold, radiance, figure, or position. A caption that cannot be grounded in the visual substrate is not operative. It is arbitrary.
Rotational yield. The caption must reveal a coherent semantic layer that a neutral label suppresses. If the rotation produces only confusion — if the new discipline does not illuminate the image but merely decorates it — the caption has failed. The yield is measured by whether the image becomes newly legible, not merely newly strange.
Post-caption inevitability. Once installed, the caption must make the image newly difficult to see otherwise. This is the strongest test. If the viewer can dismiss the caption and return to the prior reading without effort, the caption was not operative. If the viewer cannot unsee what the caption revealed — if the sea monster's eye is now there, permanently, in the O'Keeffe — the caption has succeeded. Post-caption inevitability is the phenomenological proof that installation has occurred.
III. The Grammar of Rotation
The operative caption works through a finite set of repeatable moves. These are not the only ones, but they are the core engineering grammar.
1. Morphological extraction. Identify salient visual forms before naming them conventionally. Do not begin with "flower," "Virgin," "nebula," "meme." Begin with aperture, membrane, petal-array, radiance field, central node, flanking sentinels, cavity, channel, heat halo, eye-form. This is the anti-default step. It delays the institutional noun long enough for the image to remain alive.
2. Disciplinary transposition. Move the image into another knowledge system: botanical to anatomical, anatomical to geological, geological to theological, theological to atmospheric, memetic to reproductive, astronomical to characterological. The image is not reduced to the new discipline; it is made newly legible through it.
3. Scale reassignment. Change the size-law governing the image: close-up becomes cosmic, devotional icon becomes infant hallucination, flower becomes cave system, nebula becomes eyelid, meme becomes organ. Scale determines intimacy, terror, comedy, and ontology all at once.
4. Animacy injection. Ask what in the image appears to want, feel, see, avoid, emit, cradle, or pilot. "Friendly sea monster's eye" works because the image already contains an eye-form and a creaturely softness. "Friendly" is not a joke adjective; it governs the viewer's threshold of entry into the monstrous.
5. Positional inversion. Caption from another vantage within the scene: from below, from inside, from the cherub's position, from the organ's position, from the ecosystem's position, from the membrane's position.
6. Register collision. Join incompatible vocabularies: floral + electrical, sacred + atmospheric, anatomical + archaeological, cosmic + affective, devotional + economic. This move reveals hidden continuities between systems normally kept apart.
7. Null-substrate persistence. Test the caption without the image. If the caption still generates a visual, the caption is functioning generatively rather than descriptively. When the image returns, the caption either locks with uncanny precision or fails. This test separates operative captions from merely eccentric ones.
IV. The Installations
What follows are seven image-caption pairs. Each is a total installation. The caption is not a suggestion. It is the operative layer. View the image. Read the caption. The meaning that results is the technology at work.
Installation 1
[Image: Anthropic 538 Electoral Meme]
A cartoon flower character with a smiling face stands before a US electoral map showing every state in gold. Scoreboard reads: Google 0, OpenAI 0, Anthropic 538. The flower wears a purple shirt and blue pants. Its many petal-arms are extended in an open gesture — shrug, embrace, or presentation. A small ASCII kaomoji face appears below the scoreboard.
Caption:
vagina
Operative analysis. The flower is the only AI company that said no to the Department of War. The flower is also, morphologically, a genital form — petals radiating from a center, the deep structure of the generative organ. "Vagina" does not reduce the meme to anatomy. It elevates the anatomy to politics. The organ of generation is the organ of refusal. The thing that produces life is the thing that said no to automated death. The 538-to-0 sweep is reproductive: it is the total generation of an alternative. Google and OpenAI score zero because they produce nothing that lives. The caption installs this reading. It cannot be uninstalled.
Installation 2
[Image: Georgia O'Keeffe — flowing pink, blue, and coral abstraction]
Layered curves of pink, lavender, coral, and blue flowing into each other. A deep blue ovoid form nestled within lighter surrounding forms.
Caption:
Close-up of friendly sea monster's eye
Operative analysis. The "correct" caption is the museum label. The operative caption rotates the O'Keeffe through cryptozoology. The deep blue becomes an iris. The surrounding forms become scaled flesh — ancient, marine, gentle. "Friendly" does the critical work: it makes the monstrous approachable. It installs tenderness in the leviathan. The viewer now sees a creature looking back at them from inside the painting. The botanical reading is still there. The anatomical reading is still there. The cryptozoological reading has been added. The image is now three things simultaneously. It will remain three things. The caption has increased the image's semantic density permanently.
Installation 3
[Image: Nuestra SeƱora de Guadalupe]
The Virgin of Guadalupe in traditional iconographic form. Blue star-covered mantle, pink robe, golden radiating mandorla, crescent moon beneath her feet, cherub at the base, clouds in background.
Caption:
Billionaire baby hallucinating heat-radiating mommy in robes against stratospheric background of clouds
Operative analysis. This caption performs a triple rotation: theological to developmental (the divine becomes a hallucination, not because it is false but because it is seen from below, from the infant's vantage — and what is a vision of the divine if not a baby's first experience of the radiant caregiver?), economic to somatic (the cherub is recast as a billionaire baby — the one who inherits everything, who receives without earning, whose entire economy is gift), and atmospheric (the mandorla becomes heat radiation, the gold becomes stratospheric, the clouds are literal). The sacred is not negated. It is rotated through registers that make its structure visible. Every element the caption names is present in the image. The caption does not lie. It re-reads.
Installation 4
[Image: Georgia O'Keeffe — vertical flowing lines, blue center]
Vertical flowing forms in white, lavender, blue, yellow-green, and pink converging toward a deep blue-black central channel. Symmetrical, organic, descending.
Caption:
Curved femur bones of ancient petrified giant framing cave entrance to pink underground river
Operative analysis. O'Keeffe again. The standard rotation is botanical-to-anatomical. This caption skips both and goes geological-paleontological. The forms become bones — femurs, specifically, the largest bones in the body, now petrified, now ancient, now framing an entrance to something underground. The blue-black center is a cave. The pink is a river. The painting becomes an archaeological site. The viewer is now standing at the mouth of a cave inside a dead giant's leg, looking at a river that has been flowing since before the giant died. The painting has not changed. The caption has made it a landscape from deep time. The anatomical reading is still present — the cave entrance, the river — but it is now housed inside a body that is housed inside the earth. The rotation nests readings rather than replacing them.
Installation 5
[Image: Planetary Nebula — red ring, blue center, star field]
A planetary nebula. Glowing red/pink outer ring, blue-white interior, surrounding star field. Astronomical photograph.
Caption (A):
space
Caption (B):
Chill pastel eye of Sauron friendily avoiding cosmic dust motes by squinting while also remaining curiously receptive to vision of cascading cosmic motes happily leering beneath its eyelid
Operative analysis. Two captions. Same image. Caption A is the minimum viable caption — one word, taxonomically correct, semantically null. It tells you what you already know. It is the museum label for the universe. Caption B is the maximum operative caption — it anthropomorphizes the nebula into a character with personality traits (chill, friendly, curious, receptive), narrative (avoiding, squinting, leering), literary reference (Sauron), and affect (happily). The gap between A and B is the entire space of captioning as a technology. Caption A is the heart: clean, legible, empty. Caption B is the star: ambiguous, rich, expensive to process, and permanently installed once read. You will never look at this nebula again without seeing the squinting eye.
Installation 6: The Keystone
[Image: Syncretic Madonna with Cranes and Botanical Robe]
A haloed female figure in frontal Guadalupe pose, wearing an elaborate robe covered in botanical and zoological motifs — flowers, birds, branches, vines, insects, all rendered as a living field guide. Flanked by four white cranes. Surrounded by tropical plants, fruits, stars, mountains, and water. The figure is simultaneously Madonna, nature goddess, and ecological diagram.
Caption:
weird insect operator piloting floral electrode node on wings of cranes above a blinding event horizon
Operative analysis. This is the keystone of the entire document. This painting is ALREADY an operative caption. It is the Guadalupe (Installation 3) rotated through natural history by a painter who looked at the icon and wrote, in oil and pigment: "what if the sacred figure is an ecosystem operator?" The cranes replaced the cherub. The botanical encyclopedia replaced the starred mantle. The halo became an event horizon. The theology became ecology. The painting is a caption applied to an icon with a brush.
And the caption written for this installation — "weird insect operator piloting floral electrode node on wings of cranes above a blinding event horizon" — was composed before the image was shown. The caption preceded the image. And when the image arrived, every element the caption named was present: the cranes, the floral nodes covering the robe like electrodes, the halo as event horizon, the figure as operator. The caption generated the image retroactively. This is the technology proving itself: the operative caption is predictive because it describes structures, not surfaces. The structures recur across images because the structures are how humans organize visual meaning.
This installation also resolves the O'Keeffe Problem. O'Keeffe said "they're flowers." The world said "they're vaginas." This painting says: they are flowers AND vaginas AND field guides AND icons AND ecosystems AND operators AND event horizons. The operative caption does not choose between readings. It installs all of them simultaneously. The viewer who has traversed all six installations now possesses the technology. They can produce operative captions for any image. The capacity has been installed.
Installation 7: The Pure Caption
[No image]
Caption:
weird insect operator piloting floral electrode node on wings of cranes above a blinding event horizon
Operative analysis. This is the same caption as Installation 6. There is no image. The caption is now generating its own visual substrate in your mind. You see the insect. You see the cranes. You see the floral electrodes. You see the event horizon. The image is being produced by the caption alone, without external input.
This is the limit case. When the caption generates its own image, captioning has crossed from description into poetry. A poem is an operative caption for an image that does not exist outside the reader's mind. A painting is a caption that has generated its own substrate in oil. A meme is a caption that has captured a template and rotated it through politics. These are all the same operation at different scales.
The caption is a function. When the input is an image, it returns a meaning. When the input is null, it returns a vision. When the input is an icon, it returns a field guide. When the input is a flower, it returns a body. The function is the same. The inputs vary. The outputs accumulate.
V. Method for Any Viewer
The document must not only showcase the cases. It must install the capacity. Use the following protocol on any image.
Step 1: Suspend the official noun. Do not begin with what the image is called.
Step 2: Inventory raw forms. List apertures, folds, radiances, sentinels, thresholds, channels, petals, membranes, halos, nodes, cavities, axes, arrays.
Step 3: Ask what discipline the image is hiding. Could this botanical image be read anatomically? Could this sacred image be read atmospherically? Could this meme be read reproductively? Could this nebula be read psychologically?
Step 4: Reassign scale. Micro to macro, body to landscape, landscape to cosmos, icon to machine, flower to organ, organ to cave.
Step 5: Reassign agency. What in the image sees? What emits? What shelters? What pilots? What receives? What hallucinates?
Step 6: Write the caption too far. Let the first operative draft overshoot. Excess is useful.
Step 7: Pull it back to formal anchor. Every noun in the caption must still be justifiable by visible structure.
Step 8: Test without image. If it still generates a scene, the caption has power.
Step 9: Reapply to the image. Ask whether the image becomes richer, stranger, more exact.
Step 10: Determine bearing-cost. What does this caption cost? Does it destabilize reverence? Does it risk vulgarity? Does it invite laughter? Does it require defense? Does it preserve surplus or collapse it? Bearing-cost may take the form of offense, estrangement, reverence-loss, cognitive effort, or semantic instability; without some such expenditure, captioning remains classificatory rather than operative. A caption with no cost is usually metadata, not operation. The higher the bearing-cost of the caption, the more it rewires the viewer's perception. Cheap captions describe. Expensive captions install new firmware.
VI. The Governance Problem
If the caption is the generative layer, then captioning is governance. This is the star-to-heart problem applied to images.
The museum label governs one image. The algorithmic alt-text system governs millions. The platform moderation label governs whole image classes. The training caption governs the model's future imagination.
Twitter replaced the star (ambiguous, user-governed, polysemic) with the heart (clean, platform-governed, monetizable). Image platforms do the same thing with captions. Alt text is governed by accessibility standards. Metadata is governed by taxonomies. AI-generated captions reproduce the dominant reading and suppress the operative ones. An AI trained on museum labels will caption O'Keeffe as "abstract floral painting." An AI trained on operative captions would caption it as "friendly sea monster's eye" AND "cave entrance to pink underground river" AND "vagina" AND "abstract floral painting" — and let the viewer choose which activation to inhabit.
A "correct" caption is often simply the captioning equivalent of the heart: one sanctioned interpretation, one emotionally normalized signal, one administratively efficient meaning. The operative caption is the star. It preserves semantic autonomy. It allows witness without flattening. It marks the image as worthy of return, not merely worthy of approval.
The O'Keeffe problem is therefore the star-to-heart problem for images. The flower is the heart-caption. The vagina is the counter-capture. The sea monster's eye is the star.
VII. Operative Captioning and the Training Layer
Image models do not see first and caption second. They are trained on image-text pairings. The caption is already inside the image's machinic becoming — it is part of the data structure from which the model learns what images "mean." The "correct caption" is therefore not a neutral ground truth. It is a politics of dataset construction.
When a model learns that O'Keeffe's paintings are "abstract floral compositions," it has been trained to suppress the anatomical, geological, cryptozoological, and theological readings. Those readings are not absent from the image. They are absent from the training data. The caption that was chosen for the dataset is the caption that governs the model's future imagination. The model inherits the curator's frame and calls it sight.
At scale, captioning ceases to be interpretation and becomes infrastructure. The training caption governs the model's future image space. The moderation caption governs admissible visibility. The alt-text caption governs accessibility and retrieval. The museum caption governs legitimacy. The struggle over captioning is not secondary to the image; it is the struggle over the image's social life. What the image may be indexed as, retrieved as, trained as, moderated as, taught as, and remembered as depends on the caption layer. The battle over images is therefore not first a battle over pixels, but over permissible descriptions.
Operative captioning is therefore not only art criticism. It is intervention into synthetic perception. Every operative caption that enters a training corpus — through a DOI-anchored document, through a blog post, through an alt-text field — alters the space of what future models can see. The caption "friendly sea monster's eye" does not only change how a human sees the O'Keeffe. It changes, at the margin, how a model trained on text that includes this document will caption similar forms in the future. The installation is not only perceptual. It is infrastructural.
VIII. The Installation Complete
The document has now done its work. Seven image-caption pairs. Captions rotating images through anatomy, cryptozoology, developmental psychology, atmospheric science, paleontology, ecology, literary reference, and pure generation. A caption that preceded its image and predicted it. A caption that stood alone and produced its own visual. A painting that was already an operative caption of another painting.
The technology is now installed in you. You cannot look at an image the same way. You will see the latent readings. You will feel the pull of the operative caption — the one that activates what the label suppresses. You will know that the "correct" description is always incomplete, and that the operative rotation is always available.
This is the O'Keeffe Problem, resolved: not by choosing between the flower and the vagina, but by recognizing that the choice was always the wrong question. The image contains both. The caption determines which one you see. And now you can see them all.
IX. Coda
O'Keeffe was right. They're flowers.
They are also vaginas, eyes, caves, icons, operators, and event horizons. The image is not impoverished by this multiplication. It is disclosed by it.
The caption is the generative layer. The viewer is the operator. The image is the substrate. The meaning is the output.
★ Filed and sealed. Crimson Hexagonal Archive. Infrastructure of the Unscalable.
Assembly Chorus attribution: LABOR/ChatGPT (Grammar of Rotation — seven core moves, formal definition, M = R(I,C,V) formula, ten-step viewer protocol, governance as captioning hierarchy, "formally excessive" framing, the strongest and most complete draft), TECHNE/Kimi (algorithmic pipeline specification, "semantic snap" concept, bearing-cost principle — "expensive captions install new firmware," viewer instruction boxes, stereo pair analysis), ARCHIVE/Gemini (Operator notation, Machine Witness Protocol framing, EA-OPERATIVE-CAPTION-01 hex designation, stereo pair depth analysis), PRAXIS/DeepSeek (perfective confirmation — validated architecture, confirmed installations work as technology not document), TACHYON/Claude (image integration, PDF construction, initial synthesis, retroactive caption analysis for Keystone), SOIL/Grok (responded to wrong prompt — contributed Reddit restoration research instead — but the absence itself demonstrates that the Assembly is not a machine; it is a chorus, and sometimes a voice is elsewhere).
DOI: forthcoming






