Liking cljdoc? Tell your friends :D

Red Team Report: meme-clj Adversarial Assessment

Date: 2026-04-01 Scope: 62 adversarial hypotheses + LSP structural analysis Tools used: nREPL (clojure-mcp), clojure-lsp, Bash CLI testing Methodology: Generate adversarial hypotheses across 8 categories, execute each via live REPL or CLI, verify findings with follow-up probes

Scoreboard

Category	Hypotheses	Refuted	Confirmed	Partial
Tokenizer (H1-H10)	10	7	2	1
Parser (H11-H20)	10	9	1	0
Roundtrip/Printer (H21-H30, H38-H44)	17	14	1	2
Error Handling (H31-H37)	7	4	0	3
Rewrite Engine (H45-H48)	4	2	2	0
Lang System (H49-H53)	5	2	2	1
Security/Robustness (H54-H62)	9	7	0	2
LSP Static Analysis	5 tasks	--	6 findings	5 info
Total	62 + LSP	45	8	9

Refutation rate: 73% -- the codebase is well-defended against most adversarial inputs.

CONFIRMED FINDINGS -- Sorted by Severity

SEVERITY: HIGH (1)

F1. `#=`, `#<`, `#%` accepted as tagged literal prefixes -- produces dangerous Clojure output

Hypotheses: H6 Location: src/meme/scan/tokenizer.cljc (tag scanning), src/meme/parse/reader.cljc (tagged literal handling)

The tokenizer classifies #=foo, #<foo, #%foo as :tagged-literal tokens with tags =foo, <foo, %foo. The parser accepts them and the printer emits them verbatim in Clojure output:

meme input:  #=foo bar    ->  clj output: #=foo bar
meme input:  #<foo bar    ->  clj output: #<foo bar

Clojure's reader interprets these differently:

#= triggers the EvalReader -- potential code execution if *read-eval* is true
#< produces "Unreadable form" -- the Clojure output is unreadable
#% produces "No reader function for tag" -- the Clojure output fails

Impact: The meme->clj translation pipeline can produce Clojure text that either (a) cannot be read back, or (b) triggers eval-reader behavior when fed to Clojure's reader. This is a translation correctness bug that could become a security issue in pipelines that convert untrusted .meme input to .clj and then read/eval the output.

Fix: Reject #=, #<, #% at tokenizer or parser level with a clear error, matching Clojure's restrictions on dispatch characters.

SEVERITY: MEDIUM (5)

F2. `::a::b`, `:::`, `::a/` accepted as valid auto-resolve keywords -- produces invalid Clojure

Hypotheses: H10, H37 Location: src/meme/scan/tokenizer.cljc (keyword scanning)

meme: ::a::b  ->  clj: ::a::b  ->  Clojure reader: "Invalid token: ::a::b"
meme: :::     ->  clj: :::     ->  Clojure reader: "Invalid token: :::"
meme: ::a/b/c ->  clj: ::a/b\n\n/c  ->  silently splits into two forms

The tokenizer does not validate keyword syntax beyond basic character scanning. Multiple consecutive colons, slashes, and other invalid patterns are accepted and propagated through to Clojure output that Clojure's own reader rejects.

Fix: Add keyword validation in the tokenizer: reject :::+, ::.*::, ::.*/.*/, etc.

F3. Nested `#()` anonymous function literals accepted -- forbidden by Clojure

Hypothesis: H17 Location: src/meme/parse/reader.cljc (:open-anon-fn handler)

meme: #(+(% #(*(% 2))))  ->  clj: #(+ %1 #(* %1 2))  ->  Clojure: "Nested #()s are not allowed"

Meme silently accepts nested #() and produces Clojure output that Clojure's reader rejects. The inner % params are conflated with outer params.

Fix: Track #() nesting depth in parser state and reject nested occurrences.

F4. `resolve-lang` eagerly dereferences `@builtin` before checking user-langs

Hypothesis: H49 Location: src/meme/lang.cljc:resolve-lang

(let [user #?(:clj @user-langs :cljs nil)
      b    #?(:clj @builtin :cljs builtin)]  ;; always evaluated
  (or (get user n) (get b n) ...))

@builtin is a delay that loads EDN resources from classpath. It is dereferenced unconditionally in the let binding, even when the user-lang map already contains the key. This forces unnecessary classpath I/O on every resolve-lang call and prevents user-langs from being consulted in degraded environments where @builtin would throw.

Fix: Inline @builtin into the or expression so it short-circuits.

F5. `parse-form-base` is 237 lines -- largest function, maintenance hotspot

Hypothesis: LSP Task C Location: src/meme/parse/reader.cljc:336-572

A single case dispatch handling every token type. Each branch is 5-15 lines, but the aggregate size makes review, modification, and isolated testing difficult. Most linters flag functions over 80 lines.

Recommendation: Extract thematic groups (dispatch forms, syntax-quote/unquote, metadata) into named private functions.

F6. `run-stages` in `core.cljc` is dead production code

Hypothesis: LSP Task A Location: src/meme/core.cljc:94

Public API function with zero references in any src/ file. Only used in core_test.cljc. It is a thin wrapper around stages/run which is the actual function used by meme->forms.

Recommendation: Remove or mark as ^:no-doc.

Note (2026-04-02): F6 and F14 below were fixed in v2.0.0.

SEVERITY: LOW (8)

F7. `%` normalized to `%1` in `#()` -- notation not preserved

Hypothesis: H27 #(+(% %2)) -> #(+(%1 %2)). Semantically identical, but violates the syntactic transparency principle for this one case. Bare % preference is lost.

F8. Reference types (atom, ref, agent) produce non-roundtrippable output

Hypothesis: H42 Printer falls through to JVM's #object[...] representation. Cannot be re-parsed. Matches Clojure's own limitation.

F9. Unpaired surrogates `\uD800` replaced with `?`

Hypothesis: H9 Clojure preserves the surrogate char; meme replaces it. Behavioral divergence on invalid Unicode.

F10. Rewrite guard exceptions propagate unwrapped

Hypothesis: H48 A guard function that throws produces a bare Exception, not a structured ExceptionInfo with rewrite context. Confusing for guest language authors.

F11. Extension collision between user langs is silent

Hypothesis: H50 Two langs with {:extension "test"} silently coexist; first registered wins in resolve-by-extension. No warning.

F12. Pattern matching returns nil for non-keyword map keys

Hypothesis: H47 (match-pattern '{?k ?v} {"hello" 42}) -> nil. Limits rewrite rule expressiveness with string/integer map keys.

F13. Sequential `#_` discards hit 512 depth limit

Hypothesis: H56 512 sequential #_ x forms exceed the parser depth limit. Clojure handles 10,000+ iteratively. Unlikely in practice.

F14. `version` var in `core.cljc` is completely unused

Hypothesis: LSP Task A Location: src/meme/core.cljc:105 Zero references anywhere. CLI reads version.txt directly.

Note (2026-04-02): Fixed in v2.0.0.

SEVERITY: INFO (6)

F15. `MemeRaw` and `MemeAutoKeyword` records leak from `meme->forms` API

Hypothesis: H52 Inputs like 0xFF, 0377, \u0041, ::foo return internal record types, not plain Clojure values. By design (preserves notation for printer), but surprising for API consumers expecting standard Clojure data.

F16. `raw-value` / `raw-text` accessors unused in production

Hypothesis: LSP Task A All production code uses keyword access (:value, :raw) directly on MemeRaw records.

F17. `source-context` in `errors.cljc` could be private

Hypothesis: LSP Task A Only consumed within errors.cljc itself.

F18. `stages.cljc` uses plain `ex-info` instead of `meme-error`

Hypothesis: LSP Task E Pipeline config errors (nil source, missing tokens) bypass the standard error infrastructure.

F19. `requiring-resolve` creates invisible dependency

Hypothesis: LSP Task B lang.cljc -> runtime/run.cljc dependency is runtime-only, not reflected in :require. Would fail silently if target moves.

F20. Dual `discard-sentinel` definitions

Hypothesis: LSP Task E parse/reader.cljc uses identity-based (Object.) sentinel; rewrite/tree.cljc uses value-based ::discarded keyword. Intentionally separate but naming overlap could confuse.

What We FAILED to Break (Notable Defenses)

These results speak to the quality of the implementation:

Defense	Hypotheses Defeated
Depth limit at 512 -- clean error, no StackOverflow	H1, H14, H32
Thread-safe parsing -- per-invocation `volatile!` state	H62
Linear-time parsing -- no quadratic blowup on malformed input	H56
Syntactic transparency -- `'x`/`quote(x)`, `@x`/`deref(x)`, set ordering, numeric notation ALL preserved	H22, H23, H24, H26, H29, H30
Metadata roundtrip -- survives read->print->re-read	H21, H18
Formatter idempotency -- `format(format(x)) == format(x)`	H39
Flat/canon agreement at infinite width	H44
Width=1 produces valid output	H43
Error locations on ALL tested malformed inputs	H31
`:incomplete` flag accuracy -- correct for all tested cases	H33
No raw exception leaks -- all errors wrapped in ExceptionInfo	H37 (mostly)
Rewrite cycle detection at 100 iterations	H45
No splice-variable exponential backtracking	H46
Eval injection blocked by native parser design	H54
`::` keyword namespace not leaked at parse time	H55
CLI handles: missing files, empty input, binary garbage, shebangs	H57-H61
Double discard `#_ #_` works correctly	H12
nil/true/false as call heads	H13
Spacing rule `f(x)` vs `f (x)` strictly enforced	H16
No circular dependencies in source tree	LSP Task B

Architectural Observations

The codebase has a 73% adversarial refutation rate across 62 deliberately hostile hypotheses. This is strong. Most of the confirmed findings cluster around one theme: the tokenizer is too permissive on what it accepts as valid tokens (F1, F2, F3). The parser and printer faithfully propagate whatever the tokenizer emits, so garbage-in -> garbage-out through the full pipeline.

The single most impactful improvement would be tightening tokenizer validation for:

Dispatch character restrictions (#=, #<, #%)
Keyword syntax (:::, ::a::b, ::a/b/c)
Nested #() tracking

This would address F1, F2, and F3 -- all three MEDIUM+ findings -- at one layer.

Security posture is strong. The native parser architecture provides inherent eval-injection protection (H54). The deferred :: keyword encoding avoids namespace leakage (H55). Thread safety is correct (H62). Resource exhaustion is bounded by the 512 depth limit. The only security-adjacent concern (F1, #= in output) requires a specific attack pattern (untrusted meme -> clj -> Clojure read with *read-eval* true).

62 hypotheses tested. 8 confirmed. 9 partial. 45 refuted. 20 findings catalogued across 4 severity levels.

❮meme platform roadmap Red Team 2 — Consolidated Report❯

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts

`Ctrl`+`k`	Jump to recent docs
`←`	Move to previous article
`→`	Move to next article
`Ctrl`+`/`	Jump to the search field

Raise an issue Browse cljdoc source Chat on Slack

× close