Liking cljdoc? Tell your friends :D

Red Team Report: meme-clj Adversarial Assessment

Date: 2026-04-01 Scope: 62 adversarial hypotheses + LSP structural analysis Tools used: nREPL (clojure-mcp), clojure-lsp, Bash CLI testing Methodology: Generate adversarial hypotheses across 8 categories, execute each via live REPL or CLI, verify findings with follow-up probes

Scoreboard

CategoryHypothesesRefutedConfirmedPartial
Tokenizer (H1-H10)10721
Parser (H11-H20)10910
Roundtrip/Printer (H21-H30, H38-H44)171412
Error Handling (H31-H37)7403
Rewrite Engine (H45-H48)4220
Lang System (H49-H53)5221
Security/Robustness (H54-H62)9702
LSP Static Analysis5 tasks--6 findings5 info
Total62 + LSP4589

Refutation rate: 73% -- the codebase is well-defended against most adversarial inputs.

CONFIRMED FINDINGS -- Sorted by Severity

SEVERITY: HIGH (1)

F1. #=, #<, #% accepted as tagged literal prefixes -- produces dangerous Clojure output

Hypotheses: H6 Location: src/meme/scan/tokenizer.cljc (tag scanning), src/meme/parse/reader.cljc (tagged literal handling)

The tokenizer classifies #=foo, #<foo, #%foo as :tagged-literal tokens with tags =foo, <foo, %foo. The parser accepts them and the printer emits them verbatim in Clojure output:

meme input:  #=foo bar    ->  clj output: #=foo bar
meme input:  #<foo bar    ->  clj output: #<foo bar

Clojure's reader interprets these differently:

  • #= triggers the EvalReader -- potential code execution if *read-eval* is true
  • #< produces "Unreadable form" -- the Clojure output is unreadable
  • #% produces "No reader function for tag" -- the Clojure output fails

Impact: The meme->clj translation pipeline can produce Clojure text that either (a) cannot be read back, or (b) triggers eval-reader behavior when fed to Clojure's reader. This is a translation correctness bug that could become a security issue in pipelines that convert untrusted .meme input to .clj and then read/eval the output.

Fix: Reject #=, #<, #% at tokenizer or parser level with a clear error, matching Clojure's restrictions on dispatch characters.

SEVERITY: MEDIUM (5)

F2. ::a::b, :::, ::a/ accepted as valid auto-resolve keywords -- produces invalid Clojure

Hypotheses: H10, H37 Location: src/meme/scan/tokenizer.cljc (keyword scanning)

meme: ::a::b  ->  clj: ::a::b  ->  Clojure reader: "Invalid token: ::a::b"
meme: :::     ->  clj: :::     ->  Clojure reader: "Invalid token: :::"
meme: ::a/b/c ->  clj: ::a/b\n\n/c  ->  silently splits into two forms

The tokenizer does not validate keyword syntax beyond basic character scanning. Multiple consecutive colons, slashes, and other invalid patterns are accepted and propagated through to Clojure output that Clojure's own reader rejects.

Fix: Add keyword validation in the tokenizer: reject :::+, ::.*::, ::.*/.*/, etc.

F3. Nested #() anonymous function literals accepted -- forbidden by Clojure

Hypothesis: H17 Location: src/meme/parse/reader.cljc (:open-anon-fn handler)

meme: #(+(% #(*(% 2))))  ->  clj: #(+ %1 #(* %1 2))  ->  Clojure: "Nested #()s are not allowed"

Meme silently accepts nested #() and produces Clojure output that Clojure's reader rejects. The inner % params are conflated with outer params.

Fix: Track #() nesting depth in parser state and reject nested occurrences.

F4. resolve-lang eagerly dereferences @builtin before checking user-langs

Hypothesis: H49 Location: src/meme/lang.cljc:resolve-lang

(let [user #?(:clj @user-langs :cljs nil)
      b    #?(:clj @builtin :cljs builtin)]  ;; always evaluated
  (or (get user n) (get b n) ...))

@builtin is a delay that loads EDN resources from classpath. It is dereferenced unconditionally in the let binding, even when the user-lang map already contains the key. This forces unnecessary classpath I/O on every resolve-lang call and prevents user-langs from being consulted in degraded environments where @builtin would throw.

Fix: Inline @builtin into the or expression so it short-circuits.

F5. parse-form-base is 237 lines -- largest function, maintenance hotspot

Hypothesis: LSP Task C Location: src/meme/parse/reader.cljc:336-572

A single case dispatch handling every token type. Each branch is 5-15 lines, but the aggregate size makes review, modification, and isolated testing difficult. Most linters flag functions over 80 lines.

Recommendation: Extract thematic groups (dispatch forms, syntax-quote/unquote, metadata) into named private functions.

F6. run-stages in core.cljc is dead production code

Hypothesis: LSP Task A Location: src/meme/core.cljc:94

Public API function with zero references in any src/ file. Only used in core_test.cljc. It is a thin wrapper around stages/run which is the actual function used by meme->forms.

Recommendation: Remove or mark as ^:no-doc.

Note (2026-04-02): F6 and F14 below were fixed in v2.0.0.

SEVERITY: LOW (8)

F7. % normalized to %1 in #() -- notation not preserved

Hypothesis: H27 #(+(% %2)) -> #(+(%1 %2)). Semantically identical, but violates the syntactic transparency principle for this one case. Bare % preference is lost.

F8. Reference types (atom, ref, agent) produce non-roundtrippable output

Hypothesis: H42 Printer falls through to JVM's #object[...] representation. Cannot be re-parsed. Matches Clojure's own limitation.

F9. Unpaired surrogates \uD800 replaced with ?

Hypothesis: H9 Clojure preserves the surrogate char; meme replaces it. Behavioral divergence on invalid Unicode.

F10. Rewrite guard exceptions propagate unwrapped

Hypothesis: H48 A guard function that throws produces a bare Exception, not a structured ExceptionInfo with rewrite context. Confusing for guest language authors.

F11. Extension collision between user langs is silent

Hypothesis: H50 Two langs with {:extension "test"} silently coexist; first registered wins in resolve-by-extension. No warning.

F12. Pattern matching returns nil for non-keyword map keys

Hypothesis: H47 (match-pattern '{?k ?v} {"hello" 42}) -> nil. Limits rewrite rule expressiveness with string/integer map keys.

F13. Sequential #_ discards hit 512 depth limit

Hypothesis: H56 512 sequential #_ x forms exceed the parser depth limit. Clojure handles 10,000+ iteratively. Unlikely in practice.

F14. version var in core.cljc is completely unused

Hypothesis: LSP Task A Location: src/meme/core.cljc:105 Zero references anywhere. CLI reads version.txt directly.

Note (2026-04-02): Fixed in v2.0.0.

SEVERITY: INFO (6)

F15. MemeRaw and MemeAutoKeyword records leak from meme->forms API

Hypothesis: H52 Inputs like 0xFF, 0377, \u0041, ::foo return internal record types, not plain Clojure values. By design (preserves notation for printer), but surprising for API consumers expecting standard Clojure data.

F16. raw-value / raw-text accessors unused in production

Hypothesis: LSP Task A All production code uses keyword access (:value, :raw) directly on MemeRaw records.

F17. source-context in errors.cljc could be private

Hypothesis: LSP Task A Only consumed within errors.cljc itself.

F18. stages.cljc uses plain ex-info instead of meme-error

Hypothesis: LSP Task E Pipeline config errors (nil source, missing tokens) bypass the standard error infrastructure.

F19. requiring-resolve creates invisible dependency

Hypothesis: LSP Task B lang.cljc -> runtime/run.cljc dependency is runtime-only, not reflected in :require. Would fail silently if target moves.

F20. Dual discard-sentinel definitions

Hypothesis: LSP Task E parse/reader.cljc uses identity-based (Object.) sentinel; rewrite/tree.cljc uses value-based ::discarded keyword. Intentionally separate but naming overlap could confuse.

What We FAILED to Break (Notable Defenses)

These results speak to the quality of the implementation:

DefenseHypotheses Defeated
Depth limit at 512 -- clean error, no StackOverflowH1, H14, H32
Thread-safe parsing -- per-invocation volatile! stateH62
Linear-time parsing -- no quadratic blowup on malformed inputH56
Syntactic transparency -- 'x/quote(x), @x/deref(x), set ordering, numeric notation ALL preservedH22, H23, H24, H26, H29, H30
Metadata roundtrip -- survives read->print->re-readH21, H18
Formatter idempotency -- format(format(x)) == format(x)H39
Flat/canon agreement at infinite widthH44
Width=1 produces valid outputH43
Error locations on ALL tested malformed inputsH31
:incomplete flag accuracy -- correct for all tested casesH33
No raw exception leaks -- all errors wrapped in ExceptionInfoH37 (mostly)
Rewrite cycle detection at 100 iterationsH45
No splice-variable exponential backtrackingH46
Eval injection blocked by native parser designH54
:: keyword namespace not leaked at parse timeH55
CLI handles: missing files, empty input, binary garbage, shebangsH57-H61
Double discard #_ #_ works correctlyH12
nil/true/false as call headsH13
Spacing rule f(x) vs f (x) strictly enforcedH16
No circular dependencies in source treeLSP Task B

Architectural Observations

The codebase has a 73% adversarial refutation rate across 62 deliberately hostile hypotheses. This is strong. Most of the confirmed findings cluster around one theme: the tokenizer is too permissive on what it accepts as valid tokens (F1, F2, F3). The parser and printer faithfully propagate whatever the tokenizer emits, so garbage-in -> garbage-out through the full pipeline.

The single most impactful improvement would be tightening tokenizer validation for:

  • Dispatch character restrictions (#=, #<, #%)
  • Keyword syntax (:::, ::a::b, ::a/b/c)
  • Nested #() tracking

This would address F1, F2, and F3 -- all three MEDIUM+ findings -- at one layer.

Security posture is strong. The native parser architecture provides inherent eval-injection protection (H54). The deferred :: keyword encoding avoids namespace leakage (H55). Thread safety is correct (H62). Resource exhaustion is bounded by the 512 depth limit. The only security-adjacent concern (F1, #= in output) requires a specific attack pattern (untrusted meme -> clj -> Clojure read with *read-eval* true).


62 hypotheses tested. 8 confirmed. 9 partial. 45 refuted. 20 findings catalogued across 4 severity levels.

Can you improve this documentation?Edit on GitHub

cljdoc builds & hosts documentation for Clojure/Script libraries

Keyboard shortcuts
Ctrl+kJump to recent docs
Move to previous article
Move to next article
Ctrl+/Jump to the search field
× close