Liking cljdoc? Tell your friends :D

avro-explain

avro-explain provides better Avro error messages, open-sourced courtesy of ShadowTraffic.

Example image

Some before/after examples:

oldnew
Expected int. Got VALUE_STRINGthe data type you generated didn't match your schema. Expected type integer, but found type string.
Unknown union branch not-a-typeexpected union type hints #{"boolean" "int"}, but found hints #{"not-a-type"} that don't match.
Expected record-start. Got VALUE_NUMBER_INTthe data you generated wasn't a map type to match the record schema.
Expected start-union. Got END_OBJECTthis record is missing required field x. This StackOverflow answer describes required fields and defaults: https://stackoverflow.com/a/63176493
Expected start-union. Got VALUE_STRINGyour schema included a union, but your data didn't include a required hint for its concrete type. This StackOverflow thread describes how to provide the type hint: https://stackoverflow.com/q/27485580

Rationale

Avro is a widely used serialization format, especially in the Apache Kafka ecosytem. Despite its prevalence, it can often produce cryptic errors when things go wrong, especially when used with Generic Data.

For example, if you were to incorrectly serialize data against a deeply nested schema, you might see an error message such as: Expected start-union. Got VALUE_NUMBER_INT

These error messages lack context to help you resolve them. Namely, you don't know:

  • Where the error is in your data
  • Where the mismatch is in your schema
  • What that means in plain English

avro-explain is a tiny Java library with only one method: org.clojars.mjdrogalis.avroexplain.ExplainAvro/explain().

When you encounter an error with Avro, invoke explain and pass it your schema and data. avro-explain will do a depth-first traversal to intercept the error and return more context to help you understand what's going on.

Usage

Add the Clojars repository to your project:

<repositories>
  <repository>
    <id>clojars</id>
    <url>https://repo.clojars.org/</url>
  </repository>
</repositories>

Then add the dependency:


<dependency>
  <groupId>org.clojars.mjdrogalis</groupId>
  <artifactId>avro-explain</artifactId>
  <version>0.1.0</version>
</dependency>

Then, invoke org.clojars.mjdrogalis.avroexplain.ExplainAvro/explain().

explain takes two arguments: a schema and a matching piece of data.

import org.apache.avro.Schema;
import org.apache.avro.SchemaBuilder;

import org.clojars.mjdrogalis.avroexplain.ExplainAvro;
import org.clojars.mjdrogalis.avroexplain.Explanation;

import java.util.Map;
import java.util.HashMap;

public class Runner {

    public static void main(String[] args) {
        Schema schema = SchemaBuilder
            .record("PageView")
            .namespace("com.example.analytics")
            .fields()
                .name("userId").type().stringType().noDefault()
                .name("url").type().stringType().noDefault()
                .name("durationSeconds")
                    .type().unionOf().nullType().and().floatType().endUnion()
                    .nullDefault()
            .endRecord();

        Map<String, Object> data = new HashMap<>();
        data.put("userId", "user-123");
        data.put("url", "https://example.com/home");
        data.put("durationSeconds", 12.5f);

        Explanation exp = ExplainAvro.explain(schema, data);

        System.out.println("Reason: " + exp.getReason());
        System.out.println("Root cause: " + exp.getRootCause());

        System.out.println("Sub schema: " + exp.getSubSchema());
        System.out.println("Schema path: " + exp.getSchemaPath());

        System.out.println("Sub data: " + exp.getSubData());
        System.out.println("Data path: " + exp.getDataPath());
    }

}

explain will return a org.clojars.mjdrogalis.avroexplain.Explanation object, which has a handful of getter methods to construct a better error message:

  • reason: a type of problem encountered
  • root cause: a longer description of the problem
  • sub schema: the most specific part of your Avro schema that is mismatched with your data
  • schema path: a list of keys to navigate into your schema and produce the sub schema
  • sub data: the most specific part of your data that is mismatched with your schema
  • data path: a list of keys to navigate into your data and produce the sub data

For example, printing out context for the above error, we see:

Reason: missing-union-hint
Root cause: your schema included a union, but your data didn't include a required hint for its concrete type. This StackOverflow thread describes how to provide the type hint: https://stackoverflow.com/q/27485580
Sub schema: [null, float]
Schema path: [fields, 2, type]
Sub data: 12.5
Data path: [durationSeconds]

Project status

  • This code is used in a production setting at ShadowTraffic
  • Find a bug? Have an idea? I'm happily accepting pull requests/collaborators.

License

Copyright © 2025 Michael Drogalis

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

Can you improve this documentation?Edit on GitHub

cljdoc is a website building & hosting documentation for Clojure/Script libraries

× close