What is "Modern" JSON Schema?

What is "Modern" JSON Schema?

ยท

7 min read

Featured on Hashnode

Welcome to the Modern JSON Schema (& APIs) blog! My name is Henry, and I'm the H. Andrews who co-authored all drafts of JSON Schema from 2017 to the present. With this blog, I want to help you use JSON Schema to the full extent of its capabilities!

I'm excited to show you how "modern" JSON Schema, meaning 2019-09, 2020-12, and later, can solve far more problems than "classical" JSON Schema (draft-07 and earlier). We'll also explore topics relevant to APIs in general and OpenAPI 3.1 in particular, and to all versions of JSON Schema. If you're using classical JSON Schema, earlier versions of OpenAPI, or other approaches to HTTP APIs, you'll find plenty to interest you here as well.

In this introductory post, you'll learn:

  • Why I'm making a distinction between "modern JSON Schema" and "classical JSON Schema"
  • Three areas of new functionality that modern JSON Schema has introduced

Why "modern" vs "classical" JSON Schema?

The JSON Schema landscape is pretty confusing right now. There are five versions in active use: draft-04, draft-06, draft-07, and drafts1 2019-09 and 2020-12. When you get into the complexities of IETF draft names vs meta-schema identifiers, it all becomes excessively complicated:

A complex timeline of JSON Schema drafts with IETF and meta-schema identifiers grouped into three eras: "pre-historic" (2009-2012), "classical" (2013-2018), and "modern" (2019 onwards) as a simpler classification

Talking about all of these versions is confusing. Talking about eras, only two of which are in use, is much simpler!

Simplifying the conversation is important, but there's a bit more to the name. "Modern JSON Schema" is inspired by the successful branding of C++11 and later as "modern C++". C++11 made a long list of changes to the core language, just as JSON Schema 2019-09 significantly expanded JSON Schema's capabilities. "Modern C++" is used in discussions and job descriptions to help people understand it as a distinct and desirable skill compared to C++03 and earlier.

I want to see a similar shift happen with JSON Schema. We can build excitement around "modern JSON Schema" much more easily than around a pile of draft names and numbers. We can focus on what unifies recent versions and sets them apart from the past. We can create a healthy ecosystem where modern JSON Schema is in demand, but only if enough people understand why they should demand it and support it.

My hope for this blog is that it will further the community's understanding and generate that demand.

Shiny modern things!

So let's explore what makes modern JSON Schema, "modern"! These three changes aren't the only changes between classical and modern JSON Schema, but they each dramatically expand what JSON Schema can do.

Runtime keyword communication

This change is about making schemas more re-usable. How many of you tried to write a schema like this at some point?

{
  "$schema": "https://json-schema.org/draft-07/schema",
  "type": "object",
  "allOf": [{"$ref": "https://example.com/schemas/sharedproperties"}],
  "additionalProperties": false
}

I did! But any instance that has any properties will fail validation against this schema. The "additionalProperties" keyword can't see into the https://example.com/schemas/sharedproperties schema, so it doesn't matter what it contains.

Modern JSON Schema allows keywords to record their runtime behavior for other keywords to read. This allows keywords to build on the results of other keywords in ways that classical JSON Schema does not allow, such as by taking into account which properties that were validated in a referenced schema.

In future posts, we'll use a series of examples to look at how this works, and what advantages and drawbacks it brings. You'll learn how modern JSON Schema makes it easy to solve several important use cases that used to be difficult or impossible.

Annotations for application-aware behavior

Even though JSON Schema has standard "readOnly" and "writeOnly" annotation keywords, it is challenging to use them to validate API request and response bodies if you're using classical JSON Schema.

JSON Schema validates the structure of the JSON data. In the process of doing that it learns which fields are annotated with information such as "readOnly": true. But it doesn't understand whether it's running on a client or a server, nor does it know the request and response semantics of different HTTP methods. OpenAPI HTTP tooling understands these things, but it doesn't know which fields in the structure are read-only. If these two systems could collaborate, they could validate the usage of the JSON data in the context of HTTP by extending standard JSON Schema tools.

Classical JSON Schema never defined what information made up an annotation, or how to report that information to applications such as OpenAPI tools. Modern JSON Schema defines five pieces of information that make up an annotation, along with machine-readable formats for reporting it. This allows OpenAPI tools to collaborate with standard JSON Schema implementations:

left to right: JSON Schema validation (valid) and annotation (/user/id is readOnly, /user/password is writeOnly), then OpenAPI HTTP usage (PUT request on server, /user/id unchanged therefore valid, /user/password changeable therefore valid)

Here we see JSON Schema validating a PUT request payload on the server, and reporting one read-only and one write-only field as annotations. For the purpose of this example, read-only fields can be present as long as they don't change the field's value. The OpenAPI tool checks that the read-only field is unchanged, and knows that the write-only field is allowed because a PUT request is a write.

This shows how modern JSON Schema annotations enable standardized collaboration among tools that was not previously possible! We'll explore the five parts of an annotation along with many scenarios where applications can leverage JSON Schema annotations to perform a wide variety of tasks.

Vocabularies: You (yes, you!) can extend JSON Schema

Vocabularies allow anyone to define JSON Schema keywords that can be used reliably with any compliant implementations. Classical JSON Schema allows extension keywords, but you can't guarantee that they'll be respected. Unrecognized extension keywords are silently ignored, which results in invalid data being declared valid.

Modern JSON Schema introduces vocabularies, which allow you to define a group of keywords and identify them with a URI. Schema authors can then use that URI to tell implementations that the need to support the vocabulary in order to use the schema. If they can't, instead of failing validation, the implementation refuses to run the schema and indicates which vocabularies it doesn't understand.

This is just like finding out that you need to install a 3rd-party library to run a piece of software. Hopefully it will one day be easy to find vocabulary plugins when you need them. But even if you can't find support for a vocabulary, it's much better for an implementation to refuse to use the schema than it is to incorrectly claim that the data is valid! Getting the right error is the first step to fixing it.

We'll spend a whole series of posts understanding why you should care about vocabularies and what you need to know about them. We'll walk through examples of how to design and implement them, paying special attention to vocabularies that can be created and used without having to write custom code. This is an area that is still evolving, and I hope this blog will start conversations that help us finalize vocabularies in a way that makes them accessible to everyone who wants create keywords.

Conclusion

In this article, you've learned that:

  • "Modern JSON Schema" is a way to simplify how we talk about JSON Schema and promote the expanded capabilities of its recent versions
  • Runtime keyword communication makes modern JSON Schema more re-usable
  • Modern JSON Schema's annotations and output formats allow applications to build on JSON Schema in a standardized way
  • Vocabularies allow anyone to create new keywords and ensure that they will be used correctly rather than being silently ignored

Thank you for reading this introductory post! Future posts will go deeper into these and other topics, including detailed examples. My goal is to keep posts short (less than 1000 words) and frequent, with the occasional longer deep dive.

In the next post we'll examine how the mental model for working with JSON Schema differs from working with object-oriented programming. Understanding this is fundamental to really getting comfortable with how JSON Schema works. Please click the "+Follow" button at the top of the page as you will not want to miss it!

I am very excited to be writing this blog, and have several articles near completion that you can expect to see within the next week. There is so much to say about JSON Schema and APIs, and I would love to hear what topics would interest you! Please let me know in the comments! ๐Ÿ‘‡


  1. The "draft" label is by now mostly due to historical inertia, and will be dropped soon in favor of a release model with better compatibility guarantees. For modern JSON Schema I generally won't bother with the "draft" label. 2020-12 in particular is the base from which our new push towards stability will start. โ†ฉ

Did you find this article valuable?

Support Henry Andrews by becoming a sponsor. Any amount is appreciated!

ย