In most cases where I want to do some validation on JSON I find that I usually have a class/struct/object that represents the payload and I want to unpack JSON into it (or dump that class to JSON). Ultimately, there are already nice tools to do this (eg; marshmallow on the python side). So unless I'm crossing language boundaries writing a separate jsonschema and using that is more work and I have to keep it up to date.
And in the cases where I want to cross language boundaries at this point there's less and less of a compelling case to go with JSON and not eg; flatbuffers/protobuf/thrift/capnproto since you're writing a schema anyways.
Here's my gripe with random HN comments: this has nothing to do with stability guarantees of the specification or not, unless I'm missing something in your comment. You basically just saw "JSON Schema" and thought "Aha, JSON Schema, here is why I don't like it at all, regardless of why you started this conversation"
To say something concrete about this: it’s vital that JSONSchema have the ability to provide arbitrary namespace-friendly metadata at a field level. This would allow the schema itself to be extended to specify serialization details - much like Kubernetes YAML can go far beyond specifying a deployment and be tagged with various kinds of behavior modifiers.
In theory this is already supported - ish. Because unknown keys at any level of the spec are ignored… but how does one know that those keys will always be unknown? The specification should do something like say certain keys, say, any with a period in the key name, will never be used by the core specification as it evolves. It’s a bit unclear from the spec at https://json-schema.org/draft/2020-12/json-schema-core.html#... IMO. This is why this thread is important!
And this is exactly why we are asking this sort of question. This sort of feedback is useful. Thanks. Some of the people working on the spec have had a similar idea and it may end up being part of the spec.
Since you use python you might be interested in knowing that Pydantic outputs JSON Schema as an option. This is part of how FastAPI is able to generate OpenAPI specs (OpenAPI uses JSON Schema for a lot of validation). The only time I've needed to use JSON Schema I just used Pydantic models as my source of truth.
I use pydantic, and then use the json schema output to generate typescript interface definitions. This gives me pretty good velocity by having a single ground truth.
The application I'm working on predates FastAPI, so I'm using CherryPy and modified their JSON tool to call the parse_obj() and json() on the model where needed.
Wonder in which language class members can have such JSON Schema requirements as "an integer which is a multiple of integer X" or "string which matches regexp R". Also not sure which other schema system is better than JSON Schema across many different features.
They're saying that the tool doesn't fit their priorities for their use case. I suspect you might be trying to say that their use case isn't what the tool is designed for, but you've phrased it in a way that comes across as condescending for not knowing what the tool even does. I'm not sure if you're trying to convince them that the tool would in fact be superior for their use case or if you're unhappy with them for giving this feedback in a context that you think isn't appropriate, but in either case I think it would be more effective to just state your point directly instead of passive-aggressively.
Ok, direct statement. I've tried some alternatives to JSON Schema, e.g. Protobuf. Left with some disappointment towards those alternatives. JSON Schema so far looks pretty good to me in comparison - overall, not for some specific cases, which are frankly rare.
> Wonder in which language class members can have such JSON Schema requirements as "an integer which is a multiple of integer X" or "string which matches regexp R".
Any language with a half-decent type system can represent that at the type level. And certainly any language worth knowing can trivially check whether an integer is a multiple of integer X or a string matches regexp R, which is what grandparent was getting at.
We seem to talk about different things. When doing deserialization, many wouldn't expect to add a custom code which would translate data and validate it - otherwise we would write the whole deserialization ourselves, I guess.
Most validation has to be done in custom code after deserialization (e.g. maybe a schema can enforce that the "collection ID" is in the right format, but it can't enforce that that collection actually exists in the datastore). There's definitely value in having a library/code generator/etc. to mechanically do the bytes -> structured value deserialization, but the cost/benefit of doing more complex validations at that stage is very questionable IME.
For simple enough APIs it's possible to have all the validation in the form of JSON Scheme. Surely the id may be missing, but that could be not a "client error" - the input is syntactically valid, and for some APIs that's enough - but will cause the server to return the empty result set.
You can't accept a write if you're going to break a foreign key constraint when you insert the row. And I struggle to imagine a case where you want to do complex validation like "multiple of x" or "matches this regex" but don't want to actually check if the thing exists.
I used matching regexes to check if a particular string actually has a time moment, in the form YYYY-MM-DD hh:mm:ss.xxx .
If you add events, they may not have FKs, you just add them to the table while generating PKs on the fly. This is useful in many cases. There could be other approaches with different cases, I suspect.
You can generate json schemas from other things you know. They have Typescript interface to Json schema compilers. You can probably fine one that generates a schema from python too.
In most cases where I want to do some validation on JSON I find that I usually have a class/struct/object that represents the payload and I want to unpack JSON into it (or dump that class to JSON). Ultimately, there are already nice tools to do this (eg; marshmallow on the python side). So unless I'm crossing language boundaries writing a separate jsonschema and using that is more work and I have to keep it up to date.
And in the cases where I want to cross language boundaries at this point there's less and less of a compelling case to go with JSON and not eg; flatbuffers/protobuf/thrift/capnproto since you're writing a schema anyways.