Describe the bug
I can reliably cause datamodel-code-generator
to error when processing a jsonschema file which is otherwise valid, and which can be made to succeed by tweaking the schema very slightly in a way that doesn't fundamentally alter it.
To Reproduce
Example schema:
{
"$ref": "#/definitions/LogicalExpression",
"$schema": "http://json-schema.org/draft-07/schema#",
"definitions": {
"ValueExpression": {
"title": "ValueExpression",
"anyOf": [
{
"$ref": "#/definitions/ConditionalValueExpression"
}
]
},
"ConditionalValueExpression": {
"additionalProperties": false,
"title": "ConditionalValueExpression",
"properties": {
"default": {
"$ref": "#/definitions/ValueExpression"
}
},
"type": "object"
},
"LogicalExpression": {
"title": "LogicalExpression",
"anyOf": [
{
"$ref": "#/definitions/ValueExpression"
},
{
"type": "string"
}
]
}
}
}
Used commandline:
$ datamodel-codegen --input schema2.json --output model.py --collapse-root-models --target-python-version=3.10
The input file type was determined to be: jsonschema
This can be specified explicitly with the `--input-file-type` option.
Traceback (most recent call last):
File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\__main__.py", line 476, in main
generate(
File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\__init__.py", line 485, in generate
results = parser.parse()
File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\parser\base.py", line 1474, in parse
body = code_formatter.format_code(body)
File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\format.py", line 238, in format_code
code = self.apply_black(code)
File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\format.py", line 246, in apply_black
return black.format_str(
File "src\black\__init__.py", line 1204, in format_str
File "src\black\__init__.py", line 1218, in _format_str_once
File "src\black\parsing.py", line 98, in lib2to3_parse
black.parsing.InvalidInput: Cannot parse for target version Python 3.10: 9:20: __root__: Union[, str] = Field(..., title='Expression')
Expected behavior
The command should complete successfully and generate classes in model.py
. Interestingly the schema can be adjusted very slightly in a way that is functionally identical and the code generation succeeds without any issue. If you just swap the order of the two types inside the anyOf
in LogicalExpression
(like the below) code generation will succeed.
"LogicalExpression": {
"title": "LogicalExpression",
"anyOf": [
{
"type": "string"
},
{
"$ref": "#/definitions/ValueExpression"
}
]
}
I know the schema may not make sense and may look stupid but I trimmed down a fairly large json schema to get a minimal example that reproduces the problem. And, as I said, it can be adjusted very trivially so that code generation succeeds.
Version:
Additional context
Note that while I say the schema can be adjusted so that it works, this isn't a feasible workaround for us. We are consuming a pretty large schema file that is generated automatically - so identifying and changing things around that need to be changed in the schema to workaround the issue would be really difficult.
And when I say it errors/succeeds - it looks like the code itself generates - it's just that when the datatmodel-codegen
tool invokes black
to reformat the code it fails because the type Union[, str]
isn't valid Python. So during the collapse process I guess types get flattened and removed, but we still end up with something there in their place. And presumably when the order of the anyOf
is reversed we end up with Union[str, ]
which is syntactically fine.
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too