-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Colander serializes numbers & bools as strings. #80
Comments
If anyone else has the same problem, you can temporarily work around the issue by defining the following overrides:
Keep in mind that you'll have to put these in some custom package/module if you intend on reusing them for schemas in several Python modules. |
All builtin colander types generate strings as cstruct values except mapping and sequence, by design. Deserialization converts them from strings to appropriate types. |
My point is precisely that deserialization using colander converts them back to the appropriate types. The point of generating a standard format like JSON is precisely to inter-operate with other systems. Can you at least consider introducing an optional argument to the decoding process defaulting to the current behavior? |
Apologies, no. Colander serializations are meant to be deserialized by colander. Colander deserializations are meant to be serialized by colander. Neither result is meant to be JSON. |
First, if colander serialization is not meant to generate something that inter-operable with anything else, colander serialization is utterly useless because I can just pickle my Python objects. Second, colander is really close to offering that kind of support. Proof is that I can write ~20 lines of monkey patching to find the only two issues I've found preventing it from being used this way. |
Just to give you an idea: I'm working on a REST API using Pyramid and Cornice. Colander is by far the best choice to validate incoming JSON. For basic/intermediate use cases, it's also the best choice to format outgoing JSON. The only two things that are getting in the way of making it perfect for outgoing JSON are issues #60 and #80. |
Note that by patching the result of the following 4 functions, I can generate proper JSON output with all my schemas:
I just apply the following code:
Although this works for me and I'll keep applying it as long as it will be necessary, I don't see the purpose in intentionally limiting the purpose of serialization to creating Colander's proprietary serialization format. I think supporting this would be a major feature and a good selling point for Colander. Please reopen this issue. |
I also use Colander to validate JSON, which it does very well. Being able to serialize objects into data structures whose boolean, integer and float values retain their data types would be a huge benefit. I understand that this isn't the intended usage of serialize and that I probably don't fully appreciate the implications of making the behavior of Andre's monkey-patch the default. However, I do wish this was the case, for what it's worth. |
I'm also working on a REST API using Pyramid and Cornice. I was unpleasantly surprised when all my values were converted to strings. I'm considering using the aforementioned monkey-patch or just plain json library. I can't find any other similar library: there is jsonschema but it reminds me to XML, maybe formencode? |
While Colander is useful for the purpose that Cornice allows -- input validation for a web service view -- my experience is that you start wanting to use Schema objects to house serialization and deserialization logic. But that doesn't appear to be an intended use of the library. Still, the awesomeness of having a Schema object that manages validation, serializing and deserializing data structures between JSON and Python is so great, I too am using a version of the patch @AndreLouisCaron posted. Colander is just too useful with it... I also came across jsonschema, but it seems much less powerful and too focused on validating the (draft) JSON Schema spec. There doesn't seem to be a way to express that you have this custom Python type that needs to serialize/deserialize a certain way. OTH it would be great to be able to pass off the schema to the client for client-side validation -- something Colander schemas can't do, but something like jsonschema dict-based schemas could. |
This statement is kind of weird to me. Serialization/deserialization logic is the only thing the library allows you to do.
You can't technically pass the schema an object in itself, but I find it really convenient to expose the schema in a Python package so that Python-based clients get the schema definition for free. |
You're right. I should have been more specific. As the docs say, Colander is useful for validating and deserializing JSON (and other data). But since serialization is only meant to create data consumed by Colander, deserialization, as a separate step from validation, is less useful. You can add deserialization logic to a schema, but you can't use the schema to serialize the Python data structure back to JSON. And when I'm talking about "serialization logic," I'm thinking of custom SchemaNode types that e.g. transform NumPy data structures into JSON and back again. So without something like your patch, the serialization part of that logic ends up somewhere else, which I find less than ideal. When all that logic is contained in one place, I think it's easier to reason about and maintain. I don't need Colander to support serializing objects to a JSON-friendly format, but it would be great if there were a mechanism other than monkey-patching to control the serialization output.
Right. This is a great use case for Colander. Now that you can subclass schemas, it's even nicer. My desire to share the schema with a JavaScript client isn't something I expect Colander to provide, but I could see a third-party library consuming a Colander schema and outputting something like a JSON Schema document. |
I've been starting to evaluate colander as a replacement for Django forms, specifically for validating and deserializing incoming JSON, which it has been really awesome for so far. I've also been looking at it for serializing objects into outgoing JSON structures as well. However this thread caught my attention that I (along with several other people in this thread) may be misusing the library. Particularly this statement from @mcdonc:
Perhaps I'm misunderstanding you, but if this is true, you may want to update the docs, as it seem it seems that this philosophy is at least somewhat in conflict with the first sentence on http://docs.pylonsproject.org/projects/colander/en/latest/:
Or at least it could use some clarification around the fact that while colander will happily deserialize XML/JSON/form post data, it is not intended for direct 2-way interpolation with these formats (ie deserializing some incoming JSON, manipulating it and turning around serializing it back into JSON). As the docs led me to believe that this was the general purpose of the library. I didn't realize that it was strictly for sending data between end points that are all using colander to serialize and deserialize the data. |
@mikeocool This comment also confused me. In retrospect however, I think the idea behind @mcdonc's comment is that Colander in itself was not meant to be directly tied to the serialization format (e.g. the output from However, the library seems to be tied to at least one specific serialization format that understands nothing but strings, lists and maps, which probably explains why Colander behaves the way it does. I wholeheartedly agree that if the library hapily says that it's useful for processing data across multiple formats, it should either naturally support the formats it claims to support or be clear about its limitations. |
What is the real reason why serialization can't be in json format. Is it deform? As far as I know @iElectric is working on something like defrom2. |
Having used colander extensively, I see both sides of the argument:
I sort of like the suggestion by @offlinehacker, but the issue isn't really specific to JSON. Serialization to YAML or other formats would run into the same issue, so the parameter shouldn't be called "json=True". I propose a resolution based on a current experiment I'm doing: let's add new schema types, 'colander.NativeInteger' and 'colander.NativeBoolean'. These are like the standard Integer and Boolean schema types, but they serialize to 'int' and 'bool' respectively. Then we should document that when people want to use colander for JSON (de)serialization, they should use NativeInteger and NativeBoolean instead of Integer and Boolean. This advice would apply equally to other serialization formats. |
I was a bit disappointed when I found out that colander did not do that. I thought I had found the ultimate data serialization tool for python, but turned out it doesn't do what I expected, e.g. convert data to and from json, xml. Now that I understand I have to write the logic to convert data to and from json/xml, I am not sure how colander will be useful to my my app. I'm not criticizing the project in any way. All I am saying is that reading part of the documentation gave me expectations that were not fulfilled. I do too wish colander would provide JSON serialization and deserialization. |
Just want to throw what little weight I may have in support of this issue as I have been watching it since the days I too was working in Pyramid and Cornice. |
When I use the
Bool
andInt
types in my schemas like so:And I use it to serialize some data like so:
Then
data['interested']' is
"true"` (a string).Now, if I
json.dumps(data)
, I get strings in my JSON data instead of the boolean I requested. This is really annoying because there's no way to fix this reliably after the fact! At best, I can supply a customJSONEncoder
, but I can still accidentally convert strings that shouldn't be converted.The same problem exists with numbers.
The text was updated successfully, but these errors were encountered: