Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double quoted keys for custom struct json write #1477

Open
sjanel opened this issue Dec 9, 2024 · 7 comments
Open

Double quoted keys for custom struct json write #1477

sjanel opened this issue Dec 9, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@sjanel
Copy link
Collaborator

sjanel commented Dec 9, 2024

I noticed that for structs with custom json write, objects when used as keys are double quoted. For instance, in json_test, custom_struct fails with this test:

      using custom_struct_map = std::map<custom_struct, bool>;

      custom_struct_map obj_map{{custom_struct{"hello"}, true}, {custom_struct{"world"}, false}};

      expect(not glz::write_json(obj_map, s));
      expect(s == R"({"hello":true,"world":false})");

In json/write.hpp, when we write a pair of key value for custom write which calls string type of write, the condition for this if is false so quoted_t is added in a string which will itself be quoted as well, resulting in double quoted string:

      template <opts Opts, class Key, class Value, is_context Ctx>
      GLZ_ALWAYS_INLINE void write_pair_content(const Key& key, Value&& value, Ctx& ctx, auto&&... args)
      {
         if constexpr (str_t<Key> || char_t<Key> || glaze_enum_t<Key> || Opts.quoted_num) {
            write<JSON>::op<Opts>(key, ctx, args...);
         }
         else {
            write<JSON>::op<opt_false<Opts, &opts::raw_string>>(quoted_t<const Key>{key}, ctx, args...);
         }

In my application, I try to guess at the context by scanning the buffer for previous ", : or [ for instance to guess if need to add quotes or not, but it's not very practical.

I tried to figure out how in client code we could identify that we are writing for a key or a value. In Json, keys should always be quoted, but it's not true for values such as bool / int for instance. Maybe one solution could be to give additional hint in the context that we are writing a key or value in order for it to know if it needs to add the quotes or not.

WDYT ?

@sjanel sjanel added the bug Something isn't working label Dec 9, 2024
@stephenberry
Copy link
Owner

It seems like you are not wanting the custom_struct to be the key, but rather for the maps to be merged. You can get close to what you want with using custom_struct_map = std::vector<std::pair<custom_struct, bool>>;. One way of merging is to use a vector of pairs. However, this is still double quoting the keys, so I need to look into this issue in more detail.

@stephenberry
Copy link
Owner

Boy, this is tricky. I got a firm grasp on the problem now. You were write to put out the quoted_t logic, which is being applied to the custom serialization.

Normally Glaze tries to identify the type of JSON output based on the C++ type. In this case the C++ type is custom and therefore we can't draw conclusions about what type of JSON output it will generate. This seems to imply that we ought to have a way of tagging the type of the JSON that is being produced by our custom serializer. The tricky thing is that this creates two paths to JSON type handling, the path for the C++ concepts that make the value readable and writable in JSON (such as operator[key] for maps) and the path that indicates what kind of data is serialized separate from the C++ type. It would be a bad idea to look at the serialized data to determine the type (inefficient and complex). So, we need a way of tagging the type. But, custom serialization is intentionally built around C++ concepts.

I think it is best to use the glz::meta approach with glz::custom, because this separates the type metadata from the custom serialization. Within the glz::meta we already have terms like custom_write, but this is only to avoid ambiguous partial specialization. What we want is to be able to tag the glz::meta with something like string_type to indicate that a JSON string will be serialized.

If your custom_type just didn't add the quotes, there wouldn't be a problem, but I can understand how the quotes might be desirable in different contexts.

I think supporting glz::meta with type tagging would be a valid solution, but there is also the question if it should be another specialization, such as glz::indicate_type<custom_struct, glz::string_type>. If we used a specialization other than glz::meta then we could use it in conjunction with the to/from specializations and not have to rework the custom serialization into glz::custom.

I need to give this more thought, but I think a general solution would be broadly useful.

@sjanel
Copy link
Collaborator Author

sjanel commented Dec 10, 2024

Sorry about that :)

Would it be possible to authorize custom json write for strings only and adapt glaze code in write pair content accordingly? This way, we know that we always need to quote the value in glaze code, and client code in to<JSON override does not need to care about adding more quotes (they will be already added, if it's a key or a value). The code to fix the issue would not be so complicated nor expensive in terms of processing.

However, I agree that it would be a step back from the model you provided, that we could write for int / bool / float values. I think that the need to have custom json writing for non string values is quite rare anyway.

@stephenberry
Copy link
Owner

stephenberry commented Dec 10, 2024

I think an elegant approach might be to define a C++ type with known format association to indicate how this custom type will serialize data. Maybe it should be named something like mimic_type?

template <>
struct glz::indicate_type<custom_type>
{
    using type = std::string; // indicates that the format will appear like a std::string, but with custom read/write
};

@stephenberry
Copy link
Owner

I realize this can exist inside of the glz::meta, because glz::meta won't collide with a custom to/from if it only specifies a type behavior.

Hence I'm think of using the syntax:

template <>
struct glz::meta<custom_type>
{
   using mimic = std::string;
};

I feel like mimic is better than type here, but I'm still thinking through how this should be named.

@sjanel
Copy link
Collaborator Author

sjanel commented Dec 11, 2024

I like your solution of a mimic type. But I would rather see a value from an enum class instead, to avoid ambiguity for instance for strings (std::string_view, std::string). This enum would define the basic json types like string, integer, float and boolean. Actually we also need to take care about the null special value, which should not be quoted.

@stephenberry
Copy link
Owner

I can see merits of an enum class. I'm thinking about how this applies to other specifications beyond JSON. But, I think these could be special enumerations for different use cases.

I think I agree and I propose:

template <>
struct glz::meta<custom_type>
{
   static constexpr auto mimic = mimic_type::string;
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants