Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement tags #341

Open
Hattonuri opened this issue Jan 30, 2025 · 18 comments
Open

Implement tags #341

Hattonuri opened this issue Jan 30, 2025 · 18 comments

Comments

@Hattonuri
Copy link

Hattonuri commented Jan 30, 2025

Go has tags like that and helping with tag dispatching(for example for custom parsers)

type Person struct {
    FirstName    string     `json:"first_name"`
    LastName     string     `json:"last_name"`
    DOB          *time.Time `json:"dob,omitempty"`
    Address      *string    `json:",omitempty"`
    Weight       int        `json:"-"`
    HeightInches int
}

In cpp it could be made with something like that(tag means the same as https://www.fluentcpp.com/2016/12/08/strong-types-for-strong-interfaces/):

template<typename... T>
struct Tags : std::tuple<T...> {};

struct OmitEmpty {};

struct CustomNameForJson {
    const char* custom_name;
};


struct Person {
    rfl::Field<"firstName", std::string, Tags{OmitEmpty{}, CustomNameForJson{"jsonFirstName"}}> first_name;
};

Most important: tag values should not be in template parameters or else we will be unable to match the tags like HasTag (as value should be included in template). The reason is that the only solution would be to implement is_instance_of<> that will have problem with both occuring "typename" and "auto"

@liuzicheng1987
Copy link
Contributor

Hi,

the functionality covered in your example is already supported by the library:

Custom names using rfl::Rename:

https://rfl.getml.com/concepts/structs/

Ignoring fields using rfl::Skip:

https://rfl.getml.com/rfl_skip/

And omitting empty fields using std::optional:

https://rfl.getml.com/optional_fields/

The syntax is not quite as you suggested, but the functionality is there.

@liuzicheng1987
Copy link
Contributor

Also note the processors:

https://rfl.getml.com/concepts/processors/

@Hattonuri
Copy link
Author

Hattonuri commented Jan 31, 2025

the functionality covered in your example is already supported by the library:

Unfortunately, examples I gave were only showing simplicity of tags for custom customization.
For example if I have Clickhouse table, I might want to set different compressors for different columns, or when considering Postgres table we might want to set previous column names of a variable for compatibility. (Processors link also obviously does not solve those problems as they are set for all the structure)

And rfl::Rename/Skip and so on do help only in their one specific scenario without customization ability. For example I'm not even able to write my own Rename-like class because it will become nested struct field instead of the same field but annotated :(

@Hattonuri
Copy link
Author

And if we have tags we don't even need first custom name/rename/skip in Field template. Because anything that customizes the field should be implemented using tags as the only way for customization

@liuzicheng1987
Copy link
Contributor

@Hattonuri ,

I don't think I quite understand the advantage you are seeing here. Let's take a look:

A)

struct Person {
    rfl::Field<"firstName", std::string, Tags{OmitEmpty{}, CustomNameForJson{"jsonFirstName"}}> first_name;
};

B)

struct Person {
    rfl::Rename<"firstName", std::optional<std::string>> first_name;
};

C)

struct Person {
    std::optional<std::string> first_name;
};

rfl::json::write<rfl::SnakeCaseToCamelCase>(person);

I think B) and C) are clearly simpler and more user-friendly than A). But B) and C) are syntax that we already support.

The other problem is that you cannot initialize tuples as you have done in A and tuples cannot be template parameters.

It is also very possible to define your own custom classes:

https://rfl.getml.com/concepts/custom_classes/

https://rfl.getml.com/custom_parser/#rflreflector

I honestly don't see how the "tag" design would be an improvement.

@Hattonuri
Copy link
Author

In my example CustomNameForJson would override only name for a json parser. If we are talking about changing name for all parsers it could be done like

struct Person {
    rfl::Field<std::string, Tags{CustomName{"jsonFirstName"}}> first_name;
};

The other problem is that you cannot initialize tuples as you have done in A and tuples cannot be template parameters.

🤔. I tried myself, youre right here. But what you thing about something like that? https://godbolt.org/z/T4evPzcnG

I honestly don't see how the "tag" design would be an improvement.

For example, now we can't use implement something like this
https://clickhouse.com/docs/en/data-compression/compression-in-clickhouse#choosing-the-right-column-compression-codec
Where it has column Id Int32 CODEC(Delta, ZSTD), and ContentLicense LowCardinality(String), just because we don't have these classes in reflect(as Rename does)

@liuzicheng1987
Copy link
Contributor

liuzicheng1987 commented Feb 3, 2025

@Hattonuri ,

you can actually easily implement something like this yourself:

struct NoCompression{};

struct ZSTD{};

struct Cardinality{};

struct LowCardinality{};

template <class T, class... TagsType>
struct Tags {
     using ReflectionType = T;
     T value;
};

template <class Tag, class Head, class... Tail>
consteval bool contains_tag() {
     if constexpr (std::is_same<Head, Tag>()){
          return true;
    } else if constexpr (sizeof...(Tail) == 0){
       return false;
   } else {
       return contains_tag<Tag, Tail...>();
   }
}; 

template <class Tag, class T>
class ContainsTag;

template <class Tag, class T, class TagsType...>
class ContainsTag<Tag, Tags<T, TagsType...>> {
     constexpr bool value = contains_tag<Tag, TagsType...>();
};

template <class T, class U>
constexpr bool contains_tag_v =
    ContainsTag<std::remove_cvref_t<T>, std::remove_cvref_t<U>>::value;

This will work because of this:

https://rfl.getml.com/concepts/custom_classes/

You can use it like this:

struct MyStruct {
    Tags<int, ZSTD, LowCardinality> field;
};

contains_tag_v<ZSTD, decltype(my_struct.field)>;

@liuzicheng1987
Copy link
Contributor

@Hattonuri , if you like this syntax, I could make it a library feature. It doesn't seem like a lot work, particularly because I have basically already implemented it.

@liuzicheng1987
Copy link
Contributor

@Hattonuri , alternatively we could implement this as compile-time strings:

template <class T, rfl::internal::StringLiteral... StrTags>
struct Tags {
     using ReflectionType = T;
     T value;
};

template <rfl::internal::StringLiteral tag, rfl::internal::StringLiteral head, rfl::internal::StringLiteral... tail>
consteval bool contains_tag() {
     if constexpr (tag == head){
          return true;
    } else if constexpr (sizeof...(tail) == 0){
       return false;
   } else {
       return contains_tag<tag, tail...>();
   }
}; 

template <class Tag, class T>
class ContainsTag;

template <rfl::internal::StringLiteral tag, class T, rfl::internal::StringLiteral... StrTags)>
class ContainsTag<tag, Tags<T, StrTags...>> {
     constexpr bool value = contains_tag<Tag, StrTags...>();
};

template <rfl::internal::StringLiteral tag, class T>
constexpr bool contains_tag_v =
    ContainsTag<tag std::remove_cvref_t<T>>::value;

It would then be used like this:

struct MyStruct {
    Tags<int, "ZSTD", "LowCardinality"> field;
};

contains_tag_v<"ZSTD", decltype(my_struct.field)>;

Personally, I prefer the string-based solution.

@Hattonuri
Copy link
Author

Hattonuri commented Feb 4, 2025

you can actually easily implement something like this yourself:

I did it like that but changed typename into auto. Because when you want to parametrize compressor you'll have to match every typename for every compressor instead of matching by "Compressor" class(like in my link for godbolt upper here).

String tags have the same issue, but further disables you to pass multiple values for one tag. For example Compression codec and compression level which make things even harder :(

As for ReflectionType I tried it and it has the same issue as with avoiding it - you have to if constexpr every time for contains_tag_v to pass the value (when using rfl::to_view for example). So it does not help with Tags. In other things it's the same as in the godbolt link

@Hattonuri
Copy link
Author

Hattonuri commented Feb 4, 2025

Btw, i don't really understand why if we have custom class with defined reflection type reflection library does not itself you underlying type

(I see the point "when you want to iterate by real fields of a structure", but then you should not set reflection()+ReflectionType as override)

@liuzicheng1987
Copy link
Contributor

@Hattonuri , is your problem resolved? Can we close the issue?

@Hattonuri
Copy link
Author

I think that best solution here would still be to at least support automatic reflection() forwarding in rfl::to_view() and so on. Because for custom parsers reflection() does not help :(
Solution in my link seems like a crunch to get a working but shitty code

@liuzicheng1987
Copy link
Contributor

liuzicheng1987 commented Feb 4, 2025

@Hattonuri , I don't think this is possible. The problem is that views are pointers on the fields, but .reflection() might return an rvalue to which we cannot point.

I also don't get why the problem isn't solved...I think we came up with three different approaches...why do you think that reflection() does not help?

@Hattonuri
Copy link
Author

why do you think that reflection() does not help?

The struct with ReflectionType is the same as any with other naming considering tags as reflection library does not automatically extract reflect() value when using visitor(which in fact is all the logic we use from library when we write custom parser and should extract it ourselves).
If we look at https://godbolt.org/z/T4evPzcnG (same link as was above) we get exactly the same result as with Reflection. The reason that all the ReflectionType affects is logic written inside already-made parsers. This feature would affect anything other than the library itself if it was written in fields producer(which are visitors) instead of every consumer(which are parsers).

think we came up with three different approaches

Solution with strings is bad because it does not allow to set the tag value easily (for example zstd level for zstd compressor).
Also it makes possible for many collisions and adds the requirement for programming on strings (for example "zstd9" and checking for a tag which starts with "zstd" and parsing last digits for extraction).

Solution with many typenames is better than above but still not good because when you pass typename instead of value you'll happen with Tags<Codec<9>> and then trying to extract this 9 with much pain. You'll have to make instance_of<> trait for every combination of typename or auto which is 2^len(args in tag).

template<typename T, template<typename...> typename Spec>
concept is_specialization_of_v = is_specialization_of<T, Spec>::value;
` ``

here you can not pass template<typename_or_auto...> typename Spec

@Hattonuri
Copy link
Author

@Hattonuri , I don't think this is possible. The problem is that views are pointers on the fields, but .reflection() might return an rvalue to which we cannot point.

I dont think that this is a problem, because if .reflection() returns rvalue then we are not required to allow to_view method (because it is .... to view which should not have side effects)

@liuzicheng1987
Copy link
Contributor

liuzicheng1987 commented Feb 4, 2025

@Hattonuri , I cannot change the behaviour of the to_view function, for the stated reasons.

But you can write your own implementation that does exactly what you want:

template <class T>
concept has_reflection_method = requires(T t)
{
    { t->reflection() } ;
};

template <class T>
auto to_view(T& _t) {
  return rfl::to_view(_t).transform([](auto&& field) {
      if constexpr(has_reflection_method<typename decltype(field)::Type>) {
           return rfl::make_field<field.name_>(&field.value()->reflection());
     } else {
          return field;
     }
  });
}

https://rfl.getml.com/named_tuple/#monadic-operations-transform-and-and_then

You can use it like this:

const auto view = to_view(my_struct);

It will automatically extract the reflection types, just like you asked.

@Hattonuri
Copy link
Author

The last one is good, really. Thank you)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants