Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugins for serde_derive #1427

Closed
dtolnay opened this issue Nov 24, 2018 · 7 comments
Closed

Plugins for serde_derive #1427

dtolnay opened this issue Nov 24, 2018 · 7 comments

Comments

@dtolnay
Copy link
Member

dtolnay commented Nov 24, 2018

I would like to have serde_derive integrate better with use-case-specific logic in external crates, rather than adding further use-case-specific logic into serde_derive.

As an example, #1426 wants to prefix every variant of an enum with some constant prefix. One way would be to build this logic into serde_derive.

#[derive(Deserialize)]
#[serde(prefix_all = "NetworkResult_")]
enum NetworkResult {
    Success, // deserializes from "NetworkResult_Success"
}

As there can be a question of how common a use case is and whether that justifies the tradeoff in compile time and complexity in serde_derive, it would be nicer to have such logic provided outside of serde_derive. The precise mechanism would need to be designed.

Observe that with the stabilization of proc_macro_attribute in Rust 1.30, one could write an attribute macro that inserts serde attributes into a derive invocation.

// expands to #[serde(rename = "NetworkResult_Success")] on the variant
#[serde_prefix_all = "NetworkResult_"]
#[derive(Deserialize)]
enum NetworkResult {
    Success,
}

If necessary, serde_derive could provide some basic indirection to make the invocations feel less jarring.

extern crate serde_prefix_all;
use serde_prefix_all::prefix_all;

#[derive(Deserialize)]
#[serde(apply = "prefix_all", prefix = "NetworkResult_")] // expands to #[prefix_all(prefix = "NetworkResult_")]
enum NetworkResult {
    Success,
}

Part of this work would involve polishing up serde_derive_internals because many plugins would presumably want to parse serde attributes. For example in the prefix_all use case the plugin would need to be aware of serde(rename_all = "...") and serde(rename = "...") attributes and contain logic that respects them. So far serde_derive_internals has been an internal dependency with no attention to its public API.

We will also need to catalogue some known use cases that would be met by plugin support. Digging through support tickets or feature requests or Stack Overflow questions could help put together a compelling list.

And we should consult with the authors of crates like serde_with and serde_aux for buy-in once a design has been fleshed out.

Mentioning: @Phlosioneer

@Phlosioneer
Copy link

I think the first question is, if one were to write the #[serde_prefix_all(...)] attribute, how would it work? Really, it just needs a way to apply serde attributes to all the things in the struct/enum.

It'd be nice if it looked something like this...

use serde_derive_internals::ast::{Container, Data};
use proc_macro::TokenStream;
fn prefix_all(args: TokenStream, item: Container) -> Result<Container, String> {
  // Modify the Container here...
  if let Data::Enum(ref mut vars) = item.data {
    let prefix = parse_prefix(args);
    for variant in vars.iter_mut() {
      // TODO: It's impossible to tell at this stage whether a `serde(rename)` has happened,
      // so it's also impossible to accurately account for `serde(rename_all)`.
      let ser_name = variant.attrs.name.serialize_name();
      let de_name = variant.attrs.name.deserialize_name();
      let new_ser_name = apply_prefix(ser_name, prefix);
      let new_de_name = apply_prefix(de_name, prefix);
      // TODO: There's currently no way to create / modify a `Name`.
      variant.attrs.name = Name::new(new_ser_name, new_de_name);
    }
  }
}

fn parse_prefix(...) -> String { ... }

//...

#[derive(Serialize, Deserialize)]
#[serde(plugin = "prefix_all", prefix = "networkResult_")] // Everything after the "plugin = " is passed to the plugin function.
#[serde(rename_all = "camelCase")]
enum NetworkResult {
  Success,
  NetworkFailure,
  
  #[serde(rename = "unknown")]
  UnknownFailure,
}

This makes me realize that the rename and rename_all code is really messy. I'm going to try cleaning it up.

Phlosioneer added a commit to Phlosioneer/serde that referenced this issue Nov 26, 2018
It's now concentrated in the Name struct. Some clones were avoided,
too.

Partly in preparation for serde-rs#1427
@Phlosioneer
Copy link

Continuing the conversation from #1429...

I think asking users to write full-fledged proc_macro parsers is unreasonable. Perhaps we have different goals, but I'm pushing for a "plugin" that could be written by anyone using serde_derive, with just one function. Much like a manual implementation of Serialize or Deserialize.

I think it's possible, and very ergonomic, to treat a plugin as a single, post-processing function. It would have a chance to modify the Container after parsing the syntax tree, and before generating any code. The Container, Field, and Variant structs already have most of the functionality a user would need to reach in and change whatever they like about the Container.

The very small example I gave above would be completely impossible as a stand-alone proc_macro attribute, and requires almost no experience with writing proc_macros.

@hcpl
Copy link
Contributor

hcpl commented Dec 3, 2018

I am merely speculating about what is happening at all here, so please forgive me if I got this wrong. I want to clarify the situation around this issue to better communicate the needs and constraints of both parties.

In @dtolnay's case, what I am getting from #1429 (review) is that serde_derive_internals is somewhat deprecated due to implications of

it doesn't seem like I will have bandwidth to own the additional surface area of a public serde_derive_internals

and #1429 (comment)

someone with more bandwidth

being "I don't have enough time to extend/support this anymore, so I'm passing along my flag".

And from my experience of working in the serde_derive/src/internals/ directory (which was moved from serde_derive_internals in #1246) the code there could use a lot of refactoring, though admittedly right now I'm not able to put into words how exactly this can be done. Which also means serde_derive_internals needs the same amount of refactoring, to the point it might be useful to rewrite it from the ground up.

For @Phlosioneer, on the other hand, there exists a need to express certain code constructs (plugins were mentioned, but there could be more) using only serde_derive internal workings. For this to be possible, some foundation has to exist beforehand. It may not be very solid or bug-free at first, but can be improved in the future.

I don't have anything more to add, as @Phlosioneer has already explained their position in #1429 (comment) and #1427 (comment).

Again, I'm trying to understand the situation as a third-party, so if anything I wrote above is not true, please correct me!

@dtolnay
Copy link
Member Author

dtolnay commented Dec 3, 2018

Thanks @hcpl! Some clarifications follow.


serde_derive_internals is somewhat deprecated

The crate is not deprecated. It is simply whatever code serde_derive currently happens to be using to parse serde attributes. A while back we had a request to make that logic accessible. The request did not ask for any commitment to stability, any improved documentation, or any additional API surface area beyond what serde_derive uses, so it was unproblematic and easy to fulfill.

Everything about that is different from how we would want a plugin architecture to work.

  • If the API for writing plugins changes every week, plugin authors would be justifiably unhappy. Thus there is a commitment to relative stability.
  • If the API for writing plugins is not documented, it does not meet the bar for an official Serde feature. Thus there is a documentation burden.
  • Plugins require more functionality than what is currently exposed by serde_derive_internals. Thus there is a request for additional API surface area that would need design investment.

All of these are why #1429 is not viable. If some rename methods were all that stood between serde_derive_internals and a completed serde_derive plugin library then we would do it, but that is clearly not the case. Serde_derive_internals would require weeks more work to approach what is needed. The bottom line is that this feature is not a high priority for me at present so I will spend that time on other work instead.

Changes that are not accompanied by a commitment to stability, do not present a documentation burden, and do not add API surface area are the best kind. @hcpl's work on #1424 is an example of this.


Perhaps we have different goals, but I'm pushing for a "plugin" that could be written by anyone using serde_derive, with just one function. I think asking users to write full-fledged proc_macro parsers is unreasonable.

Indeed it is a goal that plugins should be writable by anyone using serde_derive with just one function (which is a procedural macro). To the extent that a parser is required, that sort of thing would need to be factored into a library -- I don't see why a full-fledged proc_macro parser would need to be reimplemented by every user. Maybe you could clarify what unreasonableness you are concerned about here.


I think it's possible, and very ergonomic, to treat a plugin as a single, post-processing function. It would have a chance to modify the Container after parsing the syntax tree, and before generating any code. The Container, Field, and Variant structs already have most of the functionality a user would need to reach in and change whatever they like about the Container.

(Emphasis mine.)

Nope -- the only way for serde_derive to invoke a plugin would be by expanding to code that invokes the plugin as a procedural macro. I describe this approach at the top of this thread. There is no way that I can see for a plugin to execute between when serde_derive parses a syntax tree and when serde_derive emits code.


The very small example I gave above would be completely impossible as a stand-alone proc_macro attribute

Please don't make arguments of this form. It is entirely possible to write a procedural macro that expands this:

#[serde_prefix_all = "NetworkResult_"]
#[derive(Deserialize)]
enum NetworkResult {
    Success,
}

into this:

#[derive(Deserialize)]
enum NetworkResult {
    #[serde(rename = "NetworkResult_Success")]
    Success,
}

That is exactly the approach proposed at the top of this thread. To be even more clear: *I don't know of any other way that the feature could be implemented.*

Later serde_derive could wrap this approach in some syntactic convenience that enables writing the plugin invocation below the derive(Deserialize) line, but it would still expand exactly as shown here. The top of this thread shows an example of how the syntactic convenience could work.


I am not really interested in spending more time on this feature right now, but I'll lay out the remaining steps. Ideally all of these would happen outside of the Serde repo without my input.

  1. Somebody with proc macro experience will need to implement something like serde_prefix_all as shown here. There will probably be a lot of code involved.
  2. Somebody will need to look at that implementation and factor out any code that seems reusable across different plugins into a supporting library (a kind of evolution of serde_derive_internals, but I would recommend working from scratch rather than reusing any of our code). There should be almost no code left in the actual plugin at this point, beyond the renaming logic, because almost everything will be reusable.
  3. (Optional.) Write a proc macro attribute that wraps the entry point into plugins, so that the plugin author's macro does not need to perform an explicit call into the supporting library for parsing. This is a small convenience that would save 2 lines of code from each plugin so I consider it optional.
  4. Come up with a diverse set of plugin use cases and implement those as well. This ensures the supporting library is sufficiently general purpose. Importantly, please consider plugins that don't just manipulate serde attributes. For example they may need to emit entire helper functions that become arguments to deserialize_with.
  5. At this point you have working and easy-to-use plugin functionality with zero code changes in serde_derive. Good job!
  6. Make a concrete proposal for how exactly serde_derive can make the invocations integrate more nicely. I propose one possibility at the top of this thread, but I also haven't thought about the design or considered alternatives. There is a lot of design work remaining.

Example of a plugin without the optional step:

#[proc_macro_attribute]
pub fn prefix_all(input: TokenStream) -> TokenStream {
    let container = serde_plugin::parse(input);

    /* logic */
}

Or with the optional step, which expands to the same thing:

#[serde_plugin]
pub fn prefix_all(container: Container) -> Result<?> {
    /* logic */
}

Invocation with hypothetical syntactic convenience:

extern crate serde_prefix_all;
use serde_prefix_all::prefix_all;

#[derive(Deserialize)]
#[serde(plugin = "prefix_all", prefix = "NetworkResult_")]
enum NetworkResult {
    Success,
}

// expands to:
#[prefix_all(prefix = "NetworkResult_")]
#[derive(Deserialize)]
enum NetworkResult {
    Success,
}

@Phlosioneer
Copy link

Alright. Thanks @hcpl for clearing things up. Thanks also to @dtolnay for taking the time to explain.

My intention was never to add undue burden to your maintenance; the reason why I did the rename factoring was:

...If some rename methods were all that stood between serde_derive_internals and a completed serde_derive plugin library then we would do it, but that is clearly not the case. Serde_derive_internals would require weeks more work to approach what is needed.

I think, adding the rename code above, it's a perfectly fine and completed-enough plugin library. It's a little rough, but we can just make clear to users that "This is advanced usage that can change at any time, use at your own risk". Just like early proc_macros; they were still useful, even without any API stability whatsoever. I don't think there's any need to make any commitments about serde_derive_internals, just like there were no commitments made when you separated it into its own crate.

The rename functionality was the only thing that a user with access to serde_derive_internals couldn't handle, because it was impossible to determine whether or not it had already been done. Everything else is in public fields, with info on the raw parsed stuff, and what processing was done on it (such as applying rename).

Please don't make arguments of this form. It is entirely possible to write a procedural macro that expands this

I'm sorry it sounded harsh. What I meant was, I couldn't see any way to write it as a proc_macro with just 10 lines of code, like above. There's a lot of boilerplate, parsing, and error handling that goes into proc_macros.

Nope -- the only way for serde_derive to invoke a plugin would be by expanding to code that invokes the plugin as a procedural macro. I describe this approach at the top of this thread. There is no way that I can see for a plugin to execute between when serde_derive parses a syntax tree and when serde_derive emits code.

Make a concrete proposal for how exactly serde_derive can make the invocations integrate more nicely. I propose one possibility at the top of this thread, but I also haven't thought about the design or considered alternatives. There is a lot of design work remaining.

I have a concrete implementation in mind, and I don't think it'll be too difficult. But it'll take time. And I want to do this. But I also don't want to make 5-15 hours of progress and then have you say no. Again, as I said above, it's perfectly ok if you don't want to maintain more, I just wanted to know one way or another before sinking the time; and when you closed #1429 with a very dismissive comment, I felt some hostility. I was wrong.

I think I'm going to go ahead and get a prototype made, and demonstrate the things that I'm talking about. Then we can talk about concrete implementation and design details, rather than what-ifs and maybes.

@Phlosioneer
Copy link

I am mistaken. What I wanted to do doesn't work.

Specifically, I thought const fn and proc macros could interact, like rust-lang/rfcs#2279. The plan was something like this:

const fn my_plugin(container: serde_derive_internals::ast::Container, params: HashMap<String, String>) {
  // ...
}

#[derive(Serialize, Deserialize)]
#[serde(plugin = "my_plugin", my_plugin::key = "value", ...)]
struct MyStruct {
  // ...
}

So then, unless we do build.rs shenanigans, the user would have to write a procedural macro of some kind, compiled separately. And that could then be set up to have basically the same interface, something like:

use serde_plugin::{serde_plugin, ast, attr};
use std::collections::HashMap;

#[proc_macro_attribute]
fn my_plugin(attr: TokenStream, item: TokenStream) -> TokenStream {
  serde_plugin(attr, item, my_plugin_iml)
}

// Can return a modified container, or a stand-alone token stream, or both. I'd make a proper
// type for this, so that `(None, None)` isn't an option; and it'd be best to have this in a `Result` form
// for errors and `?`.
fn my_plugin_impl(container: ast::Container, params: HashMap<String, String>) -> (Option<ast::Container>, Option<TokenStream>) {
  //...
}

/////////////////////
// And in userland:

#[my_plugin(key = "value", ...)]
#[derive(Serialize, Deserialize)]
#[serde(...)]
struct MyStruct {
  // ...
}

I'll try working on it. See how far I get. I'll follow your outlined steps

@dtolnay
Copy link
Member Author

dtolnay commented Feb 1, 2019

Closing because the next steps here are all outside of Serde, so this is not tracking anything actionable on our end. We can open separate issues as needed for any work that needs to happen on this in Serde.

@dtolnay dtolnay closed this as completed Feb 1, 2019
@serde-rs serde-rs locked and limited conversation to collaborators Jun 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

3 participants