Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bloated __init__ #89

Open
ocefpaf opened this issue Mar 10, 2016 · 8 comments
Open

Bloated __init__ #89

ocefpaf opened this issue Mar 10, 2016 · 8 comments

Comments

@ocefpaf
Copy link
Member

ocefpaf commented Mar 10, 2016

@rhattersley raised this in APIRUS:

If your function has this many parameters maybe formatting isn't the problem that needs solving?

I see a few way we can cut down the fat there. Most of those are expected to exist by the SGRID convention and should be initialized without any declaration in __init__(), even the optional ones can be tested for presence and made None inside the function.

That will remove the ability to create as SGRID object by hand, but I don't think that was the goal in the first place. What do you think @ayan-usgs?

@ayan-usgs
Copy link
Contributor

@ocefpaf,

I agree that many of those parameters can be grabbed either directly found from the SGRID conventions or reliably inferred. I don't think all those parameters are needed. During development, it was never thought that anyone would actually bother with manual creation of an SGRID object, but I leaned towards giving more flexibility to the users by allowing declaration of the parameters in __init__(). The core goal was always to have, as much as possible, an SGRID object created from the values within a netCDF dataset. I have yet to see anyone take advantage of that flexibility by instantiating an SGRID object using those parameters, so I would be totally cool with their removal.

@ChrisBarker-NOAA
Copy link
Contributor

That will remove the ability to create as SGRID object by hand

I absolutely, positively think it's important to be able to crate an SGRID object by hand!

Of course, that doesn't mean you need to be able to create a complete one with just the class initializer, you could initialize a partly completed one, and then add in the rest of it as separate operations -- in fact, that's what UGRID, and I'm sure SGRID do when loading from a netcdf file. (though there are some issues with consistency there -- order of operations might be important).

But: having the full signature inthe init serves as nice documentation, and makes it easier to copy, serialize, de-serialize, etc. And I actually see no downside -- python's optional parameters and keyword arguments make it easy to only use what you want, and it's clear what you are doing.

I do see:

                 *args,
                 **kwargs

at the end there -- what is that for? I don't see them used anywhere. I really don't like to put those in without a real need -- it just makes it easier to make mistakes like mistyping a parameter name.

I have yet to see anyone take advantage of that flexibility by instantiating an SGRID object using those parameters

not even in test code? then we've got some real holes in our tests :-)

@ocefpaf
Copy link
Member Author

ocefpaf commented Mar 11, 2016

I absolutely, positively think it's important to be able to crate an SGRID object by hand!

Why?

But: having the full signature inthe init serves as nice documentation

The SGRID specs should cover that.

I do see:

                 *args,
                 **kwargs

Those should be removed too.

The pysgrid module is a SGRID convention object that interprets the SGRID specs. The SGRID specs, as any other specs, are our (interoperability people 😉) way to keep modelers at bay. Modelers have their way own of constructing grids already built-in in their models and/or other tools.

IMO there is no reason for the spec parser (here pysgrid) to became a grid creator. That is not a feature anyone would be using and, the increase in code complexity, is not sustainable in the long run.

(We have 0 paid developers in pysgrid right now!)

@ChrisBarker-NOAA
Copy link
Contributor

I think we have a different definition of "crate by hand". What I mean is create an sgrid object, once you have the grid defined in some other way -- like from one of those ways of "constructing grids already built-in in their models and/or other tools."

I didn't mean features that actually help you compute what the grid should be -- conforming to boundary, what have you. Though, now that you mention it -- why not??

If nothing else, this is really useful for test data, like what's in pyugrid right now, constructing a UGrid object from python literals.

Python is dynamic, it would be very hard to make an SGrid object that the user could NOT populate by hand, but I think it should be easy and obvious how to do so.

IRIS is a good example of what NOT to do -- it was designed with the use case of loading data from a file, and then doing analysis, visualization, etc from that data. It works great for that. But if you have data that's not in a conforming file format, it is quite tricky to figure out how to get it into an IRIS Cube -- that may be primarily a documentation problem, but that kind of proves my point: if the constructor of the object has optional parameters for all its attributes, then it can be easy and obvious to figure out how to construct one.

I guess this comes down to whether you see these tools as one-way streets -- they are only for reading data -- but I think they should be, (and pyugrid, at least, is), a way to write data as well. For instance, these tools should help the modeling community create compliant files.

the spec parser (here pysgrid)

I don't think that pysgrid nor pyugrid should be thought of as a spec parser - that's actually a secondary feature. I started pyugrid not as a spec parser, but as object that embodies the DATA MODEL encoded in the UGRID spec -- NOT a parser for the file format. The file format parser was actually added later (though not much later...) and indeed, it really should be separated more -- i.e. a compliance checker requires parsing the spec, but not creating any data objects at all...

But: having the full signature in the init serves as nice documentation
The SGRID specs should cover that.

Ouch! the spec is a pain to read and understand, and it's focused on all the nitty gritty of netcdf attributes that define the roles of the components, etc -- it would be very hard to figure out how to construct an SGrid or UGrid object from those specs.

This is was confirmed just two days ago in a meeting -- both someone in my shop and a guy from NOAA COOPS said they looked at the UGRID spec and found it essentially impossible to figure out how to write a UGRID compliant file. OK, maybe not impossible, but enough of a challenge that they gave up and didn't bother.

It would be nice to have a document that cleanly described the data models, but even then, having docs somewhere else be necessary to use a class is not good for usability.

And I still totally fail to see the downside of having a large optional parameter list in the init anyway.

@rhattersley
Copy link

IRIS is a good example of what NOT to do ... if you have data that's not in a conforming file format, it is quite tricky to figure out how to get it into an IRIS Cube -- that may be primarily a documentation problem, but that kind of proves my point: if the constructor of the object has optional parameters for all its attributes, then it can be easy and obvious to figure out how to construct one.

Yet an Iris (no capitals - it's not an acronym) Cube does have optional parameters for all its attributes. 😕 If it's tricky to create a Cube, it suggests lots of optional parameters isn't the solution. In short, I guess I don't understand the point you're making, sorry. 😒

And I still totally fail to see the downside of having a large optional parameter list in the init anyway.

My original comment was not intended to be taken too seriously - hence the winking emoticon. So please let's not blow it out of proportion. Generally it's not a good idea to have lots of a parameters to a routine, as it's indicative of overly broad/multiple responsibilities (e.g. a God object). But, as I've tried to highlight with the emphasised words, it's not a black & white rule. Just something to think about and then move on.

@ChrisBarker-NOAA
Copy link
Contributor

Yet an Iris (no capitals - it's not an acronym) Cube does have optional parameters http://scitools.org.uk/iris/docs/latest/iris/iris/cube.html#iris.cube.Cube for all its attributes. [image: 😕] If it's tricky to create a Cube, it suggests lots of optional parameters isn't the solution. In short, I guess I don't understand the point you're making, sorry. 😒

You understand, it was just a poorly made point :-)

Generally it's not a good idea to have lots of a parameters to a routine, as it's indicative of overly broad/multiple responsibilities (e.g. a God object https://en.wikipedia.org/wiki/God_object).

I agree with you there, but then it's not about the signature of the
init, it's about all those attributes that need to be set for a fully
populated object. And I don't think there's any way to avoid that in this
case.

So let's put this to bed.

-CHB

@rhattersley
Copy link

😴

@ChrisBarker-NOAA
Copy link
Contributor

I think we're done here -- the Init_ has most (all?) of the core stuff needed to create an SGrid object. good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants