-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refined and documented error handling for functions #92
base: main
Are you sure you want to change the base?
Conversation
9a5aa1d
to
c71fe4e
Compare
c71fe4e
to
3d3a513
Compare
Some feedback along the way here. I like the general idea, and stas said so, too. I also appreciate that the performance impact in this patch is moved to actually calling functions. And I'm very much with you that doing this inspection on AST-processing time once would be really nice, which is why I prioritized writing #95. Also, I was in post-FOSDEM trauma, so that was a good way to spend some time at the airport. That said, I'd prefer a bit more rigor still. In particular, I'd prefer to make the inspection code paths a no-go. One idea I've toyed around with is something like class Function(object):
nargs = 0
kwargs = {}
def __new__(cls):
raise NotImplementedError which would allow us to do something like class OS(Function):
def __new__(cls):
return "windows" Using I also think we need to make functions take |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't get this out of my head.
I agree that we need to do something about error handling of functions, but I think we need to take a different path to that.
Firstly, error handling and how to implement functions is part of the non-existing spec for the resolver. The way I phrased it in a call with stas yesterday, "I wanted to check the spec, so I called the spec". Stas also said that he's only going to get into the resolver spec in Q2. We do intend to start filing issues on the resolver spec in the fluent/fluent repository this week, though. At least the part around named arguments with hyphens changes the implicit spec by the js runtime? Maybe the documentation in this patch is a good start for some parts of the resolver spec, actually.
My other major concern about the technical approach here is that the information around function arguments isn't accessible to editing tools. In a mozilla context, having bad arguments in the browser console is better than nothing, but having them in pontoon is what will actually matter. As we define our Fluent ecosystem, I also expect us to look into language server logic for editors, etc, and then developers writing patches will also have support, from errors to possibly even autocomplete.
I'd like to split this patch up in six things:
- File an issue on fluent/fluent to specify function behavior in the resolver spec. We should specify required globals, their arguments, but also error handling in selectors and formatters. We should also specify the role of binding specs. For example, can you unsupport a
some-arg
in a standard global. This should also cover how to document signatures to tools. One way might be semantic comments, but in Firefox I want something more global than that. - @stasm to review the documentation in this patch for specification material, and potential changes to the spec.
- File an issue with needed changes to https://github.com/projectfluent/fluent/blob/master/guide/functions.md.
- File a bug on compare-locales to actually support parameter checks. That should support the in-file specs above, but probably also adds to that via project configuration.
- Transform this issue to implement the spec and document how the bindings implement it.
- File issues on the js/rust resolvers where they don't adhere to the spec ;-)
I think this patch gives a lot of food for thought in how to spec this, but also in how one can do error handling. My problem is that I can't find out if we're doing the right thing here without a spec.
I agree that a spec would be useful here. I'm also aware that different languages are going to have to approach error handling quite differently. For example, in elm-fluent I implemented a very different approach, where essentially all errors in FTL files become compile time errors. I did this because:
I think it would be important for a resolver spec to be written in such a way that allows these different approaches to error handling - though I have no idea how you would do that! In the same way the Python implementation has to fit the expectations of Python developers, as well as the Fluent spec, and that means it will probably work differently even from other dynamically typed languages. It's going to depend on how things like exceptions are generally used, and what error handling strategies are normal, and what are well supported by other tools. For example in Python/Django I'm not terrified by runtime crashing errors, because I know it will be just reported to Sentry and the main Django process will carry on, and if I've got bugs in my Python code (such as a custom function passed to So, I'm kind of thinking that maybe the implementations should come first, and then the spec :-) - or the spec should be written in general terms. Otherwise, you risk creating a spec that becomes irrelevant because a whole bunch of programming languages will ignore it, or a spec that seriously limits the implementation possibilities, which might not be helpful at this stage for Fluent. After we have a few implementations (e.g. several dynamically types, several statically typed), we can refine the spec and change the behaviour of the implementations if necessary. |
Is there a reason for that? If we have a function:
it really seems a shame if we have to make it any more complicated than simply passing this function to the
What would be the advantage of doing this? From an implementation point of view The only thing we really need is a callable object, so in Python we should accept functions, bound methods, static methods, class methods, objects with I do like your idea of having explicitly named attributes for the arg count and keyword list, rather than a single attribute that combines both in a non obvious way. So you could have:
(Alternatively
Yes, good call. I was most worried about the compiler branch having to generate calls to functions with illegal keyword arguments, but as you say you can use |
We sadly have plenty of experience with systems like that, and the Mozilla l10n team gets it's fair share of hate each time that happens. Android for example likes to do this, but we also had parts of the Firefox Desktop release pipeline do that. Release managers and engineers expect to ship, and one language having one mistake in one string blocking the complete release pipeline is generally not OK. That should also hold true for folks writing stuff in elm, as everything works as expected, you just get an English string. Breaking builds on missing strings (which elm-fluent does by default?) is making development at scale really hard. And it only allows projects with waterfall engineering processes, too. Based on that experience, we designed Fluent to be infallible at build- and runtime, and to recover from localizer errors on a per-message basis. It's not really visible in our documentations, sadly. I'll lobby hard to specify it as such. Tooling on whether a particular project wants to ship with partial translations (or obsolete strings) should really be a separate step in the release pipeline, and not encoded in a Fluent implementation. |
For missing messages, elm-fluent has a command line option to allow fallback instead of error (error is the default), which is pretty much required for development workflow, but can also be used for deployment if wanted. This is the kind of decision that has to be taken at a team level, and also at the level of different kinds of errors. For the kind of workflow and teams I'm envisaging, developers will need to be able to edit FTL files (at least the English/source version), so they will be able to correct syntax errors and type errors in the FTL, especially with the help of a good compiler that points out problems very clearly (which elm-fluent does). On the other hand, in most projects, most developers could not be expected to provide missing messages in translations, so for projects that don't care about a few missing messages having the ability to ignore this error and provide a fallback makes sense. I imagine that in future, if elm-fluent (or a similar project) continues, it may gain extra configuration switches to tolerate certain kinds of errors. However, I'm also very convinced that the vast majority of Elm teams will appreciate the current behaviour. Whatever the spec might say, those additional switches will only get implemented if people actually want it to behave that way, and the defaults will align with what the community of users wants, not the spec. That's why I think that a one-size-fits-all approach to defining 'correct' error handling behaviour which doesn't take into account the way that different programming languages and teams approach these conditions is not very helpful.
What you are essentially saying here is the Fluent should be specified in such a way that it will only work nicely for the way that Mozilla (or other pretty large teams) would want it to work. I'm saying that it should be specified in such a way that it can also work well for smaller teams or teams with different priorities. Designing the spec to make it possible to recover from errors on a per-message basis is enough, IMO. |
I've updated this branch for master, and it also includes the fixes regarding accepting kwargs-with-hyphens. Thinking again about the spec issues, maybe one way to word things is to say something like this:
It is inevitable that in some implementations, certain types of error will be impossible. For example in elm-fluent, even if I were to make the compiler massively more tolerant, there are some errors that simply cannot be recovered from without completing changing the nature of the implementation (which would also result in huge performance penalties). To be specific, for the case of an entirely missing message (i.e. from all locales), with anything approaching the current implementation of elm-fluent you are going to get a compilation error. Changing this behaviour wouldn't be desirable either, because for an entirely missing message there will be no fallback, only some kind of error string, and in 99.9% of cases it will represent a typo by the developer. In bindings that are doing all message lookups at runtime, however, we could choose to recover from an entirely missing message. My point is that there will be variations in what errors can be recovered from across different implementations, and that some implementations will make certain errors impossible at runtime, and the spec should reflect that. |
Motivation for error handling changes and function signature checking - in short, we need to be more tolerant of translator errors, and less tolerant of developer errors.
We need a mechanism for disallowing FTL function calls to use disallowed keyword arguments e.g.
NUMBER(currency:"GBP")
etc. We could do this simply using a wrapper function that has different keyword args fromfluent_number
, but it would be nice to have a more generic mechanism.We want to tolerate translator errors regarding function arguments e.g. if they pass
currency
toNUMBER
(or a non-existing keyword arg) it should generate an error/warning, but theNUMBER
function should still be called with the other valid args. (Without this patch, we are currently incorrectly allowingcurrency
to be passed, and getting only error output from theNUMBER
call if a different mistake is made). Similarly surplus positional arguments should be trimmed.We want to avoid things like this, which is basically what the old code does:
When I have seen code like this before, it has usually been at the end of a long debugging session, and has involved some cursing of the authors. As I don't want to be the subject of the imprecations of future software developers, we need to avoid this (also, the Zen of Python thing about errors etc.)
We could make it catch just
TypeError
to try to catch errors in calling wrong arguments, but this only reduces the problem a bit - there could be aTypeError
from inside the function, and this solution doesn't address item 2. above.Essentially it's about not silencing developer errors. (while still being tolerant of FTL errors)
To support these 3, I've added a system of checking function arguments before we call them, and a well defined error strategy for the functions. In this way we can remove disallowed args and not fail, and we don't even call the function if we know it would throw
TypeError
(for too few positional args).For the compiler branch, the overhead for this checking can all be done up front. For the resolver, the patch as it is does the work lazily for each function call (and not until that point, unlike some other older versions of the compiler branch you may have seen, I've changed things quite a bit). We could potentially cache some of this for future calls to
format
, but that would require storing somewhere, and the only real candidate isFluentBundle
instances. We could perhaps just stick some extra things onto it.In addition, I've properly documented the expected behaviour of custom functions regarding error handling, and re-organized the docs a little bit.