-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add error handling in IRunnable::run #344
Comments
What is the guaranteed threading model of I think the error type should be structured. Something like a polymorphic |
The handler is meant to delegate control flow decisions up the call stack to the place where there is sufficient context available for the decisions to be made, similar to exception handlers except that we return back to the place where the "exception" was raised afterward. This happens strictly within the current call stack in one thread. The way I see it now, if one were to invoke multithreading then the proposed mechanism is not being used correctly. We will document this. Currently we define the set of possible error types explicitly in a |
error callback discussionhow will the implementer of this callback know the context of the error? When considering exception handling, context is everything. I'm unconvinced that a generalized solution is usable. It'd probably become a lot of documentation about unrelated parts of the middleware and how they interact with the runnable error callback. I'd omit this callback. For errors where we need to consult a higher layer without aborting execution we should utilize a similar pattern but with callbacks injected on objects local to the errors being generated. So, if you had Error ReturnAny IRunnable::run return has to start with a notion of how we surface errors from below the LibCyphal middleware layer (e.g. the media layer implementation/port) |
A practical case is We have so far identified the following approaches:
The error type discussion is moved to #345 |
|
Currently, we pass media instances into the transport directly via a span or something. What I'm proposing as the first alternative is that we introduce a managing entity that holds the media instances and can handle their errors, where handling is to be understood as forwarding errors to an external handler: class MediaGroup
{
public:
using ErrorHandler = cetl::function<void(std::size_t media_index, const Error& error)>;
void setErrorHandler(const ErrorHandler& eh);
...
private:
std::span<IMedia*> media_;
};
The handler is only meant to be invoked on errors originating from redundant things so I don't think there will be much documentation involved. If you get called with an IMedia from the CAN transport then it's pretty clear what the scope of the error is. If we don't agree on this, we have the next best thing, the third proposal.
In this case we don't introduce additional classes but instead extend class IMedia
{
public:
using ErrorHandler = cetl::function<void(std::size_t media_index, const Error& error)>;
virtual void setErrorHandler(const ErrorHandler& eh) noexcept = 0;
//fgsfds
}; I find this proposal superior compared to the first one. |
As do I. |
Okay. We are going to make The same approach will be used in Cyphal/UDP for The |
Notes based on the last dev call. The design objective is to expose detailed error information at the media layer and at the same time allow high-level handling at the higher layers. To do this, we could (this is not a proposal yet but an idea) do the following:
The |
We can address this by introducing a non-template polymorphic base for First optionThe second part of the solution is amended as follows: struct TransientError
{
/// The reference to the error container loses validity after the return from the handler.
const unbounded_variant_base& error;
unbounded_variant<sizeof(void*)> culprit; ///< Pointer to the failed instance
};
/// The reference to the TransientError, including all nested references, loses validity after the return from the handler.
/// It is, therefore, necessary to handle the error directly in the handler.
using TransientErrorHandler = cetl::function<void(const TransientError& err)>;
void setTransientErrorHandler(const TransientErrorHandler& teh); If the const reference causes problems with copyability, it can be replaced with a const pointer. The Second optionThe second part of the solution is left as-is. The The second option is preferable because it doesn't put undue restrictions on copyability in the simple cases. One non-critical limitation here is that neither solution allows copying the error object for postponed processing unless you know its exact type. We could fix this shortcoming by introducing support for fallible copyability to class unbounded_variant_copyable_base : public unbounded_variant_base
{
//...
/// Returns false if the footprint is not large enough, in which case no copy is done.
CETL_NODISCARD virtual bool copy(const unbounded_variant_base&) = 0;
//...
}; As a piece of syntactic sugar, would it not be nice to define an unbounded variant instantiation specifically designed for holding pointers: /// May contain any raw pointer.
using any_raw_ptr = unbounded_variant<sizeof(void*)>;
/// May contain any raw pointer, unique_ptr, shared_ptr, and most other smart pointers.
using any_ptr = unbounded_variant<sizeof(void*) * 3>; |
@serges147 Here's the refined proposal that incorporates everything discussed above. Amend the definition of class IRunnable
{
public:
/// We may enable PMR for this error type later if necessary.
/// If/when that happens, the footprint should probably be set to zero to minimize overhead.
using Error = cetl::unbounded_variant<sizeof(void*) * 8>; // TODO: the footprint may need to be enlarged to fit nested variants.
virtual expected<void, Error> run(const TimePoint now) = 0;
}; Next, we add new entities to class ITransport
{
public:
using TransientErrorHandler = cetl::function<void(const RichAnyError&)>;
void setTransientErrorHandler(const TransientErrorHandler& teh);
};
If the transient error handler is not installed, errors bubble up through It should be documented that the user must not attempt to modify the configuration of Above we reference a nonexistent type named struct RichAnyError final
{
AnyError error;
/// Pointer to the entity that has caused this error for enhanced context; emoty if not applicable.
cetl::unbounded_variant<sizeof(void*)> culprit;
}; Alternatively, we could perhaps extend Then |
Done in #361 |
PROBLEM
The design doc omits error handling for
IRunnable::run
. We want to return anexpected
from there (the CETL rendition of it) to communicate errors; however:In scenarios involving redundant network interfaces, we want to be able to keep going if a redundant entity fails to keep the other members of the redundant group operational.
IRunnable
is a very abstract interface and we don't want to tie it to the transport-specific error hierarchy because that creates very bad coupling between different components of the library. We seem to need a way to both retain the specific context-dependent error type (e.g.,libcyphal::transport::AnyError
if we're in the transport context) and erase the type to manage the coupling.SOLUTION
1. Use error notifier to manage redundant entities in a way that is somewhat similar to algebraic effects. When an error occurs in the context of a redundant entity group (like multiple
IMedia
or multiple transports), we don't return it immediately but inform the caller about this, allowing the caller to decide if we should go on or abort:Behavior:
on_error
, which can return the same error (or perhaps any other error, we don't care) or nothing.on_error
returns nothing, we move on.on_error
returns an error (which may be the same or distinct), we cease further processing and return said error.Optionally, we can provide a type-erased
unbounded_variant
containing a typed pointer or even an untypedconst void*
pointing to the entity that triggered the error (e.g., anIMedia
instance) for additional context. This is immaterial at this stage.2. Use the unbounded variant (formerly
cetl::any
) to capture and erase the detailed error type. The error type is defined ascetl::unbounded_variant<sizeof(void*)*8>
. The caller will be able to query for relevant error types that it can handle at runtime, propagating unhandleable errors upward to the application. This allows us to retain the detailed type via the unbounded variant type, and at the same time hide it from the abstractIRunnable
interface:@thirtytwobits please review
The text was updated successfully, but these errors were encountered: