Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to error handling #286

Open
stevenrhall opened this issue Jan 24, 2025 · 2 comments
Open

Improvements to error handling #286

stevenrhall opened this issue Jan 24, 2025 · 2 comments

Comments

@stevenrhall
Copy link

stevenrhall commented Jan 24, 2025

I'd like to point out some ways that error handling could be improved (I think substantially) fairly easily in cyipopt. If there's interest, I'd be willing to take a stab at a PR that accomplishes it. It's in some ways related to issue #210, that led to the adoption of CyIpoptEvaluationError to signal to cyipopt to return False from a callback evaluation.

There's a few places where there are constructs like this:

        if (np_iRow < 0).any() or (np_iRow >= m).any():
            msg = b"Invalid row indices returned from jacobianstructure"
            log(msg, logging.ERROR)
            return False

An error that triggers this log message also leads to an Unrecoverable Exception (status = -100) from Ipopt, which isn't a very good clue to what the underlying problem is. Simply replacing that code by

        if (np_iRow < 0).any() or (np_iRow >= m).any():
            msg = b"Invalid row indices returned from jacobianstructure"
            raise ValueError(msg)
            # return False        # oops --- this doesn't belong here of course

would be a real improvement in error handling for the user. It leads to an Ipopt message

EXIT: Stopping optimization at current point as requested by user.

followed by a standard error trace with a ValueError message indicating that the error is with jacobianstructure. There are about 10 instances where a log with level logging.ERROR (and some with level logging.DEBUG) could be replaced with a ValueError exception and result in better messaging.

There's also a fairly insidious bug in the hessiab_cb function. The try block has

    try:
        log(b"hessian_cb", logging.INFO)
        ret_val = True
        self = <object>user_data

        if values == NULL:
            ret_val = hessian_struct_cb(...)

        else:
            ret_val = hessian_value_cb(...)
    except:
        self.__exception = sys.exc_info()
    finally:
        return ret_val

The last part should probably be

    except:
        self.__exception = sys.exc_info()
        return True
    return ret_val

or maybe

    except:
        self._Problem__exception = sys.exc_info()
        return True
    return ret_val

which is a pattern seen in other callbacks. As is, the try-except block swallows an error in the Hessian evaluation silently, and doesn't re-raise it at all. I haven't worked out the details, but the result is that in problems where there's an error in the Hessian that raises an error, the problem of course doesn't converge, and there's no indication to the user that there was an issue.

General Principle. The general principle (I think) is this: The callback functions should always return True, unless specifically signalled by CyIpoptEvaluationError to return False, which is to say, Ipopt will deal with the error. When there's an error that the user doesn't specifically catch, or when cyipopt detects invalid inputs, an exception should be allowed to be stored in the __exception attribute, and then False returned to Ipopt by the intermediate callback to stop execution, and then the exception re-raised so the user can debug their code.

As I said, I'm willing to draft a PR, with tests, if there's interest in this.

@moorepants
Copy link
Collaborator

moorepants commented Jan 25, 2025

Improvements here are welcome. It would be helpful to first draft the failures that you want to address as tests so we can see clearly the issues.

Does this one not return?

       if (np_iRow < 0).any() or (np_iRow >= m).any():
            msg = b"Invalid row indices returned from jacobianstructure"
            raise ValueError(msg)
            return False

For example:

In [1]: def f():
   ...:     raise ValueError()
   ...:     return 'a'
   ...: 

In [2]: f()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 f()

Cell In[1], line 2, in f()
      1 def f():
----> 2     raise ValueError()
      3     return 'a'

ValueError: 

@stevenrhall
Copy link
Author

stevenrhall commented Jan 25, 2025

As an example, there are 12 tests in your test suite that use

def ensure_invalid_option(instance):
    # -12: Invalid option
    # Thrown in invalid Hessian because "hessian_approximation"
    # is not chosen as "limited-memory"
    ensure_solve_status(instance, -12)

def ensure_invalid_number(instance):
    # -13: Invalid Number Detected
    ensure_solve_status(instance, -13)

def ensure_unrecoverable_exception(instance):
    # -100: Unrecoverable Exception
    # *Should* be returned from errors in initialization
    ensure_solve_status(instance, -100)

Those tests indeed capture the actual behavior of cyipopt/Ipopt. That is, some errors in the definition of the hessian structure result in a status of -12, because Ipopt concludes there is no hessian provided when cyipopt thinks there is. But that's the actual behavior, not the desired behavior. In fact, if handled by raising the error instead of just logging it, Ipopt would end with status 5 (User Requested Stop) and then cyipopt would re-raise the ValueError, which I think is the desired behavior.

Here's a modification of the hs071 test case to demonstrate. Choose the case you want to run. Depending on the case, a divide by zero exception is raised in one of the 7 user-defined callbacks. All but three cases produce a user requested stop then raise a division by zero error (as I would expect), but cases 5, 6, and 7 with exceptions in the hessian callback silence the error and they are never raised; instead you get a max iteration error. That's the insidious error I talked about above:

hs071_exceptions.txt

It's instructive to run the cases and see the difference in behavior in case 4 (an exception in the jacobian) and case 6 (an exception in the hessian). Case 4 produces:

EXIT: Stopping optimization at current point as requested by user.
Traceback (most recent call last):
  File "/Users/steve/Documents/PycharmProjects/py-ipopt/py-ipopt/examples/hs071_exceptions.py", line 124, in <module>
    main()
  File "/Users/steve/Documents/PycharmProjects/py-ipopt/py-ipopt/examples/hs071_exceptions.py", line 113, in main
    x, info = nlp.solve(x0)
  File "cyipopt/cython/ipopt_wrapper.pyx", line 658, in ipopt_wrapper.Problem.solve
  File "cyipopt/cython/ipopt_wrapper.pyx", line 1071, in ipopt_wrapper.jacobian_cb
  File "cyipopt/cython/ipopt_wrapper.pyx", line 1032, in ipopt_wrapper.jacobian_value_cb
  File "/Users/steve/Documents/PycharmProjects/py-ipopt/py-ipopt/examples/hs071_exceptions.py", line 49, in jacobian
    value = 1 / 0
ZeroDivisionError: division by zero

while case 6 produces

EXIT: Maximum Number of Iterations Exceeded.
Solution of the primal variables: x=array([1.01695083, 4.35921839, 4.39578663, 1.24766744])

Solution of the dual variables: lambda=[-0.55014645  0.17613456]

Objective=16.79460486210927

Status code=-1

Getting back to the tests above, the right thing to do in the test is ensure_solve_status(instance, 5), except that we never get the status if the exception is raised. So the best we could do there is to use with pytest.raises(ValueError,match=...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants