Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add priority emergency spells #1

Merged
merged 38 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
8a9cb25
feat(osm-stop): add single and universal OSM stop spell
amusingaxl May 7, 2024
89a518a
feat(clip-breaker): add universal clip breaker spell
amusingaxl May 8, 2024
b002456
refactor(clip-breaker): add more error handling cases
amusingaxl May 8, 2024
5bc1ab3
refactor(osm-stop): add more error handling cases
amusingaxl May 8, 2024
ba2dfc9
refactor: remove parametrizable expiration
amusingaxl May 9, 2024
43b2d04
refactor: rename `_onSchedule` to `_emergencyActions`
amusingaxl May 9, 2024
fc6b1b9
refactor: rename test files
amusingaxl May 9, 2024
87387db
docs(osm-stop/universal): add documentation to ignored ilks
amusingaxl May 9, 2024
fa32048
feat(osm-top): add batching escape hatch
amusingaxl May 9, 2024
2160801
refactor(osm-stop): remove unused interface import
amusingaxl May 9, 2024
89dcc8c
feat(clip-breaker): add batching escape hatch
amusingaxl May 9, 2024
9587bef
refactor(osm-stop): add missing params to event payload
amusingaxl May 9, 2024
8a9382e
chore: update CI pipeline
amusingaxl May 9, 2024
1b9b6a9
refactor(osm-stop): tests to use factory to deploy the spell
amusingaxl May 9, 2024
69d1027
feat(clip-breaker): add spell to set the clip breaker for a single ilk
amusingaxl May 9, 2024
5d70199
feat(auto-line-wipe): add spell to wipe a single ilk
amusingaxl May 13, 2024
7cd769c
feat: add meaningful `done()` methods
amusingaxl May 16, 2024
206c15c
refactor: reorganize contracts and update natspec
amusingaxl May 17, 2024
d1bee94
refactor: improve universal spells structure and comments
amusingaxl May 17, 2024
6b421dd
refactor: rename "universal" spells to "multi"
amusingaxl May 22, 2024
85a7adc
feat: guard against empty ilks list checks
amusingaxl May 22, 2024
efec4e7
feat(auto-line-wipe): add wipe for multiple ilks
amusingaxl May 22, 2024
5b57421
feat(ddm-disable): add single DDM disable spell
amusingaxl May 22, 2024
1a375b5
refactor(ddm-disable): change `done()` expression for better readability
amusingaxl May 23, 2024
2eeec5b
refactor: fix typo in function name
amusingaxl Jun 25, 2024
cfafa6b
refactor(single-auto-line-wipe): remove `Wipe` event parameters
amusingaxl Jun 25, 2024
359fc0d
refactor(single-ddm-disable): remove unused function from interface
amusingaxl Jun 25, 2024
50493ac
refactor(base-emergency-spell): change tag to `extcodehash(address(th…
amusingaxl Jun 25, 2024
d594174
refactor: remove inline assembly to fetch the code hash
amusingaxl Jun 26, 2024
7ad2715
docs: fix typo in comment
amusingaxl Jul 29, 2024
ce34c28
docs: add README
amusingaxl Jul 29, 2024
0372ffb
refactor: remove unneeded status checks after revert
amusingaxl Sep 19, 2024
04fe1d7
refactor: make `nextCastTime` return `type(uint256).max`
amusingaxl Sep 19, 2024
bc06eeb
docs: add relevant chainlog key references to README
amusingaxl Sep 19, 2024
c810d13
refactor: address audit findings
amusingaxl Oct 15, 2024
a8e2beb
docs: update README
amusingaxl Oct 15, 2024
75c5a06
refactor: additional improvements
amusingaxl Oct 16, 2024
d9fc4ea
fix: apply suggestions from code review
amusingaxl Oct 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
name: test

on: workflow_dispatch
on:
push:
branches:
- master
- main
pull_request:

env:
FOUNDRY_PROFILE: ci
ETH_RPC_URL: ${{ secrets.ETH_RPC_URL }}

jobs:
check:
Expand Down
158 changes: 114 additions & 44 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,66 +1,136 @@
## Foundry
# Maker Protocol Emergency Spells

**Foundry is a blazing fast, portable and modular toolkit for Ethereum application development written in Rust.**
Pre-deployed spells to allow MakerDAO Governance to react faster in case of emergencies.

Foundry consists of:
## Motivation

- **Forge**: Ethereum testing framework (like Truffle, Hardhat and DappTools).
- **Cast**: Swiss army knife for interacting with EVM smart contracts, sending transactions and getting chain data.
- **Anvil**: Local Ethereum node, akin to Ganache, Hardhat Network.
- **Chisel**: Fast, utilitarian, and verbose solidity REPL.
In Maker's linguo, a spell is a bespoke smart contract to execute authorized actions in Maker Protocol on behalf on
Maker Governance.

## Documentation
Since most contracts in Maker Protocol follow a [simple, battle-tested authorization scheme][auth], with an "all or
nothing" approach, it means that every spell has _root_ access to every single of its components.

https://book.getfoundry.sh/
[auth]: https://github.com/makerdao/pe-checklists/blob/master/core/standards.md#permissions

## Usage
In order to mitigate the risks associated with that design decision, the spell process is quite "heavy", where
multiple trusted parties are involved, and with comprehensive [checklists][spell-checklists] that must be strictly
followed.

### Build
[spell-checklists]: https://github.com/makerdao/pe-checklists/tree/master/spell

```shell
$ forge build
```
With all the complexity and coordination effort, it is not a surprise that it takes a long time for a spell to be
successfully crafted, reviewed and handed over to Maker Governance. As per the [current process][spell-schedule], with
the involvement of at least 3 engineers from the different EAs in the Spell Team, not to mention the Governance
Facilitators and other key stakeholders, it takes at least 8 working days to deliver a regular spell.

### Test
[spell-schedule]: https://github.com/makerdao/pe-checklists/blob/master/spell/spell-crafter-mainnet-workflow.md#spell-coordination-schedule

```shell
$ forge test
```
For emergency spells, historically the agreed SLA had been 24h. That was somehow possible when there was a single
tight-knit team working on spells, however can be specially challenging with a more decentralized workforce, which is
scattered around the world. Even if it was still possible to meet that SLA, in some situations 24h might be too much
time.

### Format
This repository contains a couple of different spells performing emergency actions that can be pre-deployed to allow
MakerDAO Governance to quickly respond to incidents, without the need for dedicated engineers to chime in and craft a
bespoke spell in record time.

```shell
$ forge fmt
```
## Deployments

### Gas Snapshots
TBD.

```shell
$ forge snapshot
```
## Implemented Actions

### Anvil
| Description | Single ilk | Multi ilk |
| :---------- | :--------: | :-------: |
| Wipe `AutoLine` | :white_check_mark: | :white_check_mark: |
| Set `Clip` breaker | :white_check_mark: | :white_check_mark: |
| Disable `DDM` | :white_check_mark: | :x: |
| Stop `OSM` | :white_check_mark: | :white_check_mark: |

```shell
$ anvil
```
### Wipe `AutoLine`

### Deploy
No further debt can be generated from an ilk which is wiped from `MCD_IAM_AUTO_LINE`. It also prevents the debt ceiling
(`line`) for the affected ilk from being changed without Governance interference.

```shell
$ forge script script/Counter.s.sol:CounterScript --rpc-url <your_rpc_url> --private-key <your_private_key>
```
### Set `Clip` breaker

### Cast
Halts collateral auctions happening in the `MCD_CLIP_{ILK}` contract belonging to the specified ilks. Sets the breaker level to 3
to prevent both `kick()`, `redo()` and `take()`.

```shell
$ cast <subcommand>
```
### Disable `DDM`

### Help
Disables a Direct Deposit Module (`DIRECT_{ID}_PLAN`), preventing further debt from being generated from it.

```shell
$ forge --help
$ anvil --help
$ cast --help
```
### Stop `OSM`

Stops the specified Oracle Security Module (`PIP_{GEM}`) instances, preventing updates in their price feeds.

## Design

### Overview

Emergency spells are meant to be as ABI-compatible with regular spells as possible, to allow Governance to reuse any
existing tools, which will not increase the cognitive burden in an emergency situation.

Previous bespoke emergency spells ([see example][example-emergency-spell]) would perform an open-heart surgery in the
standard [`DssExec`][dss-exec] contract and include the emergency actions calls in the `schedule` function. This allows
any contract using the `Mom` architecture ([see example][example-mom]) to bypass the GSM delay.

The same restrictions to regulars spells still apply (i.e.: storage variables are not allowed).

The emergency spells in this repository build on that idea with a few modifications:

1. No [`DssExecLib`][dss-exec-lib] dependency: emergency actions are quite simple by nature, which makes the use of
`DssExecLib` superfluous.
1. No expiration time: contrary to regular spells, which are meant to be cast only once, emergency spells can be reused
if needed, so the expiration time is set so far away in the future that in practice the spell does not expire.
1. No separate [`DssAction`][dss-action]-like contract: regular spells delegate the execution of specific actions to a
`DssAction` contract that is deployed by the spell in its constructor. The exact reason for that design choice is
unknown to the authors, however we can speculate that the way the spell `tag` is derived<sup>[\[1\]](#fn-1)</sup>
requires a separate contract.
1. Casting is a no-op: while bespoke emergency spells would often conflate emergency actions with non-emergency ones,
pre-deployed emergency spells perform only emergency actions, turning `cast()` into a no-op, which exists only for
interface-compatibility purposes.
1. No `MCD_PAUSE` interaction: as its name might suggest, the main goal of `MCD_PAUSE` is to introduce a _pause_ (GSM
delay) between the approval of a spell and its execution. Emergency spells by definition bypass the GSM delay, so
there is no strong reason to `plan` them in `MCD_PAUSE` as regular spells.
Comment on lines +94 to +96

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this code review, I wonder how current "whitelisting"-based monitoring implemented by techops works and if it detects votes on a random non-whitelisted contract before it's been too late / spell is already scheduled or if they even depend on the event emitted by plan to detect that something is off

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me raise that question to them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Techops:

We're "whitelisting" the spells in advance before they are voted through and get the hat. Otherwise any spell that is not whitelisted and that gets the hat, will trigger a P1 alert saying that aunothorized spell has the hat. Where we'll have to react by involving stakeholders and doing necessary actions required for blocking it from being cast.
Idea is that if that happens and malicious spell is scheduled, we can quickly get all the people online and put "concurrent" spell that would be executed before the malicious one and will cancel it. Or start marketing and informational campaign notifying the users about it. Or preparing some press-release and informing about our next actions. So whitelist is mainly to not any unauthorized spells be executed silently and notifying us about it.


[example-emergency-spell]: https://github.com/makerdao/spells-mainnet/blob/8b0e1c354a0add49f595eea01ca3a822e782ab0d/archive/2022-06-15-DssSpell/DssSpell.sol
[dss-exec]: https://github.com/makerdao/dss-exec-lib/blob/69b658f35d8618272cd139dfc18c5713caf6b96b/src/DssExec.sol
[dss-exec-lib]: https://github.com/makerdao/dss-exec-lib/blob/69b658f35d8618272cd139dfc18c5713caf6b96b/src/DssExecLib.sol
[dss-action]: https://github.com/makerdao/dss-exec-lib/blob/69b658f35d8618272cd139dfc18c5713caf6b96b/src/DssAction.sol
[example-mom]: https://etherscan.io/address/0x9c257e5Aaf73d964aEBc2140CA38078988fB0C10

<sub id="fn-1"><sup>\[1\]</sup> `tag` is meant to be immutable and [extract the `codehash` of the `action`
contract][spell-tag]. Notice that it would not be possible to get the `codehash` of the same contract in its
constructor.</sub>

[spell-tag]: https://github.com/makerdao/dss-exec-lib/blob/69b658f35d8618272cd139dfc18c5713caf6b96b/src/DssExec.sol#L75

Some types of emergency spells may come in 2 flavors:

1. Single ilk: applies the desired spell action for a single pre-defined ilk.
1. Multi ilk: applies the desired spell action for all applicable ilks.

Furthermore, this repo provides on-chain factories for single ilk emergency spells to make it easier to deploy for new
ilks.

### About the `done()` function

Conforming spells have a [`done`][spell-done] public storage variable which is `false` when the spell is deployed and
set to `true` when the spell is cast. This ensures a spell cannot be cast twice.

An emergency spell is not meant to be cast, but it can be scheduled multiple times. So instead of having `done` as a
storage variable, it becomes a getter function that will return:
- `false`: if the emergency spell can be scheduled in the current state, given it is lifted to the hat.
- `true`: if the desired effects of the spell can be verified or if there is anything that would prevent the spell from
being scheduled (i.e.: bad system config)

Generally speaking, `done` should almost always return `false` for any emergency spell. If it returns `true` it means it
has just been scheduled or there is most likely something wrong with the modules touched by it. The exception is the
case where the system naturally achieves the same final state as the spell being scheduled, in which it would be also
returned `true`.

In other words, if `done() == true`, it means that the actions performed by the spell are not applicable.

[spell-done]: https://github.com/makerdao/dss-exec-lib/blob/69b658f35d8618272cd139dfc18c5713caf6b96b/src/DssExec.sol#L43
135 changes: 135 additions & 0 deletions src/DssEmergencySpell.sol
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
// SPDX-FileCopyrightText: © 2024 Dai Foundation <www.daifoundation.org>
// SPDX-License-Identifier: AGPL-3.0-or-later
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU Affero General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Affero General Public License for more details.
//
// You should have received a copy of the GNU Affero General Public License
// along with this program. If not, see <https://www.gnu.org/licenses/>.
pragma solidity ^0.8.16;
oddaf marked this conversation as resolved.
Show resolved Hide resolved

interface ChainlogLike {
function getAddress(bytes32 key) external view returns (address);
}

interface DssExec {
function action() external view returns (address);
function cast() external;
function description() external view returns (string memory);
function done() external view returns (bool);
function eta() external view returns (uint256);
function expiration() external view returns (uint256);
function log() external view returns (address);
function nextCastTime() external view returns (uint256);
function officeHours() external view returns (bool);
function pause() external view returns (address);
function schedule() external;
function sig() external view returns (bytes memory);
function tag() external view returns (bytes32);
}

interface DssAction {
function actions() external;
function description() external view returns (string memory);
function execute() external;
function nextCastTime(uint256 eta) external view returns (uint256);
function officeHours() external view returns (bool);
}

interface DssEmergencySpellLike is DssExec, DssAction {
function description() external view override(DssExec, DssAction) returns (string memory);
function officeHours() external view override(DssExec, DssAction) returns (bool);
}

abstract contract DssEmergencySpell is DssEmergencySpellLike {
/// @dev The chainlog contract reference.
ChainlogLike internal constant _log = ChainlogLike(0xdA0Ab1e0017DEbCd72Be8599041a2aa3bA7e740F);

// @dev The reference to the `pause` contract.
address public immutable pause = _log.getAddress("MCD_PAUSE");
/// @dev The chainlog address.
address public constant log = address(_log);
/// @dev In regular spells, `eta` is used to enforce the GSM delay.
/// For emergency spells, the GSM delay is not applicable.
uint256 public constant eta = 0;
/// @dev Keeping the same value as regular spells.
bytes public constant sig = abi.encodeWithSelector(DssAction.execute.selector);
/// @dev Emergency spells should not expire.
uint256 public constant expiration = type(uint256).max;
// @dev An emergency spell does not need to be cast, as all actions happen during the schedule phase.
// Notice that cast is usually not supposed to revert, so it is implemented as a no-op.
uint256 internal immutable _nextCastTime = type(uint256).max;
// @dev Office Hours is always `false` for emergency spells.
bool public constant officeHours = false;
// @dev `action` is expected to return a valid address.
// We also implement the `DssAction` interface in this contract.
address public immutable action = address(this);

/**
* @dev In regular spells, `tag` is an immutable variable with the code hash of the spell action.
* It specifically uses a separate contract for spell action because `tag` is immutable and the code hash of
* the contract being initialized is not accessible in the constructor.
* Since we do not have a separate contract for actions in Emergency Spells, `tag` has to be turned into a
* getter function instead of an immutable variable.
* @return The contract codehash.
*/
function tag() external view returns (bytes32) {
return address(this).codehash;
}

/**
* @notice Triggers the emergency actions of the spell.
* @dev Emergency spells are triggered when scheduled.
* This function maintains the name for compatibility with regular spells, however nothing is actually being
* scheduled. Emergency spells take effect immediately, so there is no need to call `pause.plot()`.
*/
function schedule() external {
_emergencyActions();
}
Comment on lines +93 to +95

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the chief-keeper block-scheme, I understand that it will try to call spell.schedule() in the loop, until "it is scheduled". On the technical level, "scheduled" is checked via spell.eta != 0. In case we don't modify eta, the keeper will call emergency spell in the loop in every single block and the whole interface "compliancy" wouldn't make much sense.

To solve this loop, we can either:

  • A. Always use factories and require deployment, even for multi-variations. Have fully compliment eta logic, expiration and one-time usage logic. I hardly can see when emergency spell with the same hat will need to be executed more than once
  • B. Set eta inside spell.schedule to non zero
    • Additional complexity: Setting eta will prevent spell from being directly reusable. We can potentially circumvent this by modifying it back to 0 either via cast or manually/permissionlessly via some other, e.g. reset() method
  • C. Modify keeper logic to determine that spell was already scheduled
    • Additional complexity: there no cross-compatible event or state variable that can be used for checking if "a spell" was scheduled. For example, if plot would've been called via emergency spell, keeper can check pause.plans[hash] != true instead of spell.eta != 0

All in all, I would recommend to make some changes to the base contract to prevent keeper loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at that line we have:
if spell.done() == False

Notice that all emergency spells have meaningful done() implementations, meaning that right after schedule() is called, it should return true, which would prevent the loop in the keeper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hardly can see when emergency spell with the same hat will need to be executed more than once

  • Governance might be required to disable a specific DDM more than once during its lifespan
  • Oracle malfunctioning might happen more than once during its lifespan
  • Temporarily disabling new collateral intake through auto-line might happen more than once for the same ilk

Copy link
Contributor Author

@amusingaxl amusingaxl Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A. Always use factories and require deployment, even for multi-variations. Have fully compliment eta logic, expiration and one-time usage logic. I hardly can see when emergency spell with the same hat will need to be executed more than once

The idea of having pre-deployed spells is to not require any non-engineer to ever provide inputs manually. Even in the cases where there are factories, they would be used by engineers to deploy the contracts and hand over only the addresses to other stakeholders.


/**
* @notice Implements the emergency actions to be triggered by the spell.
*/
function _emergencyActions() internal virtual;

/**
* @notice Returns `_nextCastTime`.
* @dev This function exists only to keep interface compatibility with regular spells.
*/
function nextCastTime() external view returns (uint256 castTime) {
return _nextCastTime;
}
Comment on lines +106 to +108

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: if we don't need nor expect cast to be triggered, would it make more sense to return type(uint256).max? It shouldn't make a difference in practice, since it will never be planned into pause contract (unless we change this logic based on the previous comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't take a deep dive into the chief keeper, just looked at the most important aspects of it.
If any cleanup logic depends on the spell cast() being called, we don't want to clean it.
However, looking again, since cast() is permissionless, it's not guaranteed that the keeper is the one to call it every time, so it would make little sense to have such dependency, so your suggestion makes sense.


/**
* @notice No-op.
* @dev This function exists only to keep interface compatibility with regular spells.
*/
function cast() external {}

/**
* @notice No-op.
* @dev This function exists only to keep interface compatibility with regular spells.
*/
function execute() external {}

/**
* @notice No-op.
* @dev This function exists only to keep interface compatibility with regular spells.
*/
function actions() external {}

/**
* @notice Returns `nextCastTime`, regardless of the input parameter.
* @dev This function exists only to keep interface compatibility with regular spells.
*/
function nextCastTime(uint256) external view returns (uint256 castTime) {
return _nextCastTime;
}
}
Loading