Initial values for the hidden/cell state for LSTM and GRU models in Pytorch #1120

JanFSchulte · 2024-11-11T17:03:06Z

This PR addresses #1074 and implements the passing of initial values for the hidden and cell states in GRU and LSTM models, which is supported in pytorch. This first version implements this only for the pytorch parser, but it should be able to be extended it for keras and other parsers.

I have tested this for Vivado, Vitis, and Quartus. Thanks to Jovan, this is also implemented for oneAPI. Nothing is done for Catapult.

Note that this currently only works in io_parallel. In io_stream I was having some conceptual issues and was unsure if I should treat these initial states are streamed inputs or not. Might be good enough for now and I can revisit io_stream if there are any suggestions how to tackle that.

Type of change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change which adds functionality)

Tests

Tested in both standalone scripts and also the pytests to ensure that model parsing and evaluation work with and without passing these optional tensors.

Checklist

I have read the guidelines for contributing.
I have commented my code, particularly in hard-to-understand areas.
I have made corresponding changes to the documentation.
My changes generate no new warnings.
I have installed and run pre-commit on the files I edited or added.
I have added tests that prove my fix is effective or that my feature works.

…and GRU models

jmitrevs · 2025-01-18T02:24:11Z

hls4ml/templates/quartus/firmware/nnet_utils/nnet_recurrent.h

+          const typename CONFIG_T::weight_t RWO[CONFIG_T::n_out * CONFIG_T::n_out],
+          const typename CONFIG_T::bias_t BI[CONFIG_T::n_out], const typename CONFIG_T::bias_t BF[CONFIG_T::n_out],
+          const typename CONFIG_T::bias_t BC[CONFIG_T::n_out], const typename CONFIG_T::bias_t BO[CONFIG_T::n_out]) {
+    res_T hidden_state[CONFIG_T::n_out][CONFIG_T::n_timesteps + 1] hls_register;


Shouldn't hidden_state be of type data2_T and cell_state of type data3_T?

Indeed, fixed.

initial state rnns for oneAPI

jmitrevs · 2025-02-07T17:19:51Z

We need to fix the datatypes for oneAPI before we merge

…into initialRecurr

update types for lstm init state oneAPI

fix pytorch_order for GRU, recurrent bias for simpleNN, oneAPI

Fix pytorch simple RNN for oneAPI; add initial state version for Quartus and oneAPI

JanFSchulte · 2025-02-12T14:26:00Z

pre-commit.ci autofix

fix simple-rnn config for Keras; make test names unique

vloncar · 2025-02-15T21:21:30Z

hls4ml/converters/keras/recurrent.py

@@ -47,4 +47,6 @@ def parse_rnn_layer(keras_layer, input_names, input_shapes, data_reader):
    if layer['return_state']:
        raise Exception('"return_state" of {} layer is not yet supported.')

+    layer['pass_initial_states'] = False


Not quite, right? We find out from number of inputs. Also, would be good to add this as an expected attribute to the RNN layers in IR

vloncar · 2025-02-15T21:22:43Z

hls4ml/converters/pytorch_to_hls.py

-                input_shapes = [output_shapes[str(node.args[0])]]
+                input_shapes = []
+                input_names = []
+                for i in node.args:


can we have more descriptive names of variables i and y

vloncar · 2025-02-15T21:25:27Z

hls4ml/templates/oneapi/firmware/nnet_utils/nnet_recurrent.h

+// SimpleRNN with pytorch biases
+//----------------------
+
+struct simpleRNN_pytorch_config {


A minor annoyance is that this breaks naming convention, I was expecting simple_rnn_...

Actually, this is never used (and similarly also not for quartus). It seems like the simpleRNN_config is used for all cases. Of course that one also violates the naming convention.

My suggestion is that we remove simpleRNN_pytorch_config and rename simpleRNN_config, though we could keep the two if we really wanted to. That would complicate the templates a touch for no real gain. (I truthfully was never much of a fan of the base config templates that get overriden.)

JanFSchulte#9 does the simple approach to rename simpleRNN_config as simple_rnn_config and remove simpleRNN_pytorch_config altogether, if you agree with that approach.

vloncar · 2025-02-15T21:27:13Z

hls4ml/templates/vivado/nnet_utils/nnet_recurrent.h

@@ -235,6 +234,41 @@ void lstm_stack(data_T data[CONFIG_T::n_sequence * CONFIG_T::n_in], res_T res[CO
        }
 }

+template <class data_T, class data2_T, class data3_T, class res_T, typename CONFIG_T>


Any more descriptive names than data2_T and data3_T?

I tried to put more descriptive names in the oneAPI version. Maybe we can go more in that direction for the other backends (and potentially standardize).

vloncar

Looks good, I left some comments on cosmetics

allow initial values for the hidden/cell state to be passed for LSTM …

3bd3687

…and GRU models

JanFSchulte added the please test Trigger testing by creating local PR branch label Nov 11, 2024

JanFSchulte mentioned this pull request Nov 11, 2024

Compiling PyTorch LSTM with Hidden State Input #1074

Open

JanFSchulte added this to the v1.1.0 milestone Jan 8, 2025

Merge branch 'main' into initialRecurr

75b5dca

jmitrevs requested changes Jan 18, 2025

View reviewed changes

jmitrevs and others added 4 commits January 17, 2025 21:23

initial state rnns for oneAPI

96b6903

Merge pull request #4 from jmitrevs/initialRecurrOneAPI

ebd4de3

initial state rnns for oneAPI

Merge branch 'main' into initialRecurr

c7a89d8

fix data types in quartus

838bf37

JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Feb 7, 2025

Merge branch 'main' into initialRecurr

8161829

JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Feb 7, 2025

JanFSchulte and others added 5 commits February 7, 2025 13:39

more type updates

8ccfd96

Merge branch 'initialRecurr' of https://github.com/JanFSchulte/hls4ml …

dde4a4d

…into initialRecurr

update types for lstm init state oneAPI

5824efc

Merge pull request #5 from jmitrevs/initialRecurrOneAPIPart2

5c07445

update types for lstm init state oneAPI

Merge branch 'main' into initialRecurr

b063de6

JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Feb 7, 2025

jmitrevs and others added 3 commits February 9, 2025 20:11

fix pytorch_order for GRU, recurrent bias for simpleNN, oneAPI

767a5f8

Merge pull request #6 from jmitrevs/initialRecurr_fix_pytorch_oneAPI

554c7ad

fix pytorch_order for GRU, recurrent bias for simpleNN, oneAPI

Merge branch 'main' into initialRecurr

46b590f

JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Feb 10, 2025

fix simplernn in oneAPI

cf9c726

JanFSchulte removed the please test Trigger testing by creating local PR branch label Feb 10, 2025

JanFSchulte added the please test Trigger testing by creating local PR branch label Feb 10, 2025

jmitrevs and others added 3 commits February 10, 2025 15:01

snapshot that compiles but fails pytests

77e78e8

fix order of indices for pytorch simple RNN oneAPI

c8029dd

Merge pull request #7 from jmitrevs/initialRecurr-simpleRNN

925fe64

Fix pytorch simple RNN for oneAPI; add initial state version for Quartus and oneAPI

JanFSchulte removed the please test Trigger testing by creating local PR branch label Feb 12, 2025

pre-commit-ci bot and others added 4 commits February 12, 2025 14:27

[pre-commit.ci] auto fixes from pre-commit hooks

15e9f39

trigger pre-commit

07ccfb1

trigger pre-commit

8c70317

trigger pre-commit

7d03833

JanFSchulte added the please test Trigger testing by creating local PR branch label Feb 12, 2025

jmitrevs and others added 2 commits February 12, 2025 14:42

fix simple-rnn config for Keras; make test names unique

6aebf8f

Merge pull request #8 from jmitrevs/initialRecurr-keras

3e5a544

fix simple-rnn config for Keras; make test names unique

JanFSchulte added please test Trigger testing by creating local PR branch and removed please test Trigger testing by creating local PR branch labels Feb 12, 2025

jmitrevs approved these changes Feb 14, 2025

View reviewed changes

vloncar reviewed Feb 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial values for the hidden/cell state for LSTM and GRU models in Pytorch #1120

Initial values for the hidden/cell state for LSTM and GRU models in Pytorch #1120

JanFSchulte commented Nov 11, 2024 •

edited

Loading

jmitrevs Jan 18, 2025

JanFSchulte Feb 7, 2025

jmitrevs commented Feb 7, 2025

JanFSchulte commented Feb 12, 2025

vloncar Feb 15, 2025

vloncar Feb 15, 2025

vloncar Feb 15, 2025

jmitrevs Feb 15, 2025

jmitrevs Feb 15, 2025

jmitrevs Feb 18, 2025

vloncar Feb 15, 2025

jmitrevs Feb 15, 2025

vloncar left a comment

Initial values for the hidden/cell state for LSTM and GRU models in Pytorch #1120

Are you sure you want to change the base?

Initial values for the hidden/cell state for LSTM and GRU models in Pytorch #1120

Conversation

JanFSchulte commented Nov 11, 2024 • edited Loading

Type of change

Tests

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmitrevs commented Feb 7, 2025

JanFSchulte commented Feb 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vloncar left a comment

Choose a reason for hiding this comment

JanFSchulte commented Nov 11, 2024 •

edited

Loading