Distributed Arithmetic strategy for Dense, Conv1/2D, and EinsumDense #1191

calad0i · 2025-02-11T02:46:21Z

Description

This PR introduces a new strategy, distributed_arithmetic for

Dense (io parallel / stream)
Conv1/2D (io parallel / stream)
EinsumDense (io parallel)

With this strategy, all matmul like operations in there layers are decomposed into optimized adder trees. Heavy lifting tasks are offloaded to da4ml, where everything is jitted with numba. There, CMVM problem is optimized with greedy common subexpression elimination. A reduction of LUT consumption of over 30% is frequently seen when WRAP is used as overflow mode with improved latency. DSP consumption will almost always be 0 with this strategy.

This PR depends on the s-quark-pr and includes all changes made there. (QEinsumDense not available otherwise)

Type of change

New feature (non-breaking change which adds functionality)

Tests

Tests added to test_hgq_layers.py and test_einsum_dense.py. EinsumDense test will NOT be triggered in the current configuration due to keras v3 dependency.

Checklist

No docs for now.

fix syntax err in fused fixed_point_quantizer

calad0i added 30 commits February 6, 2025 19:20

Add hint on import failure

de9ae54

import converter dependencies lazily

dd01ff7

keras v3 object based parser

c57417a

sequential and i/o tensor name parsing fix

8359d3e

support activation layers

4a98e42

consistent v2 weight reader behavior

1799bfa

add v3 conv handlers

911a726

add test

93f482a

pre-commit fix

7c06087

revert keras v2 converter

d4487c6

make reshape handler compatiable with keras v3

9d6d02d

general einsum support for io_parallel and latency

a53b57c

add tests for einsumdense

d034521

keras v3 converter clean-up

4be36b4

add symbolic quantized interval

282bdc7

preliminary bit-exact precision derivation opt pass

8b66a30

squark layer support start

b369b05

fix einsum_dense precision computation

9ce7306

add leftover

374b5a3

qdense fix

2dcd001

support batch_norm

cdecefb

support merge layers

908f14a

support bit-exact q_einsum and fix precision trace for multi inp layers

178a02f

add einsum test

a851865

declare all softmax attrs in layer class

d8fc9eb

fix lazy import in handler

d0cf465

cleanup einsum handler

2b51ec0

cleanup einsum handler

0966587

more granular control over softmax for vivado

29c82a3

properly propagate inv/exp_table_size

4af1529

calad0i added 30 commits February 6, 2025 19:21

switch to model opt

a557e9c

pooling layers

7b76371

handle stray inputs

935a8d6

fix pooling layer accum_t

6389175

bit-exact concatenate

09beb48

add comments

8f08fa6

skip non-bit-exact compatiable softmax in bit-exact pass

677d65a

fix activation matching, fix bw edge case handle

33132a9

fixes in bw inference

d839571

warn double quantizer

49616bf

fix softmax bw and table_size in non-bit-exact case

c007490

fix dependency

3284338

formatting after rebase

dda3f5b

add kif cache

a65bf6b

fix edge cases, allow fixedquantizer bw shrink when possible

a558b49

fix global pooling bw inference

f9e22d5

rm linear layer if FixedQuantizer presents

294a35b

add parallelization factor support for squaark QDense

30b1c3c

warn at bw inference overflow, instead of crash

2e15ac2

minor fix on weight type var detection

95530d8

preliminary da for io_parallel, conv/dense

d078a01

fix conv n_pixel inference, use result_t bw

8efe258

da support for dense/conv, io_parallel/serial

2d7c4de

fix dense in io_stream

be2d489

use element-wise bw data and drop hgq quantizer when possible

c0f7481

fix syntax err in fused fixed_point_quantizer

DA for EinsumDense

935b667

add tests

5a3dd80

update jenkins

129c7ac

hg quantizer fuse cond fix

15d95ec

handle _cpy\d+ names in get_input_layers

1f922f1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Arithmetic strategy for Dense, Conv1/2D, and EinsumDense #1191

Distributed Arithmetic strategy for Dense, Conv1/2D, and EinsumDense #1191

calad0i commented Feb 11, 2025

Distributed Arithmetic strategy for Dense, Conv1/2D, and EinsumDense #1191

Are you sure you want to change the base?

Distributed Arithmetic strategy for Dense, Conv1/2D, and EinsumDense #1191

Conversation

calad0i commented Feb 11, 2025

Description

Type of change

Tests

Checklist