Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FAST #35476

Open
wants to merge 103 commits into
base: main
Choose a base branch
from
Open

Add FAST #35476

Changes from 1 commit
Commits
Show all changes
103 commits
Select commit Hold shift + click to select a range
10cd7a8
WIP
raghavanone Oct 5, 2023
77b58fd
Add config and modeling for Fast model
raghavanone Oct 7, 2023
ed1f4e0
Refactor modeling and add tests
raghavanone Oct 8, 2023
f53e255
More changes
raghavanone Oct 11, 2023
d1511dd
WIP
raghavanone Oct 13, 2023
f07268a
Add tests
raghavanone Oct 14, 2023
1e90463
Add conversion script
raghavanone Oct 15, 2023
087e6cd
Add conversion scripts, integration tests, image processor
raghavanone Oct 20, 2023
b1a323e
Fix style and copies
raghavanone Oct 31, 2023
67126d2
Add fast model to init
raghavanone Oct 31, 2023
39d1442
Add fast model in docs and other places
raghavanone Oct 31, 2023
cf539f0
Fix import of cv2
raghavanone Oct 31, 2023
110299d
Rename image processing method
raghavanone Oct 31, 2023
5aed765
Fix build
raghavanone Oct 31, 2023
3d28bb4
Fix Build
raghavanone Nov 1, 2023
8fb4df3
fix style and fix copies
raghavanone Nov 1, 2023
53ac622
Fix build
raghavanone Nov 1, 2023
0bbd05c
Fix build
raghavanone Nov 1, 2023
acd68e6
Fix Build
raghavanone Nov 1, 2023
403b388
Clean up docstrings
raghavanone Nov 1, 2023
b59b1e3
Fix Build
raghavanone Nov 1, 2023
aca107b
Fix Build
raghavanone Nov 1, 2023
539b0c6
Fix Build
raghavanone Nov 1, 2023
51ec119
Fix build
raghavanone Nov 1, 2023
e8fc511
Add test for image_processing_fast and add documentation tests
raghavanone Nov 1, 2023
1b77ee6
some refactorings
raghavanone Nov 1, 2023
6726b21
Fix failing tests
raghavanone Nov 5, 2023
1418902
Incorporate PR feedbacks
raghavanone Nov 5, 2023
e605f47
Incorporate PR feedbacks
raghavanone Nov 5, 2023
eec235d
Incorporate PR feedbacks
raghavanone Nov 5, 2023
afe6c82
Incorporate PR feedbacks
raghavanone Nov 5, 2023
95ce24d
Incorporate PR feedbacks
raghavanone Nov 5, 2023
dc4091d
Introduce TextNet
raghavanone Nov 8, 2023
0a5196e
Fix failures
raghavanone Nov 8, 2023
0b64238
Refactor textnet model
raghavanone Nov 8, 2023
3db7c35
Fix failures
raghavanone Nov 8, 2023
643f983
Add cv2 to setup
raghavanone Nov 8, 2023
9751a65
Fix failures
raghavanone Nov 8, 2023
6d7ab00
Fix failures
raghavanone Nov 8, 2023
a959cc4
Add CV2 dependency
raghavanone Nov 8, 2023
c357af0
Fix bugs
raghavanone Nov 8, 2023
ec90f5f
Fix build issue
raghavanone Nov 8, 2023
6d572da
Fix failures
raghavanone Nov 8, 2023
caba4e3
Remove textnet from modeling fast
raghavanone Nov 9, 2023
dc03eec
Fix readme
raghavanone Nov 9, 2023
192d40a
Fix build and other things
raghavanone Nov 9, 2023
c2298a5
Fix build
raghavanone Nov 9, 2023
b1f7562
some cleanups
raghavanone Nov 9, 2023
21f300f
some cleanups
raghavanone Nov 9, 2023
95b5241
Some more cleanups
raghavanone Nov 9, 2023
a0fd02a
Fix build
raghavanone Nov 9, 2023
e973f68
Incorporate PR feedbacks
raghavanone Nov 9, 2023
1f50a96
More cleanup
raghavanone Nov 9, 2023
e8036cf
More cleanup
raghavanone Nov 9, 2023
7bc0864
More cleanup
raghavanone Nov 9, 2023
59f890b
Fix build
raghavanone Nov 9, 2023
ea3444f
clean the branch a bit
jadechoghari Jan 1, 2025
7c00fc9
fix conflicts
jadechoghari Jan 1, 2025
ba0a2b1
fix processor + convert file
jadechoghari Jan 6, 2025
eb85fd4
remove textnet
jadechoghari Jan 11, 2025
127b2a6
Merge branch 'main' into add-fast
jadechoghari Jan 11, 2025
e62a18e
remove textnet since its merged
jadechoghari Jan 11, 2025
6f44d4c
fix convert file
jadechoghari Jan 11, 2025
7192f53
fix processor and conversion script
jadechoghari Jan 22, 2025
872300b
remove complex fuse conv logic
jadechoghari Feb 2, 2025
efed17a
add changes
jadechoghari Feb 3, 2025
9b98370
add changes
jadechoghari Feb 5, 2025
d65b61c
fix processor
jadechoghari Feb 13, 2025
90fb8ba
add dummy testing
jadechoghari Feb 13, 2025
207d875
fix batch testing
jadechoghari Feb 13, 2025
b9e382d
fix convert testing
jadechoghari Feb 13, 2025
db4496a
add correct image url
jadechoghari Feb 13, 2025
1f7f654
fix convert
jadechoghari Feb 13, 2025
7090cc1
add other convert fixes
jadechoghari Feb 13, 2025
1ddc69c
Merge branch 'main' into add-fast
jadechoghari Feb 13, 2025
654c83f
add stash:
jadechoghari Feb 13, 2025
1d94315
remove done TODOs
jadechoghari Feb 13, 2025
ad5e00c
fixup
jadechoghari Feb 14, 2025
1fcbb3a
add new
jadechoghari Feb 14, 2025
23d326b
add changes
jadechoghari Feb 14, 2025
91c9236
Merge branch 'main' into add-fast
jadechoghari Feb 14, 2025
c1a2f65
remove extra readme files
jadechoghari Feb 14, 2025
0c0b16f
add style
jadechoghari Feb 14, 2025
0ab1b4c
iterate on review
jadechoghari Feb 19, 2025
c27cf6e
add new init
jadechoghari Feb 20, 2025
a7eba4e
add year
jadechoghari Feb 20, 2025
7b755c5
remove config
jadechoghari Feb 20, 2025
c071f03
add custom fast loss
jadechoghari Feb 21, 2025
ec4fa2d
update copyright
jadechoghari Feb 21, 2025
6b1e832
refactor the loss
jadechoghari Feb 21, 2025
bf166ed
add template modular
jadechoghari Feb 21, 2025
faceed5
add modular for fast image processor
jadechoghari Feb 23, 2025
6dfbb34
remove archive config thign:
jadechoghari Feb 23, 2025
b216a49
fix imports
jadechoghari Mar 1, 2025
7cfb382
fix modular
jadechoghari Mar 1, 2025
7e7b759
remove not needed files
jadechoghari Mar 1, 2025
b8a7cc8
more fixes
jadechoghari Mar 1, 2025
4a36dee
fix loss test:
jadechoghari Mar 1, 2025
218b120
fix failing tests
jadechoghari Mar 1, 2025
7681a56
add scipy instead of cv2:
jadechoghari Mar 1, 2025
afacea3
improve convert
jadechoghari Mar 1, 2025
73a57bd
more changes
jadechoghari Mar 1, 2025
9a64c89
go back to cv2
jadechoghari Mar 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Refactor textnet model
raghavanone committed Nov 9, 2023
commit 0b64238e29140628681c052ac0258f06bd89c06e
152 changes: 76 additions & 76 deletions src/transformers/models/textnet/configuration_textnet.py
Original file line number Diff line number Diff line change
@@ -34,42 +34,42 @@ class TextNetConfig(BackboneConfigMixin, PretrainedConfig):

def __init__(
self,
backbone_kernel_size=3,
backbone_stride=2,
backbone_dilation=1,
backbone_groups=1,
backbone_bias=False,
backbone_has_shuffle=False,
backbone_in_channels=3,
backbone_out_channels=64,
backbone_use_bn=True,
backbone_act_func="relu",
backbone_dropout_rate=0,
backbone_ops_order="weight_bn_act",
backbone_stage1_in_channels=[64, 64, 64],
backbone_stage1_out_channels=[64, 64, 64],
backbone_stage1_kernel_size=[[3, 3], [3, 3], [3, 3]],
backbone_stage1_stride=[1, 2, 1],
backbone_stage1_dilation=[1, 1, 1],
backbone_stage1_groups=[1, 1, 1],
backbone_stage2_in_channels=[64, 128, 128, 128],
backbone_stage2_out_channels=[128, 128, 128, 128],
backbone_stage2_kernel_size=[[3, 3], [1, 3], [3, 3], [3, 1]],
backbone_stage2_stride=[2, 1, 1, 1],
backbone_stage2_dilation=[1, 1, 1, 1],
backbone_stage2_groups=[1, 1, 1, 1],
backbone_stage3_in_channels=[128, 256, 256, 256],
backbone_stage3_out_channels=[256, 256, 256, 256],
backbone_stage3_kernel_size=[[3, 3], [3, 3], [3, 1], [1, 3]],
backbone_stage3_stride=[2, 1, 1, 1],
backbone_stage3_dilation=[1, 1, 1, 1],
backbone_stage3_groups=[1, 1, 1, 1],
backbone_stage4_in_channels=[256, 512, 512, 512],
backbone_stage4_out_channels=[512, 512, 512, 512],
backbone_stage4_kernel_size=[[3, 3], [3, 1], [1, 3], [3, 3]],
backbone_stage4_stride=[2, 1, 1, 1],
backbone_stage4_dilation=[1, 1, 1, 1],
backbone_stage4_groups=[1, 1, 1, 1],
kernel_size=3,
stride=2,
dilation=1,
groups=1,
bias=False,
has_shuffle=False,
in_channels=3,
out_channels=64,
use_bn=True,
act_func="relu",
dropout_rate=0,
ops_order="weight_bn_act",
stage1_in_channels=[64, 64, 64],
stage1_out_channels=[64, 64, 64],
stage1_kernel_size=[[3, 3], [3, 3], [3, 3]],
stage1_stride=[1, 2, 1],
stage1_dilation=[1, 1, 1],
stage1_groups=[1, 1, 1],
stage2_in_channels=[64, 128, 128, 128],
stage2_out_channels=[128, 128, 128, 128],
stage2_kernel_size=[[3, 3], [1, 3], [3, 3], [3, 1]],
stage2_stride=[2, 1, 1, 1],
stage2_dilation=[1, 1, 1, 1],
stage2_groups=[1, 1, 1, 1],
stage3_in_channels=[128, 256, 256, 256],
stage3_out_channels=[256, 256, 256, 256],
stage3_kernel_size=[[3, 3], [3, 3], [3, 1], [1, 3]],
stage3_stride=[2, 1, 1, 1],
stage3_dilation=[1, 1, 1, 1],
stage3_groups=[1, 1, 1, 1],
stage4_in_channels=[256, 512, 512, 512],
stage4_out_channels=[512, 512, 512, 512],
stage4_kernel_size=[[3, 3], [3, 1], [1, 3], [3, 3]],
stage4_stride=[2, 1, 1, 1],
stage4_dilation=[1, 1, 1, 1],
stage4_groups=[1, 1, 1, 1],
hidden_sizes=[64, 64, 128, 256, 512],
initializer_range=0.02,
out_features=None,
@@ -78,55 +78,55 @@ def __init__(
):
super().__init__(**kwargs)

self.backbone_kernel_size = backbone_kernel_size
self.backbone_stride = backbone_stride
self.backbone_dilation = backbone_dilation
self.backbone_groups = backbone_groups
self.backbone_bias = backbone_bias
self.backbone_has_shuffle = backbone_has_shuffle
self.backbone_in_channels = backbone_in_channels
self.backbone_out_channels = backbone_out_channels
self.backbone_use_bn = backbone_use_bn
self.backbone_act_func = backbone_act_func
self.backbone_dropout_rate = backbone_dropout_rate
self.backbone_ops_order = backbone_ops_order
self.kernel_size = kernel_size
self.stride = stride
self.dilation = dilation
self.groups = groups
self.bias = bias
self.has_shuffle = has_shuffle
self.in_channels = in_channels
self.out_channels = out_channels
self.use_bn = use_bn
self.act_func = act_func
self.dropout_rate = dropout_rate
self.ops_order = ops_order

self.backbone_stage1_in_channels = backbone_stage1_in_channels
self.backbone_stage1_out_channels = backbone_stage1_out_channels
self.backbone_stage1_kernel_size = backbone_stage1_kernel_size
self.backbone_stage1_stride = backbone_stage1_stride
self.backbone_stage1_dilation = backbone_stage1_dilation
self.backbone_stage1_groups = backbone_stage1_groups
self.stage1_in_channels = stage1_in_channels
self.stage1_out_channels = stage1_out_channels
self.stage1_kernel_size = stage1_kernel_size
self.stage1_stride = stage1_stride
self.stage1_dilation = stage1_dilation
self.stage1_groups = stage1_groups

self.backbone_stage2_in_channels = backbone_stage2_in_channels
self.backbone_stage2_out_channels = backbone_stage2_out_channels
self.backbone_stage2_kernel_size = backbone_stage2_kernel_size
self.backbone_stage2_stride = backbone_stage2_stride
self.backbone_stage2_dilation = backbone_stage2_dilation
self.backbone_stage2_groups = backbone_stage2_groups
self.stage2_in_channels = stage2_in_channels
self.stage2_out_channels = stage2_out_channels
self.stage2_kernel_size = stage2_kernel_size
self.stage2_stride = stage2_stride
self.stage2_dilation = stage2_dilation
self.stage2_groups = stage2_groups

self.backbone_stage3_in_channels = backbone_stage3_in_channels
self.backbone_stage3_out_channels = backbone_stage3_out_channels
self.backbone_stage3_kernel_size = backbone_stage3_kernel_size
self.backbone_stage3_stride = backbone_stage3_stride
self.backbone_stage3_dilation = backbone_stage3_dilation
self.backbone_stage3_groups = backbone_stage3_groups
self.stage3_in_channels = stage3_in_channels
self.stage3_out_channels = stage3_out_channels
self.stage3_kernel_size = stage3_kernel_size
self.stage3_stride = stage3_stride
self.stage3_dilation = stage3_dilation
self.stage3_groups = stage3_groups

self.backbone_stage4_in_channels = backbone_stage4_in_channels
self.backbone_stage4_out_channels = backbone_stage4_out_channels
self.backbone_stage4_kernel_size = backbone_stage4_kernel_size
self.backbone_stage4_stride = backbone_stage4_stride
self.backbone_stage4_dilation = backbone_stage4_dilation
self.backbone_stage4_groups = backbone_stage4_groups
self.stage4_in_channels = stage4_in_channels
self.stage4_out_channels = stage4_out_channels
self.stage4_kernel_size = stage4_kernel_size
self.stage4_stride = stage4_stride
self.stage4_dilation = stage4_dilation
self.stage4_groups = stage4_groups

self.initializer_range = initializer_range
self.hidden_sizes = hidden_sizes

self.depths = [
len(self.backbone_stage1_out_channels),
len(self.backbone_stage2_out_channels),
len(self.backbone_stage3_out_channels),
len(self.backbone_stage4_out_channels),
len(self.stage1_out_channels),
len(self.stage2_out_channels),
len(self.stage3_out_channels),
len(self.stage4_out_channels),
]
self.stage_names = ["stem"] + [f"stage{idx}" for idx in range(1, 5)]
self._out_features, self._out_indices = get_aligned_output_features_output_indices(
82 changes: 41 additions & 41 deletions src/transformers/models/textnet/modeling_textnet.py
Original file line number Diff line number Diff line change
@@ -363,63 +363,63 @@ class TextNetModel(TextNetPreTrainedModel):
def __init__(self, config):
super().__init__(config)
self.first_conv = TextNetConvLayer(
config.backbone_in_channels,
config.backbone_out_channels,
config.backbone_kernel_size,
config.backbone_stride,
config.backbone_dilation,
config.backbone_groups,
config.backbone_bias,
config.backbone_has_shuffle,
config.backbone_use_bn,
config.backbone_act_func,
config.backbone_dropout_rate,
config.backbone_ops_order,
config.in_channels,
config.out_channels,
config.kernel_size,
config.stride,
config.dilation,
config.groups,
config.bias,
config.has_shuffle,
config.use_bn,
config.act_func,
config.dropout_rate,
config.ops_order,
)
stage1 = []
for stage_config in zip(
config.backbone_stage1_in_channels,
config.backbone_stage1_out_channels,
config.backbone_stage1_kernel_size,
config.backbone_stage1_stride,
config.backbone_stage1_dilation,
config.backbone_stage1_groups,
config.stage1_in_channels,
config.stage1_out_channels,
config.stage1_kernel_size,
config.stage1_stride,
config.stage1_dilation,
config.stage1_groups,
):
stage1.append(TestNetRepConvLayer(*stage_config))
self.stage1 = nn.ModuleList(stage1)

stage2 = []
for stage_config in zip(
config.backbone_stage2_in_channels,
config.backbone_stage2_out_channels,
config.backbone_stage2_kernel_size,
config.backbone_stage2_stride,
config.backbone_stage2_dilation,
config.backbone_stage2_groups,
config.stage2_in_channels,
config.stage2_out_channels,
config.stage2_kernel_size,
config.stage2_stride,
config.stage2_dilation,
config.stage2_groups,
):
stage2.append(TestNetRepConvLayer(*stage_config))
self.stage2 = nn.ModuleList(stage2)

stage3 = []
for stage_config in zip(
config.backbone_stage3_in_channels,
config.backbone_stage3_out_channels,
config.backbone_stage3_kernel_size,
config.backbone_stage3_stride,
config.backbone_stage3_dilation,
config.backbone_stage3_groups,
config.stage3_in_channels,
config.stage3_out_channels,
config.stage3_kernel_size,
config.stage3_stride,
config.stage3_dilation,
config.stage3_groups,
):
stage3.append(TestNetRepConvLayer(*stage_config))
self.stage3 = nn.ModuleList(stage3)

stage4 = []
for stage_config in zip(
config.backbone_stage4_in_channels,
config.backbone_stage4_out_channels,
config.backbone_stage4_kernel_size,
config.backbone_stage4_stride,
config.backbone_stage4_dilation,
config.backbone_stage4_groups,
config.stage4_in_channels,
config.stage4_out_channels,
config.stage4_kernel_size,
config.stage4_stride,
config.stage4_dilation,
config.stage4_groups,
):
stage4.append(TestNetRepConvLayer(*stage_config))
self.stage4 = nn.ModuleList(stage4)
@@ -481,11 +481,11 @@ def __init__(self, config):

self.textnet = TextNetModel(config)
self.num_features = [
config.backbone_out_channels,
config.backbone_stage1_out_channels[-1],
config.backbone_stage2_out_channels[-1],
config.backbone_stage3_out_channels[-1],
config.backbone_stage4_out_channels[-1],
config.out_channels,
config.stage1_out_channels[-1],
config.stage2_out_channels[-1],
config.stage3_out_channels[-1],
config.stage4_out_channels[-1],
]

# initialize weights and apply final processing
Loading