Skip to content

Commit

Permalink
[rust] add parser (#1619)
Browse files Browse the repository at this point in the history
* feat: unpack gpac

* fix: linux ci

* fix: mac build

* fix: remove unused [no ci]

* fix: ignore config.h [no ci]

* temp commit, will drop this soon

* fix: install gpac

* fix: gpac

* fix: formatting

* fix: preproccessor directive

* fix: comment display version for now

* fix: display dlls code

* fix: bundle vcruntime in hardsubx windows

* fix: again

* fix: erros in ci

* fix: ci

* fix: add vcruntime in additional dependencies

* fix: try to copy vcruntime after build

* fix: space in runtime library

* fix: remove for now [no ci]

* fix: things in vcxproj

* fix: ci for leptonica sys

* fix: docs

* fix: copy dlls on post build event

* fix: copy vcruntime after build

* feat: add arguments through clap

* fix: type of some arguments

* fix: "-" and "--" in comments

* fix: format files

* fix: add argument parsing till mkvlang

* fix: one todo item

* chore: lint fixes

* fix: nocodec value

* fix: for nocodec

* fix: add cfg feature for hardsubx

* feat: complete till startcreditstext

* fix: add more notes, args: option affect processed

* feat: port all till network stuff

* fix: complete almost all argument parsing

* fix: error free code

* fix: complete params port

* fix: hardsubx erros

* feat: clean up main function

* fix: pr reviews

* fix: make input,output function better

* fix: variant not used warning

* fix: warnings

* fix: all clippy warnings

* feat: add tests

* feat: add tests

* chore: lint fixes

* fix: move unit tests to correct folder

* fix: remove unncessary files

* fix: make function for parse_args

* fix: review changes

* fix: Impl CcxOptions whenever I could

* fix: try to convert rust to c

* chore: push c code

* fix: add more rust to c conversions

* fix: use set methods for bitfield

* fix: errors

* fix: arguments parsing

* fix: all issues

* fix: many errors

* chore: lint fix

* fix: err

* fix: unsafe function error

* fix: unsafe warning

* fix: safety lint

* chore: add docs

* fix: windows build

* fix: function

* fix: dependencies

* fix: set_binary_mode

* chore: lint fix

* fix: set_binary_mode for windows

* fix: error

* fix: undefined reference error

* chore: remove comment

* fix: output field

* chore: fix lint

* fix: ru1, ru2, ru3

* fix: undef before

* fix: parameter and update deps

* chore: update vcpkg

* feat: add release-with-debug profile

* fix; uncomment code

* fix: update visual studio to 2022

* chore: update docs

* fix: use default vcpkg

* fix: caching logic on release ci

* fix: vcpkg caching

* fix: add setup vcpkg

* chore: remove unneccesary formatting

* fix: Always write 2 bytes for UTF-16BE

* fix: formatting

* feat: add rest of the notes to bring continuity

* fix: remove extra line

* fix: add hardsubx note

* fix: source code format error

* chore: lint fixes acc to rustfmt

* feat: add unit test ci

* fix: conversion of strings, add file queue handling

* fix: decoder cfg

* fix: update dependencies

* chore: lint fix

* chore: add safety doc

* fix: default value for CcxOptions

* fix(rust): default value for teletext

* fix: leptonica version for windows

* fix: format errors

* fix: workflow

* Revert "fix: leptonica version for windows"

This reverts commit 461ef55.

* fix: pin ffmpeg to 6 for mac

* fix(parser): default values and unwrap's

* fix(parser): hardsubx fixes

* chore(parse): lint fixes

* fix(windows): switch back to sdk 2019

* fix(workflow): windows workflow revert

* fix(windows): revert to old files which were working before

* fix(workflow): pin vcpkg packages

* chore(rust): downgrade leptonica

* fix(windows): move vcpkg.json to correct place

* fix(windows): improve vcxproj

* fix(windows): workflow

* fix(windows): workflow

* fix(windows): workflow clone from vcpkg everytime

* fix(workflow): error

* fix(workflow): don't skip building vcpkg

* fix: remove depth from vcpkg

* temporary commit

* fix(windows): pin gpac and use local vcpkg manifest properly

* fix(windows): install vcpkg dependencies manually

* fix(windows): update dll names

* fix(windows); dependencies copy

* fix(windows): don't continue on error for release

* fix(macos): build ffmpeg for mac workflow

* fix: move ffmpeg to current workspace

* fix: re-add profile for windows

* fix: pkg config for mac

* fix(mac): use ffmpeg@6 from brew

* fix(macos): there is no ffmpeg_prebuilt

* fix(macos): specify ffmpeg pkg config

* fix(macos): globally define pkg config

* fix(macos): add ffmpeg include and libs dir

* fix(macos): include ffmpeg headers in makefile

* fix: include ffmpeg libraries and include directories

* fix: try to manually specify ffmpeg header in rust

* fix: also include leptonica headres

* fix: leptonica name

* fix: test

* fix: string null when output_filename is empty

* fix: error

* fix: remove cflgas

* fix(mac): disable cmake ocr hardsubx

* chore: update gitignore

* fix: null if string is empty

* fix: allow --in

* chore: bump version to 1.0 in rust

* chore: add space to trigger sp

* fix: don't panic with rust

* fix: add double dashes to indicate parameters

* chore: update CHANGES.txt

* fix: test

* fix(workflow): update workflow name

* fix(rust): linux output_filename in sampleplatform

* fix(rust): parser default values

* fix(rust): exit with MalformedParameter instead of panic

* fix(decoder): revert always write 2 bytes

* chore(rust): format

* chore: update lock file

* fix(test): test lib_ccxr and rename to test

* fix(mac): remove failing cmake_ocr test

* fix: ci errors

* fix: feature related changes

* fix: trim down default features

* fix: don't check clippy for all features
  • Loading branch information
prateekmedia authored Aug 10, 2024
1 parent 90204d4 commit 9340cc7
Show file tree
Hide file tree
Showing 30 changed files with 6,114 additions and 1,072 deletions.
16 changes: 0 additions & 16 deletions .github/workflows/build_mac.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,22 +74,6 @@ jobs:
working-directory: build
- name: Display version information
run: ./build/ccextractor --version
cmake_ocr_hardsubx:
runs-on: macos-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac ffmpeg
- name: cmake
run: |
mkdir build && cd build
cmake -DWITH_OCR=ON -DWITH_HARDSUBX=ON ../src
- name: build
run: |
make -j$(nproc)
working-directory: build
- name: Display version information
run: ./build/ccextractor --version
build_rust:
runs-on: macos-latest
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,4 +51,4 @@ jobs:
run: cargo fmt --all -- --check
- name: clippy
run: |
cargo clippy --all-features -- -D warnings
cargo clippy -- -D warnings
37 changes: 37 additions & 0 deletions .github/workflows/test_rust.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Unit Test Rust
on:
push:
paths:
- ".github/workflows/test.yml"
- "src/rust/**"
tags-ignore:
- "*.*"
pull_request:
types: [opened, synchronize, reopened]
paths:
- ".github/workflows/test.yml"
- "src/rust/**"
jobs:
test_rust:
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./src/rust
steps:
- uses: actions/checkout@v4
- name: cache
uses: actions/cache@v3
with:
path: |
src/rust/.cargo/registry
src/rust/.cargo/git
src/rust/target
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: ${{ runner.os }}-cargo-
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Test main module
run: cargo test
working-directory: ./src/rust
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@ CVS
mac/ccextractor
linux/ccextractor
linux/depend
windows/x86_64-pc-windows-msvc/**
windows/Debug/**
windows/Debug-OCR/**
windows/release-with-debug/**
windows/Release/**
windows/Release-Full/**
windows/Release-OCR/**
Expand Down Expand Up @@ -154,3 +156,4 @@ windows/ccx_rust.lib
windows/*/debug/*
windows/*/CACHEDIR.TAG
windows/.rustc_info.json
linux/configure~
3 changes: 2 additions & 1 deletion docs/CHANGES.TXT
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
0.95 (to be released)
1.0 (to be released)
-----------------
- Breaking: Major argument flags revamp for CCExtractor (#1564 & #1619)
- New: Create a Docker image to simplify the CCExtractor usage without any environmental hustle (#1611)
- New: Add time units module in lib_ccxr (#1623)
- New: Add bits and levenshtein module in lib_ccxr (#1627)
Expand Down
2 changes: 1 addition & 1 deletion docs/FFMPEG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Note:If you installed ffmpeg on non-standard location, please change/update your

### On Windows:
#### Set preprocessor flag `ENABLE_FFMPEG=1`
1. In visual studio 2013 right click <Project> and select property.
1. In visual studio 2022 right click <Project> and select property.
2. In the left panel, select Configuration Properties, C/C++, Preprocessor.
3. In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
4. In the Preprocessor Definitions dialog box, add `ENABLE_FFMPEG=1`. Choose OK to save your changes.
4 changes: 2 additions & 2 deletions docs/OCR.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,15 +93,15 @@ Download prebuild library of leptonica and tesseract from following link
https://drive.google.com/file/d/0B2ou7ZfB-2nZOTRtc3hJMHBtUFk/view?usp=sharing

put the path of libs/include of leptonica and tesseract in library paths.
1. In visual studio 2013 right click <Project> and select property.
1. In visual studio 2022 right click <Project> and select property.
2. Select Configuration properties in left panel(column) of property.
3. Select VC++ Directory.
4. In the right pane, in the right-hand column of the VC++ Directory property, open the drop-down menu and choose Edit.
5. Add path of Directory where you have kept uncompressed library of leptonica and tesseract.


Set preprocessor flag ENABLE_OCR=1
1. In visual studio 2013 right click <Project> and select property.
1. In visual studio 2022 right click <Project> and select property.
2. In the left panel, select Configuration Properties, C/C++, Preprocessor.
3. In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
4. In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.
Expand Down
3 changes: 2 additions & 1 deletion mac/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,7 @@ GPAC_CPPFLAGS = $(shell pkg-config --cflags gpac)

ccextractor_CPPFLAGS =-I../src/lib_ccx/ -I../src/thirdparty/libpng/ -I../src/thirdparty/zlib/ -I../src/lib_ccx/zvbi/ -I../src/thirdparty/lib_hash/ -I../src/thirdparty/protobuf-c/ -I../src/thirdparty -I../src/ -I../src/thirdparty/freetype/include/
ccextractor_CPPFLAGS += $(GPAC_CPPFLAGS)
ccextractor_CPPFLAGS += $(FFMPEG_CPPFLAGS)

ccextractor_LDADD=-lm -lpthread -ldl

Expand All @@ -271,7 +272,7 @@ if HARDSUBX_IS_ENABLED
ccextractor_CFLAGS += -DENABLE_HARDSUBX
ccextractor_CPPFLAGS+= ${libavcodec_CFLAGS}
ccextractor_CPPFLAGS+= ${libavformat_CFLAGS}
ccextractor_CPPFLAGS+= ${libavutil_CFALGS}
ccextractor_CPPFLAGS+= ${libavutil_CFLAGS}
ccextractor_CPPFLAGS+= ${libswscale_CFLAGS}
AV_LIB = ${libavcodec_LIBS}
AV_LIB += ${libavformat_LIBS}
Expand Down
4 changes: 4 additions & 0 deletions src/ccextractor.c
Original file line number Diff line number Diff line change
Expand Up @@ -446,7 +446,11 @@ int main(int argc, char *argv[])
// If "ccextractor.cnf" is present, takes options from it.
// See docs/ccextractor.cnf.sample for more info.

#ifndef DISABLE_RUST
int compile_ret = ccxr_parse_parameters(api_options, argc, argv);
#else
int compile_ret = parse_parameters(api_options, argc, argv);
#endif

if (compile_ret == EXIT_NO_INPUT_FILES)
{
Expand Down
3 changes: 3 additions & 0 deletions src/lib_ccx/lib_ccx.h
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,9 @@ extern void ccxr_init_basic_logger(struct ccx_s_options *opts);
void print_end_msg(void);

//params.c
#ifndef DISABLE_RUST
extern int ccxr_parse_parameters(struct ccx_s_options *opt, int argc, char *argv[]);
#endif
int parse_parameters (struct ccx_s_options *opt, int argc, char *argv[]);
void print_usage (void);
int atoi_hex (char *s);
Expand Down
4 changes: 2 additions & 2 deletions src/lib_ccx/matroska.c
Original file line number Diff line number Diff line change
Expand Up @@ -1362,8 +1362,8 @@ int matroska_loop(struct lib_ccx_ctx *ctx)
{
if (ccx_options.write_format_rewritten)
{
mprint(MATROSKA_WARNING "You are using -out=<format>, but Matroska parser extract subtitles in a recorded format\n");
mprint("-out=<format> will be ignored\n");
mprint(MATROSKA_WARNING "You are using --out=<format>, but Matroska parser extract subtitles in a recorded format\n");
mprint("--out=<format> will be ignored\n");
}

// Don't need generated input file
Expand Down
18 changes: 12 additions & 6 deletions src/lib_ccx/params.c
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,13 @@ int parsedelay(struct ccx_s_options *opt, char *par)
return 0;
}

void set_binary_mode()
{
#ifdef WIN32
setmode(fileno(stdin), O_BINARY);
#endif
}

int append_file_to_queue(struct ccx_s_options *opt, char *filename)
{
if (filename[0] == '\0') // skip files with empty file name (ex : ./ccextractor "")
Expand Down Expand Up @@ -978,14 +985,14 @@ void print_usage(void)
mprint(" a .d extension. Each .png file will contain an image representing one caption\n");
mprint(" and named subNNNN.png, starting with sub0000.png.\n");
mprint(" For example, the command:\n");
mprint(" ccextractor -out=spupng input.mpg\n");
mprint(" ccextractor --out=spupng input.mpg\n");
mprint(" will create the files:\n");
mprint(" input.xml\n");
mprint(" input.d/sub0000.png\n");
mprint(" input.d/sub0001.png\n");
mprint(" ...\n");
mprint(" The command:\n");
mprint(" ccextractor -out=spupng -o /tmp/output --12 input.mpg\n");
mprint(" ccextractor --out=spupng -o /tmp/output --12 input.mpg\n");
mprint(" will create the files:\n");
mprint(" /tmp/output_1.xml\n");
mprint(" /tmp/output_1.d/sub0000.png\n");
Expand Down Expand Up @@ -1245,9 +1252,8 @@ int parse_parameters(struct ccx_s_options *opt, int argc, char *argv[])
}
if (strcmp(argv[i], "-") == 0 || strcmp(argv[i], "--stdin") == 0)
{
#ifdef WIN32
setmode(fileno(stdin), O_BINARY);
#endif
set_binary_mode();

opt->input_source = CCX_DS_STDIN;
if (!opt->live_stream)
opt->live_stream = -1;
Expand Down Expand Up @@ -2934,7 +2940,7 @@ int parse_parameters(struct ccx_s_options *opt, int argc, char *argv[])
}
if (opt->write_format == CCX_OF_SPUPNG && opt->cc_to_stdout)
{
print_error(opt->gui_mode_reports, "You cannot use -out=spupng with -stdout.\n");
print_error(opt->gui_mode_reports, "You cannot use --out=spupng with -stdout.\n");
return EXIT_INCOMPATIBLE_PARAMETERS;
}

Expand Down
2 changes: 1 addition & 1 deletion src/lib_ccx/ts_tables.c
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,7 @@ int parse_PMT(struct ccx_demuxer *ctx, unsigned char *buf, int len, struct progr
#ifndef ENABLE_OCR
if (ccx_options.write_format != CCX_OF_SPUPNG)
{
mprint("DVB subtitles detected, OCR subsystem not present. Use -out=spupng for graphic output\n");
mprint("DVB subtitles detected, OCR subsystem not present. Use --out=spupng for graphic output\n");
continue;
}
#endif
Expand Down
Loading

0 comments on commit 9340cc7

Please sign in to comment.