update operators readme

Cathy0908 · Dec 10, 2024 · 33b99f6 · 33b99f6
1 parent 3379e86
commit 33b99f6
Show file tree

Hide file tree

Showing 3 changed files with 6 additions and 0 deletions.
diff --git a/configs/config_all.yaml b/configs/config_all.yaml
@@ -341,6 +341,10 @@ process:
       horizontal_flip: false                                  # flip frame image horizontally (left to right).
       vertical_flip: false                                    # flip frame image vertically (top to bottom).
       mem_required: '20GB'                                    # This operation (Op) utilizes deep neural network models that consume a significant amount of memory for computation, hence the system's available memory might constrains the maximum number of processes that can be launched
+  - video_extract_frames_mapper:                            # extract frames from video files according to specified methods
+      frame_sampling_method: 'all_keyframes'                  # sampling method of extracting frame images from the videos. Should be one of ["all_keyframes", "uniform"]. The former one extracts all key frames and the latter one extract specified number of frames uniformly from the video. Default: "all_keyframes".
+      frame_num: 3                                            # the number of frames to be extracted uniformly from the video. Only works when frame_sampling_method is "uniform". If it's 1, only the middle frame will be extracted. If it's 2, only the first and the last frames will be extracted. If it's larger than 2, in addition to the first and the last frames, other frames will be extracted uniformly within the video duration.
+      frame_dir: None                                         # Output directory to save extracted frames. If None, a default directory based on the video file path is used.
   - video_face_blur_mapper:                                 # blur faces detected in videos
       cv_classifier: ''                                       # OpenCV classifier path for face detection. By default, we will use 'haarcascade_frontalface_alt.xml'.
       blur_type: 'gaussian'                                   # type of blur kernel, including ['mean', 'box', 'gaussian']

diff --git a/docs/Operators.md b/docs/Operators.md
@@ -106,6 +106,7 @@ All the specific operators are listed below, each featured with several capabili
 | video_captioning_from_frames_mapper            | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | generate samples whose captions are generated based on an image-to-text model and sampled video frames. Captions from different frames will be concatenated to a single string | [code](../data_juicer/ops/mapper/video_captioning_from_frames_mapper.py)           | [tests](../tests/ops/mapper/test_video_captioning_from_frames_mapper.py)           |
 | video_captioning_from_summarizer_mapper        | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | Generate video captions by summarizing several kinds of generated texts (captions from video/audio/frames, tags from audio/frames, ...)                                        | [code](../data_juicer/ops/mapper/video_captioning_from_summarizer_mapper.py)       | [tests](../tests/ops/mapper/test_video_captioning_from_summarizer_mapper.py)       |
 | video_captioning_from_video_mapper             | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | generate samples whose captions are generated based on another model (video-blip) and sampled video frame within the original sample                                           | [code](../data_juicer/ops/mapper/video_captioning_from_video_mapper.py)            | [tests](../tests/ops/mapper/test_video_captioning_from_video_mapper.py)            |
+| video_extract_frames_mapper             | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | extract frames from video files according to specified methods                                           | [code](../data_juicer/ops/mapper/video_extract_frames_mapper.py)            | [tests](../tests/ops/mapper/test_video_extract_frames_mapper.py)            |
 | video_face_blur_mapper                         | ![Video](https://img.shields.io/badge/Video-F2B138?style=plastic)                                                                                                                                                                                                                                                             | Blur faces detected in videos                                                                                                                                                  | [code](../data_juicer/ops/mapper/video_face_blur_mapper.py)                        | [tests](../tests/ops/mapper/test_video_face_blur_mapper.py)                        |
 | video_ffmpeg_wrapped_mapper                    | ![Video](https://img.shields.io/badge/Video-F2B138?style=plastic)                                                                                                                                                                                                                                                             | Simple wrapper to run a FFmpeg video filter                                                                                                                                    | [code](../data_juicer/ops/mapper/video_ffmpeg_wrapped_mapper.py)                   | [tests](../tests/ops/mapper/test_video_ffmpeg_wrapped_mapper.py)                   |
 | video_remove_watermark_mapper                  | ![Video](https://img.shields.io/badge/Video-F2B138?style=plastic)                                                                                                                                                                                                                                                             | Remove the watermarks in videos given regions                                                                                                                                  | [code](../data_juicer/ops/mapper/video_remove_watermark_mapper.py)                 | [tests](../tests/ops/mapper/test_video_remove_watermark_mapper.py)                 |

diff --git a/docs/Operators_ZH.md b/docs/Operators_ZH.md
@@ -105,6 +105,7 @@ Data-Juicer 中的算子分为以下 5 种类型。
 | video_captioning_from_frames_mapper            | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | 生成样本，其标题是基于一个文字生成图片的模型和原始样本视频中指定帧的图像。不同帧产出的标题会拼接为一条单独的字符串。 | [code](../data_juicer/ops/mapper/video_captioning_from_frames_mapper.py)           | [tests](../tests/ops/mapper/test_video_captioning_from_frames_mapper.py)           |
 | video_captioning_from_summarizer_mapper        | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | 通过对多种不同方式生成的文本进行摘要以生成样本的标题（从视频/音频/帧生成标题，从音频/帧生成标签，...）               | [code](../data_juicer/ops/mapper/video_captioning_from_summarizer_mapper.py)       | [tests](../tests/ops/mapper/test_video_captioning_from_summarizer_mapper.py)       |
 | video_captioning_from_video_mapper             | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | 生成样本，其标题是根据另一个辅助模型（video-blip）和原始样本中的视频中指定帧的图像。                                 | [code](../data_juicer/ops/mapper/video_captioning_from_video_mapper.py)            | [tests](../tests/ops/mapper/test_video_captioning_from_video_mapper.py)            |
+| video_extract_frames_mapper             | ![Multimodal](https://img.shields.io/badge/Multimodal-F25922?style=plastic) ![GPU](https://img.shields.io/badge/GPU-F27649?style=plastic)                                                                                                                                                                                     | 从视频中抽帧。                                           | [code](../data_juicer/ops/mapper/video_extract_frames_mapper.py)            | [tests](../tests/ops/mapper/test_video_extract_frames_mapper.py)            |
 | video_face_blur_mapper                         | ![Video](https://img.shields.io/badge/Video-F2B138?style=plastic)                                                                                                                                                                                                                                                             | 对视频中的人脸进行模糊处理                                                                                           | [code](../data_juicer/ops/mapper/video_face_blur_mapper.py)                        | [tests](../tests/ops/mapper/test_video_face_blur_mapper.py)                        |
 | video_ffmpeg_wrapped_mapper                    | ![Video](https://img.shields.io/badge/Video-F2B138?style=plastic)                                                                                                                                                                                                                                                             | 运行 FFmpeg 视频过滤器的简单封装                                                                                     | [code](../data_juicer/ops/mapper/video_ffmpeg_wrapped_mapper.py)                   | [tests](../tests/ops/mapper/test_video_ffmpeg_wrapped_mapper.py)                   |
 | video_remove_watermark_mapper                  | ![Video](https://img.shields.io/badge/Video-F2B138?style=plastic)                                                                                                                                                                                                                                                             | 去除视频中给定区域的水印                                                                                             | [code](../data_juicer/ops/mapper/video_remove_watermark_mapper.py)                 | [tests](../tests/ops/mapper/test_video_remove_watermark_mapper.py)                 |