CNIC

Source code for Chinese image captioning method based on deep multimodal semantic fusion runnable on GPU and CPU.

License

This code is released under the MIT License (refer to the LICENSE file for details).

Dependencies

1. tensorflow

The model is trained using Tensorflow, a popular Python framework for training deep neural network. To install Tensorflow, please refer to Installing Tensorflow.

2. python libs

The code is written in python, you also need to install following python dependencies:

bottle==0.12.13
ipdb==0.10.3
matplotlib==2.1.0
numpy==1.13.3
Pillow==4.3.0
scikit-image==0.13.1
scipy==1.0.0
jieba==0.38

For convenience, you can alse use requirements.txt to install python dependencies:

pip install -r requirements.txt

To use the evaluation script: see coco-caption for the requirements.

Hardware

Though you can run the code on CPU, we highly recommend you to equip a GPU card. To run on cpu, please use

export CUDA_VISIBLE_DEVICES=""

Prepare data

To generate training data for Flickr8k-CN, please use build_flickr8k_data.py script:

python build_flickr8k_data.py

Train single-lable visual encoding model

We use Google Inception V3 for single-lable visual encoding network: see Inception for the instructions.

Train multi-lable keyword prediction model

Please run train_keyword.py using gpu:

CUDA_VISIBLE_DEVICES=0 python train_keyword.py

Train multimodal caption generation model

For multimodal caption generation network use train.py:

python train.py

Generate caption and visualize

Use server.py to load models, and use client.py to request caption generation:

python server.py
python client.py

Tensorboard visualization

To use tensorboard to monitor training process:

tensorboard --logdir="MODEL_PATH"

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
CNIC-C		CNIC-C
CNIC-H		CNIC-H
CNIC-HC		CNIC-HC
CNIC-X		CNIC-X
CNIC		CNIC
Example_data		Example_data
inference_utils		inference_utils
ops		ops
README.md		README.md
build_flickr8k_data.py		build_flickr8k_data.py
client.py		client.py
configuration.py		configuration.py
configuration.pyc		configuration.pyc
detector.py		detector.py
detector.pyc		detector.pyc
inference_wrapper.py		inference_wrapper.py
inference_wrapper.pyc		inference_wrapper.pyc
requirements.txt		requirements.txt
server.py		server.py
show_and_tell_model.py		show_and_tell_model.py
show_and_tell_model.pyc		show_and_tell_model.pyc
train.py		train.py
train_keyword.py		train_keyword.py
util.py		util.py
util.pyc		util.pyc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNIC

License

Dependencies

1. tensorflow

2. python libs

Hardware

Prepare data

Train single-lable visual encoding model

Train multi-lable keyword prediction model

Train multimodal caption generation model

Generate caption and visualize

Tensorboard visualization

About

Releases

Packages

Contributors 2

Languages

CSnode/CNIC

Folders and files

Latest commit

History

Repository files navigation

CNIC

License

Dependencies

1. tensorflow

2. python libs

Hardware

Prepare data

Train single-lable visual encoding model

Train multi-lable keyword prediction model

Train multimodal caption generation model

Generate caption and visualize

Tensorboard visualization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages