preprocessing code for k_politician dataset
-
run.py is for downloading PDF file from clawled adress
-
pdf2jpg.py and pdf2image_2.py is for converting PDF to JPG image format.
-
image processing codes are not available for korean letter so we should go with file name converting process.
file_name_first.py and file_name.py is for that process.
-
and we crop images as facial detection size with image_processing.py
-
then use 'background_remove.py' for removing background.