You can find the module containers for Columbia University Vision Pipeline by following links:
CU Grounding and Merging,
CU Object Detection,
CU Face/Flag/Landmark Recognition,
UIUC Text Pipeline,
USC Grounding [Optional]
Local development environment:
- Python 3.6.3
- Anaconda 3-4.4
- Tensorflow 1.12.0
Running Docker
$ INPUT= /host_input/
$ OUTPUT=/host_output/
$ # please create the folder /columbia_vision_shared/ under the ${OUTPUT}/WORKING/ directory for output files
$ mkdir ${OUTPUT}/WORKING/columbia_vision_shared/
$ # please run the necessary modules (CU_Object_Detection, CU_Face/Flag/Landmark_Recognition and UIUC_Text_Pipeline) to get or download the the required result files to the shared directory: ${OUTPUT}/WORKING/columbia_vision_shared/
$ GPU_ID=[a single integer index to the GPU]
$ docker pull gaiaaida/grounding-merging
$ docker images
$ # The model folder columbia_visual_grounding_models/ can be found in the docker image directly or the soucecode repository
$ # Mapping the directories environment variables ${INPUT} and ${OUTPUT}/WORKING/columbia_vision_shared/ to the paths in codes /root/LDC and /root/shared
$ docker run -it -e CUDA_VISIBLE_DEVICES=${GPU_ID} --gpus ${GPU_ID} --name aida-grounding-merging -v columbia_visual_grounding_models/:/root/models -v ${INPUT}:/root/LDC:ro -v ${OUTPUT}/WORKING/columbia_vision_shared/:/root/shared gaiaaida/grounding-merging /bin/bash
Building Docker
$ docker build . --tag columbia-gm
$ $ docker run -it -e CUDA_VISIBLE_DEVICES=${GPU_ID} --gpus ${GPU_ID} --name grounding-merging -v columbia_visual_grounding_models/:/root/models -v ${INPUT}:/root/LDC:ro -v ${OUTPUT}/WORKING/columbia_vision_shared/:/root/shared columbia-gm /bin/bash
$ docker exec -it aida-gm /bin/bash
$ # python
$ docker build . --tag columbia-gm
$ docker run -itd --name aida-gm -p [HOST_PORT]:8082 -v columbia_visual_grounding_models/:/root/models -v ${INPUT}:/root/LDC:ro -v ${OUTPUT}/WORKING/columbia_vision_shared/:/root/shared columbia-gm /bin/bash
$ docker port aida-gm
$ docker exec -it aida-gm /bin/bash
# jupyter notebook --allow-root --ip= --port=8082 &
# Access jupyter on the host machine [HOST_URL]:[HOST_PORT].
$ docker exec -it aida-gm /bin/bash
$$ python ./
$$ echo expect to see get [CU Visual_Features] files:
${OUTPUT}/WORKING/columbia_vision_shared/cu_grounding_matching_features/semantic_features_jpg.lmdb, semantic_features_keyframe.lmdb, instance_features_jpg.lmdb, instance_features_keyframe.lmdb
$$ echo expect to see get [CU Grounding] file: ${OUTPUT}/WORKING/columbia_vision_shared/cu_grounding_results/grounding_dict.pickle
$$ python ./
$$ echo expect to see get [CU Dictionary] files for USC: ${OUTPUT}/WORKING/columbia_vision_shared/cu_grounding_dict_files/entity2mention_dict.pickle, ${OUTPUT}/WORKING/columbia_vision_shared/cu_grounding_dict_files/id2mentions_dict.pickle
$$ python ./Graph
$$ echo expect to see get [CU Merging] files: ${OUTPUT}/WORKING/columbia_vision_shared/cu_graph_merging_ttl/merged_ttl/
- Feature_Extraction.ipynb
- Visual_Grounding_mp.ipynb
- Graph_Merging.ipynb
The steps associates with "feature extraction", "visual grounding and instance matching" and "graph merging" parts.
[Optional Setting] Running the modules of Columbia University only does not require to run USC branch, the merging steps for USC grounding in can be commented. But if you want to merge the grounding results from USC, please keep the merging steps in the code for USC grounding and follow these steps: 1. Generating the immediate dictionary files by run as input for USC grounding; 2. Runnnig the USC grounding branch and generating the USC grounding results (as dictionary object); 3. Running our codes and the system will use two types of grounding results as input.
Grounding score threshold: 0.85
Sentence score threshold: 0.6
Set the number of multiple processes: processes_num = 32
Download the folder of the LDC corpus data (
File List:
[LDC] 3 files (from ISI, sorted by UIUC)
[LTF] file (from ISI)
Download Link:
Download the shared folders (/columbia_data_root/columbia_vision_shared/) and the visual grounding model files (/columbia_data_root/columbia_visual_grounding_models/) for visual grounding, instance matching and graph merging.
- Data Structure
columbia_data_root (equivalent to ${OUTPUT}/WORKING/)
├── columbia_vision_shared
│ ├── cu_objdet_results
│ ├── cu_grounding_matching_features
│ ├── cu_grounding_results
│ ├── uiuc_ttl_results
│ ├── uiuc_asr_files
│ ├── cu_grounding_dict_files
│ ├── cu_ttl_tmp
│ ├── cu_graph_merging_ttl
│ └── ...
└── columbia_visual_grounding_models
- Specify data paths
corpus_path = '/root/LDC/' # set /root/LDC/ as the corpus data path
working_path = '/root/shared/' # set /root/shared/ as the shared folder path
model_path = '/root/models/' # set /root/models/ as the model folder path
File List:
[UIUC] 3 files (from UIUC)
[CU obj_det] files (from CU_obj)
[USC] files (from USC)
[CU clustering] files (from CU_face)
[Model] files (from CU_gm)
[CU Visual_Features] files (from CU_gm)
[CU Grounding] files (from CU_gm)
[CU Dictionary] files (from CU_gm)
[CU Merging] files (from CU_gm)
do not need to use for testing
[LDC] 4 files
parent_child_tab = corpus_path + 'docs/'
kfrm_msb = corpus_path + 'docs/masterShotBoundary.msb'
kfrm_path = corpus_path + 'data/video_shot_boundaries/representative_frames'
jpg_path = corpus_path + 'data/jpg/jpg/'
[UIUC] 2 files
video_asr_path = working_path + 'uiuc_asr_files/' +'ltf_asr/'
video_map_path = working_path + 'uiuc_asr_files/' +'map_asr/'
[CU obj_det] files
det_results_path_img = working_path + 'cu_objdet_results/' + 'det_results_merged_34a.pkl'
det_results_path_kfrm = working_path + 'cu_objdet_results/' + 'det_results_merged_34b.pkl'
[CU grounding_model] files
grounding_model_path = model_path + 'model_ELMo_PNASNET_VOA_norm'
matching_model_path = model_path + 'model_universal_no_recons_ins_only'
[CU Visual_Features] Common Space Embeddings (for grounding)
out_path_jpg = working_path + 'cu_grounding_matching_features/' + 'semantic_features_jpg.lmdb'
out_path_kfrm = working_path + 'cu_grounding_matching_features/' + 'semantic_features_keyframe.lmdb'
[CU Visual_Features] Instance Matching Features (for instance clustering)
out_path_jpg = working_path + 'cu_grounding_matching_features/' + 'instance_features_jpg.lmdb'
out_path_kfrm = working_path + 'cu_grounding_matching_features/' + 'instance_features_keyframe.lmdb'
[LDC] 5 files
parent_child_tab = corpus_path + 'docs/'
kfrm_msb = corpus_path + 'docs/masterShotBoundary.msb'
kfrm_path = corpus_path + 'data/video_shot_boundaries/representative_frames'
jpg_path = corpus_path + 'data/jpg/jpg/'
ltf_path = corpus_path + 'data/ltf/ltf/'
[UIUC] 4 files
txt_mention_ttl_path = working_path + 'uiuc_ttl_results/'
pronouns_path = working_path + 'uiuc_asr_files/' + 'pronouns.txt'
video_asr_path = working_path + 'uiuc_asr_files/' +'ltf_asr/'
video_map_path = working_path + 'uiuc_asr_files/' +'map_asr/'
[CU obj_det]
As mentioned before.
[CU Dictionary] files (for USC)
entity2mention_dict_path = working_path + 'cu_grounding_dict_files/' + 'entity2mention_dict.pickle'
id2mentions_dict_path = working_path + 'cu_grounding_dict_files/' + 'id2mentions_dict.pickle'
[CU Grounding] files
grounding_dict_path = working_path + 'cu_grounding_results/' + 'grounding_dict.pickle'
grounding_log_path = working_path + 'cu_grounding_results/' + 'log_grounding.txt'
[LDC] 3 files
parent_child_tab = corpus_path + 'docs/' # should be sorted
[CU Visual_Features] files
Generated by the module of Visual Feature: cu_grounding_matching_features/semantic_features_jpg.lmdb, semantic_features_keyframe.lmdb, instance_features_jpg.lmdb, instance_features_keyframe.lmdb
[CU Grounding] file
Generated by the module of Visual Grounding: grounding_dict_path = working_path + 'cu_grounding_results/' + version_folder + 'grounding_dict.pickle'
[USC] file
usc_dict_path = working_path + 'usc_grounding_dict/' + version_folder + 'uscvision_grounding_output_cu_format.pickle'
[UIUC] file
txt_mention_ttl_path = working_path + 'uiuc_ttl_results/' + uiuc_run_folder # 1/7th May
[CU clustering] 2 files
cu_ttl_tmp_path = working_path + 'cu_ttl_tmp/'
cu_ttl_path = cu_ttl_tmp_path + version_folder + 'm18/'
cu_ttl_ins_path = cu_ttl_tmp_path + version_folder + 'm18_i_c/'
[CU Merging] files
merged_graph_path = working_path + 'cu_graph_merging_ttl/' + 'merged_ttl/'
get the data from [UIUC] and [CU obj_det]
Run feature extraction program
Run the first one part of grounding program to generate intermediate dict file for [USC]
Merge the intermediate file from [USC]
Run grounding program and generate results (parallelly)
Get updated cu_ttl from [CU clustering]
Run merging program
Check grounding dict result
Check grounded entities and clusters
Check the prefix for entity (columbia or usc ..)
Check merged_ttl by validator on local server pineapple
Output: /columbia_vision_shared/merged_ttl/
CU grounding_dict file
'textual_features':array( [ ],
'type_rdf': rdflib.term.URIRef(''),
'f74850e12aef14b83bad4071dde1b2a6cb661':{ },
'IC00121KH.jpg.ldcc':{ },
'IC00121KI.jpg.ldcc':{ },
'IC00121KF.jpg.ldcc':{ },
'IC00121KK.jpg.ldcc':{ }
'textual_features':array( [ ],
'name':'Lavrov Lavrov ',
'sentence':'Lavrov Lavrov in the image'
If you use our Docker container images or codes in your research, please cite the following papers.
GAIA: A Fine-grained Multimedia Knowledge Extraction System. Manling Li, Alireza Zareian, Ying Lin, Xiaoman Pan, Spencer Whitehead, Brian Chen, Bo Wu, Heng Ji, Shih-Fu Chang, Clare Voss, Daniel Napierski and Marjorie Freedman Proc. The 58th Annual Meeting of the Association for Computational Linguistics (ACL2020) Demo Track
GAIA at SM-KBP 2019 - A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System. Manling Li, Ying Lin, Ananya Subburathinam, Spencer Whitehead, Xiaoman Pan, Di Lu, Qingyun Wang, Tongtao Zhang, Lifu Huang, Heng Ji, Alireza Zareian, Hassan Akbari, Brian Chen, Bo Wu, Emily Allaway,Shih-Fu Chang, Kathleen McKeown, Yixiang Yao, Jennifer Chen, Eric Berquist, Kexuan Sun, Xujun Peng, Ryan GabbardMarjorie Freedman, Pedro Szekely, T.K. Satish Kumar, Arka Sadhu, Ram Nevatia, Miguel Rodriguez, Yifan Wang, Yang Bai, Ali Sadeghian, Daisy Zhe Wang Proc. Text Analysis Conference (TAC2019)
GAIA - A Multi-media Multi-lingual Knowledge Extraction and Hypothesis Generation System. Tongtao Zhang, Ananya Subburathinam, Ge Shi, Lifu Huang, Di Lu, Xiaoman Pan, Manling Li, Boliang Zhang, Qingyun Wang, Spencer Whitehead, Heng Ji, Alireza Zareian, Hassan Akbari, Brian Chen, Ruiqi Zhong, Steven Shao, Emily Allaway, Shih-Fu Chang, Kathleen McKeown, Dongyu Li, Xin Huang, Xujun Peng, Ryan Gabbard, Marjorie Freedman, Ali Sadeghian, Mayank Kejriwal, Ram Nevatia, Pedro Szekely, Ali Sadeghian and Daisy Zhe Wang Proc. Text Analysis Conference (TAC2018)
Multi-level Multimodal Common Semantic Space for Image-Phrase Grounding. Hassan Akbari, Svebor Karaman, Surabhi Bhargava, Brian Chen, Carl Vondrick, and Shih-Fu Chang
Proc. International Conference on Computer Vision and Pattern Recognition (CVPR2019)
to: working_path = '/root/'
from: corpus_path = '/root/dryrun/'
to: corpus_path = '/root/LDC'
from: model/
to: model_path = models_root + 'columbia_visual_grounding_models/'
from: /objdet_results/, rpi_ttl/,raw_files/, tmp/, usc_dict/, /cu_ttl,/merged_ttl
to: /cu_objdet_results/, uiuc_ttl_results/,uiuc_asr_files/, cu_grounding_dict_files, usc_grounding_dict/, /cu_ttl_tmp,/cu_graph_merging_ttl
from: all_features/
to: cu_grounding_matching_features/