-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualitative result on scene description task #23
Comments
When we are doing this paper, we filter and gather the v1.0 version of 3D_LLM dataset. Since the authors of 3D_LLM have updated their annotations on ScanNet, maybe you should use their updated dataset for better performance. Since we are only training on limited data, there will be hallucinations. |
Thanks for the information. Can I enhance the model for scene captioning task by fine-tuning only on unified_3dllm_scene_description? I might try leveraging LL3DA generalist using some other scene captioning dataset like LEO and SceneVerse released. |
Yes, training on the latter datasets might help. |
Can you tell me which learning model the above qualitative results are from? Is it fine-tuned on the dataset I'm trying to fine tune the model, but there is no script for it. Referring
|
@ch3cook-fdu
Could you provide the code you used for parsing annotation from 3d-llm-scene-description v1.0? I want to try with the updated annotation v2.0 for ll3da. |
I found it here. Thanks for your work! |
@ch3cook-fdu
The evaluation result after finetuning for the scene_description task is
I set "CiDEr" as criterion while fine-tuning, so CIDEr gets slightly improved. But the others got much lower. |
This is normal. Please dig into the definition of these metrics:
|
Which metric would you recommend to improve for scene description task? Can you suggest some other approaches for it? |
|
Thanks for your advice! |
We split the task data manually, and treat the scene ids larger than 600 as the validation set. |
Thanks for the information. However in the default script |
I've been conducting an evaluation test on unified_3dllm_scene_description dataset with the pretrained generalist ckpt ll3da-opt-1.3b.pth.
An example result for scene0612_00 is as below
As it shown, the pred result is quite different from gt annotations.
Is this an expected result for the generalist model or am I missing something to do?
I wonder if it could be much better with fine-tuning on unified_3dllm_scene_description dataset.
I also noticed the annoations for scene description dataset are quite different from the original 3D-LLM annotations. For instance, the annotation for the above scene0612_00 is :
Could you tell me why are they different and how did you process each annoation for the scene description task?
The text was updated successfully, but these errors were encountered: