Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged procthor code #39

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Conversation

jordis-ai2
Copy link

@jordis-ai2 jordis-ai2 commented Jul 13, 2022

Merging procthor-rearrangement code into main.

Moved procthor-related modules as submodules of rearrange ones

Removed redundant procthor_rearrange_constants

Updated requirements and conda environment

Doc updates + removed ai2thor_rearrangement prefix for data paths in inv tasks

Typo

Update README.md

Fixed base dirs for ProcTHOR commands

Fixed base config vo ithor_mini_valid eval config

Update README.md

Added TOC entry for ProcTHOR pre-training

Doc update + bug fix in data path constant

Missing ctx in call to split_data
@jordis-ai2 jordis-ai2 marked this pull request as ready for review July 17, 2022 06:17
Copy link
Contributor

@Lucaweihs Lucaweihs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice :)! There are a huge number of new lines so I've only given things a quick review, if there are any particular files/functions/lines you'd like me to look at in detail let me know. I have two main suggested changes:

  1. There are a few cases where we seem to have copied a bunch of code from one place to another with some minor changes. Can we try to merge these?
  2. I'm not a big fan of the git-lfs dependency (indeed it caused a problem when I tried to grab this branch locally). Can we remove all the git-lfs tracked files and put them into their own, separate, repository? Then we can use prior.load_dataset(...) to load data from that repository and keep everything in this repo git-only.

.gitattributes Outdated
Comment on lines 1 to 6
data/2022procthor/mini_val_consolidated.pkl.gz filter=lfs diff=lfs merge=lfs -text
data/2022procthor/split_mini_val filter=lfs diff=lfs merge=lfs -text
data/2022procthor/split_mini_val/** filter=lfs diff=lfs merge=lfs -text
data/2022procthor/split_train filter=lfs diff=lfs merge=lfs -text
data/2022procthor/split_train/** filter=lfs diff=lfs merge=lfs -text
data/2022procthor/train_consolidated.pkl.gz filter=lfs diff=lfs merge=lfs -text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we have a dependency on git-lfs? Was this the issue we were talking about yesterday regarding how the prior library handles things?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I was actually not planning on using prior to distribute the datasets, but rather following the current design of directly hosting the data in the repository (with the modification of using git-lfs to keep future changes in the data from piling up in the history). If I'm right, we just need to clone the repo and all the procthor data is available, as with the iTHOR data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, I'd prefer we didn't introduce git-lfs as a dependency here as it's yet another thing to download and install (and getting this repository working in a new environment is already quite a lot for people). In the prior package I do some things behind the scenes so that we download a git-lfs binary onto the users machine in the background if someone doesn't have it so this is why the git lfs dependency isn't as much of an issue there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I guess I wasn't realizing I was able to directly clone the repository precisely because I had already installed git-lfs when I pushed the datasets. I'm not super confident about how to properly use prior to distribute the data, but I guess the example for procthor-10k will do.

Copy link
Author

@jordis-ai2 jordis-ai2 Jul 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to keep things more consistent, wouldn't it make more sense to just keep everything in this repo, i.e. without git-lfs given its downsides? We will still have to explain how to install the data in a reachable path (e.g. via additional instructions in README) if we use prior.

Let me know if you're happy with the solution, and I'll add the dataset files to the repository. If prior is actually the preferred choice, then I would create a repo with all the datasets (also the 2021 and 2022 ithor ones) and install all the data via an invoke command calling prior.load_dataset, if that sounds reasonable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and prepared an installer for the ProcTHOR dataset. If the design seems fine, we could port the regular iTHOR ones in a similar way.

README.md Show resolved Hide resolved
mp = mp.get_context("spawn")

# Includes types used in both open and pickup actions:
VALID_TARGET_TYPES = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead use rearrange.constants.REARRANGE_SIM_OBJECTS?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this set is actually including 'Poster', 'RoomDecor', 'Window', 'Desktop', 'GarbageCan', 'Floor', 'BathtubBasin', 'PaperTowel', 'AppleSliced', 'Bed', 'PotatoSliced', 'Mirror', 'Dresser', 'Toaster', 'Desk', 'TargetCircle', 'StoveKnob', 'Sink', 'Television', 'ShowerHead', 'EggCracked', 'DiningTable', 'Shelf', 'SinkBasin', 'TVStand', 'Ottoman', 'ArmChair', 'LightSwitch', 'ShowerGlass', 'DeskLamp', 'VacuumCleaner', 'Curtains', 'CounterTop', 'StoveBurner', 'TowelHolder', 'Chair', 'Sofa', 'DogBed', 'BreadSliced', 'LettuceSliced', 'Painting', 'Bathtub', 'TomatoSliced', 'CoffeeTable', 'ToiletPaperHanger', 'GarbageBag', 'ShelvingUnit', 'Faucet', 'HandTowelHolder', 'HousePlant', 'SideTable', 'FloorLamp', 'Stool', 'CoffeeMachine', which are not potential targets given the action space, so I would not do that.

datagen/procthor_datagen/datagen_runner_train.py Outdated Show resolved Hide resolved
datagen/procthor_datagen/datagen_runner_valid.py Outdated Show resolved Hide resolved
@@ -0,0 +1,1256 @@
"""Include the Task and TaskSampler to train on a single unshuffle instance."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing a diff between this and rearrange/tasks.py suggests to me that we might (and probably should) use the same task for both projects and just have a few switches here and there that tell us what environment we should use and how we should load data.

Copy link
Author

@jordis-ai2 jordis-ai2 Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly modified the original task and subclassed for ProcTHOR. Not exactly the same task for both environments, but almost...

rearrange/procthor_rearrange/utils.py Outdated Show resolved Hide resolved
jordis-ai2 and others added 4 commits July 22, 2022 11:51
Moved procthor-related modules as submodules of rearrange ones

Removed redundant procthor_rearrange_constants

Updated requirements and conda environment

Doc updates + removed ai2thor_rearrangement prefix for data paths in inv tasks

Typo

Update README.md

Fixed base dirs for ProcTHOR commands

Fixed base config vo ithor_mini_valid eval config

Update README.md

Added TOC entry for ProcTHOR pre-training

Doc update + bug fix in data path constant

Missing ctx in call to split_data
@mariiak2021
Copy link

Hi @jordis-ai2, @Lucaweihs I'm looking forward to work with ai2thor-rearrangement challenge using PROCThor rooms dataset, can you please help me to understand how can I do so? Which branch can I use now? Can you please also share the example script to work with rearrangement one-phase or pwo-phase challenge using ProcTHOR dataset splits? Thank you a lot!

@jordis-ai2
Copy link
Author

jordis-ai2 commented Aug 3, 2022

Hi @mariiak2021. It's currently not possible to use the ProcTHOR-related code base, since the underlying 10k houses dataset isn't yet publicly available. We're hoping to release everything soon, though, so you will be able to start training your own models in a very short time 👍

@mariiak2021
Copy link

Hi @jordis-ai2 thank you for the fast answer!
Do you mean allenai/houses dataset? Is it based on procthor-10k dataset from prior package? If yes, can you please highlight the differences? It will help me a lot :)
Can you also tell what is the expected date for allenai/houses to be released if it's known already?
Thank you!

@jordis-ai2
Copy link
Author

I think @mattdeitke or @Lucaweihs would have more accurate answers to these questions, but here's what I know: The currently missing dataset is composed of 10,000 ProcTHOR one and two-room houses, out of which we used 2,500 for our rearrangement experiments. This dataset is not yet available in our upcoming distribution package -as far as I know-. Once it is there and the ongoing PR is merged, our experiments should be repeatable. As mentioned above, this is planned for the next few weeks.

@mariiak2021
Copy link

Hi @jordis-ai2 , so sorry I've missed your reply! Thanx a lot for the update. :) Will wait the release, but in the meantime I played around with your code and created my own dataset from PROCThor-10K.

I wanted to use it within the rearrangement challenge, but got a problem with instance_detections2D dictionary. This returns for me now only high level objects without children. To be more precise I guess I can see those objects inside instance_detections2D, which were not not "objects" (f.ex. "wall" or "door"). Those objects, which are affected by those lines:

def fix_object_names(self, house):
        known_assets = defaultdict(int)
        to_traverse = house["objects"][:]
        while len(to_traverse):
            cur_obj = to_traverse.pop()
            cur_obj["id"] = f'{cur_obj["assetId"]}_{known_assets[cur_obj["assetId"]]}'
            known_assets[cur_obj["assetId"]] += 1
            if "children" in cur_obj:
                to_traverse.extend(cur_obj["children"][:])

        return house

don't appear at all in the instance_detections2D dictionary.
Can you please suggest what can be the problem? Did I miss somewhere the necessary changes?
Thank you!

@jordis-ai2
Copy link
Author

jordis-ai2 commented Aug 15, 2022

Good job! In principle it should work just as well with those houses.

If I understand it correctly, the problem you describe is a mismatch between the objects available under metadata["objects"] (where, hopefully, the overridden IDs are actually shown) and metadata["instance_detections2D"].

I assume you initialized THOR with renderInstanceSegmentation=True (https://ai2thor.allenai.org/ithor/documentation/initialization/#initialization-renderinstancesegmentation).

I'm not sure which version of THOR you're using, but my guess is that instance segmentation might not be (fully) supported for ProcTHOR yet.

@mariiak2021
Copy link

Hi @jordis-ai2,
To get instance_detections2D I'm using renderInstanceSegmentation=True and getting; controller.last_event.instance_detections2D
as specified here https://ai2thor.allenai.org/ithor/documentation/environment-state/#event-instance_detections2d

In the metadata["objects"] the overridden IDs are shown: instead of 'id': 'Bed|2|0' we will see 'id': 'Bed_18_2_0' for example, which is the combination of 'assetId': 'Bed_18_2' + 0 as it's the only bed in the scene.

I have tried different versions of Ai2Thor with different errors:

  1. Using THOR_COMMIT_ID = "90eac925dc750818890069e3131f899998dc58b4" and function procthor_reset with "CreateHouse" action, I'm getting such type of the error:
RuntimeError: NullReferenceException: Object reference not set to an instance of an object. trace:   at UnityStandardAssets.Characters.FirstPerson.BaseFPSAgentController+<>c__DisplayClass283_0.<CreateHouse>b__2 (System.String id) [0x00003] in <6afd8f78be764eeba7be30f178fa1cb8>:0
  at System.Linq.Enumerable+WhereEnumerableIterator`1[TSource].GetCount (System.Boolean onlyIfCheap) [0x0001c] in <351e49e2a5bf4fd6beabb458ce2255f3>:0
  at System.Linq.Enumerable.Count[TSource] (System.Collections.Generic.IEnumerable`1[T] source) [0x00029] in <351e49e2a5bf4fd6beabb458ce2255f3>:0
  at UnityStandardAssets.Characters.FirstPerson.BaseFPSAgentController.CreateHouse (Thor.Procedural.Data.ProceduralHouse house) [0x00088] in <6afd8f78be764eeba7be30f178fa1cb8>:0
  at (wrapper managed-to-native) System.Reflection.MonoMethod.InternalInvoke(System.Reflection.MonoMethod,object,object[],System.Exception&)
  at System.Reflection.MonoMethod.Invoke (System.Object obj, System.Reflection.BindingFlags invokeAttr, System.Reflection.Binder binder, System.Object[] parameters, System.Globalization.CultureInfo culture) [0x00032] in <695d1cc93cca45069c528c15c9fdd749>:0
  1. Using THOR_COMMIT_ID = "391b3fae4d4cc026f1522e5acf60953560235971", "CreateHouse" is not available, so I'm suing simply self.controller.reset(scene=house) instead. In this case I'm able to get instance_detections2d dictionary, but only for general objects.
    F.ex. instead of getting the following keys:
window|2|1
Bed|2|0
wall|2|0.00|0.00|0.00|3.62
wall|2|0.00|0.00|7.23|0.00
room|2

I have only:

window|2|1
wall|2|0.00|0.00|0.00|3.62
wall|2|0.00|0.00|7.23|0.00
room|2

Here bed is missing, as it was inside "objects", which IDs were changed.

The funny thing is that if I don't use the fix_object_names function, instance_detections2D returns correct results.
The code to reproduce work as expected (with commented out fix_object_names function) and wrong work:

!pip install --extra-index-url https://ai2thor-pypi.allenai.org ai2thor==0+391b3fae4d4cc026f1522e5acf60953560235971 &> /dev/null
!pip install --upgrade ai2thor-colab &> /dev/null
import ai2thor
import ai2thor_colab

from ai2thor.controller import Controller
from ai2thor_colab import (
    plot_frames,
    show_objects_table,
    side_by_side,
    overlay,
    show_video
)

ai2thor_colab.start_xserver()

"AI2-THOR Version: " + ai2thor.__version__
controller = Controller()
!pip install prior &> /dev/null
import prior
houses = prior.load_dataset("procthor-10k")
from collections import defaultdict
house = houses["train"][12]
def fix_object_names(house):
        known_assets = defaultdict(int)
        to_traverse = house["objects"][:]
        while len(to_traverse):
            cur_obj = to_traverse.pop()
            cur_obj["id"] = f'{cur_obj["assetId"]}_{known_assets[cur_obj["assetId"]]}'
            #cur_obj["name"] = f'{cur_obj["assetId"]}_{known_assets[cur_obj["assetId"]]}'
            known_assets[cur_obj["assetId"]] += 1
            if "children" in cur_obj:
                to_traverse.extend(cur_obj["children"][:])

        return house
house = fix_object_names(house)
controller.reset(scene=house, renderInstanceSegmentation=True)
event = controller.step(action="RotateLeft")
a = controller.last_event.instance_detections2D
for k in a.keys():
  print (k)
#print (controller.last_event.metadata["objects"])
from PIL import Image
Image.fromarray(controller.last_event.frame)

Do you have any ideas what can I do?

  • F.ex is it possible to avoid IDs override? Can you explain why do we actually need this override? And is it must to override IDs specifically?
  • Or instead can I change somehow the IDs of the objects passed to instance_detection2D frame? that they will be matched
  • Or can I override IDs first when loading data from .pkl files to set up rearrangements, but then push IDs to their back condition to be able using instance_detection2D frames?

@jordis-ai2
Copy link
Author

First of all, if the new IDs are used for objects but not for detections without any error message, it seems to be a THOR issue, which will probably disappear once ProcTHOR is officially released.

Please keep in mind you don't really need to access instance segmentation to solve the rearrangement task. Indeed, in the official challenge, the only available visual inputs are RGB and depth. I.e., for the intended scope of this PR, you can just leave the current ID override on (this is how our models were trained and evaluated).

It's possible that you don't need to override object IDs any longer. With the original procthor-10k and supporting THOR build, some object IDs appeared more than once in the same scene, leading to undesired behavior.

For the other questions, I'm afraid they're out of the scope of this PR.

I don't want to repeat myself, but, if you can wait for a few weeks, you will be able to use the official release, where things will be less likely to suddenly change. In any case, I think you're in the right direction 👍

@mariiak2021
Copy link

Hi @jordis-ai2 thank you for your answers! I really need instance_detection2D to get 2D bboxes for training my model. :)
Anyway will wait then for the official release - thanx for the great job you are all doing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants