-
Notifications
You must be signed in to change notification settings - Fork 66
minimal example running on NCS #8
Comments
Happy days! I'm delighted you have got it working! I am currently collecting another dataset but hope to get it running on the NCS soon. |
I finally got a chance to test it out tonight. It looks like it is working but I had to remove an encoding layer as my images are 640x480 and a 512 patch size is too big. I changed it to look like this: Is that the correct way to do it? Also I uncommented the modelTester code (I love me some stats) and got the following error: It seems to be picking up the image size, not the patch size. How do I best mix patches with the testing code? |
yeah that (256,256) -> (127,127) all looks good. with respect to the (127,127) you're sadly hitting some hard coded stuff i have in there... it's this bit of code which is an explicit slice/reshape workaround for the size/shape of the 2d output being wrong it's clumsy i know, but that could be configurable (until there's a fix..) |
ahh, would it be quicker for me to just crop the test images to 127,127 or will it work if I change the shape of the output? |
changing the code to match your size would probably be the quickest... |
I finally got a bit of time this morning and managed to get it working from start to NCS finish! Unfortunately the results were not great. I went back to train.py and uncommented the test code to see how well the training was working. There was an issue with the training network set up for a certain patch size and the test network being used on the full image so I turned off the patches and changed the image shape in data.py to resize to 239x319. The network topology and labels now match up: but when I run the training I get the following error: so 240x320 is 76800 but I cannot see anywhere in the code where the tensor is being set to 19200 and I am starting to realise that tensorflow is difficult to debug to say the least! Do you have any suggestions to see where this is getting set of for debugging tensorflow models? |
I thought it may have been the shape of my images so I resized them to match the patch size, I also resized my labels to 64x64 with nearest neighbour interpolation. I get the same error despite the dumped shapes of the models being identical So it works with the patch flag but not without. This means it is either something wrong with my labels or i'm missing something in xys_iterator. I tried tfdbg but it is hard to see what is going on.... |
yeah, it's been a nightmare to debug... i've also made this repo now more complicated than it needs to be because i've been confounding two things 1) running a patch batched model with fixed sized inference to run on the NCS and 2) training patch based and running on arbitrary sized output for my meta learning experiments; i should really move 2) into it's own repo since it requires different things than 1) on the data pipeline.... but that's an aside... are you trying to run with an output of (239,319) on the NCS? i recall having a problem where i couldn't get anything over (127,127) as output on the stick... can you share a larger stack trace around the |
It seemed to compile and run with 239,219 but the output was a mess. I will resize it if I hit the same 127,127 limitation Here is the stack trace and some additional information, as the expected size (19200) is 120 x 160 and the value being passed (76800) is 240 x 320 I think it might be the output of a particular layer is the wrong size. I used the slim model analyzer to get more info but still cannot see anything wrong. $ ./train.py --run $RUN --steps $STEPS --train-steps 1000 --train-image-dir $DATADIR/train/ --test-image-dir $DATADIR/test/ --label-dir $DATADIR/labels/ --no-use-batch-norm --no-use-skip-connections --width 640 --height 480 --label-rescale 0.25
|
thanks for waiting jono, i still haven't had a chance to look at this yet... hopefully this afternoon the planets will align for some free time :D |
No rush! I only get a chance to look at it at the weekend atm |
I am going to try rewriting the code over the weekend to work with my images, is the NCS_POC still the latest version or should I be working off the master branch? |
Yeah. I still haven't merged it back yet sorry (since it also needs some clean up) but it demonstrates the things I needed to do. Good luck! |
Wow, super excited that you got this working on the NCS as well. So at some point I'm going to try to get this running on our DepthAI platform (here) so that you can know the physical location in cartesian coordinates (x,y,z) in centimeters of the bees - so to be able to map their 3D flight patterns. |
cc @squeakus
finally have a version of this network running on the NCS :)
( this image was calculated from the stick )
currently all code is on a hacky branch
ncs_poc
just to prove things work, have to clean up a fair bit and merge everything back to master.see this README for repro instructions
The text was updated successfully, but these errors were encountered: