diff --git a/notebooks/3d_vision/FIG/3D vision.PNG b/notebooks/3d_vision/FIG/3D vision.PNG new file mode 100644 index 0000000..8df37a5 Binary files /dev/null and b/notebooks/3d_vision/FIG/3D vision.PNG differ diff --git a/notebooks/3d_vision/FIG/3D.jpeg b/notebooks/3d_vision/FIG/3D.jpeg new file mode 100644 index 0000000..57ccec5 Binary files /dev/null and b/notebooks/3d_vision/FIG/3D.jpeg differ diff --git a/notebooks/3d_vision/FIG/adv.PNG b/notebooks/3d_vision/FIG/adv.PNG new file mode 100644 index 0000000..2f22b6a Binary files /dev/null and b/notebooks/3d_vision/FIG/adv.PNG differ diff --git a/notebooks/3d_vision/FIG/pointnet.PNG b/notebooks/3d_vision/FIG/pointnet.PNG new file mode 100644 index 0000000..300bdfe Binary files /dev/null and b/notebooks/3d_vision/FIG/pointnet.PNG differ diff --git a/notebooks/3d_vision/index.ipynb b/notebooks/3d_vision/index.ipynb new file mode 100644 index 0000000..7b4cbc6 --- /dev/null +++ b/notebooks/3d_vision/index.ipynb @@ -0,0 +1,306 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "TanQk2rJZ0TO" + }, + "source": [ + "
With major recent developments in 3D sensors, large-scale 3D geometry datasets, and advances in deep 3D learning, 3D computer vision is playing an increasingly important role in many fields. This includes a varied range of cutting-edge applications spanning robotics, autonomous navigation and localization, 3D scene reconstruction, 3D scene understanding, 3D tracking and surveillance, digital city modeling, and virtual and augmented reality.
\n", + "The goal of this Research Topic is to present high-quality research on 3D computer vision that addresses challenging problems in 3D data processing, proposes new developments for reliable 3D data acquisition, and advances the state of the art in 3D applications.
\n", + "Some of the areas of focus are:
\n", + "In this notebook, we will focus on the first application.
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OtSNQfxEdrGP" + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hUVpctDiZ0TU" + }, + "source": [ + "\n", + "Show a code to read the ModelNet40 dataset and visualize some of its samples.
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# pointnet" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Yk3oVhZwZ0TW" + }, + "source": [ + "\n", + "From a data structure point of view, a point cloud is an unordered set of vectors. While most works in deep learning focus on regular input representations like sequences (in speech and language processing), images, and volumes (video or 3D data), not much work has been done in deep learning on point sets.
\n", + "Pointnet Architecture:
\n", + "Pointnet is a deep learning framework that directly consumes unordered point sets as inputs. A point cloud is represented as a set of 3D points ${P_i| i = 1, \\dots, n}$, where each point $P_i$ is a vector of its ($x, y, z$) coordinate plus extra feature channels such as color, normal, etc. For simplicity and clarity, unless otherwise noted, we only use the ($x, y, z$) coordinate as our point’s channels.
\n", + "For the object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud. Our proposed deep network outputs $k$ scores for all the $k$ candidate classes. For semantic segmentation, the input can be a single object for part region segmentation or a sub-volume from a 3D scene for object region segmentation. Our model will output $n \\times m$ scores for each of the n points and each of the $m$ semantic subcategories.
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Show the code of pointnet archtecture and evaluate a pre-trained model on the ModelNet40 data." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Introduce other useful point cloud models and link to their papers.
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "Reference to the original 3D adversarial attack paper and introduce the procedure.
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "List of papers:
\n", + "Labs and groups working on 3D Computer Vision:
\n", + "