diff --git a/notebooks/3d_vision/FIG/3D vision.PNG b/notebooks/3d_vision/FIG/3D vision.PNG new file mode 100644 index 0000000..8df37a5 Binary files /dev/null and b/notebooks/3d_vision/FIG/3D vision.PNG differ diff --git a/notebooks/3d_vision/FIG/3D.jpeg b/notebooks/3d_vision/FIG/3D.jpeg new file mode 100644 index 0000000..57ccec5 Binary files /dev/null and b/notebooks/3d_vision/FIG/3D.jpeg differ diff --git a/notebooks/3d_vision/FIG/adv.PNG b/notebooks/3d_vision/FIG/adv.PNG new file mode 100644 index 0000000..2f22b6a Binary files /dev/null and b/notebooks/3d_vision/FIG/adv.PNG differ diff --git a/notebooks/3d_vision/FIG/pointnet.PNG b/notebooks/3d_vision/FIG/pointnet.PNG new file mode 100644 index 0000000..300bdfe Binary files /dev/null and b/notebooks/3d_vision/FIG/pointnet.PNG differ diff --git a/notebooks/3d_vision/index.ipynb b/notebooks/3d_vision/index.ipynb new file mode 100644 index 0000000..7b4cbc6 --- /dev/null +++ b/notebooks/3d_vision/index.ipynb @@ -0,0 +1,306 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "TanQk2rJZ0TO" + }, + "source": [ + "
\n", + "\t\n", + "\t\t
\n", + "\t\t\t\t

\n", + "\t\t\t\t

\n", + "In the Name of God\n", + "

\n", + "
\n", + " Sharif University of Technology\n", + "
\n", + "Computer Engineering Department\n", + "

\n", + "Artificial Intelligence Course\n", + "
\n", + "\t\t\t
\n", + " MohammadHossein Rohban\n", + "
\n", + "Fall 2021\n", + "
\n", + "\t\t
\n", + "\t\t\t
\n", + "3D Computer Vision\n", + "
\n", + "\t\t
\n", + "\t\t
\n", + "Kimia Noorbakhsh\n", + "
\n", + "\t\t
\n", + "\t\t\n", + "\t\t
\n", + "\t\t\t

Table of Contents

\n", + "\t\t\t\n", + "\t\t
\n", + "\t
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cwCRBwuzZ0TT" + }, + "source": [ + "

\n", + "
\n", + "
\n", + "\t\n", + "\t\t\n", + "Introduction\n", + " \n", + "\t\t

\n", + "\t\t
\n", + "

With major recent developments in 3D sensors, large-scale 3D geometry datasets, and advances in deep 3D learning, 3D computer vision is playing an increasingly important role in many fields. This includes a varied range of cutting-edge applications spanning robotics, autonomous navigation and localization, 3D scene reconstruction, 3D scene understanding, 3D tracking and surveillance, digital city modeling, and virtual and augmented reality.

\n", + "

The goal of this Research Topic is to present high-quality research on 3D computer vision that addresses challenging problems in 3D data processing, proposes new developments for reliable 3D data acquisition, and advances the state of the art in 3D applications.

\n", + "

Some of the areas of focus are:

\n", + "
    \n", + "
  1. 3D analysis and modeling (3D deep learning and 3D machine learning, 3D recognition, 3D Segmentation, 3D detection, 3D registration)
  2. \n", + "
  3. 3D acquisition (calibration, structure from motion and SLAM, computational photography, 3D reconstruction)
  4. \n", + "
  5. 3D applications (robotics, medical applications, sports applications, digital fabrication)
  6. \n", + "
\n", + "

In this notebook, we will focus on the first application.

" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OtSNQfxEdrGP" + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "hUVpctDiZ0TU" + }, + "source": [ + "

\n", + "
\n", + "
\n", + "\t\n", + "\t\t\n", + "3D Data Representation\n", + " \n", + "\t\t

\n", + "\t\t
\n", + "
    \n", + "
  1. 3D Point Clouds
  2. \n", + "
  3. 3D Models
  4. \n", + "
  5. 3D Mesh
  6. \n", + "
  7. Voxel-based models
  8. \n", + "
  9. Parametric Model (CAD)
  10. \n", + "
  11. Other representations
  12. \n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

Show a code to read the ModelNet40 dataset and visualize some of its samples.

" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# pointnet" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Yk3oVhZwZ0TW" + }, + "source": [ + "

\n", + "
\n", + "
\n", + "\t\n", + "\t\t\n", + "Deep Learning on Unsorted Sets (The rise of Pointnet)\n", + " \n", + "\t\t

\n", + "\t\t
\n", + "

From a data structure point of view, a point cloud is an unordered set of vectors. While most works in deep learning focus on regular input representations like sequences (in speech and language processing), images, and volumes (video or 3D data), not much work has been done in deep learning on point sets.

\n", + "

Pointnet Architecture:

\n", + "

Pointnet is a deep learning framework that directly consumes unordered point sets as inputs. A point cloud is represented as a set of 3D points ${P_i| i = 1, \\dots, n}$, where each point $P_i$ is a vector of its ($x, y, z$) coordinate plus extra feature channels such as color, normal, etc. For simplicity and clarity, unless otherwise noted, we only use the ($x, y, z$) coordinate as our point’s channels.

\n", + "

For the object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud. Our proposed deep network outputs $k$ scores for all the $k$ candidate classes. For semantic segmentation, the input can be a single object for part region segmentation or a sub-volume from a 3D scene for object region segmentation. Our model will output $n \\times m$ scores for each of the n points and each of the $m$ semantic subcategories.

" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Show the code of pointnet archtecture and evaluate a pre-trained model on the ModelNet40 data." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

Introduce other useful point cloud models and link to their papers.

" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

\n", + "
\n", + "
\n", + "\t\n", + "\t\t\n", + "3D Adversarial Attacks\n", + " \n", + "\t\t

\n", + "\t\t
\n", + "

Reference to the original 3D adversarial attack paper and introduce the procedure.

" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "

\n", + "
\n", + "
\n", + "\t\n", + "\t\t\n", + "Papers and Labs regarding 3D Computer vision\n", + " \n", + "\t\t

\n", + "\t\t
\n", + "

List of papers:

\n", + "


\n", + "

Labs and groups working on 3D Computer Vision:

\n", + "
    \n", + "
  1. IPL Lab (Sharif)
  2. \n", + "
  3. TUM Computer Vision Group
  4. \n", + "
  5. ETH Computer Vision Lab
  6. \n", + "
  7. 3D Scene Geometry
  8. \n", + "
  9. Center for Research in Computer Vision, University of Central Florida (UCF)
  10. \n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dk_lolyJZ0TX" + }, + "source": [ + "

\n", + "
\n", + "
\n", + "\t\n", + "\t\t\n", + "References\n", + " \n", + "\t\t
\n", + "
    \n", + "
  • \n", + "https://towardsdatascience.com/how-to-represent-3d-data-66a0f6376afb\n", + "
  • \n", + "
  • \n", + "Reference 2\n", + "
  • \n", + "
\n", + "\t
\n", + "
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "XoZQaIqIZ0TX" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [], + "name": "ai_content_english_template.ipynb", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.5" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/notebooks/3d_vision/matadata.yml b/notebooks/3d_vision/matadata.yml new file mode 100644 index 0000000..175c1d6 --- /dev/null +++ b/notebooks/3d_vision/matadata.yml @@ -0,0 +1,25 @@ +title: 3D Computer Vision # shown on browser tab + +header: + title: 3D Computer Vision # title of your notebook + description: This notebook intruduces 3D Computer Vision and pointclouds, famous neural networks and adversarial attacks in this field. + +authors: + label: + position: top + content: + # list of notebook authors + - name: Kimia Noorbakhsh + role: Teacher Assistant + contact: + # list of contact information + - link: https://github.com/kimianoorbakhsh + icon: fab fa-github + # optionally add other contact information like + # - link: # contact link + # icon: # awsomefont tag for link (check: https://fontawesome.com/v5.15/icons) + +comments: + # enable comments for your post + label: false + kind: comments \ No newline at end of file diff --git a/notebooks/index.yml b/notebooks/index.yml index 7b516fc..889513f 100644 --- a/notebooks/index.yml +++ b/notebooks/index.yml @@ -1,3 +1,4 @@ - notebook: notebooks/logic_programming - notebook: notebooks/search_in_continuous_space - notebook: notebooks/hmm_speech_recognition +- notebook: notebooks/3d_vision \ No newline at end of file