sut-ai · kimianoorbakhsh · Oct 1, 2021 · Oct 1, 2021 · Oct 1, 2021
diff --git a/notebooks/3d_vision/FIG/3D vision.PNG b/notebooks/3d_vision/FIG/3D vision.PNG
diff --git a/notebooks/3d_vision/FIG/3D.jpeg b/notebooks/3d_vision/FIG/3D.jpeg
diff --git a/notebooks/3d_vision/FIG/adv.PNG b/notebooks/3d_vision/FIG/adv.PNG
diff --git a/notebooks/3d_vision/FIG/pointnet.PNG b/notebooks/3d_vision/FIG/pointnet.PNG
diff --git a/notebooks/3d_vision/index.ipynb b/notebooks/3d_vision/index.ipynb
@@ -0,0 +1,306 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "TanQk2rJZ0TO"
+   },
+   "source": [
+    "<div style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<div align=center>\n",
+    "\t\t\t\t<p></p>\n",
+    "\t\t\t\t<p></p>\n",
+    "In the Name of God\n",
+    "            <p></p>\n",
+    "<br>\n",
+    "            Sharif University of Technology\n",
+    "            <br>\n",
+    "Computer Engineering Department\n",
+    "            <p></p>\n",
+    "Artificial Intelligence Course\n",
+    "            <br />\n",
+    "\t\t\t<br />\n",
+    "            MohammadHossein Rohban\n",
+    "            <br />\n",
+    "Fall 2021\n",
+    "        </div>\n",
+    "\t\t<hr/>\n",
+    "\t\t\t<div align=center>\n",
+    "3D Computer Vision\n",
+    "        </div>\n",
+    "\t\t<br />\n",
+    "\t\t<div align=center>\n",
+    "Kimia Noorbakhsh\n",
+    "        </div>\n",
+    "\t\t<hr />\n",
+    "\t\t<style type=\"text/css\" scoped>\n",
+    "        p{\n",
+    "        border: 1px solid #a2a9b1;background-color: #f8f9fa;display: inline-block;\n",
+    "        };\n",
+    "        </style>\n",
+    "\t\t<div>\n",
+    "\t\t\t<h3>Table of Contents</h3>\n",
+    "\t\t\t<ul style=\"margin-right: 0;\">\n",
+    "\t\t\t\t<li>\n",
+    "                    <a href=\"#sec_intro\">\n",
+    "                        Introduction\n",
+    "                    </a>\n",
+    "                </li>\n",
+    "                <li>\n",
+    "\t\t\t\t\t<a href=\"#sec_1\">\n",
+    "                    3D Data Representation\n",
+    "                    </a>\n",
+    "\t\t\t\t</li>\n",
+    "                <li>\n",
+    "\t\t\t\t\t<a href=\"#sec_2\">\n",
+    "                Deep Learning on Unsorted Sets (The rise of Pointnet)\n",
+    "                    </a>\n",
+    "\t\t\t\t</li>\n",
+    "                 <li>\n",
+    "\t\t\t\t\t<a href=\"#sec_3\">\n",
+    "                    3D Adversarial Attacks\n",
+    "                    </a>\n",
+    "\t\t\t\t</li>\n",
+    "                <li>\n",
+    "\t\t\t\t\t<a href=\"#sec_4\">\n",
+    "                    Papers and Labs regarding 3D Computer vision\n",
+    "                    </a>\n",
+    "\t\t\t\t</li>\n",
+    "\t\t\t</ul>\n",
+    "\t\t</div>\n",
+    "\t</font>\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "cwCRBwuzZ0TT"
+   },
+   "source": [
+    "<p></p>\n",
+    "<br />\n",
+    "<div id=\"sec_intro\" style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<font color=#888888 size=6>\n",
+    "Introduction\n",
+    "        </font>\n",
+    "\t\t<p></p>\n",
+    "\t\t<hr>\n",
+    "<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'><span style=\"box-sizing: border-box;\">With major recent developments in 3D sensors, large-scale 3D geometry datasets, and advances in deep 3D learning, 3D computer vision is playing an increasingly important role in many fields. This includes a varied range of cutting-edge applications spanning robotics, autonomous navigation and localization, 3D <span style=\"box-sizing: border-box;\">scene reconstruction</span>, 3D <span style=\"box-sizing: border-box;\">scene understanding</span>, 3D <span style=\"box-sizing: border-box;\">tracking and surveillance</span>, <span style=\"box-sizing: border-box;\">digital city modeling</span>, and virtual and <span style=\"box-sizing: border-box;\">augmented reality</span>.</span></p>\n",
+    "<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'><span style=\"box-sizing: border-box;\">The goal of this Research Topic is to present high-quality research on 3D computer vision that addresses challenging problems in 3D data processing, proposes new developments for reliable 3D data acquisition, and advances the state of the art in 3D applications.</span></p>\n",
+    "<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'>Some of the areas of focus are:</p>\n",
+    "<ol style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'>\n",
+    "    <li style=\"box-sizing: border-box;\">3D analysis and modeling (3D deep learning and 3D machine learning, 3D recognition, 3D Segmentation, 3D detection, 3D registration)</li>\n",
+    "    <li style=\"box-sizing: border-box;\">3D acquisition (calibration, structure from motion and SLAM, computational photography, 3D reconstruction)</li>\n",
+    "    <li style=\"box-sizing: border-box;\">3D applications (robotics, medical applications, sports applications, digital fabrication)</li>\n",
+    "</ol>\n",
+    "<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'>In this notebook, we will focus on the first application.</p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "OtSNQfxEdrGP"
+   },
+   "source": [
+    "<img src=\"FIG/3D.jpeg\" width=\"800\" height=\"400\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "hUVpctDiZ0TU"
+   },
+   "source": [
+    "<p></p>\n",
+    "<br />\n",
+    "<div id=\"sec_1\" style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<font color=#888888 size=6>\n",
+    "3D Data Representation\n",
+    "        </font>\n",
+    "\t\t<p></p>\n",
+    "\t\t<hr>\n",
+    "<ol>\n",
+    "    <li>3D Point Clouds</li>\n",
+    "    <li>3D Models</li>\n",
+    "    <li>3D Mesh</li>\n",
+    "    <li>Voxel-based models</li>\n",
+    "    <li>Parametric Model (CAD)</li>\n",
+    "    <li>Other representations</li>\n",
+    "</ol>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<p>Show a code to read the ModelNet40 dataset and visualize some of its samples.</p>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# pointnet"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "Yk3oVhZwZ0TW"
+   },
+   "source": [
+    "<p></p>\n",
+    "<br />\n",
+    "<div id=\"sec_2\" style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<font color=#888888 size=6>\n",
+    "Deep Learning on Unsorted Sets (The rise of Pointnet)\n",
+    "        </font>\n",
+    "\t\t<p></p>\n",
+    "\t\t<hr>\n",
+    "<p>From a data structure point of view, a point cloud is an unordered set of vectors. While most works in deep learning focus on regular input representations like sequences (in speech and language processing), images, and volumes (video or 3D data), not much work has been done in deep learning on point sets.</p>\n",
+    "<p><strong><span style=\"font-size: 24px;\">Pointnet Architecture:</span></strong></p>\n",
+    "<p>Pointnet is a deep learning framework that directly consumes unordered point sets as inputs. A point cloud is represented as a set of 3D points ${P_i| i = 1, \\dots, n}$, where each point $P_i$ is a vector of its ($x, y, z$) coordinate plus extra feature channels such as color, normal, etc. For simplicity and clarity, unless otherwise noted, we only use the ($x, y, z$) coordinate as our point&rsquo;s channels.</p>\n",
+    "<p>For the object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud. Our proposed deep network outputs $k$ scores for all the $k$ candidate classes. For semantic segmentation, the input can be a single object for part region segmentation or a sub-volume from a 3D scene for object region segmentation. Our model will output $n \\times m$ scores for each of the n points and each of the $m$ semantic subcategories.</p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"FIG/pointnet.PNG\" width=\"800\" height=\"400\">"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Show the code of pointnet archtecture and evaluate a pre-trained model on the ModelNet40 data."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<p>Introduce other useful point cloud models and link to their papers.</p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<p></p>\n",
+    "<br />\n",
+    "<div id=\"sec_3\" style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<font color=#888888 size=6>\n",
+    "3D Adversarial Attacks\n",
+    "        </font>\n",
+    "\t\t<p></p>\n",
+    "\t\t<hr>\n",
+    "<p>Reference to the original 3D adversarial attack paper and introduce the procedure.</p>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"FIG/adv.PNG\" width=\"800\" height=\"400\">"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<p></p>\n",
+    "<br />\n",
+    "<div id=\"sec_4\" style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<font color=#888888 size=6>\n",
+    "Papers and Labs regarding 3D Computer vision\n",
+    "        </font>\n",
+    "\t\t<p></p>\n",
+    "\t\t<hr>\n",
+    "<p>List of papers:</p>\n",
+    "<p><br></p>\n",
+    "<p>Labs and groups working on 3D Computer Vision:</p>\n",
+    "<ol>\n",
+    "    <li><a href=\"http://ipl.ce.sharif.edu/\" rel=\"noopener noreferrer\" target=\"_blank\">IPL Lab (Sharif)</a></li>\n",
+    "    <li><a href=\"https://vision.in.tum.de/research\" rel=\"noopener noreferrer\" target=\"_blank\">TUM Computer Vision Group</a></li>\n",
+    "    <li><a href=\"https://vision.ee.ethz.ch/\" rel=\"noopener noreferrer\" target=\"_blank\">ETH Computer Vision Lab</a></li>\n",
+    "    <li><a href=\"http://www.cs.cmu.edu/~abhinavg/affordances/\" rel=\"noopener noreferrer\" target=\"_blank\">3D Scene Geometry</a></li>\n",
+    "    <li><a href=\"https://www.crcv.ucf.edu/\" rel=\"noopener noreferrer\" target=\"_blank\">Center for Research in Computer Vision, University of Central Florida (UCF)</a></li>\n",
+    "</ol>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "dk_lolyJZ0TX"
+   },
+   "source": [
+    "<p></p>\n",
+    "<br/>\n",
+    "<div id=\"sec_refs\" style=\"direction:ltr;line-height:300%;\">\n",
+    "\t<font face=\"Arial\" size=5>\n",
+    "\t\t<font color=#888888 size=6>\n",
+    "References\n",
+    "        </font>\n",
+    "\t\t<hr>       \n",
+    "        <ul>\n",
+    "            <li>\n",
+    "https://towardsdatascience.com/how-to-represent-3d-data-66a0f6376afb\n",
+    "            </li>\n",
+    "            <li>\n",
+    "Reference 2\n",
+    "            </li>\n",
+    "        </ul>\n",
+    "\t</font>\n",
+    "</div>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "XoZQaIqIZ0TX"
+   },
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "collapsed_sections": [],
+   "name": "ai_content_english_template.ipynb",
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/notebooks/3d_vision/matadata.yml b/notebooks/3d_vision/matadata.yml
@@ -0,0 +1,25 @@
+title: 3D Computer Vision # shown on browser tab
+
+header:
+    title: 3D Computer Vision # title of your notebook
+    description: This notebook intruduces 3D Computer Vision and pointclouds, famous neural networks and adversarial attacks in this field.
+
+authors:
+    label: 
+        position: top
+    content:
+    # list of notebook authors
+    - name: Kimia Noorbakhsh
+      role: Teacher Assistant
+      contact:
+      # list of contact information
+      - link: https://github.com/kimianoorbakhsh
+        icon: fab fa-github
+      # optionally add other contact information like  
+      # - link: <change this> # contact link
+      #   icon: <change this> # awsomefont tag for link (check: https://fontawesome.com/v5.15/icons)
+
+comments:
+    # enable comments for your post
+    label: false
+    kind: comments
diff --git a/notebooks/index.yml b/notebooks/index.yml
@@ -1,3 +1,4 @@
 - notebook: notebooks/logic_programming
 - notebook: notebooks/search_in_continuous_space
 - notebook: notebooks/hmm_speech_recognition
+- notebook: notebooks/3d_vision