Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added 3D vision #8

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added notebooks/3d_vision/FIG/3D vision.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added notebooks/3d_vision/FIG/3D.jpeg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added notebooks/3d_vision/FIG/adv.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added notebooks/3d_vision/FIG/pointnet.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
306 changes: 306 additions & 0 deletions notebooks/3d_vision/index.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,306 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "TanQk2rJZ0TO"
},
"source": [
"<div style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<div align=center>\n",
"\t\t\t\t<p></p>\n",
"\t\t\t\t<p></p>\n",
"In the Name of God\n",
" <p></p>\n",
"<br>\n",
" Sharif University of Technology\n",
" <br>\n",
"Computer Engineering Department\n",
" <p></p>\n",
"Artificial Intelligence Course\n",
" <br />\n",
"\t\t\t<br />\n",
" MohammadHossein Rohban\n",
" <br />\n",
"Fall 2021\n",
" </div>\n",
"\t\t<hr/>\n",
"\t\t\t<div align=center>\n",
"3D Computer Vision\n",
" </div>\n",
"\t\t<br />\n",
"\t\t<div align=center>\n",
"Kimia Noorbakhsh\n",
" </div>\n",
"\t\t<hr />\n",
"\t\t<style type=\"text/css\" scoped>\n",
" p{\n",
" border: 1px solid #a2a9b1;background-color: #f8f9fa;display: inline-block;\n",
" };\n",
" </style>\n",
"\t\t<div>\n",
"\t\t\t<h3>Table of Contents</h3>\n",
"\t\t\t<ul style=\"margin-right: 0;\">\n",
"\t\t\t\t<li>\n",
" <a href=\"#sec_intro\">\n",
" Introduction\n",
" </a>\n",
" </li>\n",
" <li>\n",
"\t\t\t\t\t<a href=\"#sec_1\">\n",
" 3D Data Representation\n",
" </a>\n",
"\t\t\t\t</li>\n",
" <li>\n",
"\t\t\t\t\t<a href=\"#sec_2\">\n",
" Deep Learning on Unsorted Sets (The rise of Pointnet)\n",
" </a>\n",
"\t\t\t\t</li>\n",
" <li>\n",
"\t\t\t\t\t<a href=\"#sec_3\">\n",
" 3D Adversarial Attacks\n",
" </a>\n",
"\t\t\t\t</li>\n",
" <li>\n",
"\t\t\t\t\t<a href=\"#sec_4\">\n",
" Papers and Labs regarding 3D Computer vision\n",
" </a>\n",
"\t\t\t\t</li>\n",
"\t\t\t</ul>\n",
"\t\t</div>\n",
"\t</font>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cwCRBwuzZ0TT"
},
"source": [
"<p></p>\n",
"<br />\n",
"<div id=\"sec_intro\" style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<font color=#888888 size=6>\n",
"Introduction\n",
" </font>\n",
"\t\t<p></p>\n",
"\t\t<hr>\n",
"<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'><span style=\"box-sizing: border-box;\">With major recent developments in 3D sensors, large-scale 3D geometry datasets, and advances in deep 3D learning, 3D computer vision is playing an increasingly important role in many fields. This includes a varied range of cutting-edge applications spanning robotics, autonomous navigation and localization, 3D <span style=\"box-sizing: border-box;\">scene reconstruction</span>, 3D <span style=\"box-sizing: border-box;\">scene understanding</span>, 3D <span style=\"box-sizing: border-box;\">tracking and surveillance</span>, <span style=\"box-sizing: border-box;\">digital city modeling</span>, and virtual and <span style=\"box-sizing: border-box;\">augmented reality</span>.</span></p>\n",
"<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'><span style=\"box-sizing: border-box;\">The goal of this Research Topic is to present high-quality research on 3D computer vision that addresses challenging problems in 3D data processing, proposes new developments for reliable 3D data acquisition, and advances the state of the art in 3D applications.</span></p>\n",
"<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'>Some of the areas of focus are:</p>\n",
"<ol style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'>\n",
" <li style=\"box-sizing: border-box;\">3D analysis and modeling (3D deep learning and 3D machine learning, 3D recognition, 3D Segmentation, 3D detection, 3D registration)</li>\n",
" <li style=\"box-sizing: border-box;\">3D acquisition (calibration, structure from motion and SLAM, computational photography, 3D reconstruction)</li>\n",
" <li style=\"box-sizing: border-box;\">3D applications (robotics, medical applications, sports applications, digital fabrication)</li>\n",
"</ol>\n",
"<p style='box-sizing: border-box; color: rgb(34, 34, 34); font-family: \"Roboto Regular\", \"Helvetica Neue\", Helvetica, Arial; font-size: 18px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-thickness: initial; text-decoration-style: initial; text-decoration-color: initial;'>In this notebook, we will focus on the first application.</p>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "OtSNQfxEdrGP"
},
"source": [
"<img src=\"FIG/3D.jpeg\" width=\"800\" height=\"400\">"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "hUVpctDiZ0TU"
},
"source": [
"<p></p>\n",
"<br />\n",
"<div id=\"sec_1\" style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<font color=#888888 size=6>\n",
"3D Data Representation\n",
" </font>\n",
"\t\t<p></p>\n",
"\t\t<hr>\n",
"<ol>\n",
" <li>3D Point Clouds</li>\n",
" <li>3D Models</li>\n",
" <li>3D Mesh</li>\n",
" <li>Voxel-based models</li>\n",
" <li>Parametric Model (CAD)</li>\n",
" <li>Other representations</li>\n",
"</ol>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<p>Show a code to read the ModelNet40 dataset and visualize some of its samples.</p>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# pointnet"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Yk3oVhZwZ0TW"
},
"source": [
"<p></p>\n",
"<br />\n",
"<div id=\"sec_2\" style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<font color=#888888 size=6>\n",
"Deep Learning on Unsorted Sets (The rise of Pointnet)\n",
" </font>\n",
"\t\t<p></p>\n",
"\t\t<hr>\n",
"<p>From a data structure point of view, a point cloud is an unordered set of vectors. While most works in deep learning focus on regular input representations like sequences (in speech and language processing), images, and volumes (video or 3D data), not much work has been done in deep learning on point sets.</p>\n",
"<p><strong><span style=\"font-size: 24px;\">Pointnet Architecture:</span></strong></p>\n",
"<p>Pointnet is a deep learning framework that directly consumes unordered point sets as inputs. A point cloud is represented as a set of 3D points ${P_i| i = 1, \\dots, n}$, where each point $P_i$ is a vector of its ($x, y, z$) coordinate plus extra feature channels such as color, normal, etc. For simplicity and clarity, unless otherwise noted, we only use the ($x, y, z$) coordinate as our point&rsquo;s channels.</p>\n",
"<p>For the object classification task, the input point cloud is either directly sampled from a shape or pre-segmented from a scene point cloud. Our proposed deep network outputs $k$ scores for all the $k$ candidate classes. For semantic segmentation, the input can be a single object for part region segmentation or a sub-volume from a 3D scene for object region segmentation. Our model will output $n \\times m$ scores for each of the n points and each of the $m$ semantic subcategories.</p>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"FIG/pointnet.PNG\" width=\"800\" height=\"400\">"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Show the code of pointnet archtecture and evaluate a pre-trained model on the ModelNet40 data."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<p>Introduce other useful point cloud models and link to their papers.</p>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<p></p>\n",
"<br />\n",
"<div id=\"sec_3\" style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<font color=#888888 size=6>\n",
"3D Adversarial Attacks\n",
" </font>\n",
"\t\t<p></p>\n",
"\t\t<hr>\n",
"<p>Reference to the original 3D adversarial attack paper and introduce the procedure.</p>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"FIG/adv.PNG\" width=\"800\" height=\"400\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<p></p>\n",
"<br />\n",
"<div id=\"sec_4\" style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<font color=#888888 size=6>\n",
"Papers and Labs regarding 3D Computer vision\n",
" </font>\n",
"\t\t<p></p>\n",
"\t\t<hr>\n",
"<p>List of papers:</p>\n",
"<p><br></p>\n",
"<p>Labs and groups working on 3D Computer Vision:</p>\n",
"<ol>\n",
" <li><a href=\"http://ipl.ce.sharif.edu/\" rel=\"noopener noreferrer\" target=\"_blank\">IPL Lab (Sharif)</a></li>\n",
" <li><a href=\"https://vision.in.tum.de/research\" rel=\"noopener noreferrer\" target=\"_blank\">TUM Computer Vision Group</a></li>\n",
" <li><a href=\"https://vision.ee.ethz.ch/\" rel=\"noopener noreferrer\" target=\"_blank\">ETH Computer Vision Lab</a></li>\n",
" <li><a href=\"http://www.cs.cmu.edu/~abhinavg/affordances/\" rel=\"noopener noreferrer\" target=\"_blank\">3D Scene Geometry</a></li>\n",
" <li><a href=\"https://www.crcv.ucf.edu/\" rel=\"noopener noreferrer\" target=\"_blank\">Center for Research in Computer Vision, University of Central Florida (UCF)</a></li>\n",
"</ol>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dk_lolyJZ0TX"
},
"source": [
"<p></p>\n",
"<br/>\n",
"<div id=\"sec_refs\" style=\"direction:ltr;line-height:300%;\">\n",
"\t<font face=\"Arial\" size=5>\n",
"\t\t<font color=#888888 size=6>\n",
"References\n",
" </font>\n",
"\t\t<hr> \n",
" <ul>\n",
" <li>\n",
"https://towardsdatascience.com/how-to-represent-3d-data-66a0f6376afb\n",
" </li>\n",
" <li>\n",
"Reference 2\n",
" </li>\n",
" </ul>\n",
"\t</font>\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "XoZQaIqIZ0TX"
},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "ai_content_english_template.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
25 changes: 25 additions & 0 deletions notebooks/3d_vision/matadata.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
title: 3D Computer Vision # shown on browser tab

header:
title: 3D Computer Vision # title of your notebook
description: This notebook intruduces 3D Computer Vision and pointclouds, famous neural networks and adversarial attacks in this field.

authors:
label:
position: top
content:
# list of notebook authors
- name: Kimia Noorbakhsh
role: Teacher Assistant
contact:
# list of contact information
- link: https://github.com/kimianoorbakhsh
icon: fab fa-github
# optionally add other contact information like
# - link: <change this> # contact link
# icon: <change this> # awsomefont tag for link (check: https://fontawesome.com/v5.15/icons)

comments:
# enable comments for your post
label: false
kind: comments
1 change: 1 addition & 0 deletions notebooks/index.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
- notebook: notebooks/logic_programming
- notebook: notebooks/search_in_continuous_space
- notebook: notebooks/hmm_speech_recognition
- notebook: notebooks/3d_vision