推荐阅读：
- 2020-2021年计算机视觉综述论文汇总
- 2019-2020年目标跟踪资源全汇总（论文、模型代码、优秀实验室）

CVPR2020最新信息及论文下载贴（Papers/Codes/Project/PaperReading／Demos/直播分享／论文分享会等）

官网链接：http://cvpr2020.thecvf.com/
时间：Seattle, Washington，2020年6月14日-6月19日
论文接收公布时间：2020年2月24日

总目录

1. CVPR2020接收论文分类汇总（持续更新）
2. CVPR2020 Oral（持续更新）
3. CVPR2020 论文解读
 4. To do list
5. Related works

1.CVPR2020接收论文（持续更新）

分类汇总

20.CVPR 2020 论文大盘点-动作检测与动作分割篇
19.CVPR 2020 论文大盘点-动作识别篇
18.CVPR 2020 论文大盘点-光流篇
17.CVPR 2020 论文大盘点-图像与视频检索篇
16.CVPR 2020 论文大盘点-遥感与航拍影像处理识别篇
15.CVPR 2020 论文大盘点-图像质量评价篇
14.CVPR 2020 论文大盘点-图像修复 Inpainting 篇
13.CVPR 2020 论文大盘点-图像增强与图像恢复篇
12.CVPR 2020 论文大盘点-去雨去雾去模糊篇
11.CVPR 2020 论文大盘点-医学影像处理识别篇
10.CVPR 2020 论文大盘点-抠图 Matting 篇
9.CVPR 2020 论文大盘点-图像分割完整篇
8.CVPR 2020 论文大盘点-全景分割与视频目标分割篇
7.CVPR 2020 论文大盘点-超分辨篇
6.CVPR 2020 论文大盘点-目标检测篇
5.CVPR 2020 论文大盘点-人脸技术篇
4.CVPR 2020 论文大盘点-目标跟踪篇
3.CVPR 2020 论文大盘点-文本图像篇
2.CVPR 2020 论文大盘点-行人检测与重识别篇
1.CVPR 2020 论文大盘点-实例分割篇

1. 目标检测
 2. 人脸识别
 3. 目标跟踪
 4. 三维点云/三维重建/三维检测/三维分割/深度估计
 5. 图像识别
 6. 图像处理
 7. 图像分类
 8. 图像分割
 9. 姿态估计/动作识别
 10. 视频分析
 11. OCR
12. GAN
13. 小样本/零样本
 14. 弱监督/无监督/自监督
 15. 行人跟踪/行人检测/ReID
16. 神经网络/模型加速/模型压缩
 17. 超分辨率
 18. 视觉常识/数据集/其他

目标检测

Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
论文地址：https://arxiv.org/abs/1912.02424
代码：https://github.com/sfzhang15/ATSS
Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector
论文地址：https://arxiv.org/abs/1908.01998
AugFPN: Improving Multi-scale Feature Learning for Object Detection
论文地址：https://arxiv.org/abs/1912.05384
Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
论文地址：https://arxiv.org/abs/2003.11818
代码：https://github.com/ggjy/HitDet.pytorch
Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
论文地址：https://arxiv.org/abs/2003.08813
CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection
论文地址：https://arxiv.org/abs/2003.09119
代码：https://github.com/KiveeDong/CentripetalNet

人脸识别

Towards Universal Representation Learning for Deep Face Recognition
论文地址：https://arxiv.org/abs/2002.11841
Suppressing Uncertainties for Large-Scale Facial Expression Recognition

论文地址：https://arxiv.org/abs/2002.10392
代码：https://github.com/kaiwang960112/Self-Cure-Network
Face X-ray for More General Face Forgery Detection
论文地址：https://arxiv.org/pdf/1912.13458.pdf
Pose Agnostic Cross-spectral Hallucination via Disentangling Independent Factors
论文地址：https://arxiv.org/abs/1909.04365
Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing
论文地址：https://arxiv.org/abs/2003.08061
代码：https://github.com/clks-wzz/FAS-SGTD
Learning Meta Face Recognition in Unseen Domains
论文地址：https://arxiv.org/abs/2003.07733
代码：https://github.com/cleardusk/MFR

目标跟踪

ROAM: Recurrently Optimizing Tracking Model
论文地址：https://arxiv.org/abs/1907.12006

三维点云/三维重建/三维检测/三维分割/深度估计

三维点云&重建

PF-Net: Point Fractal Network for 3D Point Cloud Completion
论文地址：https://arxiv.org/abs/2003.00410
PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
论文地址：https://arxiv.org/abs/2002.10876
代码：https://github.com/liruihui/PointAugment/
Learning multiview 3D point cloud registration
论文地址：https://arxiv.org/abs/2001.05119
C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds
论文地址：https://arxiv.org/abs/1912.07009
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
论文地址：https://arxiv.org/abs/1911.11236
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
论文地址：https://arxiv.org/abs/2002.12212
Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
论文地址：https://arxiv.org/abs/2003.01456
In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks
论文地址：https://arxiv.org/pdf/1911.11924.pdf
Attentive Context Normalization for Robust Permutation-Equivariant Learning
论文地址：https://arxiv.org/abs/1907.02545 Weiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi
PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
论文地址：https://arxiv.org/abs/1911.10949
SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
论文地址：https://arxiv.org/abs/1912.00036
Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
论文地址：https://arxiv.org/abs/1912.06378
代码：https://github.com/alibaba/cascade-stereo
Unsupervised Learning of Intrinsic Structural Representation Points
论文地址：https://arxiv.org/abs/2003.01661
代码：https://github.com/NolenChen/3DStructurePoints

三维重建

Leveraging 2D Data to Learn Textured 3D Mesh Generation
论文地址：https://arxiv.org/abs/2004.04180
ARCH: Animatable Reconstruction of Clothed Humans
论文地址：https://arxiv.org/abs/2004.04572
Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
论文地址：https://arxiv.org/abs/2004.03967

图像识别

图像特征匹配

Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task
论文地址：https://arxiv.org/abs/1912.00623
Correspondence Networks with Adaptive Neighbourhood Consensus
论文地址：https://arxiv.org/abs/2003.12059

图像字幕

Normalized and Geometry-Aware Self-Attention Network for Image Captioning
论文地址：https://arxiv.org/abs/2003.08897

图像处理

Learning to Shade Hand-drawn Sketches
论文地址：https://arxiv.org/abs/2002.11812
Single Image Reflection Removal through Cascaded Refinement
论文地址：https://arxiv.org/abs/1911.06634
Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data
论文地址：https://arxiv.org/abs/2002.11297
Deep Image Harmonization via Domain Verification
论文地址：https://arxiv.org/abs/1911.13239
代码：https://github.com/bcmi/Image_Harmonization_Datasets
RoutedFusion: Learning Real-time Depth Map Fusion
论文地址：https://arxiv.org/pdf/2001.04388.pdf
Neural Contours: Learning to Draw Lines from 3D Shapes
论文地址：https://arxiv.org/abs/2003.10333
Towards Photo-Realistic Virtual Try-On by Adaptively Generating鈫Preserving Image Content
论文地址：https://arxiv.org/abs/2003.05863

图像分类

Self-training with Noisy Student improves ImageNet classification
论文地址：https://arxiv.org/abs/1911.04252
Image Matching across Wide Baselines: From Paper to Practice
论文地址：https://arxiv.org/abs/2003.01587
Towards Robust Image Classification Using Sequential Attention Models
论文地址：https://arxiv.org/abs/1912.02184
Learning in the Frequency Domain
论文地址：https://arxiv.org/abs/2002.12416
Learning from Web Data with Memory Module
论文地址：https://arxiv.org/abs/1906.12028
Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks
论文地址：https://arxiv.org/abs/1912.09393

图像分割

Semi-Supervised Semantic Image Segmentation with Self-correcting Networks
论文地址：https://arxiv.org/abs/1811.07073
Deep Snake for Real-Time Instance Segmentation
论文地址：https://arxiv.org/abs/2001.01629
CenterMask : Real-Time Anchor-Free Instance Segmentation
论文地址：https://arxiv.org/abs/1911.06667
代码：https://github.com/youngwanLEE/CenterMask
SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks
论文地址：https://arxiv.org/abs/2003.00678
PolarMask: Single Shot Instance Segmentation with Polar Representation
论文地址：https://arxiv.org/abs/1909.13226
代码：https://github.com/xieenze/PolarMask
xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
论文地址：https://arxiv.org/abs/1911.12676
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
论文地址：https://arxiv.org/abs/2001.00309
Enhancing Generic Segmentation with Learned Region Representations
论文地址：https://arxiv.org/abs/1911.08564

姿态估计/动作识别

VIBE: Video Inference for Human Body Pose and Shape Estimation
论文地址：https://arxiv.org/abs/1912.05656
代码：https://github.com/mkocabas/VIBE
Distribution-Aware Coordinate Representation for Human Pose Estimation
论文地址：https://arxiv.org/abs/1910.06278
代码：https://github.com/ilovepose/DarkPose
4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
论文地址：https://arxiv.org/abs/2002.12625
Optimal least-squares solution to the hand-eye calibration problem
论文地址：https://arxiv.org/abs/2002.10838
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
论文地址：https://arxiv.org/abs/2003.01060
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
论文地址：https://arxiv.org/abs/2001.09691
Distribution Aware Coordinate Representation for Human Pose Estimation
论文地址：https://arxiv.org/abs/1910.06278
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
论文地址：https://arxiv.org/abs/1911.07524
PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
论文地址：https://arxiv.org/abs/1911.04231
Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
论文地址：https://arxiv.org/abs/2003.02824
G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
论文地址：https://arxiv.org/abs/2003.11089
Deep Image Spatial Transformation for Person Image Generation
论文地址：https://arxiv.org/abs/2003.00696
代码：https://github.com/RenYurui/ Global-Flow-Local-Attention

视频分析

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
论文地址：https://arxiv.org/abs/2003.01455
代码：https://github.com/bbrattoli/ZeroShotVideoClassification
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
论文地址：https://arxiv.org/abs/2003.00387
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
论文地址：https://arxiv.org/abs/2003.00392
Object Relational Graph with Teacher-Recommended Learning for Video Captioning
论文地址：https://arxiv.org/abs/2002.11566
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
论文地址：https://arxiv.org/abs/2002.11616
Blurry Video Frame Interpolation
论文地址：https://arxiv.org/abs/2002.12259
Hierarchical Conditional Relation Networks for Video Question Answering
论文地址：https://arxiv.org/abs/2002.10698
Action Modifiers:Learning from Adverbs in Instructional Video
论文地址：https://arxiv.org/abs/1912.06617
Visual Grounding in Video for Unsupervised Word Translation
论文地址：https://arxiv.org/abs/2003.05078
代码：https://github.com/gsig/visual-grounding
MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask（视频分析-光流估计）
论文地址：https://arxiv.org/abs/2003.10955
代码：https://github.com/microsoft/MaskFlownet
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects（视频预测）
论文地址：https://arxiv.org/abs/2003.12045
代码：https://ehsanik.github.io/forcecvpr2020

OCR

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
论文地址：https://arxiv.org/abs/2002.10200
代码：https://github.com/Yuliang-Liu/bezier_curve_text_spotting,https://github.com/aim-uofa/adet
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
论文地址：https://arxiv.org/abs/1911.06258

GAN

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
论文地址：https://arxiv.org/abs/1911.12287
代码：https://github.com/giannisdaras/ylg
MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis
论文地址：https://arxiv.org/abs/1903.06048
Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory
论文地址：https://arxiv.org/abs/1911.04636
PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
论文地址：https://arxiv.org/abs/1909.06956

小样本/零样本

Improved Few-Shot Visual Classification
论文地址：https://arxiv.org/pdf/1912.03432.pdf
Meta-Transfer Learning for Zero-Shot Super-Resolution
论文地址：https://arxiv.org/abs/2002.12213
Instance Credibility Inference for Few-Shot Learning
论文地址：https://arxiv.org/abs/2003.11853
代码：https://github.com/Yikai-Wang/ICI-FSL

弱监督/无监督/自监督

Rethinking the Route Towards Weakly Supervised Object Localization
论文地址：https://arxiv.org/abs/2002.11359
NestedVAE: Isolating Common Factors via Weak Supervision
论文地址：https://arxiv.org/abs/2002.11576
Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
论文地址：https://arxiv.org/abs/1911.07450
Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
论文地址：https://arxiv.org/abs/2003.01460
ClusterFit: Improving Generalization of Visual Representations
论文地址：https://arxiv.org/abs/1912.03330
Auto-Encoding Twin-Bottleneck Hashing
论文地址：https://arxiv.org/abs/2002.11930
Learning Representations by Predicting Bags of Visual Words
论文地址：https://arxiv.org/abs/2002.12247
A Characteristic Function Approach to Deep Implicit Generative Modeling
论文地址：https://arxiv.org/abs/1909.07425
Unsupervised Learning of Intrinsic Structural Representation Points
论文地址：https://arxiv.org/abs/2003.01661
代码：https://github.com/NolenChen/3DStructurePoints

行人跟踪/行人检测/ReID

Cross-modality Person re-identification with Shared-Specific Feature Transfer
论文地址：https://arxiv.org/abs/2002.12489
Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
论文地址：https://arxiv.org/abs/2002.11927
The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
论文地址：https://arxiv.org/abs/1912.06445

神经网络/模型压缩/模型加速

GhostNet: More Features from Cheap Operations
论文地址：https://arxiv.org/abs/1911.11907
代码：https://github.com/iamhankai/ghostnet
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral
论文地址：https://arxiv.org/abs/2003.01826
GPU-Accelerated Mobile Multi-view Style Transfer
论文地址：https://arxiv.org/abs/2003.00706
Bundle Adjustment on a Graph Processor
论文地址：https://arxiv.org/abs/2003.03134
代码：https://github.com/joeaortiz/gbp
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral
论文地址：https://arxiv.org/abs/2003.01826
Holistically-Attracted Wireframe Parsing
论文地址：https://arxiv.org/abs/2003.01663
AdderNet: Do We Really Need Multiplications in Deep Learning?
论文地址：https://arxiv.org/abs/1912.13200
CARS: Contunuous Evolution for Efficient Neural Architecture Search
论文地址：https://arxiv.org/abs/1909.04977
代码：https://github.com/huawei-noah/CARS
Π-nets: Deep Polynomial Neural Networksv
论文地址：https://arxiv.org/abs/2003.03828
Explaining Knowledge Distillation by Quantifying the Knowledge
论文地址：https://arxiv.org/abs/2003.03622

超分辨率

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
论文地址：https://arxiv.org/abs/2002.11616
Closed-loop Matters: Dual Regression Networks for Single Image Super-Resolution
论文地址：https://arxiv.org/abs/2003.07018
代码：https://github.com/guoyongcs/DRN

视觉常识/其他

Visual Commonsense R-CNN
论文地址：https://arxiv.org/abs/2002.12204
代码：https://github.com/Wangt-CN/VC-R-CNN
Scalable Uncertainty for Computer Vision with Functional Variational Inference
论文地址：https://arxiv.org/abs/2003.03396
Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective
论文地址：https://arxiv.org/abs/2002.10826
Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs
论文地址：https://arxiv.org/abs/2003.00287
Filter Grafting for Deep Neural Networks
论文地址：https://arxiv.org/abs/2001.05868
代码：https://github.com/fxmeng/filter-grafting.git
12-in-1: Multi-Task Vision and Language Representation Learning
论文地址：https://arxiv.org/abs/1912.02315
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
论文地址：https://arxiv.org/abs/2002.10638
代码：https://github.com/weituo12321/PREVALENT
Unbiased Scene Graph Generation from Biased Training
论文地址：https://arxiv.org/abs/2002.11949

9.Towards Visually Explaining Variational Autoencoders
论文地址：https://arxiv.org/abs/1911.07389

BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
论文地址：http://www.weixiushen.com/publication/cvpr20_BBN.pdf
代码：https://github.com/Megvii-Nanjing/BBN
High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks
论文地址：https://arxiv.org/abs/1905.13545
SAM: The Sensitivity of Attribution Methods to Hyperparameters
论文地址：http://s.anhnguyen.me/sam\_cvpr2020.pdf
代码：https://github.com/anguyen8/sam
Π− nets: Deep Polynomial Neural Networks
论文地址：https://arxiv.org/abs/2003.03828
Towards Backward-Compatible Representation Learning
论文地址：https://arxiv.org/abs/2003.11942
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
论文地址：https://arxiv.org/abs/2003.07064
KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations（数据集）
论文地址：https://arxiv.org/abs/2002.12687

2.CVPR2020 Oral（持续更新）

1. PolarMask: Single Shot Instance Segmentation with Polar Representation
代码：https://github.com/xieenze/PolarMask

2. Unbiased Scene Graph Generation from Biased Training
代码：https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

3. Learning to Shade Hand-drawn Sketches
代码：https://github.com/qyzdao/ShadeSketch

4. SAM: The Sensitivity of Attribution Methods to Hyperparameters
代码：https://github.com/anguyen8/sam

5. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

6. Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

7. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

8. AdderNet: Do We Really Need Multiplications in Deep Learning?

9. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

10. Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

11. Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing
https://github.com/clks-wzz/FAS-SGTD

12. Learning Meta Face Recognition in Unseen Domains
https://github.com/cleardusk/MFR

13. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
https://github.com/alibaba/cascade-stereo

14. BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
https://github.com/Megvii-Nanjing/BBN

15. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

16. SAM: The Sensitivity of Attribution Methods to Hyperparameters
https://github.com/anguyen8/sam

17. Towards Backward-Compatible Representation Learning

18. MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask
https://github.com/microsoft/MaskFlownet

19. Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
https://ehsanik.github.io/forcecvpr2020

20. StyleRig: Rigging StyleGAN for 3D Control over Portrait Images

21. Conditional Channel Gated Networks for Task-Aware Continual Learning

22. BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation

23. TITAN: Future Forecast using Action Priors

24. Learning Interactions and Relationships between Movie Characters

25. GPS-Net: Graph Property Sensing Network for Scene Graph Generation
https://github.com/taksau/GPS-Net

26. A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising
https://github.com/Vandermode/NoiseModel

27. Controllable Person Image Synthesis with Attribute-Decomposed GAN
https://menyifang.github.io/projects/ADGAN/ADGAN.html

28. Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

29. Learning to Optimize Non-Rigid Tracking

30. Self-Supervised Scene De-occlusion
https://xiaohangzhan.github.io/projects/deocclusion/

31. Robust 3D Self-portraits in Seconds

32. Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

33. Light Field Spatial Super-resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization

34. Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

35. Deep White-Balance Editing

36. Tracking by Instance Detection: A Meta-Learning Approach

3.CVPR2020 论文解读

15.无监督的视觉常识特征学习——因果关系上的一点探索

如今越来越多的研究者开始关注如何将统计中的因果应用于deep learning，来增加其鲁棒性、可解释性等等。但是大部分工作都没有深入因果理论中，更多的是借用了其中一些概念（比如counterfactual反事实），这篇paper旨在能在此基础上再向前走一点。
论文链接：https://arxiv.org/abs/2002.12204
论文代码：https://github.com/Wangt-CN/VC-R-CNN

14.CVPR2020 | 最新最完善的场景图生成 (SGG)开源框架，集成目前最全metrics，已开源

选择2019年热门框架facebookresearch/maskrcnn-benchmark作为基础，在其基础上搭建了Scene-Graph-Benchmark.pytorch。该代码不仅兼容了maskrcnn-benchmark所支持的所有detector模型，且得益于facebookresearch优秀的代码功底，更大大增加了SGG部分的可读性和可操作性。
论文链接：https://arxiv.org/abs/2002.11949
论文代码：https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

13.CVPR2020 | 旷视研究院提出基于3D关键点投票网络的单目6DoF位姿估计算法(已开源)

论文链接：https://arxiv.org/abs/1911.04231
论文代码：https://github.com/ethnhe/PVN3D.git
旷视研究院提出一种基于霍夫投票（Hough voting)的 3D 关键点检测神经网络，称之为 PVN3D，以学习逐点到 3D 关键点的偏移并为 3D 关键点投票。把基于 2D 关键点的方法推进至 3D 关键点，以充分利用刚体的几何约束信息，极大提升了 6DoF 估计的精确性。在 YCB-Video 和 LineMOD 两大公开数据集上进行了评估实验，结果表明该方法以大幅优势取得了当前最佳性能。

12.跨模态行人重识别：共享与特异特征变换算法cm-SSFT

论文链接：https://arxiv.org/abs/2002.12489
关注红外线-RGB跨模态行人重识别。试图解决：以往大部分跨模态行人重识别算法一般都只关注shared feature learning，而很少关注Specific feature。因为Specific feature在对面模态中是不存在的。例如在红外线图片中是没有彩色颜色信息的。反之在彩图中也不会有热度信息。而实际上做过ReID的都知道，传统ReID之所以性能很高，很大程度上就是有些“过拟合”到了这些specific信息上。比如衣服颜色一直是传统ReID的一个重要的cue。从这个角度出发，尝试利用specific特征。主要思路是利用近邻信息：给定一红外线query。当搜索彩色target时，可以先找到一些简单的置信度高的彩色样本（这些样本大概率是红外线query的positive样本），把这些彩色样本的颜色特异特征给与红外线query。做了这件事后，红外线query样本可以利用这些彩色信息再去搜索更难的彩色样本。

11.RandLA-Net:大场景三维点云语义分割新框架（已开源）

论文链接：https://arxiv.org/abs/1911.11236
代码：https://github.com/QingyongHu/RandLA-Net
提出了一种基于简单高效的随机降采样和局部特征聚合的网络结构(RandLA-Net)。该方法不仅在诸如Semantic3D和SemanticKITTI等大场景点云分割数据集上取得了非常好的效果，并且具有非常高的效率(e.g. 比基于图的方法SPG快了接近200倍)。

10.腾讯推出超强少样本目标检测算法，公开千类少样本检测训练集FSOD

论文链接：https://arxiv.org/abs/1908.01998
提出了新的少样本目标检测算法，创新点包括Attention-RPN、多关系检测器以及对比训练策略，另外还构建了包含1000类的少样本检测数据集FSOD，在FSOD上训练得到的论文模型能够直接迁移到新类别的检测中，不需要fine-tune

9.CARS: 华为提出基于进化算法和权值共享的神经网络结构搜索，CIFAR-10上仅需单卡半天

论文链接：https://arxiv.org/abs/1909.04977
为了优化进化算法在神经网络结构搜索时候选网络训练过长的问题，参考ENAS和NSGA-III，论文提出连续进化结构搜索方法(continuous evolution architecture search, CARS)，最大化利用学习到的知识，如上一轮进化的结构和参数。首先构造用于参数共享的超网，从超网中产生子网，然后使用None-dominated排序策略来选择不同大小的优秀网络，整体耗时仅需要0.5 GPU day。

8.化繁为简，弱监督目标定位领域的新SOTA - 伪监督目标定位方法(PSOL)

论文链接：https://arxiv.org/abs/2002.11359
论文提出伪监督目标定位方法(PSOL)来解决目前弱监督目标定位方法的问题，该方法将定位与分类分开成两个独立的网络，然后在训练集上使用Deep descriptor transformation(DDT)生成伪GT进行训练，整体效果达到SOTA。该论文主要有三点贡献：一、弱监督目标定位应该分为类不可知目标定位和目标分类两个独立的部分，提出PSOL算法；二、尽管生成的bbox有偏差，论文仍然认为应该直接优化他们而不需要类标签，最终达到SOTA；三、在不同的数据集上，PSOL算法不需要fine-tuning也能有很好的定位迁移能力

7.字节跳动：基于解剖学感知的视频3D人体姿态估计

论文链接：https://arxiv.org/pdf/2002.10322.pdf
在这项工作中，我们提出了一种新的视频中3D人体姿态估计的解决方案。我们不是直接回归3D关节位置，而是从人体骨骼解剖中汲取灵感，将任务分解为骨骼方向预测和骨骼长度预测，从这两个预测中完全可以得到三维关节位置。我们的研究动机是人类骨骼的长度随着时间的推移保持一致。这推动了我们开发有效的技术来利用视频中所有帧的全局信息来进行高精度的骨骼长度预测。此外，对于骨骼方向预测网络，我们提出了一种具有长跳跃连接的全卷积传播结构。本质上，它分层地预测不同骨骼的方向，而不使用任何耗时的存储单元(例如LSTM)。进一步引入了一种新的关节位移损失来连接骨骼长度和骨骼方向预测网络的训练。最后，我们采用一种隐含的注意机制将2D关键点可见性分数作为额外的指导反馈到模型中，这显著地缓解了许多具有挑战性的姿势中的深度歧义。我们的完整模型在Human3.6M和MPI-INF-3dHP数据集上的表现优于之前的最好结果，在这些数据集上的综合评估验证了我们模型的有效性。

6.微软亚洲研究院：给Deepfake 假脸做 X-Ray，新模型把换脸图打回原形

论文链接：论文地址：https://arxiv.org/pdf/1912.13458.pdf
微软亚洲研究院提出了一个方法，它既不需要了解换脸后的图像数据，也不需要知道换脸算法，就能对图像做『X-Ray』，鉴别出是否换脸，以及指出换脸的边界。新模型 Face X-Ray 具有两大属性：能泛化到未知换脸算法、能提供可解释的换脸边界。要获得这样的优良属性，诀窍就藏在换脸算法的一般过程中。如下所示，大多数换脸算法可以分为检测、修改以及融合三部分。与之前的研究不同，Face X-Ray 希望检测第三阶段产生的误差。

5.UDP：人体姿态估计中的无偏数据处理方法

论文链接：https://arxiv.org/abs/1911.07524
UDP，解决了现有的SOTA人体姿态估计算法中标准编解码方法存在较大统计误差的问题。同时解决了由于翻转测试而导致的结果不对齐问题。且该算法即用即插，在基本不增加模型复杂度的情况下，有效提升了算法性能。

4.让合成图像更真实，上交大提出基于域验证的图像和谐化

论文链接：https://arxiv.org/abs/1911.13239
在合成图中，前景和背景是在不同的拍摄条件 (比如时刻、季节、光照、天气) 下拍摄的，所以在亮度色泽等方面存在明显的不匹配问题。图像和谐化 (image harmonization) 旨在调整合成图中的前景，使其与背景和谐。传统的图像和谐化方法一般是从背景或者其他图片转移颜色信息到前景上，但这样无法保证调整之后的前景看起来真实并且与背景和谐。近年来，已经有少量的工作尝试用深度学习做图像和谐化，但成对的合成图和真实图极难获得。如果没有成对的合成图和真实图，深度学习的训练过程缺乏足够强的监督信息，合成图和谐化之后的结果也没有 ground-truth 用于评测。截至目前还没有公开的大规模图像和谐化数据库，我们构建并公布了由四个子数据库组成的图像和谐化数据库。并且，我们提出了域验证 (domain verification) 的概念，尝试了基于域验证的图像和谐化算法。

3.PolarMask: 一阶段实例分割新思路

论文链接：https://arxiv.org/abs/1909.13226
PolarMask基于FCOS，把实例分割统一到了FCN的框架下。FCOS本质上是一种FCN的dense prediction的检测框架，可以在性能上不输anchor based的目标检测方法，让行业看到了anchor free方法的潜力。接下来要解决的问题是实例分割。本工作最大的贡献在于把更复杂的实例分割问题，转化成在网络设计和计算量复杂度上和物体检测一样复杂的任务，把对实例分割的建模变得简单和高效。

2.华为GhostNet，超越谷歌MobileNet，已开源

论文链接：https://arxiv.org/abs/1911.11907
该论文提供了一个全新的Ghost模块，旨在通过廉价操作生成更多的特征图。基于一组原始的特征图，作者应用一系列线性变换，以很小的代价生成许多能从原始特征发掘所需信息的“幻影”特征图（Ghost feature maps）。该Ghost模块即插即用，通过堆叠Ghost模块得出Ghost bottleneck，进而搭建轻量级神经网络——GhostNet。在ImageNet分类任务，GhostNet在相似计算量情况下Top-1正确率达75.7%，高于MobileNetV3的75.2%。

1.加州理工大学Devi Parikh：多任务视觉和语言表示学习

论文链接：https://arxiv.org/abs/1912.02315
许多视觉和语言的研究集中在一组小而多样的独立任务和支持的数据集上，这些数据集通常是单独研究的;然而，成功完成这些任务所需的视觉语言理解技能有很大的重叠。在这项工作中，我们通过开发一个大规模的、多任务的训练机制来研究视觉和语言任务之间的关系。

4.To do list

CVPR2020复现代码及时更新
CVPR2020论文分享跟进

5.Related links

CVPR2019/2018/2017最全资料下载（论文／代码等)
https://github.com/extreme-assistant/iccv2019

6.CVPR2020 contributors Wechat Group

为了让大家更好得进行交流，极市特别组建了贡献者群及作者微信群，欢迎加小助手微信（cv-mart，备注CVPR2020）进群。

Files

CVPR2020.md

Latest commit

History