Skip to content

Latest commit

 

History

History
executable file
·
832 lines (523 loc) · 40.6 KB

CVPR2020.md

File metadata and controls

executable file
·
832 lines (523 loc) · 40.6 KB


CVPR2020最新信息及论文下载贴(Papers/Codes/Project/PaperReading/Demos/直播分享/论文分享会等)

官网链接:http://cvpr2020.thecvf.com/
时间:Seattle, Washington,2020年6月14日-6月19日
论文接收公布时间:2020年2月24日

相关问题:

总目录

1. CVPR2020接收论文分类汇总(持续更新)
2. CVPR2020 Oral(持续更新)
3. CVPR2020 论文解读
4. To do list
5. Related works



分类汇总


目录

1. 目标检测
2. 人脸识别
3. 目标跟踪
4. 三维点云/三维重建/三维检测/三维分割/深度估计
5. 图像识别
6. 图像处理
7. 图像分类
8. 图像分割
9. 姿态估计/动作识别
10. 视频分析
11. OCR
12. GAN
13. 小样本/零样本
14. 弱监督/无监督/自监督
15. 行人跟踪/行人检测/ReID
16. 神经网络/模型加速/模型压缩
17. 超分辨率
18. 视觉常识/数据集/其他



  1. Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
    论文地址:https://arxiv.org/abs/1912.02424
    代码:https://github.com/sfzhang15/ATSS

  2. Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector
    论文地址:https://arxiv.org/abs/1908.01998

  3. AugFPN: Improving Multi-scale Feature Learning for Object Detection
    论文地址:https://arxiv.org/abs/1912.05384

  4. Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
    论文地址:https://arxiv.org/abs/2003.11818
    代码:https://github.com/ggjy/HitDet.pytorch

  5. Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation
    论文地址:https://arxiv.org/abs/2003.08813

  6. CentripetalNet: Pursuing High-quality Keypoint Pairs for Object Detection
    论文地址:https://arxiv.org/abs/2003.09119
    代码:https://github.com/KiveeDong/CentripetalNet



  1. Towards Universal Representation Learning for Deep Face Recognition
    论文地址:https://arxiv.org/abs/2002.11841

  2. Suppressing Uncertainties for Large-Scale Facial Expression Recognition

    论文地址:https://arxiv.org/abs/2002.10392
    代码:https://github.com/kaiwang960112/Self-Cure-Network

  3. Face X-ray for More General Face Forgery Detection
    论文地址:https://arxiv.org/pdf/1912.13458.pdf

  4. Pose Agnostic Cross-spectral Hallucination via Disentangling Independent Factors
    论文地址:https://arxiv.org/abs/1909.04365

  5. Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing
    论文地址:https://arxiv.org/abs/2003.08061
    代码:https://github.com/clks-wzz/FAS-SGTD

  6. Learning Meta Face Recognition in Unseen Domains
    论文地址:https://arxiv.org/abs/2003.07733
    代码:https://github.com/cleardusk/MFR



  1. ROAM: Recurrently Optimizing Tracking Model
    论文地址:https://arxiv.org/abs/1907.12006



  • 三维点云&重建
  1. PF-Net: Point Fractal Network for 3D Point Cloud Completion
    论文地址:https://arxiv.org/abs/2003.00410

  2. PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
    论文地址:https://arxiv.org/abs/2002.10876
    代码:https://github.com/liruihui/PointAugment/

  3. Learning multiview 3D point cloud registration
    论文地址:https://arxiv.org/abs/2001.05119

  4. C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds
    论文地址:https://arxiv.org/abs/1912.07009

  5. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
    论文地址:https://arxiv.org/abs/1911.11236

  6. Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
    论文地址:https://arxiv.org/abs/2002.12212

  7. Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
    论文地址:https://arxiv.org/abs/2003.01456

  8. In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks
    论文地址:https://arxiv.org/pdf/1911.11924.pdf

  9. Attentive Context Normalization for Robust Permutation-Equivariant Learning
    论文地址:https://arxiv.org/abs/1907.02545 Weiwei Sun, Wei Jiang, Eduard Trulls, Andrea Tagliasacchi, Kwang Moo Yi

  10. PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
    论文地址:https://arxiv.org/abs/1911.10949

  11. SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
    论文地址:https://arxiv.org/abs/1912.00036

  12. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
    论文地址:https://arxiv.org/abs/1912.06378
    代码:https://github.com/alibaba/cascade-stereo

  13. Unsupervised Learning of Intrinsic Structural Representation Points
    论文地址:https://arxiv.org/abs/2003.01661
    代码:https://github.com/NolenChen/3DStructurePoints

  • 三维重建
  1. Leveraging 2D Data to Learn Textured 3D Mesh Generation
    论文地址:https://arxiv.org/abs/2004.04180

  2. ARCH: Animatable Reconstruction of Clothed Humans
    论文地址:https://arxiv.org/abs/2004.04572

  3. Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
    论文地址:https://arxiv.org/abs/2004.03967



  • 图像特征匹配
  1. Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task
    论文地址:https://arxiv.org/abs/1912.00623

  2. Correspondence Networks with Adaptive Neighbourhood Consensus
    论文地址:https://arxiv.org/abs/2003.12059

  • 图像字幕
  1. Normalized and Geometry-Aware Self-Attention Network for Image Captioning
    论文地址:https://arxiv.org/abs/2003.08897



  1. Learning to Shade Hand-drawn Sketches
    论文地址:https://arxiv.org/abs/2002.11812

  2. Single Image Reflection Removal through Cascaded Refinement
    论文地址:https://arxiv.org/abs/1911.06634

  3. Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data
    论文地址:https://arxiv.org/abs/2002.11297

  4. Deep Image Harmonization via Domain Verification
    论文地址:https://arxiv.org/abs/1911.13239
    代码:https://github.com/bcmi/Image_Harmonization_Datasets

  5. RoutedFusion: Learning Real-time Depth Map Fusion
    论文地址:https://arxiv.org/pdf/2001.04388.pdf

  6. Neural Contours: Learning to Draw Lines from 3D Shapes
    论文地址:https://arxiv.org/abs/2003.10333

  7. Towards Photo-Realistic Virtual Try-On by Adaptively Generating鈫Preserving Image Content
    论文地址:https://arxiv.org/abs/2003.05863



  1. Self-training with Noisy Student improves ImageNet classification
    论文地址:https://arxiv.org/abs/1911.04252

  2. Image Matching across Wide Baselines: From Paper to Practice
    论文地址:https://arxiv.org/abs/2003.01587

  3. Towards Robust Image Classification Using Sequential Attention Models
    论文地址:https://arxiv.org/abs/1912.02184

  4. Learning in the Frequency Domain
    论文地址:https://arxiv.org/abs/2002.12416

  5. Learning from Web Data with Memory Module
    论文地址:https://arxiv.org/abs/1906.12028

  6. Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks
    论文地址:https://arxiv.org/abs/1912.09393



  1. Semi-Supervised Semantic Image Segmentation with Self-correcting Networks
    论文地址:https://arxiv.org/abs/1811.07073

  2. Deep Snake for Real-Time Instance Segmentation
    论文地址:https://arxiv.org/abs/2001.01629

  3. CenterMask : Real-Time Anchor-Free Instance Segmentation
    论文地址:https://arxiv.org/abs/1911.06667
    代码:https://github.com/youngwanLEE/CenterMask

  4. SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks
    论文地址:https://arxiv.org/abs/2003.00678

  5. PolarMask: Single Shot Instance Segmentation with Polar Representation
    论文地址:https://arxiv.org/abs/1909.13226
    代码:https://github.com/xieenze/PolarMask

  6. xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
    论文地址:https://arxiv.org/abs/1911.12676

  7. BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
    论文地址:https://arxiv.org/abs/2001.00309

  8. Enhancing Generic Segmentation with Learned Region Representations
    论文地址:https://arxiv.org/abs/1911.08564



  1. VIBE: Video Inference for Human Body Pose and Shape Estimation
    论文地址:https://arxiv.org/abs/1912.05656
    代码:https://github.com/mkocabas/VIBE

  2. Distribution-Aware Coordinate Representation for Human Pose Estimation
    论文地址:https://arxiv.org/abs/1910.06278
    代码:https://github.com/ilovepose/DarkPose

  3. 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras
    论文地址:https://arxiv.org/abs/2002.12625

  4. Optimal least-squares solution to the hand-eye calibration problem
    论文地址:https://arxiv.org/abs/2002.10838

  5. D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
    论文地址:https://arxiv.org/abs/2003.01060

  6. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
    论文地址:https://arxiv.org/abs/2001.09691

  7. Distribution Aware Coordinate Representation for Human Pose Estimation
    论文地址:https://arxiv.org/abs/1910.06278

  8. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation
    论文地址:https://arxiv.org/abs/1911.07524

  9. PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
    论文地址:https://arxiv.org/abs/1911.04231

  10. Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation
    论文地址:https://arxiv.org/abs/2003.02824

  11. G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
    论文地址:https://arxiv.org/abs/2003.11089

  12. Deep Image Spatial Transformation for Person Image Generation
    论文地址:https://arxiv.org/abs/2003.00696
    代码:https://github.com/RenYurui/ Global-Flow-Local-Attention



  1. Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
    论文地址:https://arxiv.org/abs/2003.01455
    代码:https://github.com/bbrattoli/ZeroShotVideoClassification

  2. Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs
    论文地址:https://arxiv.org/abs/2003.00387

  3. Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
    论文地址:https://arxiv.org/abs/2003.00392

  4. Object Relational Graph with Teacher-Recommended Learning for Video Captioning
    论文地址:https://arxiv.org/abs/2002.11566

  5. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
    论文地址:https://arxiv.org/abs/2002.11616

  6. Blurry Video Frame Interpolation
    论文地址:https://arxiv.org/abs/2002.12259

  7. Hierarchical Conditional Relation Networks for Video Question Answering
    论文地址:https://arxiv.org/abs/2002.10698

  8. Action Modifiers:Learning from Adverbs in Instructional Video
    论文地址:https://arxiv.org/abs/1912.06617

  9. Visual Grounding in Video for Unsupervised Word Translation
    论文地址:https://arxiv.org/abs/2003.05078
    代码:https://github.com/gsig/visual-grounding

  10. MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask(视频分析-光流估计)
    论文地址:https://arxiv.org/abs/2003.10955
    代码:https://github.com/microsoft/MaskFlownet

  11. Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects(视频预测)
    论文地址:https://arxiv.org/abs/2003.12045
    代码:https://ehsanik.github.io/forcecvpr2020



  1. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network
    论文地址:https://arxiv.org/abs/2002.10200
    代码:https://github.com/Yuliang-Liu/bezier_curve_text_spotting,https://github.com/aim-uofa/adet

  2. Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA
    论文地址:https://arxiv.org/abs/1911.06258



  1. Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
    论文地址:https://arxiv.org/abs/1911.12287
    代码:https://github.com/giannisdaras/ylg

  2. MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis
    论文地址:https://arxiv.org/abs/1903.06048

  3. Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory
    论文地址:https://arxiv.org/abs/1911.04636

  4. PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer
    论文地址:https://arxiv.org/abs/1909.06956



  1. Improved Few-Shot Visual Classification
    论文地址:https://arxiv.org/pdf/1912.03432.pdf

  2. Meta-Transfer Learning for Zero-Shot Super-Resolution
    论文地址:https://arxiv.org/abs/2002.12213

  3. Instance Credibility Inference for Few-Shot Learning
    论文地址:https://arxiv.org/abs/2003.11853
    代码:https://github.com/Yikai-Wang/ICI-FSL



  1. Rethinking the Route Towards Weakly Supervised Object Localization
    论文地址:https://arxiv.org/abs/2002.11359

  2. NestedVAE: Isolating Common Factors via Weak Supervision
    论文地址:https://arxiv.org/abs/2002.11576

  3. Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation
    论文地址:https://arxiv.org/abs/1911.07450

  4. Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
    论文地址:https://arxiv.org/abs/2003.01460

  5. ClusterFit: Improving Generalization of Visual Representations
    论文地址:https://arxiv.org/abs/1912.03330

  6. Auto-Encoding Twin-Bottleneck Hashing
    论文地址:https://arxiv.org/abs/2002.11930

  7. Learning Representations by Predicting Bags of Visual Words
    论文地址:https://arxiv.org/abs/2002.12247

  8. A Characteristic Function Approach to Deep Implicit Generative Modeling
    论文地址:https://arxiv.org/abs/1909.07425

  9. Unsupervised Learning of Intrinsic Structural Representation Points
    论文地址:https://arxiv.org/abs/2003.01661
    代码:https://github.com/NolenChen/3DStructurePoints



  1. Cross-modality Person re-identification with Shared-Specific Feature Transfer
    论文地址:https://arxiv.org/abs/2002.12489

  2. Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction
    论文地址:https://arxiv.org/abs/2002.11927

  3. The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction
    论文地址:https://arxiv.org/abs/1912.06445



  1. GhostNet: More Features from Cheap Operations
    论文地址:https://arxiv.org/abs/1911.11907
    代码:https://github.com/iamhankai/ghostnet

  2. Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral
    论文地址:https://arxiv.org/abs/2003.01826

  3. GPU-Accelerated Mobile Multi-view Style Transfer
    论文地址:https://arxiv.org/abs/2003.00706

  4. Bundle Adjustment on a Graph Processor
    论文地址:https://arxiv.org/abs/2003.03134
    代码:https://github.com/joeaortiz/gbp

  5. Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral 
    论文地址:https://arxiv.org/abs/2003.01826

  6. Holistically-Attracted Wireframe Parsing
    论文地址:https://arxiv.org/abs/2003.01663

  7. AdderNet: Do We Really Need Multiplications in Deep Learning? 
    论文地址:https://arxiv.org/abs/1912.13200

  8. CARS: Contunuous Evolution for Efficient Neural Architecture Search
    论文地址:https://arxiv.org/abs/1909.04977
    代码:https://github.com/huawei-noah/CARS

  9. Π-nets: Deep Polynomial Neural Networksv
    论文地址:https://arxiv.org/abs/2003.03828

  10. Explaining Knowledge Distillation by Quantifying the Knowledge
    论文地址:https://arxiv.org/abs/2003.03622



  1. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
    论文地址:https://arxiv.org/abs/2002.11616

  2. Closed-loop Matters: Dual Regression Networks for Single Image Super-Resolution
    论文地址:https://arxiv.org/abs/2003.07018
    代码:https://github.com/guoyongcs/DRN



  1. Visual Commonsense R-CNN
    论文地址:https://arxiv.org/abs/2002.12204
    代码:https://github.com/Wangt-CN/VC-R-CNN

  2. Scalable Uncertainty for Computer Vision with Functional Variational Inference
    论文地址:https://arxiv.org/abs/2003.03396

  3. Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective
    论文地址:https://arxiv.org/abs/2002.10826

  4. Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs
    论文地址:https://arxiv.org/abs/2003.00287

  5. Filter Grafting for Deep Neural Networks
    论文地址:https://arxiv.org/abs/2001.05868
    代码:https://github.com/fxmeng/filter-grafting.git

  6. 12-in-1: Multi-Task Vision and Language Representation Learning
    论文地址:https://arxiv.org/abs/1912.02315

  7. Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
    论文地址:https://arxiv.org/abs/2002.10638
    代码:https://github.com/weituo12321/PREVALENT

  8. Unbiased Scene Graph Generation from Biased Training
    论文地址:https://arxiv.org/abs/2002.11949

9.Towards Visually Explaining Variational Autoencoders
论文地址:https://arxiv.org/abs/1911.07389

  1. BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
    论文地址:http://www.weixiushen.com/publication/cvpr20_BBN.pdf
    代码:https://github.com/Megvii-Nanjing/BBN

  2. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks
    论文地址:https://arxiv.org/abs/1905.13545

  3. SAM: The Sensitivity of Attribution Methods to Hyperparameters
    论文地址:http://s.anhnguyen.me/sam\_cvpr2020.pdf
    代码:https://github.com/anguyen8/sam

  4. Π− nets: Deep Polynomial Neural Networks
    论文地址:https://arxiv.org/abs/2003.03828

  5. Towards Backward-Compatible Representation Learning
    论文地址:https://arxiv.org/abs/2003.11942

  6. On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
    论文地址:https://arxiv.org/abs/2003.07064

  7. KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations(数据集)
    论文地址:https://arxiv.org/abs/2002.12687



1. PolarMask: Single Shot Instance Segmentation with Polar Representation
代码:https://github.com/xieenze/PolarMask

2. Unbiased Scene Graph Generation from Biased Training
代码:https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

3. Learning to Shade Hand-drawn Sketches
代码:https://github.com/qyzdao/ShadeSketch

4. SAM: The Sensitivity of Attribution Methods to Hyperparameters
代码:https://github.com/anguyen8/sam

5. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

6. Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

7. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

8. AdderNet: Do We Really Need Multiplications in Deep Learning? 

9. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

10. Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

11. Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing
https://github.com/clks-wzz/FAS-SGTD

12. Learning Meta Face Recognition in Unseen Domains
https://github.com/cleardusk/MFR

13. Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
https://github.com/alibaba/cascade-stereo

14. BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition
https://github.com/Megvii-Nanjing/BBN

15. High Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

16. SAM: The Sensitivity of Attribution Methods to Hyperparameters
https://github.com/anguyen8/sam

17. Towards Backward-Compatible Representation Learning

18. MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask
https://github.com/microsoft/MaskFlownet

19. Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
https://ehsanik.github.io/forcecvpr2020

20. StyleRig: Rigging StyleGAN for 3D Control over Portrait Images

21. Conditional Channel Gated Networks for Task-Aware Continual Learning

22. BANet: Bidirectional Aggregation Network with Occlusion Handling for Panoptic Segmentation

23. TITAN: Future Forecast using Action Priors

24. Learning Interactions and Relationships between Movie Characters

25. GPS-Net: Graph Property Sensing Network for Scene Graph Generation
https://github.com/taksau/GPS-Net

26. A Physics-based Noise Formation Model for Extreme Low-light Raw Denoising
https://github.com/Vandermode/NoiseModel

27. Controllable Person Image Synthesis with Attribute-Decomposed GAN
https://menyifang.github.io/projects/ADGAN/ADGAN.html

28. Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

29. Learning to Optimize Non-Rigid Tracking

30. Self-Supervised Scene De-occlusion
https://xiaohangzhan.github.io/projects/deocclusion/

31. Robust 3D Self-portraits in Seconds

32. Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

33. Light Field Spatial Super-resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization

34. Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

35. Deep White-Balance Editing

36. Tracking by Instance Detection: A Meta-Learning Approach



如今越来越多的研究者开始关注如何将统计中的因果应用于deep learning,来增加其鲁棒性、可解释性等等。但是大部分工作都没有深入因果理论中,更多的是借用了其中一些概念(比如counterfactual反事实),这篇paper旨在能在此基础上再向前走一点。
论文链接:https://arxiv.org/abs/2002.12204
论文代码:https://github.com/Wangt-CN/VC-R-CNN

选择2019年热门框架facebookresearch/maskrcnn-benchmark作为基础,在其基础上搭建了Scene-Graph-Benchmark.pytorch。该代码不仅兼容了maskrcnn-benchmark所支持的所有detector模型,且得益于facebookresearch优秀的代码功底,更大大增加了SGG部分的可读性和可操作性。
论文链接:https://arxiv.org/abs/2002.11949
论文代码:https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch

论文链接:https://arxiv.org/abs/1911.04231
论文代码:https://github.com/ethnhe/PVN3D.git
旷视研究院提出一种基于霍夫投票(Hough voting)的 3D 关键点检测神经网络,称之为 PVN3D,以学习逐点到 3D 关键点的偏移并为 3D 关键点投票。把基于 2D 关键点的方法推进至 3D 关键点,以充分利用刚体的几何约束信息,极大提升了 6DoF 估计的精确性。在 YCB-Video 和 LineMOD 两大公开数据集上进行了评估实验,结果表明该方法以大幅优势取得了当前最佳性能。

论文链接:https://arxiv.org/abs/2002.12489
关注红外线-RGB跨模态行人重识别。试图解决:以往大部分跨模态行人重识别算法一般都只关注shared feature learning,而很少关注Specific feature。因为Specific feature在对面模态中是不存在的。例如在红外线图片中是没有彩色颜色信息的。反之在彩图中也不会有热度信息。而实际上做过ReID的都知道,传统ReID之所以性能很高,很大程度上就是有些“过拟合”到了这些specific信息上。比如衣服颜色一直是传统ReID的一个重要的cue。从这个角度出发,尝试利用specific特征。主要思路是利用近邻信息:给定一红外线query。当搜索彩色target时,可以先找到一些简单的置信度高的彩色样本(这些样本大概率是红外线query的positive样本),把这些彩色样本的颜色特异特征给与红外线query。做了这件事后,红外线query样本可以利用这些彩色信息再去搜索更难的彩色样本。

论文链接:https://arxiv.org/abs/1911.11236
代码:https://github.com/QingyongHu/RandLA-Net
提出了一种基于简单高效的随机降采样和局部特征聚合的网络结构(RandLA-Net)。该方法不仅在诸如Semantic3D和SemanticKITTI等大场景点云分割数据集上取得了非常好的效果,并且具有非常高的效率(e.g. 比基于图的方法SPG快了接近200倍)。

论文链接:https://arxiv.org/abs/1908.01998
提出了新的少样本目标检测算法,创新点包括Attention-RPN、多关系检测器以及对比训练策略,另外还构建了包含1000类的少样本检测数据集FSOD,在FSOD上训练得到的论文模型能够直接迁移到新类别的检测中,不需要fine-tune


论文链接:https://arxiv.org/abs/1909.04977
为了优化进化算法在神经网络结构搜索时候选网络训练过长的问题,参考ENAS和NSGA-III,论文提出连续进化结构搜索方法(continuous evolution architecture search, CARS),最大化利用学习到的知识,如上一轮进化的结构和参数。首先构造用于参数共享的超网,从超网中产生子网,然后使用None-dominated排序策略来选择不同大小的优秀网络,整体耗时仅需要0.5 GPU day。

论文链接:https://arxiv.org/abs/2002.11359
论文提出伪监督目标定位方法(PSOL)来解决目前弱监督目标定位方法的问题,该方法将定位与分类分开成两个独立的网络,然后在训练集上使用Deep descriptor transformation(DDT)生成伪GT进行训练,整体效果达到SOTA。 该论文主要有三点贡献:一、弱监督目标定位应该分为类不可知目标定位和目标分类两个独立的部分,提出PSOL算法;二、尽管生成的bbox有偏差,论文仍然认为应该直接优化他们而不需要类标签,最终达到SOTA;三、在不同的数据集上,PSOL算法不需要fine-tuning也能有很好的定位迁移能力

论文链接:https://arxiv.org/pdf/2002.10322.pdf
在这项工作中,我们提出了一种新的视频中3D人体姿态估计的解决方案。我们不是直接回归3D关节位置,而是从人体骨骼解剖中汲取灵感,将任务分解为骨骼方向预测和骨骼长度预测,从这两个预测中完全可以得到三维关节位置。我们的研究动机是人类骨骼的长度随着时间的推移保持一致。这推动了我们开发有效的技术来利用视频中所有帧的全局信息来进行高精度的骨骼长度预测。此外,对于骨骼方向预测网络,我们提出了一种具有长跳跃连接的全卷积传播结构。本质上,它分层地预测不同骨骼的方向,而不使用任何耗时的存储单元(例如LSTM)。进一步引入了一种新的关节位移损失来连接骨骼长度和骨骼方向预测网络的训练。最后,我们采用一种隐含的注意机制将2D关键点可见性分数作为额外的指导反馈到模型中,这显著地缓解了许多具有挑战性的姿势中的深度歧义。我们的完整模型在Human3.6M和MPI-INF-3dHP数据集上的表现优于之前的最好结果,在这些数据集上的综合评估验证了我们模型的有效性。

论文链接:论文地址:https://arxiv.org/pdf/1912.13458.pdf
微软亚洲研究院提出了一个方法,它既不需要了解换脸后的图像数据,也不需要知道换脸算法,就能对图像做『X-Ray』,鉴别出是否换脸,以及指出换脸的边界。 新模型 Face X-Ray 具有两大属性:能泛化到未知换脸算法、能提供可解释的换脸边界。要获得这样的优良属性,诀窍就藏在换脸算法的一般过程中。如下所示,大多数换脸算法可以分为检测、修改以及融合三部分。与之前的研究不同,Face X-Ray 希望检测第三阶段产生的误差。

论文链接:https://arxiv.org/abs/1911.07524
UDP,解决了现有的SOTA人体姿态估计算法中标准编解码方法存在较大统计误差的问题。同时解决了由于翻转测试而导致的结果不对齐问题。且该算法即用即插,在基本不增加模型复杂度的情况下,有效提升了算法性能。

论文链接:https://arxiv.org/abs/1911.13239
在合成图中,前景和背景是在不同的拍摄条件 (比如时刻、季节、光照、天气) 下拍摄的,所以在亮度色泽等方面存在明显的不匹配问题。图像和谐化 (image harmonization) 旨在调整合成图中的前景,使其与背景和谐。传统的图像和谐化方法一般是从背景或者其他图片转移颜色信息到前景上,但这样无法保证调整之后的前景看起来真实并且与背景和谐。近年来,已经有少量的工作尝试用深度学习做图像和谐化,但成对的合成图和真实图极难获得。如果没有成对的合成图和真实图,深度学习的训练过程缺乏足够强的监督信息,合成图和谐化之后的结果也没有 ground-truth 用于评测。截至目前还没有公开的大规模图像和谐化数据库,我们构建并公布了由四个子数据库组成的图像和谐化数据库。并且,我们提出了域验证 (domain verification) 的概念,尝试了基于域验证的图像和谐化算法。

论文链接:https://arxiv.org/abs/1909.13226
PolarMask基于FCOS,把实例分割统一到了FCN的框架下。FCOS本质上是一种FCN的dense prediction的检测框架,可以在性能上不输anchor based的目标检测方法,让行业看到了anchor free方法的潜力。接下来要解决的问题是实例分割。本工作最大的贡献在于把更复杂的实例分割问题,转化成在网络设计和计算量复杂度上和物体检测一样复杂的任务,把对实例分割的建模变得简单和高效。

论文链接:https://arxiv.org/abs/1911.11907
该论文提供了一个全新的Ghost模块,旨在通过廉价操作生成更多的特征图。基于一组原始的特征图,作者应用一系列线性变换,以很小的代价生成许多能从原始特征发掘所需信息的“幻影”特征图(Ghost feature maps)。该Ghost模块即插即用,通过堆叠Ghost模块得出Ghost bottleneck,进而搭建轻量级神经网络——GhostNet。在ImageNet分类任务,GhostNet在相似计算量情况下Top-1正确率达75.7%,高于MobileNetV3的75.2%。

论文链接:https://arxiv.org/abs/1912.02315
许多视觉和语言的研究集中在一组小而多样的独立任务和支持的数据集上,这些数据集通常是单独研究的;然而,成功完成这些任务所需的视觉语言理解技能有很大的重叠。在这项工作中,我们通过开发一个大规模的、多任务的训练机制来研究视觉和语言任务之间的关系。



  • CVPR2020复现代码及时更新
  • CVPR2020论文分享跟进



6.CVPR2020 contributors Wechat Group

为了让大家更好得进行交流,极市特别组建了贡献者群及作者微信群,欢迎加小助手微信(cv-mart,备注CVPR2020)进群。