26 minute read

Computer Vision Research Work"

Computer Vision Research WorkPermalink

When we talk about “vision” capabilities, most people don’t understand how complex the brain is in processing the visual spectrum (light signals). What kind of processing happens inside our brain that allows us to understand color, depth, motion, speed, segments, objects, scenes, different kinds of art, drawings, culture, etc.? Until recently, when “computer vision” became a serious field in AI, only neurology researchers, surgeons, and brain specialists had some insights into these processes. But since 2012 (AlexNet Paper), with new papers being published almost every month, we are constantly learning how far we’ve come in computer vision. This article is not only about the chronology of computer vision but also about software engineers, computer scientists, AI engineers, and everyone who wants to understand how their phone performs certain computer visions tasks and becomes intelligent.

SNo Research Name Short Description of Paper Month-Year Organization URL
1 DeconvNet Deconvolutional Networks for Feature Learning Nov 2010 KAIST Paper, Blog
2 Saliency Propagation A method for salient object detection that propagates saliency information through optimization Apr 2014 Chinese Academy of Sciences Paper
3 SDS Simultaneous Detection and Segmentation Jun 2014 UC Berkeley Paper
4 GoogleNet Introduced the Inception module to increase network depth and width efficiently. Sep-2014 Google  
5 VGGNet Used small 3x3 convolution filters to increase depth, achieving high accuracy. Sep-2014 Oxford University  
6 FCN Fully Convolutional Networks for semantic segmentation Nov 2014 UC Berkeley Paper
7 HyperColumn Multi-scale CNN feature fusion Nov 2014 UC Berkeley Paper
8 DeepLab v1 Semantic Image Segmentation with Deep Convolutional Nets and CRFs Dec 2014 Google Paper
9 U-Net Convolutional network for biomedical image segmentation May 2015 University of Freiburg Paper, Blog
10 Highway Network Proposed highway layers to enable training of very deep networks. May-2015 University of Montreal  
11 YOLO Series You Only Look Once: series of real-time object detection systems (v1-v4) Jun 2015 (v1) - Apr 2020 (v4) University of Washington, Darknet Paper, Blog
12 CRF-RNN Conditional Random Fields as Recurrent Neural Networks Jun 2015 University of Oxford Paper
13 MR-CNN & S-CNN Multi-Region CNN and Semantic CNN for object detection Jun 2015 University of California, Berkeley Paper
14 DeepMask Learning to Segment Objects Candidates Jun 2015 Facebook AI Research Paper
15 LAPGAN Laplacian Pyramid of Generative Adversarial Networks for image generation Jun 2015 Facebook AI Research Paper
16 CUDMedVision1 Medical Image Segmentation System 1 Sep 2015 Chinese University of Hong Kong Paper, Blog
17 SegNet Deep Convolutional Encoder-Decoder Architecture for Image Segmentation Oct 2015 University of Cambridge Paper
18 DilatedNet Multi-Scale Context Aggregation by Dilated Convolutions Nov 2015 Princeton University Paper
19 CAM Class Activation Mapping for identifying discriminative regions Dec 2015 MIT Paper
20 ParseNet Looking Wider to See Better for semantic segmentation Dec 2015 UNC Chapel Hill Paper
21 MNC Instance-aware Semantic Segmentation via Multi-task Network Cascades Dec 2015 Microsoft Research Paper
22 ResNet Introduced residual learning to address vanishing gradients in deep networks. Dec-2015 Microsoft Research  
23 SqueezeNet AlexNet-level accuracy with 50x fewer parameters Feb 2016 UC Berkeley, Stanford Paper, Blog
24 SqueezeNet Designed to reduce model size while maintaining accuracy, using 1x1 convolutions. Feb-2016 DeepScale, UC Berkeley  
25 Pre-activation ResNet Identity Mappings in Deep Residual Networks Mar 2016 Microsoft Research Paper
26 SharpMask Learning to Refine Object Segments Mar 2016 Facebook AI Research Paper
27 InstanceFCN Instance-sensitive Fully Convolutional Networks Mar 2016 Microsoft Research Paper
28 MultipathNet Multiple Path Aggregation Network Apr 2016 Facebook AI Research Paper
29 R-FCN Region-based Fully Convolutional Networks for object detection May 2016 Microsoft Research Paper
30 NOC Neural Object Counting for object detection May 2016 Microsoft Research Paper
31 DeepLab v2 Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution Jun 2016 Google Paper
32 DeepSim Deep Learning Approach for Image Quality Assessment Jun 2016 Tsinghua University Paper
33 DIS Deep Image Smoothing Jun 2016 University of Illinois Paper
34 V-Net Fully Convolutional Neural Network for volumetric medical image segmentation Jun 2016 University College London Paper
35 3D U-net Volumetric Segmentation with 3D U-net Jun 2016 University of Freiburg Paper
36 ENet Efficient Neural Network for Real-time Semantic Segmentation Jul 2016 University of Cambridge Paper
37 ResNet38 Wider or Deeper: Revisiting the ResNet Model Jul 2016 KAIST Paper
38 DRRN Deep Recursive Residual Network for image super-resolution Jul 2016 National University of Singapore Paper
39 Multi-Channel Multi-Channel CNN for medical image analysis Jul 2016 University of California, San Diego Paper
40 GCN Graph Convolutional Networks for processing graph-structured data Sep 2016 University of Montreal Paper
41 M²FCN Multi-modal Fully Convolutional Networks for medical imaging Sep 2016 Chinese Academy of Sciences Paper, Blog
42 Graph CNN Graph Convolutional Neural Networks Sep 2016 University of Montreal Paper
43 Grad-CAM Gradient-weighted Class Activation Mapping Oct 2016 Georgia Tech Paper
44 ResNeXt Aggregated Residual Transformations for Deep Neural Networks Nov 2016 Facebook AI Research Paper
45 DRN Dilated Residual Networks Nov 2016 Princeton University Paper
46 RefineNet Multi-Path Refinement Networks for high-resolution semantic segmentation Nov 2016 University of Adelaide Paper
47 FractalNet Ultra-Deep Neural Networks without Residuals Nov 2016 University of Toronto Paper
48 SSD Single Shot MultiBox Detector for real-time object detection Dec 2016 Google Paper, Blog
49 TDM Top-Down Modulation for object detection Dec 2016 Carnegie Mellon University Paper
50 FPN Feature Pyramid Networks for object detection Dec 2016 Facebook AI Research Paper
51 VoxResNet Deep Voxelwise Residual Networks Dec 2016 Chinese Academy of Sciences Paper
52 DSSD Deconvolutional Single Shot Detector Jan 2017 UNC Chapel Hill Paper
53 PolyNet Better Vision with More Complex Paths Mar 2017 Microsoft Research Paper
54 IGCNet Interleaved Group Convolutions Mar 2017 Microsoft Research Paper
55 DCN Deformable Convolutional Networks Mar 2017 Microsoft Research Asia Paper
56 IDW-CNN Image Dependent Warping CNN Mar 2017 Seoul National University Paper
57 FCIS Fully Convolutional Instance-aware Semantic Segmentation Mar 2017 Microsoft Research Asia Paper
58 Residual Attention Network Attention mechanism for image classification Apr 2017 Tsinghua University Paper
59 ResNet-DUC-HDC Dense Upsampling Convolution and Hybrid Dilated Convolution Apr 2017 Tsinghua University Paper
60 MobileNet Focused on efficient models for mobile and embedded devices using depthwise separable convolutions. Apr-2017 Google  
61 G-RMI Google’s large scale object detection system Jun 2017 Google Research Paper
62 GraphSAGE Inductive Representation Learning on Large Graphs Jun 2017 Stanford University Paper
63 DPN Dual Path Networks combining ResNet and DenseNet Jul 2017 UCSD, Momenta Paper
64 ERFNet Efficient Residual Factorized ConvNet for real-time semantic segmentation Jul 2017 Universidad de Alcalá Paper
65 Suggestive Annotation Active Learning for medical image segmentation Jul 2017 ETH Zurich Paper
66 RetinaNet Focal Loss for Dense Object Detection Aug 2017 Facebook AI Research Paper, Blog
67 Hide-and-Seek Weakly-supervised object detection training strategy Aug 2017 Carnegie Mellon University Paper
68 C3 Cross-City Cascade for semantic segmentation Aug 2017 University of Oxford Paper
69 U-net+Res-net Combined U-net and Residual Network for medical segmentation Aug 2017 Technical University of Munich Paper
70 DenseVoxNet Dense Voxel Network for 3D medical image segmentation Sep 2017 Chinese University of Hong Kong Paper
71 Graph Attention Networks Self-attention for Graph Data Oct 2017 Université de Montréal Paper
72 Light-Head R-CNN Light-weight object detection architecture Nov 2017 Megvii Technology Paper
73 LayerCascade Instance segmentation via layer cascade Nov 2017 University of Washington Paper
74 3D U-net + ResNet Combined 3D U-net and ResNet for volumetric segmentation Nov 2017 Technical University of Munich Paper
75 Cascade R-CNN Multi-stage object detection refinement Dec 2017 CIDSE Paper
76 StairNet Top-down semantic feature refinement Dec 2017 Seoul National University Paper
77 MaskLab Instance Segmentation by Refining Object Detection Jan 2018 Google Paper
78 RU-Net + R2U-Net Recurrent Residual U-Net variants Jan 2018 University of Dhaka Paper
79 AmoebaNet Evolutionary Architecture Search Feb 2018 Google Brain Paper
80 SqueezeNext Hardware-Aware Neural Network Design Feb 2018 UC Berkeley Paper
81 ENAS Efficient Neural Architecture Search Feb 2018 Google Brain Paper
82 DeepLab v3+ Encoder-Decoder with Atrous Separable Convolution Feb 2018 Google Paper, Blog
83 Group Normalization Alternative to Batch Normalization Mar 2018 Facebook AI Research Paper
84 ACoL Adversarial Complementary Learning for weakly supervised object localization Mar 2018 University of Technology Sydney Paper
85 BR²Net Boundary Refinement and Recurrent Network for semantic segmentation Mar 2018 Tsinghua University Paper
86 PANet Path Aggregation Network for Instance Segmentation Mar 2018 Chinese Academy of Sciences Paper
87 MorphNet Fast & Simple Resource-Constrained Structure Learning Apr 2018 Google Research Paper
88 ImageNet Rethinking Research on ImageNet training strategies Apr 2018 Facebook AI Research Paper
89 Attention U-net Attention Gates for Medical Image Segmentation Apr 2018 University College London Paper
90 MegNet Multi-Evidence Guidance for weakly supervised object detection Jun 2018 University of Technology Sydney Paper
91 H-DenseUNet Hybrid Densely Connected UNet for medical segmentation Jun 2018 Chinese University of Hong Kong Paper
92 PNASNet Progressive Neural Architecture Search Jul 2018 Google Brain Paper
93 ShuffleNetV2 Practical Guidelines for Mobile Network Design Jul 2018 Face++ Paper
94 BAM Bottleneck Attention Module Jul 2018 KAIST Paper
95 CBAM Convolutional Block Attention Module Jul 2018 KAIST Paper
96 NetAdapt Platform-Aware Neural Network Adaptation Jul 2018 MIT Paper
97 U-Net++ Nested U-Net Architecture Jul 2018 Arizona State University Paper
98 DU-Net Deformable U-Net for medical image segmentation Aug 2018 Shanghai Jiao Tong University Paper
99 DropBlock Structured dropout method for convolutional networks Oct 2018 Google Brain Paper
100 AutoDeepLab Neural Architecture Search for Semantic Image Segmentation Jan 2019 Google Research Paper
101 ESPNetv2 Efficient Spatial Pyramid of Dilated Convolutions Mar 2019 MIT Paper
102 SiamRPN++ Deep learning-based visual tracking framework that removes spatial awareness by sampling features across different layers Mar 2019 Chinese Academy of Sciences Paper
103 Libra R-CNN Balanced learning framework for object detection that addresses sample level, feature level, and objective level imbalance Apr 2019 SenseTime Research Paper
104 FBNet Hardware-Aware Efficient ConvNet Design May 2019 Facebook AI Research Paper
105 SDN Selective Deep Network for efficient visual recognition May 2019 University of Texas Paper
106 MultiResUNet Multi-Resolution U-Net for medical image segmentation May 2019 Bangladesh University Paper
107 EfficientNet Scaled networks uniformly in depth, width, and resolution for better efficiency. May-2019 Google  
108 ADL Attention-based Dropout Layer for weakly supervised object localization Jun 2019 KAIST Paper
109 ARMA Convolution Auto-Regressive Moving Average Graph Filtering Jun 2019 Università degli Studi di Modena Paper
110 Panoptic Segmentation Unified Scene Parsing Framework Jun 2019 Facebook AI Research Paper
111 CutMix Data augmentation method combining cut and mix images Aug 2019 Clova AI Research, NAVER Paper
112 SlowFast Two-pathway network for video recognition that captures both slow and fast motion patterns Aug 2019 Facebook AI Research Paper
113 EfficientDet Scalable object detection architecture using weighted bidirectional feature network and compound scaling Nov 2019 Google Research Paper
114 AdderNet Neural Networks with Only Addition Operations Dec 2019 Huawei Noah’s Ark Lab Paper
115 TPN Temporal Pyramid Network for action detection in videos Dec 2019 Microsoft Research Asia Paper
116 ATSS Adaptive Training Sample Selection for object detection Dec 2019 ByteDance AI Lab Paper
117 ACNe Attentive Context Normalization for robust permutation-equivariant learning Dec 2019 KAIST Paper
118 Cascade Cost Volume Cascade Cost Volume for stereo matching Dec 2019 Megvii Technology Paper
119 Yolact++ Real-time instance segmentation with improved mask quality and inference speed Jan 2020 University of California, Davis Paper
120 MCN Multi-task Collaboration Network Jan 2020 Microsoft Research Asia Paper
121 RandLA-Net Large-scale Point Cloud Semantic Segmentation Jan 2020 University of Oxford Paper
122 OccuSeg 3D instance segmentation approach that handles occlusions in point clouds Mar 2020 Stanford University Paper
123 GTAD Global Temporal Action Detection framework for temporal action localization Mar 2020 Sun Yat-sen University Paper
124 Attention-RPN Visual tracking framework with attention mechanism in Region Proposal Network Mar 2020 Chinese Academy of Sciences Paper
125 QSA + QNT Quantized Squeeze-and-Attention Networks Mar 2020 Tsinghua University Paper
126 UNet 3+ Full-Scale Connected UNet for medical image segmentation Mar 2020 Southern Medical University Paper
127 ROAM Recurrently Optimizing Tracking Model Mar 2020 ByteDance AI Lab Paper
128 PF-NET Point Fractal Network for 3D point cloud completion Mar 2020 Simon Fraser University Paper
129 Total3DUnderstanding 3D Scene Understanding Mar 2020 National University of Singapore Paper
130 SG-NN Scene Graph Neural Networks Mar 2020 Georgia Tech Paper
131 SEAN Semantic Region-Adaptive Normalization Mar 2020 ETH Zürich Paper
132 SAOL Self-Attention Object Localization Apr 2020 Seoul National University Paper
133 VGGNet For Covid19 Modified VGG architecture for COVID-19 detection Apr 2020 Multiple Institutions Paper
134 CentripetalNet Anchor-free object detection with point-based prediction Apr 2020 Megvii Technology Paper
135 PointAugment Auto-Augmentation for 3D Point Cloud Apr 2020 National University of Singapore Paper
136 PQ-Net Learning to Generate 3D Shapes Apr 2020 Stanford University Paper
137 Axial-DeepLab Stand-Alone Axial-Attention for Vision Models Apr 2020 Johns Hopkins University Paper
138 SipMask Spatial Information Preservation for Fast Instance Segmentation Apr 2020 Inception Institute of AI Paper
139 SCAN Learning to Classify Images without Labels Apr 2020 Facebook AI Research Paper
140 MutualNet Adaptive ConvNet via Mutual Learning Apr 2020 Microsoft Research Asia Paper
141 DETR End-to-End Object Detection with Transformers May 2020 Facebook AI Research Paper
142 C-Flow Conditional Normalizing Flows May 2020 ETH Zürich Paper
143 PerfectShape Shape completion using implicit functions May 2020 Stanford University Paper
144 UFO² Unified Framework for Object Detection May 2020 Carnegie Mellon University Paper
145 Refinement Network RGB-D Scene Understanding May 2020 Technical University Munich Paper
146 AssembleNet++ Video Recognition with Learnable Connectivity May 2020 Google Research Paper
147 WeightNet Revisiting Weight Networks May 2020 Microsoft Research Paper
148 YOLOv5 Improved version of YOLO with better speed-accuracy trade-off Jun 2020 Ultralytics Paper
149 UCTGAN Unsupervised Cartoon-to-Real Translation GAN for image translation between cartoon and real-world domains Jun 2020 Nanyang Technological University Paper
150 IF-Nets Implicit Function Neural Networks for 3D reconstruction Jun 2020 Max Planck Institute Paper
151 SketchGCN Sketch Recognition using Graph Convolutional Networks Jun 2020 University of British Columbia Paper
152 AABO Adaptive Anchor Box Optimization Jun 2020 Huawei Noah’s Ark Lab Paper
153 Polka Lines Line Detection using Polar Coordinates Jun 2020 Korea University Paper
154 Pose2Mesh 3D Human Pose and Mesh Recovery Jun 2020 Korea University Paper
155 SNE-RoadSeg Road Segmentation with Synthetic Data Jun 2020 Hong Kong University Paper
156 Deep Hough Transform Line Detection using Deep Learning Jun 2020 Chinese Academy of Sciences Paper
157 Non-Local Sparse Attention Efficient Attention Mechanism Jun 2020 Google Research Paper
158 Hit-Detector Hierarchical Trinity architecture for object detection combining different detection paradigms Jul 2020 ByteDance AI Lab Paper
159 Spectral 3D Computer Vision Graph Neural Network Library Jul 2020 Multiple Contributors Paper
160 TIDE Error Analysis Tool for Object Detection Jul 2020 Carnegie Mellon University Paper
161 SimAug Learning Robust Representations through Simulation Jul 2020 Carnegie Mellon University Paper
162 HOTR End-to-End Human-Object Interaction Detection Jul 2020 KAIST Paper
163 ReXNet Rethinking Channel Dimensions for Efficient Model Design Jul 2020 UC Berkeley Paper
164 Keep Eyes on the Lane Lane Detection with Deep Learning Jul 2020 Shanghai Jiao Tong University Paper
165 AdvPC Adversarial Point Cloud Defense Jul 2020 Tsinghua University Paper
166 PD-GAN Probabilistic Diverse GAN Jul 2020 University of Oxford Paper
167 FedDG Federated Domain Generalization Jul 2020 Carnegie Mellon University Paper
168 Dynamic RCNN Dynamic R-CNN for object detection with improved training and inference Aug 2020 ByteDance AI Lab Paper
169 Aug-FPN Augmented Feature Pyramid Network for object detection with improved multi-scale feature fusion Aug 2020 Tsinghua University Paper
170 Instant-teaching Self-training for Object Detection Aug 2020 ByteDance AI Lab Paper
171 Soft-IntroVAE Soft Introduction of Variational AutoEncoders Aug 2020 Tel Aviv University Paper
172 DiNTS Differentiable Neural Network Transform Search Aug 2020 Microsoft Research Paper
173 Eagle Eye Fast Sub-net Evaluation for Efficient Neural Network Training Aug 2020 MIT Paper
174 StyleMapGAN Exploiting Spatial Dimensions of Latent for Image Manipulation Aug 2020 KAIST Paper
175 TediGAN Text-Guided Diverse Image Generation Aug 2020 Microsoft Research Asia Paper
176 Auto-Exposure Fusion Automatic Exposure Fusion for Photography Aug 2020 ETH Zürich Paper
177 Vision Transformer Transformer architecture adapted for image recognition tasks Sep 2020 Google Research Paper
178 IDU Instance Depth Embedding for RGB-D salient object detection Sep 2020 Nankai University Paper
179 VideoMoCo Contrastive Learning for Video Understanding Sep 2020 Microsoft Research Asia Paper
180 MZSR Meta-Transfer Learning for Zero-Shot Super-Resolution Nov 2020 KAIST Paper
181 DeiT Data-efficient training of image transformers Dec 2020 Facebook AI Research Paper
182 Involution Inverting Convolution for Visual Recognition Dec 2020 Shanghai AI Lab Paper
183 Deep Learning on Semantic Segmentation Comprehensive Survey and Benchmark Dec 2020 Chinese Academy of Sciences Paper
184 LiteFlowNet3 Lightweight Optical Flow Estimation Dec 2020 Chinese University of Hong Kong Paper
185 PPDM Parallel Point Detection and Matching Dec 2020 ByteDance AI Lab Paper
186 RepVGG Making VGG-style ConvNets Great Again Jan 2021 MEGVII Technology Paper
187 PSConvolution Parameter-Sharing Convolution for Deep Learning Jan 2021 Tsinghua University Paper
188 PerPixel Classification Pixel-wise Classification Network Jan 2021 ETH Zürich Paper
189 PIPAL Perceptual Image Quality Assessment Jan 2021 Nanyang Technological University Paper
190 ArtGAN Artwork Synthesis with GAN Feb 2021 NVIDIA Research Paper
191 Synthetic to Real Domain Adaptation for Semantic Segmentation Feb 2021 ETH Zürich Paper
192 Spatial-Phase-Shallow-Learning Phase-Based Feature Learning Feb 2021 Peking University Paper
193 DARKGAN Dark Image Enhancement with GAN Feb 2021 Tsinghua University Paper
194 Deep Imbalance Regression Learning from Imbalanced Data Feb 2021 Carnegie Mellon University Paper
195 Room Classification GNN Graph Neural Network for Room Layout Feb 2021 Facebook Research Paper
196 Pyramid Vision Transformer Hierarchical Vision Transformer Feb 2021 KAIST Paper
197 Residual Attention Attention Mechanism for CNNs Feb 2021 Google Research Paper
198 Teachers do more than teach Multi-teacher approach for image-to-image translation Mar 2021 Tel Aviv University Paper
199 Vip-DeepLab Visual Parsing DeepLab for Panoptic Segmentation Mar 2021 Google Research Paper
200 HistoGAN Histological Image Generation with GAN Mar 2021 University of Oxford Paper
201 Anchor-Free Person Search End-to-End Person Search without Anchors Mar 2021 Chinese Academy of Sciences Paper
202 CBNetV2 Composite Backbone Network Mar 2021 Megvii Technology Paper
203 Kaleido-BERT Vision-Language Pre-training Mar 2021 Microsoft Research Asia Paper
204 Elastic Graph Neural Network Adaptive Graph Structure Learning Mar 2021 Stanford University Paper
205 Rank and Sort Loss Loss Function for Object Detection Mar 2021 ByteDance AI Lab Paper
206 EigenGAN Eigenvalue-Based GAN Architecture Mar 2021 MIT Paper
207 DetCo Unsupervised Detection Pre-training Mar 2021 Microsoft Research Asia Paper
208 MG-GAN Multi-Generator GAN Mar 2021 NVIDIA Research Paper
209 AdaAttN Adaptive Attention for Style Transfer Mar 2021 Microsoft Research Asia Paper
210 AirBERT Vision-Language Model for Aerial Images Mar 2021 Chinese Academy of Sciences Paper
211 DeepGCNs Deep Graph Convolutional Networks Mar 2021 KAUST Paper
212 Survey: Instance Segmentation Comprehensive review of instance segmentation methods Mar 2021 Multiple Institutions Paper
213 LoFTR Local Feature TRansformer for establishing dense correspondences between images Apr 2021 Zhejiang University Paper
214 Semantic Image Matting Matting with Semantic Guidance Apr 2021 ByteDance AI Lab Paper
215 EfficientNetV2 Improved EfficientNet Architecture Apr 2021 Google Research Paper
216 Closed-Loop Matters Dual Regression for Image Generation Apr 2021 University of Oxford Paper
217 Mobile-Former Mobile-Friendly Transformer Apr 2021 Microsoft Research Paper
218 GNeRF Generalizable Neural Radiance Fields Apr 2021 UC Berkeley Paper
219 DETR with Modulated Co-Attention Enhanced DETR Architecture Apr 2021 Facebook AI Research Paper
220 Adaptable GAN Encoders Flexible GAN Inversion Apr 2021 Adobe Research Paper
221 Conformer Local Features Meet Global Dependencies Apr 2021 Shanghai AI Lab Paper
222 VMNet Visual Manipulation Networks Apr 2021 Stanford University Paper
223 Battle of Network Structure Network Architecture Comparison Study Apr 2021 Google Research Paper
224 Efficient Person Search Fast Person Search Framework Apr 2021 University of Technology Sydney Paper
225 SLIDE Smart Learning on Large-Scale Data Apr 2021 Carnegie Mellon University Paper
226 SOTR Transformer for Set Operations Apr 2021 Tsinghua University Paper
227 CANet Class-Agnostic Segmentation Networks Apr 2021 UC Berkeley Paper
228 YOLOP Real-time Driving Perception May 2021 Huawei Noah’s Ark Lab Paper
229 InSeGAN Interactive Segmentation with GAN May 2021 Adobe Research Paper
230 GroupFormer Group-Based Attention May 2021 Microsoft Research Paper
231 Super Neuron Neural Architecture Enhancement May 2021 MIT Paper
232 SO-Pose Self-Occlusion Aware Pose Estimation May 2021 NVIDIA Research Paper
233 TxT Text-driven Text Generation May 2021 Google Research Paper
234 OS2D One-Stage 2D Object Detection May 2021 Yandex Research Paper
235 CodeNet Large-Scale Code Dataset May 2021 IBM Research Paper
236 Geometric Deep Learning Blueprint for designing architectures for geometric data May 2021 Imperial College London Paper
237 Oriented R-CNN Oriented Object Detection Jun 2021 Tongji University Paper
238 XVFI Video Frame Interpolation Jun 2021 KAIST Paper
239 Cross Domain Contrastive Learning Domain Adaptation via Contrastive Learning Jun 2021 Microsoft Research Paper
240 PointManifoldCut Data Augmentation for Point Clouds Jun 2021 Stanford University Paper
241 Distance IOU Loss Improved Loss Function for Object Detection Jun 2021 Tsinghua University Paper
242 ConvMLP Convolutional MLP Architecture Jul 2021 University of Oregon Paper
243 Graph-FPN Feature Pyramid Networks with Graph Neural Networks Jul 2021 Carnegie Mellon University Paper
244 WatchOut! Motion Blur Impact on DNNs Jul 2021 ETH Zürich Paper
245 ECA-Net Efficient Channel Attention Network Jul 2021 Tsinghua University Paper
246 ShiftAddNet Efficient Neural Network Training Aug 2021 MIT Paper
247 Deep Imitation Learning Survey of Imitation Learning Methods Aug 2021 DeepMind Paper
248 3DETR 3D Object Detection with Transformers Aug 2021 Facebook AI Research Paper
249 ByteTrack Multi-Object Tracking Framework Aug 2021 ByteDance AI Lab Paper
250 Neuron Merging Network Compression via Neuron Merging Sep 2021 Microsoft Research Paper
251 Focal Transformer Vision Transformer with Focal Attention Sep 2021 Microsoft Research Paper
252 Non-Deep Networks Alternative to Deep Neural Networks Sep 2021 MIT Paper
253 PytorchVideo Deep Learning Library for Video Understanding Sep 2021 Facebook AI Research Paper
254 HeadGAN Head Generation and Editing Oct 2021 Tel Aviv University Paper
255 StyleGAN3 Alias-Free Generative Network Oct 2021 NVIDIA Research Paper
256 MedMNIST Medical Image Dataset Collection Oct 2021 Stanford University Paper
257 TokenLearner Dynamic Token Selection in Vision Transformers Oct 2021 Google Research Paper
258 Temporal Fusion Transformer Multi-horizon Forecasting Oct 2021 Google Research Paper
259 NeuralProphet Neural Network based Time-Series Model Oct 2021 Stanford University Paper
260 MetNet-2 Weather Forecasting Model Oct 2021 Google Research Paper
261 Plan-then-generate Controlled Text Generation Nov 2021 Microsoft Research Paper
262 ProjectedGAN Improved GAN Image Quality Nov 2021 NVIDIA Research Paper
263 PHALP Pose and Human Analysis using Language Processing Nov 2021 Carnegie Mellon University Paper
264 Semantic Diffusion Guidance Controlled Image Generation Nov 2021 Stanford University Paper
265 GauGAN Text-to-Image Generation Nov 2021 NVIDIA Research Paper
266 NeatNet Neural Architecture Evolution Nov 2021 Google Research Paper
267 DenseULearn Dense Prediction with Uncertainty Nov 2021 ETH Zürich Paper
268 StyleNeRF Neural Radiance Fields with Style-based Generation Dec 2021 NVIDIA Research Paper
269 Colossal-AI Large-Scale Parallel Training System Dec 2021 UC Berkeley Paper
270 EditGAN Semantic Image Editing with GANs Dec 2021 Adobe Research Paper
271 PoolFormer Alternative to Attention-based Transformers Dec 2021 Sea AI Lab Paper
272 GLIP Grounded Language-Image Pre-training Dec 2021 Microsoft Research Paper
273 PixMix Data Augmentation Strategy Dec 2021 Google Research Paper
274 GANgealing GAN-based Image Alignment Dec 2021 MIT Paper
275 HiClass Hierarchical Classification Metrics Dec 2021 Microsoft Research Paper
276 MetaFormer General Architecture for Vision Dec 2021 Sea AI Lab Paper
277 SAVi Slot Attention for Video Understanding Dec 2021 DeepMind Paper
278 PARP Parameter Reduction Technique Dec 2021 MIT Paper
279 TransMix Data Augmentation for Transformers Dec 2021 Microsoft Research Paper
280 Stable Long Term Video SR Long-term Video Super Resolution Dec 2021 ETH Zürich Paper
281 Few-Shot Learner Few-Shot Learning Framework Dec 2021 Meta AI Research Paper
282 StyleSwin StyleGAN with Swin Transformer Dec 2021 Microsoft Research Paper
283 2 Stage U-net Two-Stage Medical Image Segmentation Dec 2021 Stanford University Paper
284 ELSA Efficient Long-term Semantic Aggregation Dec 2021 ETH Zürich Paper
285 GLIDE Text-Guided Image Generation Dec 2021 OpenAI Paper
286 AdaViT Adaptive Vision Transformers Jan 2022 Microsoft Research Paper
287 Exemplar Transformers Example-based Vision Transformers Jan 2022 Google Research Paper
288 RepMLNet Reprogrammable Multi-Layer Network Jan 2022 Tsinghua University Paper
289 Untrained Deep NN Deep Networks without Training Jan 2022 MIT Paper
290 JoJoGAN Just one Joint Training GAN Jan 2022 National University of Singapore Paper
291 PRIME Pre-trained Image Encoders Jan 2022 Google Research Paper
292 StyleGAN-V Video Generation with StyleGAN Jan 2022 NVIDIA Research Paper
293 SmoothNet Motion Smoothing Network Jan 2022 ETH Zürich Paper
294 PCACE Point Cloud Auto-Encoder Jan 2022 Tsinghua University Paper
295 Siamese CD Change Detection with Transformers Jan 2022 Wuhan University Paper
296 SASA Self-Attention Spatial Adaptivity Jan 2022 Carnegie Mellon University Paper
297 GCD Generalized Category Discovery Jan 2022 University of Oxford Paper
298 3D ConvNet Optimization Optimization Planning for 3D CNNs Jan 2022 Google Research Paper
299 SeamlessGAN Seamless Image Generation Jan 2022 Adobe Research Paper
300 HardBoost Hard Example Mining with Boosting Jan 2022 Tsinghua University Paper
301 Q-ViT Quantized Vision Transformer Jan 2022 Meta AI Research Paper
302 GeoFill Geometry-aware Image Inpainting Jan 2022 Adobe Research Paper
303 Detic Detector with Image Classes Jan 2022 UC Berkeley Paper
304 RelTR Relational Transformer Jan 2022 Microsoft Research Paper
305 ResiDualGAN Residual Dual GAN Architecture Jan 2022 NVIDIA Research Paper
306 You Only Cut Once Single-Shot Instance Segmentation Jan 2022 ByteDance AI Lab Paper
307 KFIoU Loss Kalman Filter IoU Loss Function Jan 2022 Tongji University Paper
308 StyleGAN3 Editing Image and Video Editing Framework Jan 2022 NVIDIA Research Paper
309 Block-NeRF City-scale Neural Radiance Fields using blocked-based decomposition Jan 2022 Waymo/Google Research Paper
310 SeMask Semantically Masked Transformers Feb 2022 NVIDIA Research Paper
311 SLIP Self-supervision with Language-Image Pre-training Feb 2022 UC Berkeley Paper
312 Deformable ViT Vision Transformer with Deformable Attention Feb 2022 Microsoft Research Paper
313 Lawin Transformer Lightweight Transformer for Segmentation Feb 2022 Nanjing University Paper
314 HyperionSolarNet Solar Panel Detection Network Feb 2022 Stanford University Paper
315 KerGNNs Kernel Graph Neural Networks Feb 2022 MIT Paper
316 gDNA Geometric DNA Networks Feb 2022 DeepMind Paper
317 HYDRA Hybrid Deep Learning Architecture Feb 2022 Microsoft Research Paper
318 DDU-Net Dense Dual-Path U-Net Feb 2022 Shanghai Jiao Tong University Paper
319 SPAMs Spatial Attention Modules Feb 2022 Google Research Paper
320 ReLICv2 Representation Learning with Image Consistency Feb 2022 Meta AI Research Paper
321 Momentum Capsules Dynamic Routing with Momentum Feb 2022 Google Research Paper
322 SAR Despecking Transformer for SAR Image Denoising Feb 2022 Chinese Academy of Sciences Paper
323 VRT Video Restoration Transformer Feb 2022 ETH Zürich Paper
324 StyleGAN-XL Extra Large Scale StyleGAN Feb 2022 NVIDIA Research Paper
325 AlphaCode Code Generation AI System Feb 2022 DeepMind Paper
326 StyleGAN-Human Human image synthesis using StyleGAN Apr 2022 Microsoft Research Paper
327 How Do Vision Transformers Work? Analysis of internal mechanisms of Vision Transformers Jun 2022 Google Research Paper
328 FERV39k Facial Expression Recognition Dataset with 39k samples Jun 2022 South China University of Technology Paper
329 DaViT Data-efficient Vision Transformer Jul 2022 Microsoft Research Paper
330 BEVFormer Bird’s Eye View Transformer for autonomous driving Aug 2022 Shanghai AI Lab Paper
331 TensoRF Tensorial Radiance Fields for efficient 3D reconstruction Sep 2022 Zhejiang University Paper
332 WebFace260M Large-scale face recognition dataset Sep 2022 InsightFace Paper
333 Neighborhood Attention Transformer Local attention mechanism for vision tasks Oct 2022 Meta AI Research Paper
334 Barbershop Hair editing and synthesis framework Oct 2022 Adobe Research Paper
335 Visual Attention Network Novel attention mechanism for computer vision Nov 2022 Meta AI Research Paper
336 MaskGIT Masked Generative Image Transformer Nov 2022 Google Research Paper
337 CenterNet++ Improved CenterNet for object detection Nov 2022 University of Texas Paper
338 Patch-NetVLAD+ Enhanced visual place recognition using patch-based features Dec 2022 Oxford University Paper
339 PENCIL Probabilistic end-to-end noise correction Dec 2022 NTU Singapore Paper
340 CenterSnap Center-based 3D object pose estimation Dec 2022 Intel Labs Paper
341 AGCN Adaptive Graph Convolutional Network Dec 2022 Tsinghua University Paper
342 AutoAvatar Automated avatar generation from images Dec 2022 Tencent AI Lab Paper
343 Balanced MSE Balanced Mean Squared Error for imbalanced data Dec 2022 Carnegie Mellon University Paper
344 ReCLIP Improved CLIP with region-based features Dec 2022 Google Research Paper
345 EditGAN GAN-based image editing framework Dec 2022 NVIDIA Research Paper
346 HuMMan Human Motion and Manipulation dataset Dec 2022 Max Planck Institute Paper
347 BlobGAN Unsupervised part-aware image generation Dec 2022 MIT Paper
348 Deep Spectral Methods Spectral analysis for deep learning Dec 2022 MIT Paper
349 TransformNet Transformer-based architecture for geometry transformation Jan 2023 Carnegie Mellon University Paper
350 Mirror-YOLO YOLO variant using mirror augmentation for detection Jan 2023 Peking University Paper
351 Paying U-Attention to Textures U-Net based texture synthesis with attention Jan 2023 Adobe Research Paper
352 ZippyPoint Fast point cloud processing architecture Jan 2023 ETH Zürich Paper
353 InsetGAN for Full-Body Image Generation GAN-based full-body image synthesis Jan 2023 Max Planck Institute Paper
354 Mixed Differential Privacy Privacy-preserving vision model training Jan 2023 MIT Paper
355 L³U-Net Lightweight U-Net variant with enhanced learning Jan 2023 ETH Zürich Paper
356 RBGNet Residual Bidirectional Graph Network Jan 2023 Peking University Paper
357 TopFormer Top-down Transformer for vision tasks Jan 2023 Microsoft Research Paper
358 CLIP-GEN CLIP-guided image generation Jan 2023 OpenAI Paper
359 DANBO Dynamic Attention Network for Body Pose Jan 2023 Carnegie Mellon University Paper
360 KeypointNeRF NeRF with keypoint conditioning Jan 2023 Stanford University Paper
361 VOS (Visual Object Streaming) Efficient streaming framework for video object segmentation Feb 2023 ETH Zürich Paper
362 ScoreNet Score-based generative modeling for point cloud generation Feb 2023 UC Berkeley Paper
363 GroupViT Vision Transformer with dynamic grouping mechanism Feb 2023 NVIDIA Research Paper
364 TCTrack Temporal context-aware tracking framework Feb 2023 Chinese Academy of Sciences Paper
365 MLSeg Multi-level semantic segmentation framework Feb 2023 Stanford University Paper
366 StyleBabel Text-guided style transfer using BABEL embeddings Feb 2023 NVIDIA Research Paper
367 Mixed DualStyleGAN Dual-domain style transfer with mixed training Feb 2023 NVIDIA Paper
368 StyleT2I Style-based text-to-image generation Feb 2023 Microsoft Research Paper
369 SPAct Spatial-temporal action recognition Feb 2023 University of Oxford Paper
370 JIFF Joint Image and Feature Fusion Feb 2023 Stanford University Paper
371 C3-STISR Cross-Camera Stereo Image Super-Resolution Feb 2023 Tsinghua University Paper
372 IVY Integrated Vision System Feb 2023 Intel Research Paper
373 StyLandGAN Stylized landscape generation Feb 2023 NVIDIA Research Paper
374 NeuralFusion Neural fusion for 3D reconstruction using implicit representations Mar 2023 MIT Paper
375 COLA Contrastive learning approach for visual recognition Mar 2023 Stanford University Paper
376 VLP (Vision-Language Pre-training) Joint pre-training for vision and language tasks Mar 2023 Microsoft Research Paper
377 Level-K to Nash Equilibrium Game theoretic approach to vision problems Mar 2023 DeepMind Paper
378 HyperTransformer Hypernetwork-based transformer for vision tasks Mar 2023 Google Research Paper
379 GrainSpace Granular spatial representation learning Mar 2023 Carnegie Mellon University Paper
380 ROOD-MRI Robust out-of-distribution detection for medical imaging Mar 2023 MIT Paper
381 Bamboo Framework for efficient neural architecture search Mar 2023 Microsoft Research Paper
382 BigDetection Large-scale object detection framework Mar 2023 Facebook AI Research Paper
383 TransEditor Transformer-based image editing framework Mar 2023 Adobe Research Paper
384 Event Transformer Transformer architecture for event-based vision Mar 2023 Intel Labs Paper
385 MVSTER Multi-view Stereo Transformer Mar 2023 ETH Zürich Paper
386 CLIP-Art CLIP-based artistic image synthesis Mar 2023 DeepMind Paper
387 Sequencer Sequential modeling for vision tasks Mar 2023 Google Research Paper
388 GraphWorld Benchmark for graph neural networks Mar 2023 DeepMind Paper
389 F8Net Lightweight network for efficient feature extraction Apr 2023 Tsinghua University Paper
390 LatentFormer Transformer architecture for latent space manipulation Apr 2023 MIT Paper

Updated: