Share with :

Computer Vision Research Work
#

When we talk about “vision” capabilities, most people don’t understand how complex the brain is in processing the visual spectrum (light signals). What kind of processing happens inside our brain that allows us to understand color, depth, motion, speed, segments, objects, scenes, different kinds of art, drawings, culture, etc.? Until recently, when “computer vision” became a serious field in AI, only neurology researchers, surgeons, and brain specialists had some insights into these processes. But since 2012 (AlexNet Paper), with new papers being published almost every month, we are constantly learning how far we’ve come in computer vision. This article is not only about the chronology of computer vision but also about software engineers, computer scientists, AI engineers, and everyone who wants to understand how their phone performs certain computer visions tasks and becomes intelligent.

SNo	Research Name	Short Description of Paper	Month-Year	Organization	URL
1	DeconvNet	Deconvolutional Networks for Feature Learning	Nov 2010	KAIST	Paper, Blog
2	Saliency Propagation	A method for salient object detection that propagates saliency information through optimization	Apr 2014	Chinese Academy of Sciences	Paper
3	SDS	Simultaneous Detection and Segmentation	Jun 2014	UC Berkeley	Paper
4	GoogleNet	Introduced the Inception module to increase network depth and width efficiently.	Sep-2014	Google
5	VGGNet	Used small 3x3 convolution filters to increase depth, achieving high accuracy.	Sep-2014	Oxford University
6	FCN	Fully Convolutional Networks for semantic segmentation	Nov 2014	UC Berkeley	Paper
7	HyperColumn	Multi-scale CNN feature fusion	Nov 2014	UC Berkeley	Paper
8	DeepLab v1	Semantic Image Segmentation with Deep Convolutional Nets and CRFs	Dec 2014	Google	Paper
9	U-Net	Convolutional network for biomedical image segmentation	May 2015	University of Freiburg	Paper, Blog
10	Highway Network	Proposed highway layers to enable training of very deep networks.	May-2015	University of Montreal
11	YOLO Series	You Only Look Once: series of real-time object detection systems (v1-v4)	Jun 2015 (v1) - Apr 2020 (v4)	University of Washington, Darknet	Paper, Blog
12	CRF-RNN	Conditional Random Fields as Recurrent Neural Networks	Jun 2015	University of Oxford	Paper
13	MR-CNN & S-CNN	Multi-Region CNN and Semantic CNN for object detection	Jun 2015	University of California, Berkeley	Paper
14	DeepMask	Learning to Segment Objects Candidates	Jun 2015	Facebook AI Research	Paper
15	LAPGAN	Laplacian Pyramid of Generative Adversarial Networks for image generation	Jun 2015	Facebook AI Research	Paper
16	CUDMedVision1	Medical Image Segmentation System 1	Sep 2015	Chinese University of Hong Kong	Paper, Blog
17	SegNet	Deep Convolutional Encoder-Decoder Architecture for Image Segmentation	Oct 2015	University of Cambridge	Paper
18	DilatedNet	Multi-Scale Context Aggregation by Dilated Convolutions	Nov 2015	Princeton University	Paper
19	CAM	Class Activation Mapping for identifying discriminative regions	Dec 2015	MIT	Paper
20	ParseNet	Looking Wider to See Better for semantic segmentation	Dec 2015	UNC Chapel Hill	Paper
21	MNC	Instance-aware Semantic Segmentation via Multi-task Network Cascades	Dec 2015	Microsoft Research	Paper
22	ResNet	Introduced residual learning to address vanishing gradients in deep networks.	Dec-2015	Microsoft Research
23	SqueezeNet	AlexNet-level accuracy with 50x fewer parameters	Feb 2016	UC Berkeley, Stanford	Paper, Blog
24	SqueezeNet	Designed to reduce model size while maintaining accuracy, using 1x1 convolutions.	Feb-2016	DeepScale, UC Berkeley
25	Pre-activation ResNet	Identity Mappings in Deep Residual Networks	Mar 2016	Microsoft Research	Paper
26	SharpMask	Learning to Refine Object Segments	Mar 2016	Facebook AI Research	Paper
27	InstanceFCN	Instance-sensitive Fully Convolutional Networks	Mar 2016	Microsoft Research	Paper
28	MultipathNet	Multiple Path Aggregation Network	Apr 2016	Facebook AI Research	Paper
29	R-FCN	Region-based Fully Convolutional Networks for object detection	May 2016	Microsoft Research	Paper
30	NOC	Neural Object Counting for object detection	May 2016	Microsoft Research	Paper
31	DeepLab v2	Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution	Jun 2016	Google	Paper
32	DeepSim	Deep Learning Approach for Image Quality Assessment	Jun 2016	Tsinghua University	Paper
33	DIS	Deep Image Smoothing	Jun 2016	University of Illinois	Paper
34	V-Net	Fully Convolutional Neural Network for volumetric medical image segmentation	Jun 2016	University College London	Paper
35	3D U-net	Volumetric Segmentation with 3D U-net	Jun 2016	University of Freiburg	Paper
36	ENet	Efficient Neural Network for Real-time Semantic Segmentation	Jul 2016	University of Cambridge	Paper
37	ResNet38	Wider or Deeper: Revisiting the ResNet Model	Jul 2016	KAIST	Paper
38	DRRN	Deep Recursive Residual Network for image super-resolution	Jul 2016	National University of Singapore	Paper
39	Multi-Channel	Multi-Channel CNN for medical image analysis	Jul 2016	University of California, San Diego	Paper
40	GCN	Graph Convolutional Networks for processing graph-structured data	Sep 2016	University of Montreal	Paper
41	M²FCN	Multi-modal Fully Convolutional Networks for medical imaging	Sep 2016	Chinese Academy of Sciences	Paper, Blog
42	Graph CNN	Graph Convolutional Neural Networks	Sep 2016	University of Montreal	Paper
43	Grad-CAM	Gradient-weighted Class Activation Mapping	Oct 2016	Georgia Tech	Paper
44	ResNeXt	Aggregated Residual Transformations for Deep Neural Networks	Nov 2016	Facebook AI Research	Paper
45	DRN	Dilated Residual Networks	Nov 2016	Princeton University	Paper
46	RefineNet	Multi-Path Refinement Networks for high-resolution semantic segmentation	Nov 2016	University of Adelaide	Paper
47	FractalNet	Ultra-Deep Neural Networks without Residuals	Nov 2016	University of Toronto	Paper
48	SSD	Single Shot MultiBox Detector for real-time object detection	Dec 2016	Google	Paper, Blog
49	TDM	Top-Down Modulation for object detection	Dec 2016	Carnegie Mellon University	Paper
50	FPN	Feature Pyramid Networks for object detection	Dec 2016	Facebook AI Research	Paper
51	VoxResNet	Deep Voxelwise Residual Networks	Dec 2016	Chinese Academy of Sciences	Paper
52	DSSD	Deconvolutional Single Shot Detector	Jan 2017	UNC Chapel Hill	Paper
53	PolyNet	Better Vision with More Complex Paths	Mar 2017	Microsoft Research	Paper
54	IGCNet	Interleaved Group Convolutions	Mar 2017	Microsoft Research	Paper
55	DCN	Deformable Convolutional Networks	Mar 2017	Microsoft Research Asia	Paper
56	IDW-CNN	Image Dependent Warping CNN	Mar 2017	Seoul National University	Paper
57	FCIS	Fully Convolutional Instance-aware Semantic Segmentation	Mar 2017	Microsoft Research Asia	Paper
58	Residual Attention Network	Attention mechanism for image classification	Apr 2017	Tsinghua University	Paper
59	ResNet-DUC-HDC	Dense Upsampling Convolution and Hybrid Dilated Convolution	Apr 2017	Tsinghua University	Paper
60	MobileNet	Focused on efficient models for mobile and embedded devices using depthwise separable convolutions.	Apr-2017	Google
61	G-RMI	Google’s large scale object detection system	Jun 2017	Google Research	Paper
62	GraphSAGE	Inductive Representation Learning on Large Graphs	Jun 2017	Stanford University	Paper
63	DPN	Dual Path Networks combining ResNet and DenseNet	Jul 2017	UCSD, Momenta	Paper
64	ERFNet	Efficient Residual Factorized ConvNet for real-time semantic segmentation	Jul 2017	Universidad de Alcalá	Paper
65	Suggestive Annotation	Active Learning for medical image segmentation	Jul 2017	ETH Zurich	Paper
66	RetinaNet	Focal Loss for Dense Object Detection	Aug 2017	Facebook AI Research	Paper, Blog
67	Hide-and-Seek	Weakly-supervised object detection training strategy	Aug 2017	Carnegie Mellon University	Paper
68	C3	Cross-City Cascade for semantic segmentation	Aug 2017	University of Oxford	Paper
69	U-net+Res-net	Combined U-net and Residual Network for medical segmentation	Aug 2017	Technical University of Munich	Paper
70	DenseVoxNet	Dense Voxel Network for 3D medical image segmentation	Sep 2017	Chinese University of Hong Kong	Paper
71	Graph Attention Networks	Self-attention for Graph Data	Oct 2017	Université de Montréal	Paper
72	Light-Head R-CNN	Light-weight object detection architecture	Nov 2017	Megvii Technology	Paper
73	LayerCascade	Instance segmentation via layer cascade	Nov 2017	University of Washington	Paper
74	3D U-net + ResNet	Combined 3D U-net and ResNet for volumetric segmentation	Nov 2017	Technical University of Munich	Paper
75	Cascade R-CNN	Multi-stage object detection refinement	Dec 2017	CIDSE	Paper
76	StairNet	Top-down semantic feature refinement	Dec 2017	Seoul National University	Paper
77	MaskLab	Instance Segmentation by Refining Object Detection	Jan 2018	Google	Paper
78	RU-Net + R2U-Net	Recurrent Residual U-Net variants	Jan 2018	University of Dhaka	Paper
79	AmoebaNet	Evolutionary Architecture Search	Feb 2018	Google Brain	Paper
80	SqueezeNext	Hardware-Aware Neural Network Design	Feb 2018	UC Berkeley	Paper
81	ENAS	Efficient Neural Architecture Search	Feb 2018	Google Brain	Paper
82	DeepLab v3+	Encoder-Decoder with Atrous Separable Convolution	Feb 2018	Google	Paper, Blog
83	Group Normalization	Alternative to Batch Normalization	Mar 2018	Facebook AI Research	Paper
84	ACoL	Adversarial Complementary Learning for weakly supervised object localization	Mar 2018	University of Technology Sydney	Paper
85	BR²Net	Boundary Refinement and Recurrent Network for semantic segmentation	Mar 2018	Tsinghua University	Paper
86	PANet	Path Aggregation Network for Instance Segmentation	Mar 2018	Chinese Academy of Sciences	Paper
87	MorphNet	Fast & Simple Resource-Constrained Structure Learning	Apr 2018	Google Research	Paper
88	ImageNet Rethinking	Research on ImageNet training strategies	Apr 2018	Facebook AI Research	Paper
89	Attention U-net	Attention Gates for Medical Image Segmentation	Apr 2018	University College London	Paper
90	MegNet	Multi-Evidence Guidance for weakly supervised object detection	Jun 2018	University of Technology Sydney	Paper
91	H-DenseUNet	Hybrid Densely Connected UNet for medical segmentation	Jun 2018	Chinese University of Hong Kong	Paper
92	PNASNet	Progressive Neural Architecture Search	Jul 2018	Google Brain	Paper
93	ShuffleNetV2	Practical Guidelines for Mobile Network Design	Jul 2018	Face++	Paper
94	BAM	Bottleneck Attention Module	Jul 2018	KAIST	Paper
95	CBAM	Convolutional Block Attention Module	Jul 2018	KAIST	Paper
96	NetAdapt	Platform-Aware Neural Network Adaptation	Jul 2018	MIT	Paper
97	U-Net++	Nested U-Net Architecture	Jul 2018	Arizona State University	Paper
98	DU-Net	Deformable U-Net for medical image segmentation	Aug 2018	Shanghai Jiao Tong University	Paper
99	DropBlock	Structured dropout method for convolutional networks	Oct 2018	Google Brain	Paper
100	AutoDeepLab	Neural Architecture Search for Semantic Image Segmentation	Jan 2019	Google Research	Paper
101	ESPNetv2	Efficient Spatial Pyramid of Dilated Convolutions	Mar 2019	MIT	Paper
102	SiamRPN++	Deep learning-based visual tracking framework that removes spatial awareness by sampling features across different layers	Mar 2019	Chinese Academy of Sciences	Paper
103	Libra R-CNN	Balanced learning framework for object detection that addresses sample level, feature level, and objective level imbalance	Apr 2019	SenseTime Research	Paper
104	FBNet	Hardware-Aware Efficient ConvNet Design	May 2019	Facebook AI Research	Paper
105	SDN	Selective Deep Network for efficient visual recognition	May 2019	University of Texas	Paper
106	MultiResUNet	Multi-Resolution U-Net for medical image segmentation	May 2019	Bangladesh University	Paper
107	EfficientNet	Scaled networks uniformly in depth, width, and resolution for better efficiency.	May-2019	Google
108	ADL	Attention-based Dropout Layer for weakly supervised object localization	Jun 2019	KAIST	Paper
109	ARMA Convolution	Auto-Regressive Moving Average Graph Filtering	Jun 2019	Università degli Studi di Modena	Paper
110	Panoptic Segmentation	Unified Scene Parsing Framework	Jun 2019	Facebook AI Research	Paper
111	CutMix	Data augmentation method combining cut and mix images	Aug 2019	Clova AI Research, NAVER	Paper
112	SlowFast	Two-pathway network for video recognition that captures both slow and fast motion patterns	Aug 2019	Facebook AI Research	Paper
113	EfficientDet	Scalable object detection architecture using weighted bidirectional feature network and compound scaling	Nov 2019	Google Research	Paper
114	AdderNet	Neural Networks with Only Addition Operations	Dec 2019	Huawei Noah’s Ark Lab	Paper
115	TPN	Temporal Pyramid Network for action detection in videos	Dec 2019	Microsoft Research Asia	Paper
116	ATSS	Adaptive Training Sample Selection for object detection	Dec 2019	ByteDance AI Lab	Paper
117	ACNe	Attentive Context Normalization for robust permutation-equivariant learning	Dec 2019	KAIST	Paper
118	Cascade Cost Volume	Cascade Cost Volume for stereo matching	Dec 2019	Megvii Technology	Paper
119	Yolact++	Real-time instance segmentation with improved mask quality and inference speed	Jan 2020	University of California, Davis	Paper
120	MCN	Multi-task Collaboration Network	Jan 2020	Microsoft Research Asia	Paper
121	RandLA-Net	Large-scale Point Cloud Semantic Segmentation	Jan 2020	University of Oxford	Paper
122	OccuSeg	3D instance segmentation approach that handles occlusions in point clouds	Mar 2020	Stanford University	Paper
123	GTAD	Global Temporal Action Detection framework for temporal action localization	Mar 2020	Sun Yat-sen University	Paper
124	Attention-RPN	Visual tracking framework with attention mechanism in Region Proposal Network	Mar 2020	Chinese Academy of Sciences	Paper
125	QSA + QNT	Quantized Squeeze-and-Attention Networks	Mar 2020	Tsinghua University	Paper
126	UNet 3+	Full-Scale Connected UNet for medical image segmentation	Mar 2020	Southern Medical University	Paper
127	ROAM	Recurrently Optimizing Tracking Model	Mar 2020	ByteDance AI Lab	Paper
128	PF-NET	Point Fractal Network for 3D point cloud completion	Mar 2020	Simon Fraser University	Paper
129	Total3DUnderstanding	3D Scene Understanding	Mar 2020	National University of Singapore	Paper
130	SG-NN	Scene Graph Neural Networks	Mar 2020	Georgia Tech	Paper
131	SEAN	Semantic Region-Adaptive Normalization	Mar 2020	ETH Zürich	Paper
132	SAOL	Self-Attention Object Localization	Apr 2020	Seoul National University	Paper
133	VGGNet For Covid19	Modified VGG architecture for COVID-19 detection	Apr 2020	Multiple Institutions	Paper
134	CentripetalNet	Anchor-free object detection with point-based prediction	Apr 2020	Megvii Technology	Paper
135	PointAugment	Auto-Augmentation for 3D Point Cloud	Apr 2020	National University of Singapore	Paper
136	PQ-Net	Learning to Generate 3D Shapes	Apr 2020	Stanford University	Paper
137	Axial-DeepLab	Stand-Alone Axial-Attention for Vision Models	Apr 2020	Johns Hopkins University	Paper
138	SipMask	Spatial Information Preservation for Fast Instance Segmentation	Apr 2020	Inception Institute of AI	Paper
139	SCAN	Learning to Classify Images without Labels	Apr 2020	Facebook AI Research	Paper
140	MutualNet	Adaptive ConvNet via Mutual Learning	Apr 2020	Microsoft Research Asia	Paper
141	DETR	End-to-End Object Detection with Transformers	May 2020	Facebook AI Research	Paper
142	C-Flow	Conditional Normalizing Flows	May 2020	ETH Zürich	Paper
143	PerfectShape	Shape completion using implicit functions	May 2020	Stanford University	Paper
144	UFO²	Unified Framework for Object Detection	May 2020	Carnegie Mellon University	Paper
145	Refinement Network	RGB-D Scene Understanding	May 2020	Technical University Munich	Paper
146	AssembleNet++	Video Recognition with Learnable Connectivity	May 2020	Google Research	Paper
147	WeightNet	Revisiting Weight Networks	May 2020	Microsoft Research	Paper
148	YOLOv5	Improved version of YOLO with better speed-accuracy trade-off	Jun 2020	Ultralytics	Paper
149	UCTGAN	Unsupervised Cartoon-to-Real Translation GAN for image translation between cartoon and real-world domains	Jun 2020	Nanyang Technological University	Paper
150	IF-Nets	Implicit Function Neural Networks for 3D reconstruction	Jun 2020	Max Planck Institute	Paper
151	SketchGCN	Sketch Recognition using Graph Convolutional Networks	Jun 2020	University of British Columbia	Paper
152	AABO	Adaptive Anchor Box Optimization	Jun 2020	Huawei Noah’s Ark Lab	Paper
153	Polka Lines	Line Detection using Polar Coordinates	Jun 2020	Korea University	Paper
154	Pose2Mesh	3D Human Pose and Mesh Recovery	Jun 2020	Korea University	Paper
155	SNE-RoadSeg	Road Segmentation with Synthetic Data	Jun 2020	Hong Kong University	Paper
156	Deep Hough Transform	Line Detection using Deep Learning	Jun 2020	Chinese Academy of Sciences	Paper
157	Non-Local Sparse Attention	Efficient Attention Mechanism	Jun 2020	Google Research	Paper
158	Hit-Detector	Hierarchical Trinity architecture for object detection combining different detection paradigms	Jul 2020	ByteDance AI Lab	Paper
159	Spectral 3D Computer Vision	Graph Neural Network Library	Jul 2020	Multiple Contributors	Paper
160	TIDE	Error Analysis Tool for Object Detection	Jul 2020	Carnegie Mellon University	Paper
161	SimAug	Learning Robust Representations through Simulation	Jul 2020	Carnegie Mellon University	Paper
162	HOTR	End-to-End Human-Object Interaction Detection	Jul 2020	KAIST	Paper
163	ReXNet	Rethinking Channel Dimensions for Efficient Model Design	Jul 2020	UC Berkeley	Paper
164	Keep Eyes on the Lane	Lane Detection with Deep Learning	Jul 2020	Shanghai Jiao Tong University	Paper
165	AdvPC	Adversarial Point Cloud Defense	Jul 2020	Tsinghua University	Paper
166	PD-GAN	Probabilistic Diverse GAN	Jul 2020	University of Oxford	Paper
167	FedDG	Federated Domain Generalization	Jul 2020	Carnegie Mellon University	Paper
168	Dynamic RCNN	Dynamic R-CNN for object detection with improved training and inference	Aug 2020	ByteDance AI Lab	Paper
169	Aug-FPN	Augmented Feature Pyramid Network for object detection with improved multi-scale feature fusion	Aug 2020	Tsinghua University	Paper
170	Instant-teaching	Self-training for Object Detection	Aug 2020	ByteDance AI Lab	Paper
171	Soft-IntroVAE	Soft Introduction of Variational AutoEncoders	Aug 2020	Tel Aviv University	Paper
172	DiNTS	Differentiable Neural Network Transform Search	Aug 2020	Microsoft Research	Paper
173	Eagle Eye	Fast Sub-net Evaluation for Efficient Neural Network Training	Aug 2020	MIT	Paper
174	StyleMapGAN	Exploiting Spatial Dimensions of Latent for Image Manipulation	Aug 2020	KAIST	Paper
175	TediGAN	Text-Guided Diverse Image Generation	Aug 2020	Microsoft Research Asia	Paper
176	Auto-Exposure Fusion	Automatic Exposure Fusion for Photography	Aug 2020	ETH Zürich	Paper
177	Vision Transformer	Transformer architecture adapted for image recognition tasks	Sep 2020	Google Research	Paper
178	IDU	Instance Depth Embedding for RGB-D salient object detection	Sep 2020	Nankai University	Paper
179	VideoMoCo	Contrastive Learning for Video Understanding	Sep 2020	Microsoft Research Asia	Paper
180	MZSR	Meta-Transfer Learning for Zero-Shot Super-Resolution	Nov 2020	KAIST	Paper
181	DeiT	Data-efficient training of image transformers	Dec 2020	Facebook AI Research	Paper
182	Involution	Inverting Convolution for Visual Recognition	Dec 2020	Shanghai AI Lab	Paper
183	Deep Learning on Semantic Segmentation	Comprehensive Survey and Benchmark	Dec 2020	Chinese Academy of Sciences	Paper
184	LiteFlowNet3	Lightweight Optical Flow Estimation	Dec 2020	Chinese University of Hong Kong	Paper
185	PPDM	Parallel Point Detection and Matching	Dec 2020	ByteDance AI Lab	Paper
186	RepVGG	Making VGG-style ConvNets Great Again	Jan 2021	MEGVII Technology	Paper
187	PSConvolution	Parameter-Sharing Convolution for Deep Learning	Jan 2021	Tsinghua University	Paper
188	PerPixel Classification	Pixel-wise Classification Network	Jan 2021	ETH Zürich	Paper
189	PIPAL	Perceptual Image Quality Assessment	Jan 2021	Nanyang Technological University	Paper
190	ArtGAN	Artwork Synthesis with GAN	Feb 2021	NVIDIA Research	Paper
191	Synthetic to Real	Domain Adaptation for Semantic Segmentation	Feb 2021	ETH Zürich	Paper
192	Spatial-Phase-Shallow-Learning	Phase-Based Feature Learning	Feb 2021	Peking University	Paper
193	DARKGAN	Dark Image Enhancement with GAN	Feb 2021	Tsinghua University	Paper
194	Deep Imbalance Regression	Learning from Imbalanced Data	Feb 2021	Carnegie Mellon University	Paper
195	Room Classification GNN	Graph Neural Network for Room Layout	Feb 2021	Facebook Research	Paper
196	Pyramid Vision Transformer	Hierarchical Vision Transformer	Feb 2021	KAIST	Paper
197	Residual Attention	Attention Mechanism for CNNs	Feb 2021	Google Research	Paper
198	Teachers do more than teach	Multi-teacher approach for image-to-image translation	Mar 2021	Tel Aviv University	Paper
199	Vip-DeepLab	Visual Parsing DeepLab for Panoptic Segmentation	Mar 2021	Google Research	Paper
200	HistoGAN	Histological Image Generation with GAN	Mar 2021	University of Oxford	Paper
201	Anchor-Free Person Search	End-to-End Person Search without Anchors	Mar 2021	Chinese Academy of Sciences	Paper
202	CBNetV2	Composite Backbone Network	Mar 2021	Megvii Technology	Paper
203	Kaleido-BERT	Vision-Language Pre-training	Mar 2021	Microsoft Research Asia	Paper
204	Elastic Graph Neural Network	Adaptive Graph Structure Learning	Mar 2021	Stanford University	Paper
205	Rank and Sort Loss	Loss Function for Object Detection	Mar 2021	ByteDance AI Lab	Paper
206	EigenGAN	Eigenvalue-Based GAN Architecture	Mar 2021	MIT	Paper
207	DetCo	Unsupervised Detection Pre-training	Mar 2021	Microsoft Research Asia	Paper
208	MG-GAN	Multi-Generator GAN	Mar 2021	NVIDIA Research	Paper
209	AdaAttN	Adaptive Attention for Style Transfer	Mar 2021	Microsoft Research Asia	Paper
210	AirBERT	Vision-Language Model for Aerial Images	Mar 2021	Chinese Academy of Sciences	Paper
211	DeepGCNs	Deep Graph Convolutional Networks	Mar 2021	KAUST	Paper
212	Survey: Instance Segmentation	Comprehensive review of instance segmentation methods	Mar 2021	Multiple Institutions	Paper
213	LoFTR	Local Feature TRansformer for establishing dense correspondences between images	Apr 2021	Zhejiang University	Paper
214	Semantic Image Matting	Matting with Semantic Guidance	Apr 2021	ByteDance AI Lab	Paper
215	EfficientNetV2	Improved EfficientNet Architecture	Apr 2021	Google Research	Paper
216	Closed-Loop Matters	Dual Regression for Image Generation	Apr 2021	University of Oxford	Paper
217	Mobile-Former	Mobile-Friendly Transformer	Apr 2021	Microsoft Research	Paper
218	GNeRF	Generalizable Neural Radiance Fields	Apr 2021	UC Berkeley	Paper
219	DETR with Modulated Co-Attention	Enhanced DETR Architecture	Apr 2021	Facebook AI Research	Paper
220	Adaptable GAN Encoders	Flexible GAN Inversion	Apr 2021	Adobe Research	Paper
221	Conformer	Local Features Meet Global Dependencies	Apr 2021	Shanghai AI Lab	Paper
222	VMNet	Visual Manipulation Networks	Apr 2021	Stanford University	Paper
223	Battle of Network Structure	Network Architecture Comparison Study	Apr 2021	Google Research	Paper
224	Efficient Person Search	Fast Person Search Framework	Apr 2021	University of Technology Sydney	Paper
225	SLIDE	Smart Learning on Large-Scale Data	Apr 2021	Carnegie Mellon University	Paper
226	SOTR	Transformer for Set Operations	Apr 2021	Tsinghua University	Paper
227	CANet	Class-Agnostic Segmentation Networks	Apr 2021	UC Berkeley	Paper
228	YOLOP	Real-time Driving Perception	May 2021	Huawei Noah’s Ark Lab	Paper
229	InSeGAN	Interactive Segmentation with GAN	May 2021	Adobe Research	Paper
230	GroupFormer	Group-Based Attention	May 2021	Microsoft Research	Paper
231	Super Neuron	Neural Architecture Enhancement	May 2021	MIT	Paper
232	SO-Pose	Self-Occlusion Aware Pose Estimation	May 2021	NVIDIA Research	Paper
233	TxT	Text-driven Text Generation	May 2021	Google Research	Paper
234	OS2D	One-Stage 2D Object Detection	May 2021	Yandex Research	Paper
235	CodeNet	Large-Scale Code Dataset	May 2021	IBM Research	Paper
236	Geometric Deep Learning	Blueprint for designing architectures for geometric data	May 2021	Imperial College London	Paper
237	Oriented R-CNN	Oriented Object Detection	Jun 2021	Tongji University	Paper
238	XVFI	Video Frame Interpolation	Jun 2021	KAIST	Paper
239	Cross Domain Contrastive Learning	Domain Adaptation via Contrastive Learning	Jun 2021	Microsoft Research	Paper
240	PointManifoldCut	Data Augmentation for Point Clouds	Jun 2021	Stanford University	Paper
241	Distance IOU Loss	Improved Loss Function for Object Detection	Jun 2021	Tsinghua University	Paper
242	ConvMLP	Convolutional MLP Architecture	Jul 2021	University of Oregon	Paper
243	Graph-FPN	Feature Pyramid Networks with Graph Neural Networks	Jul 2021	Carnegie Mellon University	Paper
244	WatchOut!	Motion Blur Impact on DNNs	Jul 2021	ETH Zürich	Paper
245	ECA-Net	Efficient Channel Attention Network	Jul 2021	Tsinghua University	Paper
246	ShiftAddNet	Efficient Neural Network Training	Aug 2021	MIT	Paper
247	Deep Imitation Learning	Survey of Imitation Learning Methods	Aug 2021	DeepMind	Paper
248	3DETR	3D Object Detection with Transformers	Aug 2021	Facebook AI Research	Paper
249	ByteTrack	Multi-Object Tracking Framework	Aug 2021	ByteDance AI Lab	Paper
250	Neuron Merging	Network Compression via Neuron Merging	Sep 2021	Microsoft Research	Paper
251	Focal Transformer	Vision Transformer with Focal Attention	Sep 2021	Microsoft Research	Paper
252	Non-Deep Networks	Alternative to Deep Neural Networks	Sep 2021	MIT	Paper
253	PytorchVideo	Deep Learning Library for Video Understanding	Sep 2021	Facebook AI Research	Paper
254	HeadGAN	Head Generation and Editing	Oct 2021	Tel Aviv University	Paper
255	StyleGAN3	Alias-Free Generative Network	Oct 2021	NVIDIA Research	Paper
256	MedMNIST	Medical Image Dataset Collection	Oct 2021	Stanford University	Paper
257	TokenLearner	Dynamic Token Selection in Vision Transformers	Oct 2021	Google Research	Paper
258	Temporal Fusion Transformer	Multi-horizon Forecasting	Oct 2021	Google Research	Paper
259	NeuralProphet	Neural Network based Time-Series Model	Oct 2021	Stanford University	Paper
260	MetNet-2	Weather Forecasting Model	Oct 2021	Google Research	Paper
261	Plan-then-generate	Controlled Text Generation	Nov 2021	Microsoft Research	Paper
262	ProjectedGAN	Improved GAN Image Quality	Nov 2021	NVIDIA Research	Paper
263	PHALP	Pose and Human Analysis using Language Processing	Nov 2021	Carnegie Mellon University	Paper
264	Semantic Diffusion Guidance	Controlled Image Generation	Nov 2021	Stanford University	Paper
265	GauGAN	Text-to-Image Generation	Nov 2021	NVIDIA Research	Paper
266	NeatNet	Neural Architecture Evolution	Nov 2021	Google Research	Paper
267	DenseULearn	Dense Prediction with Uncertainty	Nov 2021	ETH Zürich	Paper
268	StyleNeRF	Neural Radiance Fields with Style-based Generation	Dec 2021	NVIDIA Research	Paper
269	Colossal-AI	Large-Scale Parallel Training System	Dec 2021	UC Berkeley	Paper
270	EditGAN	Semantic Image Editing with GANs	Dec 2021	Adobe Research	Paper
271	PoolFormer	Alternative to Attention-based Transformers	Dec 2021	Sea AI Lab	Paper
272	GLIP	Grounded Language-Image Pre-training	Dec 2021	Microsoft Research	Paper
273	PixMix	Data Augmentation Strategy	Dec 2021	Google Research	Paper
274	GANgealing	GAN-based Image Alignment	Dec 2021	MIT	Paper
275	HiClass	Hierarchical Classification Metrics	Dec 2021	Microsoft Research	Paper
276	MetaFormer	General Architecture for Vision	Dec 2021	Sea AI Lab	Paper
277	SAVi	Slot Attention for Video Understanding	Dec 2021	DeepMind	Paper
278	PARP	Parameter Reduction Technique	Dec 2021	MIT	Paper
279	TransMix	Data Augmentation for Transformers	Dec 2021	Microsoft Research	Paper
280	Stable Long Term Video SR	Long-term Video Super Resolution	Dec 2021	ETH Zürich	Paper
281	Few-Shot Learner	Few-Shot Learning Framework	Dec 2021	Meta AI Research	Paper
282	StyleSwin	StyleGAN with Swin Transformer	Dec 2021	Microsoft Research	Paper
283	2 Stage U-net	Two-Stage Medical Image Segmentation	Dec 2021	Stanford University	Paper
284	ELSA	Efficient Long-term Semantic Aggregation	Dec 2021	ETH Zürich	Paper
285	GLIDE	Text-Guided Image Generation	Dec 2021	OpenAI	Paper
286	AdaViT	Adaptive Vision Transformers	Jan 2022	Microsoft Research	Paper
287	Exemplar Transformers	Example-based Vision Transformers	Jan 2022	Google Research	Paper
288	RepMLNet	Reprogrammable Multi-Layer Network	Jan 2022	Tsinghua University	Paper
289	Untrained Deep NN	Deep Networks without Training	Jan 2022	MIT	Paper
290	JoJoGAN	Just one Joint Training GAN	Jan 2022	National University of Singapore	Paper
291	PRIME	Pre-trained Image Encoders	Jan 2022	Google Research	Paper
292	StyleGAN-V	Video Generation with StyleGAN	Jan 2022	NVIDIA Research	Paper
293	SmoothNet	Motion Smoothing Network	Jan 2022	ETH Zürich	Paper
294	PCACE	Point Cloud Auto-Encoder	Jan 2022	Tsinghua University	Paper
295	Siamese CD	Change Detection with Transformers	Jan 2022	Wuhan University	Paper
296	SASA	Self-Attention Spatial Adaptivity	Jan 2022	Carnegie Mellon University	Paper
297	GCD	Generalized Category Discovery	Jan 2022	University of Oxford	Paper
298	3D ConvNet Optimization	Optimization Planning for 3D CNNs	Jan 2022	Google Research	Paper
299	SeamlessGAN	Seamless Image Generation	Jan 2022	Adobe Research	Paper
300	HardBoost	Hard Example Mining with Boosting	Jan 2022	Tsinghua University	Paper
301	Q-ViT	Quantized Vision Transformer	Jan 2022	Meta AI Research	Paper
302	GeoFill	Geometry-aware Image Inpainting	Jan 2022	Adobe Research	Paper
303	Detic	Detector with Image Classes	Jan 2022	UC Berkeley	Paper
304	RelTR	Relational Transformer	Jan 2022	Microsoft Research	Paper
305	ResiDualGAN	Residual Dual GAN Architecture	Jan 2022	NVIDIA Research	Paper
306	You Only Cut Once	Single-Shot Instance Segmentation	Jan 2022	ByteDance AI Lab	Paper
307	KFIoU Loss	Kalman Filter IoU Loss Function	Jan 2022	Tongji University	Paper
308	StyleGAN3 Editing	Image and Video Editing Framework	Jan 2022	NVIDIA Research	Paper
309	Block-NeRF	City-scale Neural Radiance Fields using blocked-based decomposition	Jan 2022	Waymo/Google Research	Paper
310	SeMask	Semantically Masked Transformers	Feb 2022	NVIDIA Research	Paper
311	SLIP	Self-supervision with Language-Image Pre-training	Feb 2022	UC Berkeley	Paper
312	Deformable ViT	Vision Transformer with Deformable Attention	Feb 2022	Microsoft Research	Paper
313	Lawin Transformer	Lightweight Transformer for Segmentation	Feb 2022	Nanjing University	Paper
314	HyperionSolarNet	Solar Panel Detection Network	Feb 2022	Stanford University	Paper
315	KerGNNs	Kernel Graph Neural Networks	Feb 2022	MIT	Paper
316	gDNA	Geometric DNA Networks	Feb 2022	DeepMind	Paper
317	HYDRA	Hybrid Deep Learning Architecture	Feb 2022	Microsoft Research	Paper
318	DDU-Net	Dense Dual-Path U-Net	Feb 2022	Shanghai Jiao Tong University	Paper
319	SPAMs	Spatial Attention Modules	Feb 2022	Google Research	Paper
320	ReLICv2	Representation Learning with Image Consistency	Feb 2022	Meta AI Research	Paper
321	Momentum Capsules	Dynamic Routing with Momentum	Feb 2022	Google Research	Paper
322	SAR Despecking	Transformer for SAR Image Denoising	Feb 2022	Chinese Academy of Sciences	Paper
323	VRT	Video Restoration Transformer	Feb 2022	ETH Zürich	Paper
324	StyleGAN-XL	Extra Large Scale StyleGAN	Feb 2022	NVIDIA Research	Paper
325	AlphaCode	Code Generation AI System	Feb 2022	DeepMind	Paper
326	StyleGAN-Human	Human image synthesis using StyleGAN	Apr 2022	Microsoft Research	Paper
327	How Do Vision Transformers Work?	Analysis of internal mechanisms of Vision Transformers	Jun 2022	Google Research	Paper
328	FERV39k	Facial Expression Recognition Dataset with 39k samples	Jun 2022	South China University of Technology	Paper
329	DaViT	Data-efficient Vision Transformer	Jul 2022	Microsoft Research	Paper
330	BEVFormer	Bird’s Eye View Transformer for autonomous driving	Aug 2022	Shanghai AI Lab	Paper
331	TensoRF	Tensorial Radiance Fields for efficient 3D reconstruction	Sep 2022	Zhejiang University	Paper
332	WebFace260M	Large-scale face recognition dataset	Sep 2022	InsightFace	Paper
333	Neighborhood Attention Transformer	Local attention mechanism for vision tasks	Oct 2022	Meta AI Research	Paper
334	Barbershop	Hair editing and synthesis framework	Oct 2022	Adobe Research	Paper
335	Visual Attention Network	Novel attention mechanism for computer vision	Nov 2022	Meta AI Research	Paper
336	MaskGIT	Masked Generative Image Transformer	Nov 2022	Google Research	Paper
337	CenterNet++	Improved CenterNet for object detection	Nov 2022	University of Texas	Paper
338	Patch-NetVLAD+	Enhanced visual place recognition using patch-based features	Dec 2022	Oxford University	Paper
339	PENCIL	Probabilistic end-to-end noise correction	Dec 2022	NTU Singapore	Paper
340	CenterSnap	Center-based 3D object pose estimation	Dec 2022	Intel Labs	Paper
341	AGCN	Adaptive Graph Convolutional Network	Dec 2022	Tsinghua University	Paper
342	AutoAvatar	Automated avatar generation from images	Dec 2022	Tencent AI Lab	Paper
343	Balanced MSE	Balanced Mean Squared Error for imbalanced data	Dec 2022	Carnegie Mellon University	Paper
344	ReCLIP	Improved CLIP with region-based features	Dec 2022	Google Research	Paper
345	EditGAN	GAN-based image editing framework	Dec 2022	NVIDIA Research	Paper
346	HuMMan	Human Motion and Manipulation dataset	Dec 2022	Max Planck Institute	Paper
347	BlobGAN	Unsupervised part-aware image generation	Dec 2022	MIT	Paper
348	Deep Spectral Methods	Spectral analysis for deep learning	Dec 2022	MIT	Paper
349	TransformNet	Transformer-based architecture for geometry transformation	Jan 2023	Carnegie Mellon University	Paper
350	Mirror-YOLO	YOLO variant using mirror augmentation for detection	Jan 2023	Peking University	Paper
351	Paying U-Attention to Textures	U-Net based texture synthesis with attention	Jan 2023	Adobe Research	Paper
352	ZippyPoint	Fast point cloud processing architecture	Jan 2023	ETH Zürich	Paper
353	InsetGAN for Full-Body Image Generation	GAN-based full-body image synthesis	Jan 2023	Max Planck Institute	Paper
354	Mixed Differential Privacy	Privacy-preserving vision model training	Jan 2023	MIT	Paper
355	L³U-Net	Lightweight U-Net variant with enhanced learning	Jan 2023	ETH Zürich	Paper
356	RBGNet	Residual Bidirectional Graph Network	Jan 2023	Peking University	Paper
357	TopFormer	Top-down Transformer for vision tasks	Jan 2023	Microsoft Research	Paper
358	CLIP-GEN	CLIP-guided image generation	Jan 2023	OpenAI	Paper
359	DANBO	Dynamic Attention Network for Body Pose	Jan 2023	Carnegie Mellon University	Paper
360	KeypointNeRF	NeRF with keypoint conditioning	Jan 2023	Stanford University	Paper
361	VOS (Visual Object Streaming)	Efficient streaming framework for video object segmentation	Feb 2023	ETH Zürich	Paper
362	ScoreNet	Score-based generative modeling for point cloud generation	Feb 2023	UC Berkeley	Paper
363	GroupViT	Vision Transformer with dynamic grouping mechanism	Feb 2023	NVIDIA Research	Paper
364	TCTrack	Temporal context-aware tracking framework	Feb 2023	Chinese Academy of Sciences	Paper
365	MLSeg	Multi-level semantic segmentation framework	Feb 2023	Stanford University	Paper
366	StyleBabel	Text-guided style transfer using BABEL embeddings	Feb 2023	NVIDIA Research	Paper
367	Mixed DualStyleGAN	Dual-domain style transfer with mixed training	Feb 2023	NVIDIA	Paper
368	StyleT2I	Style-based text-to-image generation	Feb 2023	Microsoft Research	Paper
369	SPAct	Spatial-temporal action recognition	Feb 2023	University of Oxford	Paper
370	JIFF	Joint Image and Feature Fusion	Feb 2023	Stanford University	Paper
371	C3-STISR	Cross-Camera Stereo Image Super-Resolution	Feb 2023	Tsinghua University	Paper
372	IVY	Integrated Vision System	Feb 2023	Intel Research	Paper
373	StyLandGAN	Stylized landscape generation	Feb 2023	NVIDIA Research	Paper
374	NeuralFusion	Neural fusion for 3D reconstruction using implicit representations	Mar 2023	MIT	Paper
375	COLA	Contrastive learning approach for visual recognition	Mar 2023	Stanford University	Paper
376	VLP (Vision-Language Pre-training)	Joint pre-training for vision and language tasks	Mar 2023	Microsoft Research	Paper
377	Level-K to Nash Equilibrium	Game theoretic approach to vision problems	Mar 2023	DeepMind	Paper
378	HyperTransformer	Hypernetwork-based transformer for vision tasks	Mar 2023	Google Research	Paper
379	GrainSpace	Granular spatial representation learning	Mar 2023	Carnegie Mellon University	Paper
380	ROOD-MRI	Robust out-of-distribution detection for medical imaging	Mar 2023	MIT	Paper
381	Bamboo	Framework for efficient neural architecture search	Mar 2023	Microsoft Research	Paper
382	BigDetection	Large-scale object detection framework	Mar 2023	Facebook AI Research	Paper
383	TransEditor	Transformer-based image editing framework	Mar 2023	Adobe Research	Paper
384	Event Transformer	Transformer architecture for event-based vision	Mar 2023	Intel Labs	Paper
385	MVSTER	Multi-view Stereo Transformer	Mar 2023	ETH Zürich	Paper
386	CLIP-Art	CLIP-based artistic image synthesis	Mar 2023	DeepMind	Paper
387	Sequencer	Sequential modeling for vision tasks	Mar 2023	Google Research	Paper
388	GraphWorld	Benchmark for graph neural networks	Mar 2023	DeepMind	Paper
389	F8Net	Lightweight network for efficient feature extraction	Apr 2023	Tsinghua University	Paper
390	LatentFormer	Transformer architecture for latent space manipulation	Apr 2023	MIT	Paper

Follow Me

Dr. Hari Thapliyaal

Dr. Hari Thapliyal is a seasoned professional and prolific blogger with a multifaceted background that spans the realms of Data Science, Project Management, and Advait-Vedanta Philosophy. Holding a Doctorate in AI/NLP from SSBM (Geneva, Switzerland), Hari has earned Master's degrees in Computers, Business Management, Data Science, and Economics, reflecting his dedication to continuous learning and a diverse skill set. With over three decades of experience in management and leadership, Hari has proven expertise in training, consulting, and coaching within the technology sector. His extensive 16+ years in all phases of software product development are complemented by a decade-long focus on course design, training, coaching, and consulting in Project Management. In the dynamic field of Data Science, Hari stands out with more than three years of hands-on experience in software development, training course development, training, and mentoring professionals. His areas of specialization include Data Science, AI, Computer Vision, NLP, complex machine learning algorithms, statistical modeling, pattern identification, and extraction of valuable insights. Hari's professional journey showcases his diverse experience in planning and executing multiple types of projects. He excels in driving stakeholders to identify and resolve business problems, consistently delivering excellent results. Beyond the professional sphere, Hari finds solace in long meditation, often seeking secluded places or immersing himself in the embrace of nature.

Comments:

Share with :

Computer Vision Research Work#

Dr. Hari Thapliyaal

Comments:

Related

Computer Vision Research Work
#