Important AI Paper List
Introduciton
In almost all citations it becomes very difficult to read the title of research papers. Why? Because the contributors’ information is first and most of the time, it is difficult to read the name other than native people. For example, if an Indian find a native name like “Vivek Ramaswami, Kartikeyan Karunanidhi” it is easy for them to read the name but the same name becomes difficult to read for non-Indian people, and vice-versa. Giving respect to the creator is very important but more than we need to know what they have done. I know from my experience, for almost every researcher, it becomes very difficult to track good AI research papers. For me, it is more difficult because I need to maintain this blog and I want to give references to the work across different webpages. Therefore I am creating a citation key, which includes the Last name of the first researcher + year of presenting that paper. Along with this, I am describing the title of the paper and where it was presented. If you find a particular title interesting for your work you can search that paper on “google scholar”, Mendeley, sci-hub or other places with which you are familiar and comfortable. Post that you can download and read that paper at your leisure. Hope you find this list of some use for your work.
Citations
Pretrained Language Models for Text Generation: A Survey
[Bahdanau2015]
Neural machine translation by jointly learning to align and translate. In ICLR, 2015.
[Bao2020]
PLATO-2: towards building an open- domain chatbot via curriculum learning. arXiv preprint arXiv:2006.16779, 2020.
[Brown2020]
Language models are few-shot learners. In NeurIPS, 2020.
[Chen2020a]
Distilling knowledge learned in BERT for text generation. In ACL, 2020.
[Chen2020b]
Few-shot NLG with pre-trained language model. In ACL, 2020.
[Conneau2019]
Cross-lingual language model pretraining. In NeurIPS, 2019.
[Devlin2019]
BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.
[Dong2019]
Unified language model pretraining for natural language understanding and generation. In NeurIPS, 2019.
[Fan2019]
Unsupervised pre-training for sequence to sequence speech recognition. CoRR, arXiv preprint arXiv:1910.12418, 2019.
[Gehring2017]
Convolutional sequence to sequence learning. In ICML, 2017.
[Gong2020]
Tablegpt: Few-shot table-to-text generation with table structure reconstruction and content matching. In COLING, 2020.
[Gu2020]
A tailored pre-training model for task-oriented dialog generation. arXiv preprint arXiv:2004.13835, 2020.
[Guan2020]
Survey on automatic text summarization and transformer models applicability. In CCRIS, 2020.
[Hendrycks2020]
Pretrained transformers improve out-of- distribution robustness. In ACL, 2020.
[Keskar2019]
CTRL: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019.
[Kryscinski2018]
Improving abstraction in text summarization. In EMNLP, 2018.
[Lan2020]
ALBERT: A lite BERT for self-supervised learning of language representations. In ICLR, 2020.
[Lewis2020]
BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, 2020.
[Li2019]
Generating long and informative reviews with aspect-aware coarse-to-fine decoding. In ACL, pages 1969–1979, 2019.
[Li2020]
Knowledge-enhanced personalized review generation with capsule graph neural network. In CIKM, pages 735–744, 2020.
[Li2021a]
TextBox: A unified, modularized, and extensible framework for text generation. In ACL, 2021.
[Li2021b]
Few-shot knowledge graph-to-text generation with pretrained language models. In Findings of ACL, 2021.
[Li2021c]
Knowledge-based review generation by coherence enhanced text planning. In SIGIR, 2021.
[Lin2020]
Pretraining multilingual neural machine translation by leveraging alignment information. In EMNLP, 2020.
[Liu2019]
Text summarization with pretrained encoders. In EMNLP, 2019.
[Mager2020]
GPT-too: A language-model-first approach for AMR-to-text generation. In ACL, 2020.
[Peters2018]
Deep contextualized word representations. In NAACL-HLT, 2018.
[Qiu2020]
Pre-trained models for natural language processing: A survey. arXiv preprint arXiv:2003.08271, 2020.
[Radford2019]
Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
[Raffel2020]
Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 2020.
[Ribeiro2020]
Investigating pretrained language models for graph-to-text generation. arXiv preprint arXiv:2007.08426, 2020.
[Ross, 2012]
Guide for conducting risk assessments. In NIST Special Publication, 2012.
[Rothe2020]
Leveraging pre-trained checkpoints for sequence generation tasks. TACL, 2020.
[Sanh2019]
Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
[See2017]
Get to the point: Summarization with pointer-generator networks. In ACL, 2017.
[Song2019]
MASS: masked sequence to sequence pre-training for language generation. In ICML, 2019.
[Sun2019a]
Contrastive bidirectional transformer for temporal representation learning. arXiv preprint arXiv:1906.05743, 2019.
[Sun2019b]
Videobert: A joint model for video and language representation learning. In ICCV, 2019.
[Vaswani2017]
Attention is all you need. In NIPS, 2017.
[Wada2018]
Unsupervised cross-lingual word embedding by multilingual neural language models. arXiv preprint arXiv:1809.02306, 2018.
[Wolf2019]
Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149, 2019.
[Xia2020]
XGPT: cross-modal generative pre-training for image captioning. arXiv preprint arXiv:2003.01473, 2020.
[Xu2020a]
Discourse-aware neural extractive text summarization. In ACL, 2020.
[Xu2020b]
Unsupervised extractive summarization by pre-training hierarchical transformers. In EMNLP, 2020.
[Yang2020a]
CSP: code-switching pre-training for neural machine translation. In EMNLP, 2020.
[Yang2020b]
TED: A pretrained unsupervised summarization model with theme modeling and denoising. In EMNLP (Findings), 2020.
[Zaib2020]
A short survey of pre-trained language models for conversational AI-A new age in NLP. In ACSW, 2020.
[Zeng2020]
Generalized conditioned dialogue generation based on pre-trained language model. arXiv preprint arXiv:2010.11140, 2020.
[Zhang2019a]
Pretraining-based natural language generation for text summarization. In CoNLL, 2019.
[Zhang2019b]
HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. In ACL, 2019.
[Zhang2019c]
ERNIE: enhanced language representation with informative entities. In ACL, 2019.
[Zhang2020]
DIALOGPT : Largescale generative pre-training for conversational response generation. In ACL, 2020.
[Zhao2020]
Knowledge-grounded dialogue generation with pretrained language models. In EMNLP, 2020.
[Zheng2019]
Sentence centrality revisited for unsupervised summarization. In ACL, 2019.
[Zhou2020]
Unified vision-language pre-training for image captioning and VQA. In AAAI, 2020
Survey on Automatic Text Summarization and Transformer Models Applicability
[CohanA2018]
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In Proceedings of the 2018 Conference of the North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 615–621.
[NenkovaA2007]
The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Transactions on Speech and Language Processing 4, 2 (2007).
[RadfordA]
Improving language understanding by generative pre-training. www.cs.ubc.ca/~amuham01/LING530/ papers/radford2018improving.pdf
[RasimMA2013]
Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40, 5 (2013), 1675–1689.
[RasimMA]
MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 12 (2011), 14514–14522.
[VaswaniA2017]
Attention is all you need. Advances in neural information processing systems (2017), 5998–6008.
[RaffelC2019]
Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683 (2019).
[BahdanauD2014]
Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014).
[GunesE2004]
LexRank: Graph-based lexical centrality as salience in text summarization. Journal ofArtificial Intelligence 20, 1 (2004), 457–479.
[ZhangH]
Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 23rd Conference on Computational Natural Language Learning (CoNLL). 789–797.
[DevlinJ2019]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings ofthe 2019Conference ofthe NorthAmerican ChapteroftheAssociation forComputational Linguistics: Human Language Technologies. 4171–4186.
[HowardJ]
Universal Language Model Fine-tuning for Text Classification. In Proceedings ofthe 56th Annual Meeting ofthe Association for Computational Linguistics. 328–339.
[ZhangJ2019]
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. arXiv:1912.08777 (2019).
[KaikhahK]
Text summarization using neural networks. In Proceeding of second conference on intelligent system. 40–44.
[XuK]
Show, attend and tell: Neural image caption generation with visual attention. In Proceedings ofthe International conference on machine learning. 2048–2057.
[Chin-YewL]
ROUGE: A package for automatic evaluation of summaries. In Proceedings ofACL Workshop “Text Summarization Branches Out”. 8.
[M2019]
BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv:1910.13461 (2019).
[Ch2011]
A statistical approach for automatic text summarization by extraction. In Proceedings of2011 International Conference on Communication Systems and Network Technologies. 268–271.
[ConroyJM]
Text Summarization via Hidden Markov Models. In Proceedings ofthe 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 406–407.
[PetersM2018]
Deep Contextualized Word Representations. In Proceedings ofthe 2018 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 2227–2237.
[RushAM2015]
A neural attention model for abstractive sentence summarization. arXiv:1509.00685 (2015).
[VinyalsO2015]
Pointer networks. Advances in neural information processing systems (2015), 2692–2700.
[DragomirRR2004]
Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919–938.
[MihalceaR2004]
Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing. 404–411.
[NallapatiR2016]
Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv:1602.06023 (2016).
[OakR2015]
Extractive techniques for automatic document summarization: a survey. International Journal of Innovative Research in Computer and Communication Engineering 4, 3 (2016), 4158–4164.
[ParkerR2011]
English Gigaword. https://catalog.ldc.upenn.edu/LDC2011T07
[ChopraS2016]
Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings ofthe 2016 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 93–98.
[EvanS2008]
The New York Times Annotated Corpus. https://catalog.ldc.upenn. edu/LDC2008T19
[EdunovS2019]
Pre-trained language model representations for language generation. In Proceedings ofthe 2019 Conference ofthe North American Chapter ofthe Association for Computational Linguistics. 4052–4059.
[NarayanS2018]
Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 2018 Conference on Empirical Methods in Natural Language Processing. 1797–1807.
[PeterJ2017]
Get to the point: Summarization with pointer-generator networks. arXiv:1704.04368 (2017).
[GuptaV2010]
A Survey of Text Summarization Extractive Techniques. Journal ofEmerging Technologies in Web Intelligence 2, 3 (2010), 258–268.
[SanhV2019]
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arxiv.org/pdf/1910.01108 (2019).
[LiuY2019]
Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692 (2019).
[YanY2020]
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training. arXiv:2001.04063 (2020).
[DaiZ]
Transformer- XL: Attentive Language Models beyond a Fixed-Length Context. In Proceedings ofthe 57th Annual Meeting ofthe Association for Computational Linguistics. 2978–2988.
[LanZ2019]
Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019).
[YangZ2019]
Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems (2019), 5754–5764.
CTRL: A Conditional Transformer Language Model For Controllable Generation
[Mart2016]
Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Imple-mentation ({OSDI} 16), pp. 265–283, 2016.
[Rohan2019]
Memory-efficient adaptive optimiza-tion for large-scale learning. arXiv preprint arXiv:1901.11150, 2019.
[Martin2017]
Wasserstein generative adversarial networks. ´In International conference on machine learning, pp. 214–223, 2017.
[Matthew2017]
Factsheets: Increasing trust in AI servicesthrough supplier’s declarations of conformity, August 2018. arXiv:1808.07261 [cs.CY].Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. Unsupervised neural machinetranslation. arXiv preprint arXiv:1710.11041, 2017.
[Jimmy2016]
Layer normalization. CoRR, abs/1607.06450,2016.
[Lo2019]
Findings of the2019 conference on machine translation (wmt19). In Proceedings of the Fourth Conference onMachine Translation (Volume 2: Shared Task Papers, Day 1), pp. 1–61, 2019.
[Yoshua2003]
A neural probabilistic ´language model. Journal of machine learning research, 3(Feb):1137–1155, 2003.
[Thorsten2007]
Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Methods inNatural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL),pp. 858–867, 2007.
[Miles2016]
Artificial intelligence and responsible innovation. In Vincent C. Muller (ed.), ¨Fundamental Issues of Artificial Intelligence, pp. 543–554. Springer, 2016.
[Miles2019]
The malicious use of artificial intelligence: Forecasting,prevention, and mitigation, February 2019. arXiv:1802.07228 [cs.AI].Isaac Caswell, Ciprian Chelba, and David Grangier. Tagged back-translation. arXiv preprintarXiv:1906.06442, 2019.
[Xi2016]
Infogan:Interpretable representation learning by information maximizing generative adversarial nets. InAdvances in neural information processing systems, pp. 2172–2180, 2016.
[Rewon2019]
Generating long sequences with sparsetransformers. arXiv preprint arXiv:1904.10509, 2019.
[Ronan2008]
A unified architecture for natural language processing: Deepneural networks with multitask learning. In Proceedings of the 25th international conference onMachine learning, pp. 160–167. ACM, 2008.
[Ronan2011]
Natural language processing (almost) from scratch. Journal of machine learning research,12(Aug):2493–2537, 2011.
[Ruth1987]
The consumption junction: A proposal for research strategies in the sociol-ogy of technology. In Wiebe E. Bijker, Thomas P. Hughes, and Trevor J. Pinch (eds.), The SocialConstruction of Technological Systems, pp. 261–280. MIT Press, Cambridge, MA, USA, 1987.
[Andrew2015]
Semi-supervised sequence learning. In Advances in neural infor-mation processing systems, pp. 3079–3087, 2015.
[Zihang2019]
Transformer-xl: Attentive language models beyond a fixed-length context. arXivpreprint arXiv:1901.02860, 2019.
[Jacob2018]
Bert: Pre-training of deepbidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
[John2011]
Adaptive subgradient methods for online learning andstochastic optimization. Journal of Machine Learning Research, 12(Jul):2121–2159, 2011.
[Matthew2017]
Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprintarXiv:1704.05179, 2017.
[Angela2018]
Hierarchical neural story generation. arXiv preprintarXiv:1805.04833, 2018.
[Angela2019]
Eli5:Long form question answering. arXiv preprint arXiv:1907.09190, 2019.
[Boris2019]
Stochastic gradient methods with layer-wise adaptive moments for training of deep networks. arXiv preprint arXiv:1905.11286, 2019.
[Ian2014]
Generative adversarial nets. In Advances in neural infor-mation processing systems, pp. 2672–2680, 2014.
[Max2016]
Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. In Proceedings of the 2018 Conference of the North AmericanChapter of the Association for Computational Linguistics: Human Language Technologies, pp.708–719, New Orleans, Louisiana, June 2018.
[Kaiming2016]
Deep residual learning for image recog-nition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770–778, 2016.
[Karl2015]
Teaching machines to read and comprehend. In Advances inneural information processing systems, pp. 1693–1701, 2015.
[Ari2019]
The curious case of neural text degener-ation. arXiv preprint arXiv:1904.09751, 2019.
[Jeremy2018]
Universal language model fine-tuning for text classification.arXiv preprint arXiv:1801.06146, 2018.
[Hakan2016]
Tying word vectors and word classifiers: Aloss framework for language modeling. arXiv preprint arXiv:1611.01462, 2016.
[Melvin2017]
Googles multilingual neural ´machine translation system: Enabling zero-shot translation. Transactions of the Association forComputational Linguistics, 5:339–351, 2017.
[Mandar2017]
Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551, 2017.
[David2017]
Self-censorship is not enough. Nature, 492(7429):345–347,December 2012. doi: 10.1038/492345a.
[Lukasz2017]
One model to learn them all. arXiv preprint arXiv:1706.05137, 2017.
[Łukasz2018]
Fast decoding in sequence models using discrete latent variables. arXiv preprintarXiv:1803.03382, 2018.
[Nitish2019]
Unifying questionanswering and text classification via span extraction. arXiv preprint arXiv:1904.09286, 2019.
[Diederik2014]
Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980, 2014.
[Diederik2013]
Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114, 2013.
[Ryan2015]
Skip-thought vectors. In Advances in neural information processingsystems, pp. 3294–3302, 2015.
[Catherine2016]
Senellart. Domain control for neural machine translation.arXiv preprint arXiv:1612.06140, 2016.
[Wojciech2019]
Neural text summarization: A critical evaluation. arXiv preprint arXiv:1908.08960, 2019.
[Tom2019]
Natural questions: abenchmark for question answering research. Transactions of the Association for ComputationalLinguistics, 7:453–466, 2019.
[Guillaume2019]
Cross-lingual language model pretraining. arXiv preprintarXiv:1901.07291, 2019.
[Guillaume2019]
Large memory layers with product keys. ´ arXiv preprint arXiv:1907.05242, 2019.
[Hector2012]
The winograd schema challenge. In Thir-teenth International Conference on the Principles of Knowledge Representation and Reasoning,2012.
[Patrick2019]
Unsupervised question answering by clozetranslation. arXiv preprint arXiv:1906.04980, 2019.
[Minh-Thang2015]
Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114, 2015.
[Julian2015]
Image-based rec-ommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIRConference on Research and Development in Information Retrieval, pp. 43–52. ACM, 2015.
[Bryan6294]
Learned in translation:Contextualized word vectors. In Advances in Neural Information Processing Systems, pp. 6294.
[Bryan2018]
The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730, 2018.
[15Stephen2017]
Regularizing and optimizing lstm lan-guage models. arXiv preprint arXiv:1708.02182, 2017.
[Tomas2013]
Distributed represen-tations of words and phrases and their compositionality. In Advances in neural information pro-cessing systems, pp. 3111–3119, 2013.
[Margaret7596]
Model cards for model reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency (FAT* ’19), Jan-uary 2019. doi: 10.1145/3287560.3287596.
[Amit2019]
Filling gender & number gaps in neural ma-chine translation with black-box context injection. arXiv preprint arXiv:1903.03467, 2019.
[Vinod2010]
Rectified linear units improve restricted boltzmann machines. InProceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814,2010.
[Ramesh2016]
Abstractive text summarizationusing sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.
[Matthew2018]
Deep contextualized word representations. arXiv preprint arXiv:1802.05365,2018.
[Carol1979]
Constraints on language mixing: intra sentential code-switching and borrowing inspanish/english. Language, pp. 291–318, 1979.
[Shana1980]
Sometimes ill start a sentence in spanish y termino en espanol: toward a typologyof code-switching1. Linguistics, 18(7-8):581–618, 1980.
[Ofir2016]
Using the output embedding to improve language models. arXiv preprintarXiv:1608.05859, 2016.
[Alec2018]
Improving language under-standing by generative pre-training. URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/language understanding paper.pdf, 2018.
[Alec2019]
Language models are unsupervised multitask learners. URLhttps://d4mucfpksywv.cloudfront.net/better-language-models/language models are unsupervised multitask learners.pdf, 2019.
[Nazneen2019]
Explain yourself! leveraging language models for commonsense reasoning. arXiv preprint arXiv:1906.02361, 2019.
[Pranav2016]
Squad: 100,000+ questionsfor machine comprehension of text. arXiv preprint arXiv:1606.05250, 2016.
[Alexander2015]
A neural attention model for abstractivesentence summarization. arXiv preprint arXiv:1509.00685, 2015.
[Evan2008]
The new york times annotated corpus. Linguistic Data Consortium, Philadelphia,6(12):e26752, 2008.
[Thomas2019]
Answers unite!unsupervised metrics for reinforced summarization models. arXiv preprint arXiv:1909.01610,2019.
[Abigail2017]
Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics (Volume 1: Long Papers), volume 1, pp. 1073–1083, 2017.
[Rico2015]
Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909, 2015.
[Noam2018]
Adafactor: Adaptive learning rates with sublinear memory cost.arXiv preprint arXiv:1804.04235, 2018.
[Jack008]
Developing a framework for responsible inno-vation. Research Policy, 42(9):1568–1580, November 2013. doi: 10.1016/j.respol.2013.05.008.
[Ilya2014]
Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112, 2014.
[Trieu2018]
A simple method for commonsense reasoning. arXiv preprintarXiv:1806.02847, 2018.
[Adam2016]
A machine comprehension dataset. arXiv preprint arXiv:1611.09830,2016.
[Lav6008]
Pretrained AI models: Performativity,mobility, and change, September 2019. arXiv:1909.03290 [cs.CY].
[Curran2018]
Glue:A multi-task benchmark and analysis platform for natural language understanding. arXiv preprintarXiv:1804.07461, 2018.
[Sean2019]
Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319, 2019.
[Yonghui2016]
Google’s neural machine trans-lation system: Bridging the gap between human and machine translation. arXiv preprintarXiv:1609.08144, 2016.
[Stratos2019]
Sumqe: a bert-based summary quality estimation model. arXiv preprint arXiv:1909.00578, 2019.
[Zhilin2018]
Hotpotqa: A dataset for diverse, explainable multi-hop questionanswering. arXiv preprint arXiv:1809.09600, 2018.
[Rowan 2019]
Defending against neural fake news. arXiv preprint arXiv:1905.12616, 2019
[Fangxiaoyu2022]
Language-agnostic BERT Sentence Embedding - LaBSE
- BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning
- BERT based cross-lingual sentence embeddings is explored in this paper.
- It explored combining the best methods for learning monolingual and cross-lingual representations including: masked language modeling (MLM), translation language modeling (TLM)
- Introducing a pre-trained multilingual language model dramatically reduces the amount of parallel training data required to achieve good performance
- It produces a model that achieves high bi-text retrieval accuracy over 112 languages
NLP Papers Available on my Google Drive
You can download these papers from link
- A brief introduction to boosting.pdf
- A Closer Look at Fermentors and Bioreactors.pdf
- A Comprehensive Survey on Graph Neural Networks.pdf
- A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection.pdf
- A dataset for detecting irony in Hindi-english code-mixed social media text.pdf
- A Framework for Document Specific Error Detection and Corrections in Indic OCR.pdf
- A lexicon-based approach for hate speech detection.pdf
- A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (.pdf
- A novel automatic satire and irony detection using ensembled feature selection and data mining.pdf
- A Pragmatic Analysis Of Humor In Modern Family.pdf
- A Selective Overview of Deep Learning.pdf
- A Sentiment Analyzer for Hindi Using Hindi Senti Lexicon.pdf
- A Survey of Code-switched Speech and Language Processing.pdf
- A Survey of the State of Explainable AI for Natural Language Processing.pdf
- A Survey on Explainable Artificial Intelligence (XAI) Toward Medical XAI.pdf
- A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language.pdf
- A transformer-based approach to irony and sarcasm detection.pdf
- A2Text-net A novel deep neural network for sarcasm detection.pdf
- Adaptive glove and fasttext model for Hindi word embeddings.pdf
- AI and Ethics - Operationalising Responsible AI-PAPER.pdf
- AI4Bharat-IndicNLP Corpus Monolingual Corpora and Word Embeddings for Indic Languages.pdf
- ALBERT A Lite BERT for Self-supervised Learning of Language Representations.pdf
- an Analysis of Current Trends for Sanskrit As a Computer Programming Language.pdf
- An empirical, quantitative analysis of the differences between sarcasm and Irony.pdf
- An Image is Worth 16x16 Words Transformers for Image Recognition at Scale.pdf
- Analyzing_The_Expressive_Power_Of_Graph.pdf
- AnnCorra Annotating Corpora Guidelines For POS And Chunk Annotation For Indian Languages.pdf
- Approaches to Cross-Domain Sentiment Analysis A Systematic Literature Review.pdf
- Attention is all you need.pdf
- Automatic sarcasm detection A survey.pdf
- Automatic satire detection Are you having a laugh.pdf
- Bag of tricks for efficient text classification.pdf
- Baselines and bigrams Simple, good sentiment and topic classification.pdf
- BERT Explained - A list of Frequently Asked Questions.pdf
- BERT Pre-training of deep bidirectional transformers for language understanding.pdf
- BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories.pdf
- Carer Contextualized affect representations for emotion recognition.pdf
- CASCADE Contextual Sarcasm Detection in Online Discussion Forums.pdf
- Challenges in Deploying Machine Learning a Survey of Case Studies.pdf
- Clinical artificial intelligence quality improvement towards continual monitoring and updating of AI algorithms in healthcare.pdf
- CLUE based load balancing in replicated web server.pdf
- Clues for detecting irony in user-generated contents Oh…!! it_s so easy -).pdf
- Code Mixing A Challenge for Language Identification in the Language of Social Media.pdf
- Context-based Sarcasm Detection in Hindi Tweets.pdf
- Contextualized sarcasm detection on twitter.pdf
- Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis.pdf
- Data governance A conceptual framework, structured review, and research agenda.pdf
- Deep and Dense Sarcasm Detection.pdf
- Deep learning based unsupervised POS tagging for Sanskrit.pdf
- Detailed human avatars from monocular video.pdf
- Detecting Sarcasm is Extremely Easy -).pdf
- DIALOGPT Large-Scale Generative Pre-training for Conversational Response Generation.pdf
- DistilBERT, a distilled version of BERT smaller, faster, cheaper and lighter.pdf
- DRIFT Deep Reinforcement Learning for Functional Software Testing.pdf
- Drop A reading comprehension benchmark requiring discrete reasoning over paragraphs.pdf
- Dynamic routing between capsules.pdf
- Effect of speech coding on speaker identification.pdf
- Efficient estimation of word representations in vector space(2).pdf
- ELECTRA Pre-training Text Encoders as Discriminators Rather Than Generators.pdf
- Embedding Words as Distributions with a Bayesian Skip-gram Model.pdf
- Enriching Word Vectors with Subword Information.pdf
- Experience Grounds Language.pdf
- Exploiting emojis for sarcasm detection.pdf
- Exploiting Similarities among Languages for Machine Translation.pdf
- Exploring the fine-grained analysis and automatic detection of irony on Twitter(2).pdf
- Exploring the fine-grained analysis and automatic detection of irony on Twitter.pdf
- Exploring the impact of pragmatic phenomena on irony detection in tweets A multilingual corpus study.pdf
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.pdf
- Extensions to HMM-based Statistical Word Alignment Models.pdf
- Fairness_In_Machine_Learning_A_Survey.pdf
- Fake news detection of Indian and United States election data using machine learning algorithm.pdf
- Fake News Detection on Social Media.pdf
- FakeNewsNet A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social.pdf
- Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks.pdf
- FastText.zip Compressing text classification models.pdf
- Figurative messages and affect in Twitter Differences between #irony, #sarcasm and #not.pdf
- Forecasting COVID-19 Confirmed Cases in Major Indian Cities and Their Connectedness with Mobility and Weather-related Parameters.pdf
- From English To Foreign Languages Transferring Pre-trained Language Models.pdf
- FROM Pre-trained Word Embeddings TO Pre-trained Language Models - Focus on BERT.pdf
- Going deeper with convolutions.pdf
- Graph Machine Learning NeurIPS 2020 Papers.pdf
- Grouped Convolutional Neural Networks for Multivariate Time Series.pdf
- Grouped Functional Time Series Forecasting An Application to Age-Specific Mortality Rates.pdf
- Handbook of approximation algorithms and metaheuristics.pdf
- Harnessing context incongruity for sarcasm detection.pdf
- Harnessing Online News for Sarcasm Detection in Hindi Tweets.pdf
- Hidden Markov Models.pdf
- Hidden technical debt in machine learning systems.pdf
- Hotpotqa A dataset for diverse, explainable multi-hop question answering.pdf
- How multilingual is multilingual BERT.pdf
- How to avoid machine learning pitfalls a guide for academic researchers.pdf
- How to read a paper.pdf
- HuggingFace_s Transformers State-of-the-art Natural Language Processing.pdf
- Identifying machine learning techniques for classification of target advertising.pdf
- Identifying sarcasm in Twitter A closer look.pdf
- Improving Language Understanding by Generative Pre-Training.pdf
- Improving the learnability of classifiers for Sanskrit OCR corrections.pdf
- Indic sentiReview Natural language processing based sentiment analysis on major indian languages.pdf
- Interactive-and-Visual-Prompt-Engineering-for-adhoc-Task-Adaptation-LLM.pdf
- Investigations in computational sarcasm.pdf
- Irony detection in twitter The role of affective content.pdf
- Irony, Sarcasm and Parody in the American Sitcom Modern Family.pdf
- iSarcasm A Dataset of Intended Sarcasm.pdf
- K-means with Three different Distance Metrics.pdf
- Knowledge Representation in Sanskrit and Artificial Intelligence.pdf
- Learning Graph Search Heuristics.pdf
- Learning latent causal graphs via mixture oracles.pdf
- LearningSys_2015_paper_32.pdf
- Lexicon-Based Methods for Sentiment Analysis.pdf
- Lexicon-Based Sentiment Analysis in the Social Web.pdf
- LightGBM A highly efficient gradient boosting decision tree.pdf
- Linguistic Inquiry and Word Count LIWC2015.pdf
- Machine Learning in Automated Text Categorization.pdf
- Machine Learning within a Graph Database A Case Study on Link Prediction for Scholarly Data.pdf
- Machine Translation Approaches and Survey for Indian Languages.pdf
- Machine Translation of Bi-lingual Hindi-English (Hinglish) Text.pdf
- Merlion A Machine Learning Library for Time Series.pdf
- Mining of Massive Datasets.pdf
- MLP-Mixer An all-MLP Architecture for Vision.pdf
- Multi-modal sarcasm detection in Twitter with hierarchical fusion model.pdf
- Multi-rule based ensemble feature selection model for sarcasm type detection in Twitter.pdf
- Multimodal markers of irony and sarcasm.pdf
- N Atural L Anguage I Nference Over.pdf
- Natural Language Processing - A Panian Perspective.pdf
- Natural language processing based features for sarcasm detection An investigation using bilingual social media texts.pdf
- NeuralProphet Explainable Forecasting at Scale.pdf
- On State-of-the-art of POS Tagger, Sandhi Splitter, Alankaar Finder and Samaas Finder for IndoAryan and Dravidian Languages.pdf
- Opinion mining and sentiment analysis.pdf
- Opinion-Based Entity Ranking (Author_s Draft).pdf
- Part-of-speech tagging from 97_ to 100_ Is it time for some linguistics.pdf
- PAVE Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models.pdf
- Real-time Sentiment Analysis of Hindi Tweets.pdf
- Reasoning with sarcasm by reading in-between.pdf
- Recent trends in deep learning based natural language processing Review Article.pdf
- RECEPTIVE FIELDS OF SINGLE NEURONES IN THE CAT _ S STRIATE CORTEX.pdf
- Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing.pdf
- Representing social media users for sarcasm detection.pdf
- Retrospective Reader for Machine Reading Comprehension.pdf
- RoBERTa A Robustly Optimized BERT Pretraining Approach.pdf
- Robotics , AI , and.pdf
- ROC graphs Notes and practical considerations for researchers.pdf
- Sanskrit sandhi splitting using Seq2(Seq)22.pdf
- Sanskrit word segmentation using character-level recurrent and convolutional neural networks.pdf
- Sarc-M Sarcasm Detection in Typo-graphic Memes.pdf
- Sarcasm as contrast between a positive sentiment and negative situation.pdf
- Sarcasm Detection in Hindi sentences using Support Vector machine.pdf
- Sarcasm detection in tweets.pdf
- Sarcasm detection on twitterA behavioral modeling approach.pdf
- Sarcastic sentiment detection in tweets streamed in real time a big data approach.pdf
- Scalable linear algebra on a relational database system.pdf
- Scaling Large Production Clusters with Partitioned Synchronization This paper is included in the Proceedings of the.pdf
- Semantics-Aware BERT for Language Understanding.pdf
- Semi-supervised recognition of sarcastic sentences in twitter and Amazon.pdf
- SentencePiece A simple and language independent subword tokenizer and detokenizer for neural text processing.pdf
- Sentiment Analysis for Hindi Language.pdf
- Sentiment Analysis in a Resource Scarce LanguageHindi.pdf
- Sentiment Analysis In Hindi.pdf
- Sentiment Analysis in Indian languages o Definition.pdf
- Sentiment Analysis of Hindi Review based on Negation and Discourse Relation.pdf
- Sentiment classification using machine learning techniques with syntax features.pdf
- Skillful writing of an awful research paper.pdf
- Social media and fake news in the 2016 election.pdf
- Sound classification using convolutional neural network and tensor deep stacking network.pdf
- Sparse, contextually informed models for irony detection Exploiting user communities, entities and sentiment.pdf
- SQuad 100,000 questions for machine comprehension of text.pdf
- ST4_Method_Random_Forest.pdf
- Statistical Methods in Natural Language Processing.pdf
- StructBERT Incorporating Language Structures into Pre-training for Deep Language Understanding.pdf
- Structural S tudies on S mall A myloid O ligomers RT-6.pdf
- Superintelligence.pdf
- Systematic literature review of sentiment analysis on Twitter using soft computing techniques.pdf
- Text categorization with support vector machines Learning with many relevant features.pdf
- Text normalization of code mix and sentiment analysis.pdf
- The Differential Role of Ridicule in Sarcasm and Irony The Differential Role of Ridicule in Sarcasm and Irony.pdf
- The highest form of intelligence Sarcasm increases creativity for both expressers and recipients.pdf
- The Modern Mathematics of Deep Learning *.pdf
- The Paninian approach to natural language processing.pdf
- The perfect solution for detecting sarcasm in tweets #not.pdf
- Thumbs Up or Thumbs Down Semantic Orientation Applied to Unsupervised Classification of Reviews.pdf
- THU_NGN at SemEval-2018 Task 3 Tweet Irony Detection with Densely connected LSTM and Multi-task Learning.pdf
- TnT - A Statistical Part-of-Speech Tagger.pdf
- To BLOB or Not To BLOB Large Object Storage in a Database or a Filesystem To BLOB or Not To BLOB Large Object Storage in a Dat.pdf
- Towards Demystifying Serverless Machine Learning Training.pdf
- Towards multimodal sarcasm detection (an obviously perfect paper).pdf
- Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text.pdf
- Triple-View Feature Learning for Medical Image Segmentation.pdf
- Twitter as a corpus for sentiment analysis and opinion mining.pdf
- Two improved continuous bag-of-word models.pdf
- Understanding Diffusion Models A Unified Perspective Introduction Generative Models.pdf
- Universal Sentence Encoder.pdf
- Unsupervised Irony Detection A Probabilistic Model with Word Embeddings.pdf
- UR-Funny A multimodal language dataset for understanding humor.pdf
- Use of Sanskrit for natural language processing.pdf
- Using TF-IDF to Determine Word Relevance in Document Queries.pdf
- Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval.pdf
- Very deep convolutional networks for large-scale image recognition.pdf
- We are IntechOpen , the world ‘ s leading publisher of Open Access books Built by scientists , for scientists TOP 1 _.pdf
- When BERT Plays the Lottery, All Tickets Are Winning.pdf
- XGBoost A scalable tree boosting system.pdf
- XLNet Generalized Autoregressive Pretraining for Language Understanding.pdf
AI Papers Available on my Google Drive
You can download these papers from link
- A Comprehensive Survey on Graph Neural Networks-PAPER.pdf
- A machine learning approach to predicting psychosis-PAPER.pdf
- A Selective Overview of Deep Learning-PAPER.pdf
- A Short introduction to boosting-PAPER.pdf
- A Survey of the State of Explainable AI for NLP-PAPER.pdf
- A Survey on Explainable AI (XAI) towards Medical XAI-PAPER.pdf
- AI and Ethics - Operationalising Responsible AI-PAPER.pdf
- Analyzing The Expressive Power of Graph Neural Network in a Spectral Perspective-PAPER.pdf
- Attention-Mechanism-Transformers-BERT-and-GPT-PAPER.pdf
- Can GPT-4 Perform Neural Architecture Search-PAPER.pdf
- Challenges in Deploying Machine Learning-PAPER.pdf
- Clinical AI quality improvement-PAPER.pdf
- Cramming-Training-a-Language-Model-On-A-Single-GPU-in-one-Day-PAPER.pdf
- DataGovernance-A conceptual framework, structured review-PAPER.pdf
- Detailed human avatar-PAPER.pdf
- DRIFT_26_CameraReadySubmission_NeurIPS_DRL-PAPER.pdf
- Dynamic Routing Between Capsules-PAPER.pdf
- Fairness in Machine Learning A Survey-PAPER.pdf
- Forecasting COVID-19 Confirmed Cases-PAPER.pdf
- Generalization Beyond Overfitting On Small Datasets-PAPER.pdf
- GPTrillion-Paper.pdf
- GPTs-are-GPTs-An-Early-Look-at-the-Labor-Market-Impact-Potential-of-Large-Language-Models-PAPER.pdf
- GraphMachine Learning NeurIPS 2020-PAPER.pdf
- Grouped Convolutional Neural Networks for Multivariate Time Series -PAPER.pdf
- Grouped functional time series forecasting An application to age-specific mortality rates-PAPER.pdf
- Hidden Technical Debt in Machine Learning Systems-PAPER.pdf
- Hidden technical debt in machine learning systems.pdf
- How to avoid machine learning pitfalls-PAPER.pdf
- How to Read a Paper-ARTC.pdf
- Identifying machine learning techniques for classification of target advertising-PAPER.pdf
- Introducing-GPTrillion-PAPER.pdf
- Large Object Storage in a Database or a Filesystem-PAPER.pdf
- Learning Graph Heuristic Search-PAPER.pdf
- Learning latent causal graphs via mixture oracles-PAPER.pdf
- LightGBM A Highly Efficient Gradient Boosting-PAPER.pdf
- Machine Learning within a Graph Database- A Case Study on Link Prediction for Scholarly Data-PAPER.pdf
- Merlion- A Machine Learning Library for Time Series-PAPER.pdf
- Model Evaluation, Model Selection, and Algorithm Selection-PAPER.pdf
- NeuralProphet-Explainable Forecasting at Scale-PAPER.pdf
- PAVE-Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models-PAPER.pdf
- Precise Zero-Shot Dense Retrieval without Relevance Labels-PAPER.pdf
- Randomforest-PAPER.pdf
- Receptive Fields of Single Neurones in the Cats Striate Cortex-PAPER.pdf
- Robotics, AI, and Humanity Science-PAPERS.pdf
- Scalable Linear Algebra on a Relational Database System-PAPER.pdf
- Scaling Large Production Clusters-PAPER.pdf
- Skillful writing of an awful research paper-GUIDE.pdf
- The Modern Mathematics of Deep Learning-PAPER.pdf
- Towards Demystifying Serverless Machine Learning Training-PAPER.pdf
- Triple-View Feature Learning for Medical Image Segmentation-PAPER.pdf
- Understanding Diffusion Models- A Unified Perspective-PAPER.pdf
- VeML-An-End-to-End-Machine-Learning-Lifecycle-for-Large-Scale-and-High-Dimensional-Data-PAPER.pdf
- Very Deep Convolutional Networks for Large Scale Image Recognition-PAPER.pdf
- XGBoost A Scalable Tree Boosting System-PAPER.pdf
- XGBoost Reliable Large-scale Tree Boosting System-PAPER.pdf
Recent Papers
- Detecting Signs of Disease from External Images of the Eye, THURSDAY, MARCH 24, 2022
- Deep-learning models for the detection and incidence prediction of chronic kidney disease and type 2 diabetes from retinal fundus images, 15 June 2021
- Detection of anaemia from retinal fundus images via deep learning, 23 December 2019
- Assessing Cardiovascular Risk Factors with Computer Vision, MONDAY, FEBRUARY 19, 2018
Author
Dr Hari Thapliyaal
dasarpai.com
linkedin.com/in/harithapliyal