25 minute read

Important AI Paper List

Introduciton

In almost all citations it becomes very difficult to read the title of research papers. Why? Because the contributors’ information is first and most of the time, it is difficult to read the name other than native people. For example, if an Indian find a native name like “Vivek Ramaswami, Kartikeyan Karunanidhi” it is easy for them to read the name but the same name becomes difficult to read for non-Indian people, and vice-versa. Giving respect to the creator is very important but more than we need to know what they have done. I know from my experience, for almost every researcher, it becomes very difficult to track good AI research papers. For me, it is more difficult because I need to maintain this blog and I want to give references to the work across different webpages. Therefore I am creating a citation key, which includes the Last name of the first researcher + year of presenting that paper. Along with this, I am describing the title of the paper and where it was presented. If you find a particular title interesting for your work you can search that paper on “google scholar”, Mendeley, sci-hub or other places with which you are familiar and comfortable. Post that you can download and read that paper at your leisure. Hope you find this list of some use for your work.

Citations

Pretrained Language Models for Text Generation: A Survey

[Bahdanau2015]

Neural machine translation by jointly learning to align and translate. In ICLR, 2015.

[Bao2020]

PLATO-2: towards building an open- domain chatbot via curriculum learning. arXiv preprint arXiv:2006.16779, 2020.

[Brown2020]

Language models are few-shot learners. In NeurIPS, 2020.

[Chen2020a]

Distilling knowledge learned in BERT for text generation. In ACL, 2020.

[Chen2020b]

Few-shot NLG with pre-trained language model. In ACL, 2020.

[Conneau2019]

Cross-lingual language model pretraining. In NeurIPS, 2019.

[Devlin2019]

BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.

[Dong2019]

Unified language model pretraining for natural language understanding and generation. In NeurIPS, 2019.

[Fan2019]

Unsupervised pre-training for sequence to sequence speech recognition. CoRR, arXiv preprint arXiv:1910.12418, 2019.

[Gehring2017]

Convolutional sequence to sequence learning. In ICML, 2017.

[Gong2020]

Tablegpt: Few-shot table-to-text generation with table structure reconstruction and content matching. In COLING, 2020.

[Gu2020]

A tailored pre-training model for task-oriented dialog generation. arXiv preprint arXiv:2004.13835, 2020.

[Guan2020]

Survey on automatic text summarization and transformer models applicability. In CCRIS, 2020.

[Hendrycks2020]

Pretrained transformers improve out-of- distribution robustness. In ACL, 2020.

[Keskar2019]

CTRL: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019.

[Kryscinski2018]

Improving abstraction in text summarization. In EMNLP, 2018.

[Lan2020]

ALBERT: A lite BERT for self-supervised learning of language representations. In ICLR, 2020.

[Lewis2020]

BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, 2020.

[Li2019]

Generating long and informative reviews with aspect-aware coarse-to-fine decoding. In ACL, pages 1969–1979, 2019.

[Li2020]

Knowledge-enhanced personalized review generation with capsule graph neural network. In CIKM, pages 735–744, 2020.

[Li2021a]

TextBox: A unified, modularized, and extensible framework for text generation. In ACL, 2021.

[Li2021b]

Few-shot knowledge graph-to-text generation with pretrained language models. In Findings of ACL, 2021.

[Li2021c]

Knowledge-based review generation by coherence enhanced text planning. In SIGIR, 2021.

[Lin2020]

Pretraining multilingual neural machine translation by leveraging alignment information. In EMNLP, 2020.

[Liu2019]

Text summarization with pretrained encoders. In EMNLP, 2019.

[Mager2020]

GPT-too: A language-model-first approach for AMR-to-text generation. In ACL, 2020.

[Peters2018]

Deep contextualized word representations. In NAACL-HLT, 2018.

[Qiu2020]

Pre-trained models for natural language processing: A survey. arXiv preprint arXiv:2003.08271, 2020.

[Radford2019]

Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.

[Raffel2020]

Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 2020.

[Ribeiro2020]

Investigating pretrained language models for graph-to-text generation. arXiv preprint arXiv:2007.08426, 2020.

[Ross, 2012]

Guide for conducting risk assessments. In NIST Special Publication, 2012.

[Rothe2020]

Leveraging pre-trained checkpoints for sequence generation tasks. TACL, 2020.

[Sanh2019]

Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.

[See2017]

Get to the point: Summarization with pointer-generator networks. In ACL, 2017.

[Song2019]

MASS: masked sequence to sequence pre-training for language generation. In ICML, 2019.

[Sun2019a]

Contrastive bidirectional transformer for temporal representation learning. arXiv preprint arXiv:1906.05743, 2019.

[Sun2019b]

Videobert: A joint model for video and language representation learning. In ICCV, 2019.

[Vaswani2017]

Attention is all you need. In NIPS, 2017.

[Wada2018]

Unsupervised cross-lingual word embedding by multilingual neural language models. arXiv preprint arXiv:1809.02306, 2018.

[Wolf2019]

Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149, 2019.

[Xia2020]

XGPT: cross-modal generative pre-training for image captioning. arXiv preprint arXiv:2003.01473, 2020.

[Xu2020a]

Discourse-aware neural extractive text summarization. In ACL, 2020.

[Xu2020b]

Unsupervised extractive summarization by pre-training hierarchical transformers. In EMNLP, 2020.

[Yang2020a]

CSP: code-switching pre-training for neural machine translation. In EMNLP, 2020.

[Yang2020b]

TED: A pretrained unsupervised summarization model with theme modeling and denoising. In EMNLP (Findings), 2020.

[Zaib2020]

A short survey of pre-trained language models for conversational AI-A new age in NLP. In ACSW, 2020.

[Zeng2020]

Generalized conditioned dialogue generation based on pre-trained language model. arXiv preprint arXiv:2010.11140, 2020.

[Zhang2019a]

Pretraining-based natural language generation for text summarization. In CoNLL, 2019.

[Zhang2019b]

HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. In ACL, 2019.

[Zhang2019c]

ERNIE: enhanced language representation with informative entities. In ACL, 2019.

[Zhang2020]

DIALOGPT : Largescale generative pre-training for conversational response generation. In ACL, 2020.

[Zhao2020]

Knowledge-grounded dialogue generation with pretrained language models. In EMNLP, 2020.

[Zheng2019]

Sentence centrality revisited for unsupervised summarization. In ACL, 2019.

[Zhou2020]

Unified vision-language pre-training for image captioning and VQA. In AAAI, 2020

Survey on Automatic Text Summarization and Transformer Models Applicability

[CohanA2018]

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In Proceedings of the 2018 Conference of the North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 615–621.

[NenkovaA2007]

The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Transactions on Speech and Language Processing 4, 2 (2007).

[RadfordA]

Improving language understanding by generative pre-training. www.cs.ubc.ca/~amuham01/LING530/ papers/radford2018improving.pdf

[RasimMA2013]

Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40, 5 (2013), 1675–1689.

[RasimMA]

MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 12 (2011), 14514–14522.

[VaswaniA2017]

Attention is all you need. Advances in neural information processing systems (2017), 5998–6008.

[RaffelC2019]

Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683 (2019).

[BahdanauD2014]

Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014).

[GunesE2004]

LexRank: Graph-based lexical centrality as salience in text summarization. Journal ofArtificial Intelligence 20, 1 (2004), 457–479.

[ZhangH]

Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 23rd Conference on Computational Natural Language Learning (CoNLL). 789–797.

[DevlinJ2019]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings ofthe 2019Conference ofthe NorthAmerican ChapteroftheAssociation forComputational Linguistics: Human Language Technologies. 4171–4186.

[HowardJ]

Universal Language Model Fine-tuning for Text Classification. In Proceedings ofthe 56th Annual Meeting ofthe Association for Computational Linguistics. 328–339.

[ZhangJ2019]

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. arXiv:1912.08777 (2019).

[KaikhahK]

Text summarization using neural networks. In Proceeding of second conference on intelligent system. 40–44.

[XuK]

Show, attend and tell: Neural image caption generation with visual attention. In Proceedings ofthe International conference on machine learning. 2048–2057.

[Chin-YewL]

ROUGE: A package for automatic evaluation of summaries. In Proceedings ofACL Workshop “Text Summarization Branches Out”. 8.

[M2019]

BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv:1910.13461 (2019).

[Ch2011]

A statistical approach for automatic text summarization by extraction. In Proceedings of2011 International Conference on Communication Systems and Network Technologies. 268–271.

[ConroyJM]

Text Summarization via Hidden Markov Models. In Proceedings ofthe 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 406–407.

[PetersM2018]

Deep Contextualized Word Representations. In Proceedings ofthe 2018 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 2227–2237.

[RushAM2015]

A neural attention model for abstractive sentence summarization. arXiv:1509.00685 (2015).

[VinyalsO2015]

Pointer networks. Advances in neural information processing systems (2015), 2692–2700.

[DragomirRR2004]

Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919–938.

[MihalceaR2004]

Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing. 404–411.

[NallapatiR2016]

Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv:1602.06023 (2016).

[OakR2015]

Extractive techniques for automatic document summarization: a survey. International Journal of Innovative Research in Computer and Communication Engineering 4, 3 (2016), 4158–4164.

[ParkerR2011]

English Gigaword. https://catalog.ldc.upenn.edu/LDC2011T07

[ChopraS2016]

Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings ofthe 2016 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 93–98.

[EvanS2008]

The New York Times Annotated Corpus. https://catalog.ldc.upenn. edu/LDC2008T19

[EdunovS2019]

Pre-trained language model representations for language generation. In Proceedings ofthe 2019 Conference ofthe North American Chapter ofthe Association for Computational Linguistics. 4052–4059.

[NarayanS2018]

Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 2018 Conference on Empirical Methods in Natural Language Processing. 1797–1807.

[PeterJ2017]

Get to the point: Summarization with pointer-generator networks. arXiv:1704.04368 (2017).

[GuptaV2010]

A Survey of Text Summarization Extractive Techniques. Journal ofEmerging Technologies in Web Intelligence 2, 3 (2010), 258–268.

[SanhV2019]

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arxiv.org/pdf/1910.01108 (2019).

[LiuY2019]

Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692 (2019).

[YanY2020]

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training. arXiv:2001.04063 (2020).

[DaiZ]

Transformer- XL: Attentive Language Models beyond a Fixed-Length Context. In Proceedings ofthe 57th Annual Meeting ofthe Association for Computational Linguistics. 2978–2988.

[LanZ2019]

Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019).

[YangZ2019]

Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems (2019), 5754–5764.

CTRL: A Conditional Transformer Language Model For Controllable Generation

[Mart2016]

Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Imple-mentation ({OSDI} 16), pp. 265–283, 2016.

[Rohan2019]

Memory-efficient adaptive optimiza-tion for large-scale learning. arXiv preprint arXiv:1901.11150, 2019.

[Martin2017]

Wasserstein generative adversarial networks. ´In International conference on machine learning, pp. 214–223, 2017.

[Matthew2017]

Factsheets: Increasing trust in AI servicesthrough supplier’s declarations of conformity, August 2018. arXiv:1808.07261 [cs.CY].Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. Unsupervised neural machinetranslation. arXiv preprint arXiv:1710.11041, 2017.

[Jimmy2016]

Layer normalization. CoRR, abs/1607.06450,2016.

[Lo2019]

Findings of the2019 conference on machine translation (wmt19). In Proceedings of the Fourth Conference onMachine Translation (Volume 2: Shared Task Papers, Day 1), pp. 1–61, 2019.

[Yoshua2003]

A neural probabilistic ´language model. Journal of machine learning research, 3(Feb):1137–1155, 2003.

[Thorsten2007]

Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Methods inNatural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL),pp. 858–867, 2007.

[Miles2016]

Artificial intelligence and responsible innovation. In Vincent C. Muller (ed.), ¨Fundamental Issues of Artificial Intelligence, pp. 543–554. Springer, 2016.

[Miles2019]

The malicious use of artificial intelligence: Forecasting,prevention, and mitigation, February 2019. arXiv:1802.07228 [cs.AI].Isaac Caswell, Ciprian Chelba, and David Grangier. Tagged back-translation. arXiv preprintarXiv:1906.06442, 2019.

[Xi2016]

Infogan:Interpretable representation learning by information maximizing generative adversarial nets. InAdvances in neural information processing systems, pp. 2172–2180, 2016.

[Rewon2019]

Generating long sequences with sparsetransformers. arXiv preprint arXiv:1904.10509, 2019.

[Ronan2008]

A unified architecture for natural language processing: Deepneural networks with multitask learning. In Proceedings of the 25th international conference onMachine learning, pp. 160–167. ACM, 2008.

[Ronan2011]

Natural language processing (almost) from scratch. Journal of machine learning research,12(Aug):2493–2537, 2011.

[Ruth1987]

The consumption junction: A proposal for research strategies in the sociol-ogy of technology. In Wiebe E. Bijker, Thomas P. Hughes, and Trevor J. Pinch (eds.), The SocialConstruction of Technological Systems, pp. 261–280. MIT Press, Cambridge, MA, USA, 1987.

[Andrew2015]

Semi-supervised sequence learning. In Advances in neural infor-mation processing systems, pp. 3079–3087, 2015.

[Zihang2019]

Transformer-xl: Attentive language models beyond a fixed-length context. arXivpreprint arXiv:1901.02860, 2019.

[Jacob2018]

Bert: Pre-training of deepbidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

[John2011]

Adaptive subgradient methods for online learning andstochastic optimization. Journal of Machine Learning Research, 12(Jul):2121–2159, 2011.

[Matthew2017]

Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprintarXiv:1704.05179, 2017.

[Angela2018]

Hierarchical neural story generation. arXiv preprintarXiv:1805.04833, 2018.

[Angela2019]

Eli5:Long form question answering. arXiv preprint arXiv:1907.09190, 2019.

[Boris2019]

Stochastic gradient methods with layer-wise adaptive moments for training of deep networks. arXiv preprint arXiv:1905.11286, 2019.

[Ian2014]

Generative adversarial nets. In Advances in neural infor-mation processing systems, pp. 2672–2680, 2014.

[Max2016]

Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. In Proceedings of the 2018 Conference of the North AmericanChapter of the Association for Computational Linguistics: Human Language Technologies, pp.708–719, New Orleans, Louisiana, June 2018.

[Kaiming2016]

Deep residual learning for image recog-nition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770–778, 2016.

[Karl2015]

Teaching machines to read and comprehend. In Advances inneural information processing systems, pp. 1693–1701, 2015.

[Ari2019]

The curious case of neural text degener-ation. arXiv preprint arXiv:1904.09751, 2019.

[Jeremy2018]

Universal language model fine-tuning for text classification.arXiv preprint arXiv:1801.06146, 2018.

[Hakan2016]

Tying word vectors and word classifiers: Aloss framework for language modeling. arXiv preprint arXiv:1611.01462, 2016.

[Melvin2017]

Googles multilingual neural ´machine translation system: Enabling zero-shot translation. Transactions of the Association forComputational Linguistics, 5:339–351, 2017.

[Mandar2017]

Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551, 2017.

[David2017]

Self-censorship is not enough. Nature, 492(7429):345–347,December 2012. doi: 10.1038/492345a.

[Lukasz2017]

One model to learn them all. arXiv preprint arXiv:1706.05137, 2017.

[Łukasz2018]

Fast decoding in sequence models using discrete latent variables. arXiv preprintarXiv:1803.03382, 2018.

[Nitish2019]

Unifying questionanswering and text classification via span extraction. arXiv preprint arXiv:1904.09286, 2019.

[Diederik2014]

Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980, 2014.

[Diederik2013]

Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114, 2013.

[Ryan2015]

Skip-thought vectors. In Advances in neural information processingsystems, pp. 3294–3302, 2015.

[Catherine2016]

Senellart. Domain control for neural machine translation.arXiv preprint arXiv:1612.06140, 2016.

[Wojciech2019]

Neural text summarization: A critical evaluation. arXiv preprint arXiv:1908.08960, 2019.

[Tom2019]

Natural questions: abenchmark for question answering research. Transactions of the Association for ComputationalLinguistics, 7:453–466, 2019.

[Guillaume2019]

Cross-lingual language model pretraining. arXiv preprintarXiv:1901.07291, 2019.

[Guillaume2019]

Large memory layers with product keys. ´ arXiv preprint arXiv:1907.05242, 2019.

[Hector2012]

The winograd schema challenge. In Thir-teenth International Conference on the Principles of Knowledge Representation and Reasoning,2012.

[Patrick2019]

Unsupervised question answering by clozetranslation. arXiv preprint arXiv:1906.04980, 2019.

[Minh-Thang2015]

Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114, 2015.

[Julian2015]

Image-based rec-ommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIRConference on Research and Development in Information Retrieval, pp. 43–52. ACM, 2015.

[Bryan6294]

Learned in translation:Contextualized word vectors. In Advances in Neural Information Processing Systems, pp. 6294.

[Bryan2018]

The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730, 2018.

[15Stephen2017]

Regularizing and optimizing lstm lan-guage models. arXiv preprint arXiv:1708.02182, 2017.

[Tomas2013]

Distributed represen-tations of words and phrases and their compositionality. In Advances in neural information pro-cessing systems, pp. 3111–3119, 2013.

[Margaret7596]

Model cards for model reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency (FAT* ’19), Jan-uary 2019. doi: 10.1145/3287560.3287596.

[Amit2019]

Filling gender & number gaps in neural ma-chine translation with black-box context injection. arXiv preprint arXiv:1903.03467, 2019.

[Vinod2010]

Rectified linear units improve restricted boltzmann machines. InProceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814,2010.

[Ramesh2016]

Abstractive text summarizationusing sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.

[Matthew2018]

Deep contextualized word representations. arXiv preprint arXiv:1802.05365,2018.

[Carol1979]

Constraints on language mixing: intra sentential code-switching and borrowing inspanish/english. Language, pp. 291–318, 1979.

[Shana1980]

Sometimes ill start a sentence in spanish y termino en espanol: toward a typologyof code-switching1. Linguistics, 18(7-8):581–618, 1980.

[Ofir2016]

Using the output embedding to improve language models. arXiv preprintarXiv:1608.05859, 2016.

[Alec2018]

Improving language under-standing by generative pre-training. URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/language understanding paper.pdf, 2018.

[Alec2019]

Language models are unsupervised multitask learners. URLhttps://d4mucfpksywv.cloudfront.net/better-language-models/language models are unsupervised multitask learners.pdf, 2019.

[Nazneen2019]

Explain yourself! leveraging language models for commonsense reasoning. arXiv preprint arXiv:1906.02361, 2019.

[Pranav2016]

Squad: 100,000+ questionsfor machine comprehension of text. arXiv preprint arXiv:1606.05250, 2016.

[Alexander2015]

A neural attention model for abstractivesentence summarization. arXiv preprint arXiv:1509.00685, 2015.

[Evan2008]

The new york times annotated corpus. Linguistic Data Consortium, Philadelphia,6(12):e26752, 2008.

[Thomas2019]

Answers unite!unsupervised metrics for reinforced summarization models. arXiv preprint arXiv:1909.01610,2019.

[Abigail2017]

Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics (Volume 1: Long Papers), volume 1, pp. 1073–1083, 2017.

[Rico2015]

Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909, 2015.

[Noam2018]

Adafactor: Adaptive learning rates with sublinear memory cost.arXiv preprint arXiv:1804.04235, 2018.

[Jack008]

Developing a framework for responsible inno-vation. Research Policy, 42(9):1568–1580, November 2013. doi: 10.1016/j.respol.2013.05.008.

[Ilya2014]

Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112, 2014.

[Trieu2018]

A simple method for commonsense reasoning. arXiv preprintarXiv:1806.02847, 2018.

[Adam2016]

A machine comprehension dataset. arXiv preprint arXiv:1611.09830,2016.

[Lav6008]

Pretrained AI models: Performativity,mobility, and change, September 2019. arXiv:1909.03290 [cs.CY].

[Curran2018]

Glue:A multi-task benchmark and analysis platform for natural language understanding. arXiv preprintarXiv:1804.07461, 2018.

[Sean2019]

Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319, 2019.

[Yonghui2016]

Google’s neural machine trans-lation system: Bridging the gap between human and machine translation. arXiv preprintarXiv:1609.08144, 2016.

[Stratos2019]

Sumqe: a bert-based summary quality estimation model. arXiv preprint arXiv:1909.00578, 2019.

[Zhilin2018]

Hotpotqa: A dataset for diverse, explainable multi-hop questionanswering. arXiv preprint arXiv:1809.09600, 2018.

[Rowan 2019]

Defending against neural fake news. arXiv preprint arXiv:1905.12616, 2019

[Fangxiaoyu2022]

Language-agnostic BERT Sentence Embedding - LaBSE

  • BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning
  • BERT based cross-lingual sentence embeddings is explored in this paper.
  • It explored combining the best methods for learning monolingual and cross-lingual representations including: masked language modeling (MLM), translation language modeling (TLM)
  • Introducing a pre-trained multilingual language model dramatically reduces the amount of parallel training data required to achieve good performance
  • It produces a model that achieves high bi-text retrieval accuracy over 112 languages

NLP Papers Available on my Google Drive

You can download these papers from link

  1. A brief introduction to boosting.pdf
  2. A Closer Look at Fermentors and Bioreactors.pdf
  3. A Comprehensive Survey on Graph Neural Networks.pdf
  4. A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection.pdf
  5. A dataset for detecting irony in Hindi-english code-mixed social media text.pdf
  6. A Framework for Document Specific Error Detection and Corrections in Indic OCR.pdf
  7. A lexicon-based approach for hate speech detection.pdf
  8. A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (.pdf
  9. A novel automatic satire and irony detection using ensembled feature selection and data mining.pdf
  10. A Pragmatic Analysis Of Humor In Modern Family.pdf
  11. A Selective Overview of Deep Learning.pdf
  12. A Sentiment Analyzer for Hindi Using Hindi Senti Lexicon.pdf
  13. A Survey of Code-switched Speech and Language Processing.pdf
  14. A Survey of the State of Explainable AI for Natural Language Processing.pdf
  15. A Survey on Explainable Artificial Intelligence (XAI) Toward Medical XAI.pdf
  16. A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language.pdf
  17. A transformer-based approach to irony and sarcasm detection.pdf
  18. A2Text-net A novel deep neural network for sarcasm detection.pdf
  19. Adaptive glove and fasttext model for Hindi word embeddings.pdf
  20. AI and Ethics - Operationalising Responsible AI-PAPER.pdf
  21. AI4Bharat-IndicNLP Corpus Monolingual Corpora and Word Embeddings for Indic Languages.pdf
  22. ALBERT A Lite BERT for Self-supervised Learning of Language Representations.pdf
  23. an Analysis of Current Trends for Sanskrit As a Computer Programming Language.pdf
  24. An empirical, quantitative analysis of the differences between sarcasm and Irony.pdf
  25. An Image is Worth 16x16 Words Transformers for Image Recognition at Scale.pdf
  26. Analyzing_The_Expressive_Power_Of_Graph.pdf
  27. AnnCorra Annotating Corpora Guidelines For POS And Chunk Annotation For Indian Languages.pdf
  28. Approaches to Cross-Domain Sentiment Analysis A Systematic Literature Review.pdf
  29. Attention is all you need.pdf
  30. Automatic sarcasm detection A survey.pdf
  31. Automatic satire detection Are you having a laugh.pdf
  32. Bag of tricks for efficient text classification.pdf
  33. Baselines and bigrams Simple, good sentiment and topic classification.pdf
  34. BERT Explained - A list of Frequently Asked Questions.pdf
  35. BERT Pre-training of deep bidirectional transformers for language understanding.pdf
  36. BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories.pdf
  37. Carer Contextualized affect representations for emotion recognition.pdf
  38. CASCADE Contextual Sarcasm Detection in Online Discussion Forums.pdf
  39. Challenges in Deploying Machine Learning a Survey of Case Studies.pdf
  40. Clinical artificial intelligence quality improvement towards continual monitoring and updating of AI algorithms in healthcare.pdf
  41. CLUE based load balancing in replicated web server.pdf
  42. Clues for detecting irony in user-generated contents Oh…!! it_s so easy -).pdf
  43. Code Mixing A Challenge for Language Identification in the Language of Social Media.pdf
  44. Context-based Sarcasm Detection in Hindi Tweets.pdf
  45. Contextualized sarcasm detection on twitter.pdf
  46. Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis.pdf
  47. Data governance A conceptual framework, structured review, and research agenda.pdf
  48. Deep and Dense Sarcasm Detection.pdf
  49. Deep learning based unsupervised POS tagging for Sanskrit.pdf
  50. Detailed human avatars from monocular video.pdf
  51. Detecting Sarcasm is Extremely Easy -).pdf
  52. DIALOGPT Large-Scale Generative Pre-training for Conversational Response Generation.pdf
  53. DistilBERT, a distilled version of BERT smaller, faster, cheaper and lighter.pdf
  54. DRIFT Deep Reinforcement Learning for Functional Software Testing.pdf
  55. Drop A reading comprehension benchmark requiring discrete reasoning over paragraphs.pdf
  56. Dynamic routing between capsules.pdf
  57. Effect of speech coding on speaker identification.pdf
  58. Efficient estimation of word representations in vector space(2).pdf
  59. ELECTRA Pre-training Text Encoders as Discriminators Rather Than Generators.pdf
  60. Embedding Words as Distributions with a Bayesian Skip-gram Model.pdf
  61. Enriching Word Vectors with Subword Information.pdf
  62. Experience Grounds Language.pdf
  63. Exploiting emojis for sarcasm detection.pdf
  64. Exploiting Similarities among Languages for Machine Translation.pdf
  65. Exploring the fine-grained analysis and automatic detection of irony on Twitter(2).pdf
  66. Exploring the fine-grained analysis and automatic detection of irony on Twitter.pdf
  67. Exploring the impact of pragmatic phenomena on irony detection in tweets A multilingual corpus study.pdf
  68. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.pdf
  69. Extensions to HMM-based Statistical Word Alignment Models.pdf
  70. Fairness_In_Machine_Learning_A_Survey.pdf
  71. Fake news detection of Indian and United States election data using machine learning algorithm.pdf
  72. Fake News Detection on Social Media.pdf
  73. FakeNewsNet A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social.pdf
  74. Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks.pdf
  75. FastText.zip Compressing text classification models.pdf
  76. Figurative messages and affect in Twitter Differences between #irony, #sarcasm and #not.pdf
  77. Forecasting COVID-19 Confirmed Cases in Major Indian Cities and Their Connectedness with Mobility and Weather-related Parameters.pdf
  78. From English To Foreign Languages Transferring Pre-trained Language Models.pdf
  79. FROM Pre-trained Word Embeddings TO Pre-trained Language Models - Focus on BERT.pdf
  80. Going deeper with convolutions.pdf
  81. Graph Machine Learning NeurIPS 2020 Papers.pdf
  82. Grouped Convolutional Neural Networks for Multivariate Time Series.pdf
  83. Grouped Functional Time Series Forecasting An Application to Age-Specific Mortality Rates.pdf
  84. Handbook of approximation algorithms and metaheuristics.pdf
  85. Harnessing context incongruity for sarcasm detection.pdf
  86. Harnessing Online News for Sarcasm Detection in Hindi Tweets.pdf
  87. Hidden Markov Models.pdf
  88. Hidden technical debt in machine learning systems.pdf
  89. Hotpotqa A dataset for diverse, explainable multi-hop question answering.pdf
  90. How multilingual is multilingual BERT.pdf
  91. How to avoid machine learning pitfalls a guide for academic researchers.pdf
  92. How to read a paper.pdf
  93. HuggingFace_s Transformers State-of-the-art Natural Language Processing.pdf
  94. Identifying machine learning techniques for classification of target advertising.pdf
  95. Identifying sarcasm in Twitter A closer look.pdf
  96. Improving Language Understanding by Generative Pre-Training.pdf
  97. Improving the learnability of classifiers for Sanskrit OCR corrections.pdf
  98. Indic sentiReview Natural language processing based sentiment analysis on major indian languages.pdf
  99. Interactive-and-Visual-Prompt-Engineering-for-adhoc-Task-Adaptation-LLM.pdf
  100. Investigations in computational sarcasm.pdf
  101. Irony detection in twitter The role of affective content.pdf
  102. Irony, Sarcasm and Parody in the American Sitcom Modern Family.pdf
  103. iSarcasm A Dataset of Intended Sarcasm.pdf
  104. K-means with Three different Distance Metrics.pdf
  105. Knowledge Representation in Sanskrit and Artificial Intelligence.pdf
  106. Learning Graph Search Heuristics.pdf
  107. Learning latent causal graphs via mixture oracles.pdf
  108. LearningSys_2015_paper_32.pdf
  109. Lexicon-Based Methods for Sentiment Analysis.pdf
  110. Lexicon-Based Sentiment Analysis in the Social Web.pdf
  111. LightGBM A highly efficient gradient boosting decision tree.pdf
  112. Linguistic Inquiry and Word Count LIWC2015.pdf
  113. Machine Learning in Automated Text Categorization.pdf
  114. Machine Learning within a Graph Database A Case Study on Link Prediction for Scholarly Data.pdf
  115. Machine Translation Approaches and Survey for Indian Languages.pdf
  116. Machine Translation of Bi-lingual Hindi-English (Hinglish) Text.pdf
  117. Merlion A Machine Learning Library for Time Series.pdf
  118. Mining of Massive Datasets.pdf
  119. MLP-Mixer An all-MLP Architecture for Vision.pdf
  120. Multi-modal sarcasm detection in Twitter with hierarchical fusion model.pdf
  121. Multi-rule based ensemble feature selection model for sarcasm type detection in Twitter.pdf
  122. Multimodal markers of irony and sarcasm.pdf
  123. N Atural L Anguage I Nference Over.pdf
  124. Natural Language Processing - A Panian Perspective.pdf
  125. Natural language processing based features for sarcasm detection An investigation using bilingual social media texts.pdf
  126. NeuralProphet Explainable Forecasting at Scale.pdf
  127. On State-of-the-art of POS Tagger, Sandhi Splitter, Alankaar Finder and Samaas Finder for IndoAryan and Dravidian Languages.pdf
  128. Opinion mining and sentiment analysis.pdf
  129. Opinion-Based Entity Ranking (Author_s Draft).pdf
  130. Part-of-speech tagging from 97_ to 100_ Is it time for some linguistics.pdf
  131. PAVE Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models.pdf
  132. Real-time Sentiment Analysis of Hindi Tweets.pdf
  133. Reasoning with sarcasm by reading in-between.pdf
  134. Recent trends in deep learning based natural language processing Review Article.pdf
  135. RECEPTIVE FIELDS OF SINGLE NEURONES IN THE CAT _ S STRIATE CORTEX.pdf
  136. Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing.pdf
  137. Representing social media users for sarcasm detection.pdf
  138. Retrospective Reader for Machine Reading Comprehension.pdf
  139. RoBERTa A Robustly Optimized BERT Pretraining Approach.pdf
  140. Robotics , AI , and.pdf
  141. ROC graphs Notes and practical considerations for researchers.pdf
  142. Sanskrit sandhi splitting using Seq2(Seq)22.pdf
  143. Sanskrit word segmentation using character-level recurrent and convolutional neural networks.pdf
  144. Sarc-M Sarcasm Detection in Typo-graphic Memes.pdf
  145. Sarcasm as contrast between a positive sentiment and negative situation.pdf
  146. Sarcasm Detection in Hindi sentences using Support Vector machine.pdf
  147. Sarcasm detection in tweets.pdf
  148. Sarcasm detection on twitterA behavioral modeling approach.pdf
  149. Sarcastic sentiment detection in tweets streamed in real time a big data approach.pdf
  150. Scalable linear algebra on a relational database system.pdf
  151. Scaling Large Production Clusters with Partitioned Synchronization This paper is included in the Proceedings of the.pdf
  152. Semantics-Aware BERT for Language Understanding.pdf
  153. Semi-supervised recognition of sarcastic sentences in twitter and Amazon.pdf
  154. SentencePiece A simple and language independent subword tokenizer and detokenizer for neural text processing.pdf
  155. Sentiment Analysis for Hindi Language.pdf
  156. Sentiment Analysis in a Resource Scarce LanguageHindi.pdf
  157. Sentiment Analysis In Hindi.pdf
  158. Sentiment Analysis in Indian languages o Definition.pdf
  159. Sentiment Analysis of Hindi Review based on Negation and Discourse Relation.pdf
  160. Sentiment classification using machine learning techniques with syntax features.pdf
  161. Skillful writing of an awful research paper.pdf
  162. Social media and fake news in the 2016 election.pdf
  163. Sound classification using convolutional neural network and tensor deep stacking network.pdf
  164. Sparse, contextually informed models for irony detection Exploiting user communities, entities and sentiment.pdf
  165. SQuad 100,000 questions for machine comprehension of text.pdf
  166. ST4_Method_Random_Forest.pdf
  167. Statistical Methods in Natural Language Processing.pdf
  168. StructBERT Incorporating Language Structures into Pre-training for Deep Language Understanding.pdf
  169. Structural S tudies on S mall A myloid O ligomers RT-6.pdf
  170. Superintelligence.pdf
  171. Systematic literature review of sentiment analysis on Twitter using soft computing techniques.pdf
  172. Text categorization with support vector machines Learning with many relevant features.pdf
  173. Text normalization of code mix and sentiment analysis.pdf
  174. The Differential Role of Ridicule in Sarcasm and Irony The Differential Role of Ridicule in Sarcasm and Irony.pdf
  175. The highest form of intelligence Sarcasm increases creativity for both expressers and recipients.pdf
  176. The Modern Mathematics of Deep Learning *.pdf
  177. The Paninian approach to natural language processing.pdf
  178. The perfect solution for detecting sarcasm in tweets #not.pdf
  179. Thumbs Up or Thumbs Down Semantic Orientation Applied to Unsupervised Classification of Reviews.pdf
  180. THU_NGN at SemEval-2018 Task 3 Tweet Irony Detection with Densely connected LSTM and Multi-task Learning.pdf
  181. TnT - A Statistical Part-of-Speech Tagger.pdf
  182. To BLOB or Not To BLOB Large Object Storage in a Database or a Filesystem To BLOB or Not To BLOB Large Object Storage in a Dat.pdf
  183. Towards Demystifying Serverless Machine Learning Training.pdf
  184. Towards multimodal sarcasm detection (an obviously perfect paper).pdf
  185. Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text.pdf
  186. Triple-View Feature Learning for Medical Image Segmentation.pdf
  187. Twitter as a corpus for sentiment analysis and opinion mining.pdf
  188. Two improved continuous bag-of-word models.pdf
  189. Understanding Diffusion Models A Unified Perspective Introduction Generative Models.pdf
  190. Universal Sentence Encoder.pdf
  191. Unsupervised Irony Detection A Probabilistic Model with Word Embeddings.pdf
  192. UR-Funny A multimodal language dataset for understanding humor.pdf
  193. Use of Sanskrit for natural language processing.pdf
  194. Using TF-IDF to Determine Word Relevance in Document Queries.pdf
  195. Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval.pdf
  196. Very deep convolutional networks for large-scale image recognition.pdf
  197. We are IntechOpen , the world ‘ s leading publisher of Open Access books Built by scientists , for scientists TOP 1 _.pdf
  198. When BERT Plays the Lottery, All Tickets Are Winning.pdf
  199. XGBoost A scalable tree boosting system.pdf
  200. XLNet Generalized Autoregressive Pretraining for Language Understanding.pdf

AI Papers Available on my Google Drive

You can download these papers from link

  1. A Comprehensive Survey on Graph Neural Networks-PAPER.pdf
  2. A machine learning approach to predicting psychosis-PAPER.pdf
  3. A Selective Overview of Deep Learning-PAPER.pdf
  4. A Short introduction to boosting-PAPER.pdf
  5. A Survey of the State of Explainable AI for NLP-PAPER.pdf
  6. A Survey on Explainable AI (XAI) towards Medical XAI-PAPER.pdf
  7. AI and Ethics - Operationalising Responsible AI-PAPER.pdf
  8. Analyzing The Expressive Power of Graph Neural Network in a Spectral Perspective-PAPER.pdf
  9. Attention-Mechanism-Transformers-BERT-and-GPT-PAPER.pdf
  10. Can GPT-4 Perform Neural Architecture Search-PAPER.pdf
  11. Challenges in Deploying Machine Learning-PAPER.pdf
  12. Clinical AI quality improvement-PAPER.pdf
  13. Cramming-Training-a-Language-Model-On-A-Single-GPU-in-one-Day-PAPER.pdf
  14. DataGovernance-A conceptual framework, structured review-PAPER.pdf
  15. Detailed human avatar-PAPER.pdf
  16. DRIFT_26_CameraReadySubmission_NeurIPS_DRL-PAPER.pdf
  17. Dynamic Routing Between Capsules-PAPER.pdf
  18. Fairness in Machine Learning A Survey-PAPER.pdf
  19. Forecasting COVID-19 Confirmed Cases-PAPER.pdf
  20. Generalization Beyond Overfitting On Small Datasets-PAPER.pdf
  21. GPTrillion-Paper.pdf
  22. GPTs-are-GPTs-An-Early-Look-at-the-Labor-Market-Impact-Potential-of-Large-Language-Models-PAPER.pdf
  23. GraphMachine Learning NeurIPS 2020-PAPER.pdf
  24. Grouped Convolutional Neural Networks for Multivariate Time Series -PAPER.pdf
  25. Grouped functional time series forecasting An application to age-specific mortality rates-PAPER.pdf
  26. Hidden Technical Debt in Machine Learning Systems-PAPER.pdf
  27. Hidden technical debt in machine learning systems.pdf
  28. How to avoid machine learning pitfalls-PAPER.pdf
  29. How to Read a Paper-ARTC.pdf
  30. Identifying machine learning techniques for classification of target advertising-PAPER.pdf
  31. Introducing-GPTrillion-PAPER.pdf
  32. Large Object Storage in a Database or a Filesystem-PAPER.pdf
  33. Learning Graph Heuristic Search-PAPER.pdf
  34. Learning latent causal graphs via mixture oracles-PAPER.pdf
  35. LightGBM A Highly Efficient Gradient Boosting-PAPER.pdf
  36. Machine Learning within a Graph Database- A Case Study on Link Prediction for Scholarly Data-PAPER.pdf
  37. Merlion- A Machine Learning Library for Time Series-PAPER.pdf
  38. Model Evaluation, Model Selection, and Algorithm Selection-PAPER.pdf
  39. NeuralProphet-Explainable Forecasting at Scale-PAPER.pdf
  40. PAVE-Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models-PAPER.pdf
  41. Precise Zero-Shot Dense Retrieval without Relevance Labels-PAPER.pdf
  42. Randomforest-PAPER.pdf
  43. Receptive Fields of Single Neurones in the Cats Striate Cortex-PAPER.pdf
  44. Robotics, AI, and Humanity Science-PAPERS.pdf
  45. Scalable Linear Algebra on a Relational Database System-PAPER.pdf
  46. Scaling Large Production Clusters-PAPER.pdf
  47. Skillful writing of an awful research paper-GUIDE.pdf
  48. The Modern Mathematics of Deep Learning-PAPER.pdf
  49. Towards Demystifying Serverless Machine Learning Training-PAPER.pdf
  50. Triple-View Feature Learning for Medical Image Segmentation-PAPER.pdf
  51. Understanding Diffusion Models- A Unified Perspective-PAPER.pdf
  52. VeML-An-End-to-End-Machine-Learning-Lifecycle-for-Large-Scale-and-High-Dimensional-Data-PAPER.pdf
  53. Very Deep Convolutional Networks for Large Scale Image Recognition-PAPER.pdf
  54. XGBoost A Scalable Tree Boosting System-PAPER.pdf
  55. XGBoost Reliable Large-scale Tree Boosting System-PAPER.pdf

Recent Papers

  1. Detecting Signs of Disease from External Images of the Eye, THURSDAY, MARCH 24, 2022
  2. Deep-learning models for the detection and incidence prediction of chronic kidney disease and type 2 diabetes from retinal fundus images, 15 June 2021
  3. Detection of anaemia from retinal fundus images via deep learning, 23 December 2019
  4. Assessing Cardiovascular Risk Factors with Computer Vision, MONDAY, FEBRUARY 19, 2018

Author
Dr Hari Thapliyaal
dasarpai.com
linkedin.com/in/harithapliyal

Updated: