Important AI Paper List
Important AI Paper ListPermalink
IntroducitonPermalink
In almost all citations it becomes very difficult to read the title of research papers. Why? Because the contributors’ information is first and most of the time, it is difficult to read the name other than native people. For example, if an Indian find a native name like “Vivek Ramaswami, Kartikeyan Karunanidhi” it is easy for them to read the name but the same name becomes difficult to read for non-Indian people, and vice-versa. Giving respect to the creator is very important but more than we need to know what have they done. I know from my experience, for almost every researcher, it becomes very difficult to track good AI research papers. For me, it is more difficult because I need to maintain this blog and I want to give references to the work across different webpages. Therefore I am creating a citation key, which includes the Last name of the first researcher + year of presenting that paper. Along with this, I am describing the title of the paper and where it was presented. If you find a particular title interesting for your work you can search that paper on “google scholar”, Mendeley, sci-hub or other places with which you are familiar and comfortable. Post that you can download and read that paper at your leisure. Hope you find this list of some use for your work.
CitationsPermalink
Pretrained Language Models for Text Generation: A Survey
[Bahdanau2015]Permalink
Neural machine translation by jointly learning to align and translate. In ICLR, 2015.
[Bao2020]Permalink
PLATO-2: towards building an open- domain chatbot via curriculum learning. arXiv preprint arXiv:2006.16779, 2020.
[Brown2020]Permalink
Language models are few-shot learners. In NeurIPS, 2020.
[Chen2020a]Permalink
Distilling knowledge learned in BERT for text generation. In ACL, 2020.
[Chen2020b]Permalink
Few-shot NLG with pre-trained language model. In ACL, 2020.
[Conneau2019]Permalink
Cross-lingual language model pretraining. In NeurIPS, 2019.
[Devlin2019]Permalink
BERT: pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, 2019.
[Dong2019]Permalink
Unified language model pretraining for natural language understanding and generation. In NeurIPS, 2019.
[Fan2019]Permalink
Unsupervised pre-training for sequence to sequence speech recognition. CoRR, arXiv preprint arXiv:1910.12418, 2019.
[Gehring2017]Permalink
Convolutional sequence to sequence learning. In ICML, 2017.
[Gong2020]Permalink
Tablegpt: Few-shot table-to-text generation with table structure reconstruction and content matching. In COLING, 2020.
[Gu2020]Permalink
A tailored pre-training model for task-oriented dialog generation. arXiv preprint arXiv:2004.13835, 2020.
[Guan2020]Permalink
Survey on automatic text summarization and transformer models applicability. In CCRIS, 2020.
[Hendrycks2020]Permalink
Pretrained transformers improve out-of- distribution robustness. In ACL, 2020.
[Keskar2019]Permalink
CTRL: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858, 2019.
[Kryscinski2018]Permalink
Improving abstraction in text summarization. In EMNLP, 2018.
[Lan2020]Permalink
ALBERT: A lite BERT for self-supervised learning of language representations. In ICLR, 2020.
[Lewis2020]Permalink
BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In ACL, 2020.
[Li2019]Permalink
Generating long and informative reviews with aspect-aware coarse-to-fine decoding. In ACL, pages 1969–1979, 2019.
[Li2020]Permalink
Knowledge-enhanced personalized review generation with capsule graph neural network. In CIKM, pages 735–744, 2020.
[Li2021a]Permalink
TextBox: A unified, modularized, and extensible framework for text generation. In ACL, 2021.
[Li2021b]Permalink
Few-shot knowledge graph-to-text generation with pretrained language models. In Findings of ACL, 2021.
[Li2021c]Permalink
Knowledge-based review generation by coherence enhanced text planning. In SIGIR, 2021.
[Lin2020]Permalink
Pretraining multilingual neural machine translation by leveraging alignment information. In EMNLP, 2020.
[Liu2019]Permalink
Text summarization with pretrained encoders. In EMNLP, 2019.
[Mager2020]Permalink
GPT-too: A language-model-first approach for AMR-to-text generation. In ACL, 2020.
[Peters2018]Permalink
Deep contextualized word representations. In NAACL-HLT, 2018.
[Qiu2020]Permalink
Pre-trained models for natural language processing: A survey. arXiv preprint arXiv:2003.08271, 2020.
[Radford2019]Permalink
Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
[Raffel2020]Permalink
Exploring the limits of transfer learning with a unified text-to-text transformer. JMLR, 2020.
[Ribeiro2020]Permalink
Investigating pretrained language models for graph-to-text generation. arXiv preprint arXiv:2007.08426, 2020.
[Ross, 2012]Permalink
Guide for conducting risk assessments. In NIST Special Publication, 2012.
[Rothe2020]Permalink
Leveraging pre-trained checkpoints for sequence generation tasks. TACL, 2020.
[Sanh2019]Permalink
Distilbert, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.
[See2017]Permalink
Get to the point: Summarization with pointer-generator networks. In ACL, 2017.
[Song2019]Permalink
MASS: masked sequence to sequence pre-training for language generation. In ICML, 2019.
[Sun2019a]Permalink
Contrastive bidirectional transformer for temporal representation learning. arXiv preprint arXiv:1906.05743, 2019.
[Sun2019b]Permalink
Videobert: A joint model for video and language representation learning. In ICCV, 2019.
[Vaswani2017]Permalink
Attention is all you need. In NIPS, 2017.
[Wada2018]Permalink
Unsupervised cross-lingual word embedding by multilingual neural language models. arXiv preprint arXiv:1809.02306, 2018.
[Wolf2019]Permalink
Transfertransfo: A transfer learning approach for neural network based conversational agents. arXiv preprint arXiv:1901.08149, 2019.
[Xia2020]Permalink
XGPT: cross-modal generative pre-training for image captioning. arXiv preprint arXiv:2003.01473, 2020.
[Xu2020a]Permalink
Discourse-aware neural extractive text summarization. In ACL, 2020.
[Xu2020b]Permalink
Unsupervised extractive summarization by pre-training hierarchical transformers. In EMNLP, 2020.
[Yang2020a]Permalink
CSP: code-switching pre-training for neural machine translation. In EMNLP, 2020.
[Yang2020b]Permalink
TED: A pretrained unsupervised summarization model with theme modeling and denoising. In EMNLP (Findings), 2020.
[Zaib2020]Permalink
A short survey of pre-trained language models for conversational AI-A new age in NLP. In ACSW, 2020.
[Zeng2020]Permalink
Generalized conditioned dialogue generation based on pre-trained language model. arXiv preprint arXiv:2010.11140, 2020.
[Zhang2019a]Permalink
Pretraining-based natural language generation for text summarization. In CoNLL, 2019.
[Zhang2019b]Permalink
HIBERT: document level pre-training of hierarchical bidirectional transformers for document summarization. In ACL, 2019.
[Zhang2019c]Permalink
ERNIE: enhanced language representation with informative entities. In ACL, 2019.
[Zhang2020]Permalink
DIALOGPT : Largescale generative pre-training for conversational response generation. In ACL, 2020.
[Zhao2020]Permalink
Knowledge-grounded dialogue generation with pretrained language models. In EMNLP, 2020.
[Zheng2019]Permalink
Sentence centrality revisited for unsupervised summarization. In ACL, 2019.
[Zhou2020]Permalink
Unified vision-language pre-training for image captioning and VQA. In AAAI, 2020
Survey on Automatic Text Summarization and Transformer Models Applicability
[CohanA2018]Permalink
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. In Proceedings of the 2018 Conference of the North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 615–621.
[NenkovaA2007]Permalink
The pyramid method: Incorporating human content selection variation in summarization evaluation. ACM Transactions on Speech and Language Processing 4, 2 (2007).
[RadfordA]Permalink
Improving language understanding by generative pre-training. www.cs.ubc.ca/~amuham01/LING530/ papers/radford2018improving.pdf
[RasimMA2013]Permalink
Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40, 5 (2013), 1675–1689.
[RasimMA]Permalink
MCMR: Maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38, 12 (2011), 14514–14522.
[VaswaniA2017]Permalink
Attention is all you need. Advances in neural information processing systems (2017), 5998–6008.
[RaffelC2019]Permalink
Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv:1910.10683 (2019).
[BahdanauD2014]Permalink
Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014).
[GunesE2004]Permalink
LexRank: Graph-based lexical centrality as salience in text summarization. Journal ofArtificial Intelligence 20, 1 (2004), 457–479.
[ZhangH]Permalink
Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 23rd Conference on Computational Natural Language Learning (CoNLL). 789–797.
[DevlinJ2019]Permalink
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings ofthe 2019Conference ofthe NorthAmerican ChapteroftheAssociation forComputational Linguistics: Human Language Technologies. 4171–4186.
[HowardJ]Permalink
Universal Language Model Fine-tuning for Text Classification. In Proceedings ofthe 56th Annual Meeting ofthe Association for Computational Linguistics. 328–339.
[ZhangJ2019]Permalink
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization. arXiv:1912.08777 (2019).
[KaikhahK]Permalink
Text summarization using neural networks. In Proceeding of second conference on intelligent system. 40–44.
[XuK]Permalink
Show, attend and tell: Neural image caption generation with visual attention. In Proceedings ofthe International conference on machine learning. 2048–2057.
[Chin-YewL]Permalink
ROUGE: A package for automatic evaluation of summaries. In Proceedings ofACL Workshop “Text Summarization Branches Out”. 8.
[M2019]Permalink
BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv:1910.13461 (2019).
[Ch2011]Permalink
A statistical approach for automatic text summarization by extraction. In Proceedings of2011 International Conference on Communication Systems and Network Technologies. 268–271.
[ConroyJM]Permalink
Text Summarization via Hidden Markov Models. In Proceedings ofthe 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 406–407.
[PetersM2018]Permalink
Deep Contextualized Word Representations. In Proceedings ofthe 2018 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 2227–2237.
[RushAM2015]Permalink
A neural attention model for abstractive sentence summarization. arXiv:1509.00685 (2015).
[VinyalsO2015]Permalink
Pointer networks. Advances in neural information processing systems (2015), 2692–2700.
[DragomirRR2004]Permalink
Centroid-based summarization of multiple documents. Information Processing & Management 40, 6 (2004), 919–938.
[MihalceaR2004]Permalink
Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing. 404–411.
[NallapatiR2016]Permalink
Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv:1602.06023 (2016).
[OakR2015]Permalink
Extractive techniques for automatic document summarization: a survey. International Journal of Innovative Research in Computer and Communication Engineering 4, 3 (2016), 4158–4164.
[ParkerR2011]Permalink
English Gigaword. https://catalog.ldc.upenn.edu/LDC2011T07
[ChopraS2016]Permalink
Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings ofthe 2016 Conference ofthe North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies. 93–98.
[EvanS2008]Permalink
The New York Times Annotated Corpus. https://catalog.ldc.upenn. edu/LDC2008T19
[EdunovS2019]Permalink
Pre-trained language model representations for language generation. In Proceedings ofthe 2019 Conference ofthe North American Chapter ofthe Association for Computational Linguistics. 4052–4059.
[NarayanS2018]Permalink
Pretraining-Based Natural Language Generation for Text Summarization. In Proceedings ofthe 2018 Conference on Empirical Methods in Natural Language Processing. 1797–1807.
[PeterJ2017]Permalink
Get to the point: Summarization with pointer-generator networks. arXiv:1704.04368 (2017).
[GuptaV2010]Permalink
A Survey of Text Summarization Extractive Techniques. Journal ofEmerging Technologies in Web Intelligence 2, 3 (2010), 258–268.
[SanhV2019]Permalink
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arxiv.org/pdf/1910.01108 (2019).
[LiuY2019]Permalink
Roberta: A robustly optimized bert pretraining approach. arXiv:1907.11692 (2019).
[YanY2020]Permalink
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training. arXiv:2001.04063 (2020).
[DaiZ]Permalink
Transformer- XL: Attentive Language Models beyond a Fixed-Length Context. In Proceedings ofthe 57th Annual Meeting ofthe Association for Computational Linguistics. 2978–2988.
[LanZ2019]Permalink
Albert: A lite bert for self-supervised learning of language representations. arXiv:1909.11942 (2019).
[YangZ2019]Permalink
Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems (2019), 5754–5764.
CTRL: A Conditional Transformer Language Model For Controllable Generation
[Mart2016]Permalink
Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Imple-mentation ({OSDI} 16), pp. 265–283, 2016.
[Rohan2019]Permalink
Memory-efficient adaptive optimiza-tion for large-scale learning. arXiv preprint arXiv:1901.11150, 2019.
[Martin2017]Permalink
Wasserstein generative adversarial networks. ´In International conference on machine learning, pp. 214–223, 2017.
[Matthew2017]Permalink
Factsheets: Increasing trust in AI servicesthrough supplier’s declarations of conformity, August 2018. arXiv:1808.07261 [cs.CY].Mikel Artetxe, Gorka Labaka, Eneko Agirre, and Kyunghyun Cho. Unsupervised neural machinetranslation. arXiv preprint arXiv:1710.11041, 2017.
[Jimmy2016]Permalink
Layer normalization. CoRR, abs/1607.06450,2016.
[Lo2019]Permalink
Findings of the2019 conference on machine translation (wmt19). In Proceedings of the Fourth Conference onMachine Translation (Volume 2: Shared Task Papers, Day 1), pp. 1–61, 2019.
[Yoshua2003]Permalink
A neural probabilistic ´language model. Journal of machine learning research, 3(Feb):1137–1155, 2003.
[Thorsten2007]Permalink
Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Methods inNatural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL),pp. 858–867, 2007.
[Miles2016]Permalink
Artificial intelligence and responsible innovation. In Vincent C. Muller (ed.), ¨Fundamental Issues of Artificial Intelligence, pp. 543–554. Springer, 2016.
[Miles2019]Permalink
The malicious use of artificial intelligence: Forecasting,prevention, and mitigation, February 2019. arXiv:1802.07228 [cs.AI].Isaac Caswell, Ciprian Chelba, and David Grangier. Tagged back-translation. arXiv preprintarXiv:1906.06442, 2019.
[Xi2016]Permalink
Infogan:Interpretable representation learning by information maximizing generative adversarial nets. InAdvances in neural information processing systems, pp. 2172–2180, 2016.
[Rewon2019]Permalink
Generating long sequences with sparsetransformers. arXiv preprint arXiv:1904.10509, 2019.
[Ronan2008]Permalink
A unified architecture for natural language processing: Deepneural networks with multitask learning. In Proceedings of the 25th international conference onMachine learning, pp. 160–167. ACM, 2008.
[Ronan2011]Permalink
Natural language processing (almost) from scratch. Journal of machine learning research,12(Aug):2493–2537, 2011.
[Ruth1987]Permalink
The consumption junction: A proposal for research strategies in the sociol-ogy of technology. In Wiebe E. Bijker, Thomas P. Hughes, and Trevor J. Pinch (eds.), The SocialConstruction of Technological Systems, pp. 261–280. MIT Press, Cambridge, MA, USA, 1987.
[Andrew2015]Permalink
Semi-supervised sequence learning. In Advances in neural infor-mation processing systems, pp. 3079–3087, 2015.
[Zihang2019]Permalink
Transformer-xl: Attentive language models beyond a fixed-length context. arXivpreprint arXiv:1901.02860, 2019.
[Jacob2018]Permalink
Bert: Pre-training of deepbidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
[John2011]Permalink
Adaptive subgradient methods for online learning andstochastic optimization. Journal of Machine Learning Research, 12(Jul):2121–2159, 2011.
[Matthew2017]Permalink
Searchqa: A new q&a dataset augmented with context from a search engine. arXiv preprintarXiv:1704.05179, 2017.
[Angela2018]Permalink
Hierarchical neural story generation. arXiv preprintarXiv:1805.04833, 2018.
[Angela2019]Permalink
Eli5:Long form question answering. arXiv preprint arXiv:1907.09190, 2019.
[Boris2019]Permalink
Stochastic gradient methods with layer-wise adaptive moments for training of deep networks. arXiv preprint arXiv:1905.11286, 2019.
[Ian2014]Permalink
Generative adversarial nets. In Advances in neural infor-mation processing systems, pp. 2672–2680, 2014.
[Max2016]Permalink
Newsroom: A dataset of 1.3 million summaries with diverse extractive strategies. In Proceedings of the 2018 Conference of the North AmericanChapter of the Association for Computational Linguistics: Human Language Technologies, pp.708–719, New Orleans, Louisiana, June 2018.
[Kaiming2016]Permalink
Deep residual learning for image recog-nition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.770–778, 2016.
[Karl2015]Permalink
Teaching machines to read and comprehend. In Advances inneural information processing systems, pp. 1693–1701, 2015.
[Ari2019]Permalink
The curious case of neural text degener-ation. arXiv preprint arXiv:1904.09751, 2019.
[Jeremy2018]Permalink
Universal language model fine-tuning for text classification.arXiv preprint arXiv:1801.06146, 2018.
[Hakan2016]Permalink
Tying word vectors and word classifiers: Aloss framework for language modeling. arXiv preprint arXiv:1611.01462, 2016.
[Melvin2017]Permalink
Googles multilingual neural ´machine translation system: Enabling zero-shot translation. Transactions of the Association forComputational Linguistics, 5:339–351, 2017.
[Mandar2017]Permalink
Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. arXiv preprint arXiv:1705.03551, 2017.
[David2017]Permalink
Self-censorship is not enough. Nature, 492(7429):345–347,December 2012. doi: 10.1038/492345a.
[Lukasz2017]Permalink
One model to learn them all. arXiv preprint arXiv:1706.05137, 2017.
[Łukasz2018]Permalink
Fast decoding in sequence models using discrete latent variables. arXiv preprintarXiv:1803.03382, 2018.
[Nitish2019]Permalink
Unifying questionanswering and text classification via span extraction. arXiv preprint arXiv:1904.09286, 2019.
[Diederik2014]Permalink
Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980, 2014.
[Diederik2013]Permalink
Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114, 2013.
[Ryan2015]Permalink
Skip-thought vectors. In Advances in neural information processingsystems, pp. 3294–3302, 2015.
[Catherine2016]Permalink
Senellart. Domain control for neural machine translation.arXiv preprint arXiv:1612.06140, 2016.
[Wojciech2019]Permalink
Neural text summarization: A critical evaluation. arXiv preprint arXiv:1908.08960, 2019.
[Tom2019]Permalink
Natural questions: abenchmark for question answering research. Transactions of the Association for ComputationalLinguistics, 7:453–466, 2019.
[Guillaume2019]Permalink
Cross-lingual language model pretraining. arXiv preprintarXiv:1901.07291, 2019.
[Guillaume2019]Permalink
Large memory layers with product keys. ´ arXiv preprint arXiv:1907.05242, 2019.
[Hector2012]Permalink
The winograd schema challenge. In Thir-teenth International Conference on the Principles of Knowledge Representation and Reasoning,2012.
[Patrick2019]Permalink
Unsupervised question answering by clozetranslation. arXiv preprint arXiv:1906.04980, 2019.
[Minh-Thang2015]Permalink
Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114, 2015.
[Julian2015]Permalink
Image-based rec-ommendations on styles and substitutes. In Proceedings of the 38th International ACM SIGIRConference on Research and Development in Information Retrieval, pp. 43–52. ACM, 2015.
[Bryan6294]Permalink
Learned in translation:Contextualized word vectors. In Advances in Neural Information Processing Systems, pp. 6294.
[Bryan2018]Permalink
The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730, 2018.
[15Stephen2017]Permalink
Regularizing and optimizing lstm lan-guage models. arXiv preprint arXiv:1708.02182, 2017.
[Tomas2013]Permalink
Distributed represen-tations of words and phrases and their compositionality. In Advances in neural information pro-cessing systems, pp. 3111–3119, 2013.
[Margaret7596]Permalink
Model cards for model reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency (FAT* ’19), Jan-uary 2019. doi: 10.1145/3287560.3287596.
[Amit2019]Permalink
Filling gender & number gaps in neural ma-chine translation with black-box context injection. arXiv preprint arXiv:1903.03467, 2019.
[Vinod2010]Permalink
Rectified linear units improve restricted boltzmann machines. InProceedings of the 27th International Conference on Machine Learning (ICML-10), pp. 807–814,2010.
[Ramesh2016]Permalink
Abstractive text summarizationusing sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023, 2016.
[Matthew2018]Permalink
Deep contextualized word representations. arXiv preprint arXiv:1802.05365,2018.
[Carol1979]Permalink
Constraints on language mixing: intra sentential code-switching and borrowing inspanish/english. Language, pp. 291–318, 1979.
[Shana1980]Permalink
Sometimes ill start a sentence in spanish y termino en espanol: toward a typologyof code-switching1. Linguistics, 18(7-8):581–618, 1980.
[Ofir2016]Permalink
Using the output embedding to improve language models. arXiv preprintarXiv:1608.05859, 2016.
[Alec2018]Permalink
Improving language under-standing by generative pre-training. URL https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/languageunsupervised/language understanding paper.pdf, 2018.
[Alec2019]Permalink
Language models are unsupervised multitask learners. URLhttps://d4mucfpksywv.cloudfront.net/better-language-models/language models are unsupervised multitask learners.pdf, 2019.
[Nazneen2019]Permalink
Explain yourself! leveraging language models for commonsense reasoning. arXiv preprint arXiv:1906.02361, 2019.
[Pranav2016]Permalink
Squad: 100,000+ questionsfor machine comprehension of text. arXiv preprint arXiv:1606.05250, 2016.
[Alexander2015]Permalink
A neural attention model for abstractivesentence summarization. arXiv preprint arXiv:1509.00685, 2015.
[Evan2008]Permalink
The new york times annotated corpus. Linguistic Data Consortium, Philadelphia,6(12):e26752, 2008.
[Thomas2019]Permalink
Answers unite!unsupervised metrics for reinforced summarization models. arXiv preprint arXiv:1909.01610,2019.
[Abigail2017]Permalink
Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computa-tional Linguistics (Volume 1: Long Papers), volume 1, pp. 1073–1083, 2017.
[Rico2015]Permalink
Neural machine translation of rare words with subword units. arXiv preprint arXiv:1508.07909, 2015.
[Noam2018]Permalink
Adafactor: Adaptive learning rates with sublinear memory cost.arXiv preprint arXiv:1804.04235, 2018.
[Jack008]Permalink
Developing a framework for responsible inno-vation. Research Policy, 42(9):1568–1580, November 2013. doi: 10.1016/j.respol.2013.05.008.
[Ilya2014]Permalink
Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pp. 3104–3112, 2014.
[Trieu2018]Permalink
A simple method for commonsense reasoning. arXiv preprintarXiv:1806.02847, 2018.
[Adam2016]Permalink
A machine comprehension dataset. arXiv preprint arXiv:1611.09830,2016.
[Lav6008]Permalink
Pretrained AI models: Performativity,mobility, and change, September 2019. arXiv:1909.03290 [cs.CY].
[Curran2018]Permalink
Glue:A multi-task benchmark and analysis platform for natural language understanding. arXiv preprintarXiv:1804.07461, 2018.
[Sean2019]Permalink
Neural text generation with unlikelihood training. arXiv preprint arXiv:1908.04319, 2019.
[Yonghui2016]Permalink
Google’s neural machine trans-lation system: Bridging the gap between human and machine translation. arXiv preprintarXiv:1609.08144, 2016.
[Stratos2019]Permalink
Sumqe: a bert-based summary quality estimation model. arXiv preprint arXiv:1909.00578, 2019.
[Zhilin2018]Permalink
Hotpotqa: A dataset for diverse, explainable multi-hop questionanswering. arXiv preprint arXiv:1809.09600, 2018.
[Rowan 2019]Permalink
Defending against neural fake news. arXiv preprint arXiv:1905.12616, 2019
[Fangxiaoyu2022]Permalink
Language-agnostic BERT Sentence Embedding - LaBSE
- BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning
- BERT based cross-lingual sentence embeddings is explored in this paper.
- It explored combining the best methods for learning monolingual and cross-lingual representations including: masked language modeling (MLM), translation language modeling (TLM)
- Introducing a pre-trained multilingual language model dramatically reduces the amount of parallel training data required to achieve good performance
- It produces a model that achieves high bi-text retrieval accuracy over 112 languages
NLP Papers Available on my Google DrivePermalink
You can download these papers from link
- A brief introduction to boosting.pdf
- A Closer Look at Fermentors and Bioreactors.pdf
- A Comprehensive Survey on Graph Neural Networks.pdf
- A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection.pdf
- A dataset for detecting irony in Hindi-english code-mixed social media text.pdf
- A Framework for Document Specific Error Detection and Corrections in Indic OCR.pdf
- A lexicon-based approach for hate speech detection.pdf
- A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (.pdf
- A novel automatic satire and irony detection using ensembled feature selection and data mining.pdf
- A Pragmatic Analysis Of Humor In Modern Family.pdf
- A Selective Overview of Deep Learning.pdf
- A Sentiment Analyzer for Hindi Using Hindi Senti Lexicon.pdf
- A Survey of Code-switched Speech and Language Processing.pdf
- A Survey of the State of Explainable AI for Natural Language Processing.pdf
- A Survey on Explainable Artificial Intelligence (XAI) Toward Medical XAI.pdf
- A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language.pdf
- A transformer-based approach to irony and sarcasm detection.pdf
- A2Text-net A novel deep neural network for sarcasm detection.pdf
- Adaptive glove and fasttext model for Hindi word embeddings.pdf
- AI and Ethics - Operationalising Responsible AI-PAPER.pdf
- AI4Bharat-IndicNLP Corpus Monolingual Corpora and Word Embeddings for Indic Languages.pdf
- ALBERT A Lite BERT for Self-supervised Learning of Language Representations.pdf
- an Analysis of Current Trends for Sanskrit As a Computer Programming Language.pdf
- An empirical, quantitative analysis of the differences between sarcasm and Irony.pdf
- An Image is Worth 16x16 Words Transformers for Image Recognition at Scale.pdf
- Analyzing_The_Expressive_Power_Of_Graph.pdf
- AnnCorra Annotating Corpora Guidelines For POS And Chunk Annotation For Indian Languages.pdf
- Approaches to Cross-Domain Sentiment Analysis A Systematic Literature Review.pdf
- Attention is all you need.pdf
- Automatic sarcasm detection A survey.pdf
- Automatic satire detection Are you having a laugh.pdf
- Bag of tricks for efficient text classification.pdf
- Baselines and bigrams Simple, good sentiment and topic classification.pdf
- BERT Explained - A list of Frequently Asked Questions.pdf
- BERT Pre-training of deep bidirectional transformers for language understanding.pdf
- BHAAV- A Text Corpus for Emotion Analysis from Hindi Stories.pdf
- Carer Contextualized affect representations for emotion recognition.pdf
- CASCADE Contextual Sarcasm Detection in Online Discussion Forums.pdf
- Challenges in Deploying Machine Learning a Survey of Case Studies.pdf
- Clinical artificial intelligence quality improvement towards continual monitoring and updating of AI algorithms in healthcare.pdf
- CLUE based load balancing in replicated web server.pdf
- Clues for detecting irony in user-generated contents Oh…!! it_s so easy -).pdf
- Code Mixing A Challenge for Language Identification in the Language of Social Media.pdf
- Context-based Sarcasm Detection in Hindi Tweets.pdf
- Contextualized sarcasm detection on twitter.pdf
- Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis.pdf
- Data governance A conceptual framework, structured review, and research agenda.pdf
- Deep and Dense Sarcasm Detection.pdf
- Deep learning based unsupervised POS tagging for Sanskrit.pdf
- Detailed human avatars from monocular video.pdf
- Detecting Sarcasm is Extremely Easy -).pdf
- DIALOGPT Large-Scale Generative Pre-training for Conversational Response Generation.pdf
- DistilBERT, a distilled version of BERT smaller, faster, cheaper and lighter.pdf
- DRIFT Deep Reinforcement Learning for Functional Software Testing.pdf
- Drop A reading comprehension benchmark requiring discrete reasoning over paragraphs.pdf
- Dynamic routing between capsules.pdf
- Effect of speech coding on speaker identification.pdf
- Efficient estimation of word representations in vector space(2).pdf
- ELECTRA Pre-training Text Encoders as Discriminators Rather Than Generators.pdf
- Embedding Words as Distributions with a Bayesian Skip-gram Model.pdf
- Enriching Word Vectors with Subword Information.pdf
- Experience Grounds Language.pdf
- Exploiting emojis for sarcasm detection.pdf
- Exploiting Similarities among Languages for Machine Translation.pdf
- Exploring the fine-grained analysis and automatic detection of irony on Twitter(2).pdf
- Exploring the fine-grained analysis and automatic detection of irony on Twitter.pdf
- Exploring the impact of pragmatic phenomena on irony detection in tweets A multilingual corpus study.pdf
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.pdf
- Extensions to HMM-based Statistical Word Alignment Models.pdf
- Fairness_In_Machine_Learning_A_Survey.pdf
- Fake news detection of Indian and United States election data using machine learning algorithm.pdf
- Fake News Detection on Social Media.pdf
- FakeNewsNet A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social.pdf
- Faster R-CNN Towards Real-Time Object Detection with Region Proposal Networks.pdf
- FastText.zip Compressing text classification models.pdf
- Figurative messages and affect in Twitter Differences between #irony, #sarcasm and #not.pdf
- Forecasting COVID-19 Confirmed Cases in Major Indian Cities and Their Connectedness with Mobility and Weather-related Parameters.pdf
- From English To Foreign Languages Transferring Pre-trained Language Models.pdf
- FROM Pre-trained Word Embeddings TO Pre-trained Language Models - Focus on BERT.pdf
- Going deeper with convolutions.pdf
- Graph Machine Learning NeurIPS 2020 Papers.pdf
- Grouped Convolutional Neural Networks for Multivariate Time Series.pdf
- Grouped Functional Time Series Forecasting An Application to Age-Specific Mortality Rates.pdf
- Handbook of approximation algorithms and metaheuristics.pdf
- Harnessing context incongruity for sarcasm detection.pdf
- Harnessing Online News for Sarcasm Detection in Hindi Tweets.pdf
- Hidden Markov Models.pdf
- Hidden technical debt in machine learning systems.pdf
- Hotpotqa A dataset for diverse, explainable multi-hop question answering.pdf
- How multilingual is multilingual BERT.pdf
- How to avoid machine learning pitfalls a guide for academic researchers.pdf
- How to read a paper.pdf
- HuggingFace_s Transformers State-of-the-art Natural Language Processing.pdf
- Identifying machine learning techniques for classification of target advertising.pdf
- Identifying sarcasm in Twitter A closer look.pdf
- Improving Language Understanding by Generative Pre-Training.pdf
- Improving the learnability of classifiers for Sanskrit OCR corrections.pdf
- Indic sentiReview Natural language processing based sentiment analysis on major indian languages.pdf
- Interactive-and-Visual-Prompt-Engineering-for-adhoc-Task-Adaptation-LLM.pdf
- Investigations in computational sarcasm.pdf
- Irony detection in twitter The role of affective content.pdf
- Irony, Sarcasm and Parody in the American Sitcom Modern Family.pdf
- iSarcasm A Dataset of Intended Sarcasm.pdf
- K-means with Three different Distance Metrics.pdf
- Knowledge Representation in Sanskrit and Artificial Intelligence.pdf
- Learning Graph Search Heuristics.pdf
- Learning latent causal graphs via mixture oracles.pdf
- LearningSys_2015_paper_32.pdf
- Lexicon-Based Methods for Sentiment Analysis.pdf
- Lexicon-Based Sentiment Analysis in the Social Web.pdf
- LightGBM A highly efficient gradient boosting decision tree.pdf
- Linguistic Inquiry and Word Count LIWC2015.pdf
- Machine Learning in Automated Text Categorization.pdf
- Machine Learning within a Graph Database A Case Study on Link Prediction for Scholarly Data.pdf
- Machine Translation Approaches and Survey for Indian Languages.pdf
- Machine Translation of Bi-lingual Hindi-English (Hinglish) Text.pdf
- Merlion A Machine Learning Library for Time Series.pdf
- Mining of Massive Datasets.pdf
- MLP-Mixer An all-MLP Architecture for Vision.pdf
- Multi-modal sarcasm detection in Twitter with hierarchical fusion model.pdf
- Multi-rule based ensemble feature selection model for sarcasm type detection in Twitter.pdf
- Multimodal markers of irony and sarcasm.pdf
- N Atural L Anguage I Nference Over.pdf
- Natural Language Processing - A Panian Perspective.pdf
- Natural language processing based features for sarcasm detection An investigation using bilingual social media texts.pdf
- NeuralProphet Explainable Forecasting at Scale.pdf
- On State-of-the-art of POS Tagger, Sandhi Splitter, Alankaar Finder and Samaas Finder for IndoAryan and Dravidian Languages.pdf
- Opinion mining and sentiment analysis.pdf
- Opinion-Based Entity Ranking (Author_s Draft).pdf
- Part-of-speech tagging from 97_ to 100_ Is it time for some linguistics.pdf
- PAVE Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models.pdf
- Real-time Sentiment Analysis of Hindi Tweets.pdf
- Reasoning with sarcasm by reading in-between.pdf
- Recent trends in deep learning based natural language processing Review Article.pdf
- RECEPTIVE FIELDS OF SINGLE NEURONES IN THE CAT _ S STRIATE CORTEX.pdf
- Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing.pdf
- Representing social media users for sarcasm detection.pdf
- Retrospective Reader for Machine Reading Comprehension.pdf
- RoBERTa A Robustly Optimized BERT Pretraining Approach.pdf
- Robotics , AI , and.pdf
- ROC graphs Notes and practical considerations for researchers.pdf
- Sanskrit sandhi splitting using Seq2(Seq)22.pdf
- Sanskrit word segmentation using character-level recurrent and convolutional neural networks.pdf
- Sarc-M Sarcasm Detection in Typo-graphic Memes.pdf
- Sarcasm as contrast between a positive sentiment and negative situation.pdf
- Sarcasm Detection in Hindi sentences using Support Vector machine.pdf
- Sarcasm detection in tweets.pdf
- Sarcasm detection on twitterA behavioral modeling approach.pdf
- Sarcastic sentiment detection in tweets streamed in real time a big data approach.pdf
- Scalable linear algebra on a relational database system.pdf
- Scaling Large Production Clusters with Partitioned Synchronization This paper is included in the Proceedings of the.pdf
- Semantics-Aware BERT for Language Understanding.pdf
- Semi-supervised recognition of sarcastic sentences in twitter and Amazon.pdf
- SentencePiece A simple and language independent subword tokenizer and detokenizer for neural text processing.pdf
- Sentiment Analysis for Hindi Language.pdf
- Sentiment Analysis in a Resource Scarce LanguageHindi.pdf
- Sentiment Analysis In Hindi.pdf
- Sentiment Analysis in Indian languages o Definition.pdf
- Sentiment Analysis of Hindi Review based on Negation and Discourse Relation.pdf
- Sentiment classification using machine learning techniques with syntax features.pdf
- Skillful writing of an awful research paper.pdf
- Social media and fake news in the 2016 election.pdf
- Sound classification using convolutional neural network and tensor deep stacking network.pdf
- Sparse, contextually informed models for irony detection Exploiting user communities, entities and sentiment.pdf
- SQuad 100,000 questions for machine comprehension of text.pdf
- ST4_Method_Random_Forest.pdf
- Statistical Methods in Natural Language Processing.pdf
- StructBERT Incorporating Language Structures into Pre-training for Deep Language Understanding.pdf
- Structural S tudies on S mall A myloid O ligomers RT-6.pdf
- Superintelligence.pdf
- Systematic literature review of sentiment analysis on Twitter using soft computing techniques.pdf
- Text categorization with support vector machines Learning with many relevant features.pdf
- Text normalization of code mix and sentiment analysis.pdf
- The Differential Role of Ridicule in Sarcasm and Irony The Differential Role of Ridicule in Sarcasm and Irony.pdf
- The highest form of intelligence Sarcasm increases creativity for both expressers and recipients.pdf
- The Modern Mathematics of Deep Learning *.pdf
- The Paninian approach to natural language processing.pdf
- The perfect solution for detecting sarcasm in tweets #not.pdf
- Thumbs Up or Thumbs Down Semantic Orientation Applied to Unsupervised Classification of Reviews.pdf
- THU_NGN at SemEval-2018 Task 3 Tweet Irony Detection with Densely connected LSTM and Multi-task Learning.pdf
- TnT - A Statistical Part-of-Speech Tagger.pdf
- To BLOB or Not To BLOB Large Object Storage in a Database or a Filesystem To BLOB or Not To BLOB Large Object Storage in a Dat.pdf
- Towards Demystifying Serverless Machine Learning Training.pdf
- Towards multimodal sarcasm detection (an obviously perfect paper).pdf
- Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed text.pdf
- Triple-View Feature Learning for Medical Image Segmentation.pdf
- Twitter as a corpus for sentiment analysis and opinion mining.pdf
- Two improved continuous bag-of-word models.pdf
- Understanding Diffusion Models A Unified Perspective Introduction Generative Models.pdf
- Universal Sentence Encoder.pdf
- Unsupervised Irony Detection A Probabilistic Model with Word Embeddings.pdf
- UR-Funny A multimodal language dataset for understanding humor.pdf
- Use of Sanskrit for natural language processing.pdf
- Using TF-IDF to Determine Word Relevance in Document Queries.pdf
- Using Word Embeddings for Query Translation for Hindi to English Cross Language Information Retrieval.pdf
- Very deep convolutional networks for large-scale image recognition.pdf
- We are IntechOpen , the world ‘ s leading publisher of Open Access books Built by scientists , for scientists TOP 1 _.pdf
- When BERT Plays the Lottery, All Tickets Are Winning.pdf
- XGBoost A scalable tree boosting system.pdf
- XLNet Generalized Autoregressive Pretraining for Language Understanding.pdf
AI Papers Available on my Google DrivePermalink
You can download these papers from link
- A Comprehensive Survey on Graph Neural Networks-PAPER.pdf
- A machine learning approach to predicting psychosis-PAPER.pdf
- A Selective Overview of Deep Learning-PAPER.pdf
- A Short introduction to boosting-PAPER.pdf
- A Survey of the State of Explainable AI for NLP-PAPER.pdf
- A Survey on Explainable AI (XAI) towards Medical XAI-PAPER.pdf
- AI and Ethics - Operationalising Responsible AI-PAPER.pdf
- Analyzing The Expressive Power of Graph Neural Network in a Spectral Perspective-PAPER.pdf
- Attention-Mechanism-Transformers-BERT-and-GPT-PAPER.pdf
- Can GPT-4 Perform Neural Architecture Search-PAPER.pdf
- Challenges in Deploying Machine Learning-PAPER.pdf
- Clinical AI quality improvement-PAPER.pdf
- Cramming-Training-a-Language-Model-On-A-Single-GPU-in-one-Day-PAPER.pdf
- DataGovernance-A conceptual framework, structured review-PAPER.pdf
- Detailed human avatar-PAPER.pdf
- DRIFT_26_CameraReadySubmission_NeurIPS_DRL-PAPER.pdf
- Dynamic Routing Between Capsules-PAPER.pdf
- Fairness in Machine Learning A Survey-PAPER.pdf
- Forecasting COVID-19 Confirmed Cases-PAPER.pdf
- Generalization Beyond Overfitting On Small Datasets-PAPER.pdf
- GPTrillion-Paper.pdf
- GPTs-are-GPTs-An-Early-Look-at-the-Labor-Market-Impact-Potential-of-Large-Language-Models-PAPER.pdf
- GraphMachine Learning NeurIPS 2020-PAPER.pdf
- Grouped Convolutional Neural Networks for Multivariate Time Series -PAPER.pdf
- Grouped functional time series forecasting An application to age-specific mortality rates-PAPER.pdf
- Hidden Technical Debt in Machine Learning Systems-PAPER.pdf
- Hidden technical debt in machine learning systems.pdf
- How to avoid machine learning pitfalls-PAPER.pdf
- How to Read a Paper-ARTC.pdf
- Identifying machine learning techniques for classification of target advertising-PAPER.pdf
- Introducing-GPTrillion-PAPER.pdf
- Large Object Storage in a Database or a Filesystem-PAPER.pdf
- Learning Graph Heuristic Search-PAPER.pdf
- Learning latent causal graphs via mixture oracles-PAPER.pdf
- LightGBM A Highly Efficient Gradient Boosting-PAPER.pdf
- Machine Learning within a Graph Database- A Case Study on Link Prediction for Scholarly Data-PAPER.pdf
- Merlion- A Machine Learning Library for Time Series-PAPER.pdf
- Model Evaluation, Model Selection, and Algorithm Selection-PAPER.pdf
- NeuralProphet-Explainable Forecasting at Scale-PAPER.pdf
- PAVE-Lazy-MDP based Ensemble to Improve Recall of Product Attribute Extraction Models-PAPER.pdf
- Precise Zero-Shot Dense Retrieval without Relevance Labels-PAPER.pdf
- Randomforest-PAPER.pdf
- Receptive Fields of Single Neurones in the Cats Striate Cortex-PAPER.pdf
- Robotics, AI, and Humanity Science-PAPERS.pdf
- Scalable Linear Algebra on a Relational Database System-PAPER.pdf
- Scaling Large Production Clusters-PAPER.pdf
- Skillful writing of an awful research paper-GUIDE.pdf
- The Modern Mathematics of Deep Learning-PAPER.pdf
- Towards Demystifying Serverless Machine Learning Training-PAPER.pdf
- Triple-View Feature Learning for Medical Image Segmentation-PAPER.pdf
- Understanding Diffusion Models- A Unified Perspective-PAPER.pdf
- VeML-An-End-to-End-Machine-Learning-Lifecycle-for-Large-Scale-and-High-Dimensional-Data-PAPER.pdf
- Very Deep Convolutional Networks for Large Scale Image Recognition-PAPER.pdf
- XGBoost A Scalable Tree Boosting System-PAPER.pdf
- XGBoost Reliable Large-scale Tree Boosting System-PAPER.pdf
Recent PapersPermalink
- Detecting Signs of Disease from External Images of the Eye, THURSDAY, MARCH 24, 2022
- Deep-learning models for the detection and incidence prediction of chronic kidney disease and type 2 diabetes from retinal fundus images, 15 June 2021
- Detection of anaemia from retinal fundus images via deep learning, 23 December 2019
- Assessing Cardiovascular Risk Factors with Computer Vision, MONDAY, FEBRUARY 19, 2018
Author
Dr Hari Thapliyaal
dasarpai.com
linkedin.com/in/harithapliyal
Leave a comment