Natural Language Processing
Revision as of 14:19, 21 December 2020 by Kourouklides (talk | contribs)
This page contains resources about Natural Language Processing, Text Mining, Speech Processing, Audio Signal Processing and Computational Linguistics.
Subfields and Concepts[edit]
- Vector Space Model (VSM)
- Latent Semantic Indexing
- Latent Semantic Analysis
- Latent Dirichlet Allocation (LDA)
- Part-of-speech tagging
- Sequence-to-Sequence (seq2seq) Model
- Dynamic Memory Network (a specific architecture of Artificial Neural Networks)
- Attention Mechanism
- RNN with attention
- Transformer (i.e. self-attention & FNN)
- Set Transformer
- Speaker Recognition
- Speaker Verification
- Speaker Identification / Speaker Diarization
- Speaker Segmentation
- Speaker Clustering
- Speech Synthesis / Text-to-Speech
- Speech Recognition / Voice Recognition / Speech-to-Text / Transcription
- Conversational Speech
- Voice Dictation
- Voice Commands
- Audio Captioning / Subtitling
- Automatic Lyrics Recognition
- Topic Model
- Text Preprocessing
- Tokenization
- Stemming
- Lemmatisation
- Word embeddings / Feature vectors / Word representations
- Sparse feature vectors
- Word2Vec Model
- Continuous Skip-gram
- Continuous Bag-of-Words (CBOW)
- GloVe
- FastText
- Bag-of-Words (BoW) Model
- N-grams
- Unigrams
- Bigrams
- Term Frequency - Inverse Document Frequency (TF-IDF)
- Sequence Tagging
- Natural Language Understanding (NLU)
- Natural Language Generation (NLG)
- Named-Entity Recognition (NER)
- (Natural Language) Semantic Analysis
- Sentiment Analysis
- Emotion Recognition
- Diacritization (e.g. in Hebrew or Arabic)
- Dialogue System / Conversational Agents
- Task-Oriented Dialogue System / Goal-Oriented Conversational Agent (usually built for speech input and output)
- Pipeline Systems
- Natural language understanding (NLU)
- Dialogue state tracking
- Dialogue policy learning
- Natural language generation (NLG)
- End-to-End trainable Systems
- Pipeline Systems
- Non-Task-Oriented Dialogue System / Chatbot (in the strict sense) / Question-Answering (QA) System
- Rule-based QA
- ML-based QA / Corpus-based QA
- Retrieval-based models (using Utterance selection)
- Generative models
- Task-Oriented Dialogue System / Goal-Oriented Conversational Agent (usually built for speech input and output)
- Visual Question-Answering (VQA)
- Question Generation
- Machine Translation (MT)
- Text summarization
Online Courses[edit]
Video Lectures[edit]
- Natural Language Processing with Deep Learning by Chris Manning and Richard Socher
- Natural Language Processing by Pushpack Bhattacharyya - NPTEL
- Natural Language Processing by Dragomir Radev
- Text Mining and Analytics by ChengXiang ("Cheng") Zhai
- Deep Learning for Natural Language Processing: Applications of Deep Neural Networks to Machine Learning Tasks by Jon Krohn
- Natural Language Processing - Coursera
- Neural Networks for NLP by Graham Neubig (slides)
Lecture Notes[edit]
- Advanced Language Processing by Michael Collins and Regina Barzilay - MIT Opencourseware
- Automatic Speech Recognition by James Glass and Victor Zue - MIT Opencourseware
- Speech Recognition by Bhuvana Ramabhadran, Markus Nussbaum-Thom, Michael Picheny and Stanley F. Chen
- Natural Language Processing with NLTK
- CS224U: Natural Language Understanding by Bill MacCartney and Christopher Potts
Books[edit]
Natural Language Processing[edit]
- Arumugam, R., & Shanmugamani, R. (2018). Hands-On Natural Language Processing with Python. Packt Publishing.
- Srinivasa-Desikan, B. (2018). Natural Language Processing and Computational Linguistics. Packt Publishing.
- Goldberg, Y., & (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool Publishers.
- Silge, J., & Robinson, D. (2017). Text mining with R: A tidy approach. O'Reilly Media, Inc.
- Sarkar, D. (2016). Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from Your Data. Apress. (link)
- Mihalcea, R., & Radev, D. (2011). Graph-based natural language processing and information retrieval. Cambridge University Press.
- Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc. (link)
- Zhai, C. (2008). Statistical language models for information retrieval. Morgan and Claypool Publishers.
- Tiwary, U. S., & Siddiqui, T. (2008). Natural language processing and information retrieval. Oxford University Press.
- Manning, C. D., & Schutze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press.
Speech Processing[edit]
- Rabiner, L. R., & Schafer, R. W. (2011). Theory and applications of digital speech processing. Pearson.
- Gold, B., Morgan, N., & Ellis, D. (2011). Speech and audio signal processing: processing and perception of speech and music. 2nd Ed. John Wiley & Sons.
- Mitra, S. K., & Kuo, Y. (2010). Digital signal processing: a computer-based approach. 4th Ed. McGraw-Hill.
- Spanias, A., Painter, T., & Atti, V. (2006). Audio signal processing and coding. John Wiley & Sons.
- Huang, X., Acero, A., Hon, H. W., & Reddy, R. (2001). Spoken language processing: A guide to theory, algorithm, and system development. Prentice Hall.
- Quatieri, T. F. (2001). Discrete-time speech signal processing: principles and practice. Pearson.
- Holmes, J., & Holmes, W. (2001). Speech Synthesis and Recognition. 2nd Ed. CRC press.
- Huang, X., Acero, A., Hon, H. W., & Reddy, R. (2001). Spoken language processing: A guide to theory, algorithm, and system development. Prentice Hall.
- Jelinek, F. (1998). Statistical methods for speech recognition. MIT Press.
- Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals. Prentice Hall.
Speech and Natural Language Processing (both)[edit]
- Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall. (link)
Scholarly articles[edit]
- Park, D. S., Chan, W., Zhang, Y., Chiu, C. C., Zoph, B., Cubuk, E. D., & Le, Q. V. (2019). SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. arXiv preprint arXiv:1904.08779.
- Deriu, J., Rodrigo, A., Otegi, A., Echegoyen, G., Rosset, S., Agirre, E., & Cieliebak, M. (2019). Survey on Evaluation Methods for Dialogue Systems. arXiv preprint arXiv:1905.04071.
- Das, A., Li, J., Zhao, R., & Gong, Y. (2018). Advancing connectionist temporal classification with attention modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4769-4773).
- Moriya, T., Ueno, S., Shinohara, Y., Delcroix, M., Yamaguchi, Y., & Aono, Y. (2018). Multi-task Learning with Augmentation Strategy for Acoustic-to-word Attention-based Encoder-decoder Speech Recognition. In Proceedings of Interspeech (pp. 2399-2403).
- Zeyer, A., Irie, K., Schlüter, R., & Ney, H. (2018). Improved Training of End-to-end Attention Models for Speech Recognition. In Proceedings of Interspeech (pp. 7-11).
- Toshniwal, S., Tang, H., Lu, L., & Livescu, K. (2017). Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition. In Proceedings of Interspeech (pp. 3532-3536).
- Kim, S., Hori, T., & Watanabe, S. (2017). Joint CTC-attention based end-to-end speech recognition using multi-task learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4835-4839).
- Chen, H., Liu, X., Yin, D., & Tang, J. (2017). A survey on dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explorations Newsletter, 19(2), 25-35.
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998-6008).
- Mishra, A., & Jain, S. K. (2016). A survey on question answering systems with classification. Journal of King Saud University-Computer and Information Sciences, 28(3), 345-361.
- Yin, J., Jiang, X., Lu, Z., Shang, L., Li, H., & Li, X. (2016). Neural generative question answering. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (pp. 2972-2978).
- Luong, M. T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. In Empirical Methods in Natural Language Processing (pp. 1412–1421).
- Wen, T. H., Gasic, M., Mrksic, N., Su, P. H., Vandyke, D., & Young, S. (2015). Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems. arXiv preprint arXiv:1508.01745.
- Sethu, V., Epps, J., & Ambikairajah, E. (2015). Speech based emotion recognition. In Speech and Audio Processing for Coding, Enhancement and Recognition (pp. 197-228). Springer.
- Blei, D. M. (2012). Probabilistic Topic Models. Communications of the ACM, 55(4), 77-84.
Tutorials[edit]
Software[edit]
Speech Recognition[edit]
- OpenSeq2Seq - Python with TensorFlow
- DeepSpeech - Python with TensorFlow
- SpeechRecognition - Python library for performing speech recognition, with support for several engines and APIs, online and offline
- Kaldi - C++
- CMUSphinx - Open Source Speech Recognition Toolkit
- DeepSpeech - Python
- wav2letter - C++ (and previously in Lua)
- Web Speech API (Mozilla) - Speech Synthesis (Text-to-Speech) and Speech Recognition (Asynchronous Speech Recognition)
- W3C Web Speech API - JavaScript
- wit.ai - free API using NLP for text and voice
- Google Cloud Speech-to-Text API - paid API
- Microsoft Bing Speech-to-Text API - paid API
- IBM Watson Speech to Text - paid API
- AssemblyAI - freeware API
Text-to-Speech (TTS) Synthesis[edit]
- gTTS (Google Text-to-Speech) - Python library and CLI tool to interface with Google Translate's text-to-speech API
- tacotron - Python and TensorFlow TTS
- Tacotron-2 - Python and TensorFlow TTS
- tacotron2 - Python and PyTorch TTS
- multi-speaker-tacotron - Python and TensorFlow TTS
- nv-wavenet - Python and C ++ TTS
- WaveGlow - Python and PyTorch TTS
- tensorflow-wavenet - Python and TensorFlow TTS
- wavenet_vocoder - Python and PyTorch TTS
- The Festival Speech Synthesis System
- CMU Flite
Miscellaneous[edit]
- OpenSeq2Seq - Python with TensorFlow
- AllenNLP - Python with PyTorch
- NLTK - Python
- gensim - Python
- spaCy - Python
- nlpbuddy - Python
- MALLET - Java
- magnitude - Python
- SPEAR: A Speaker Recognition Toolkit based on Bob - Python
- SIDEKIT - Python library for Speaker, Language Recognition and Diarization
- OpenNMT - Implemented in LuaTorch, PyTorch and TensorFlow
- HTK - C
- ffmpeg-python - Python
- SoX - command line utility
- pysox - Python
- wave - Python
- TextBlob - Python
- ParlAI
- Moses
- sockeye - Python
- T2T or Tensor2Tensor - Python with TensorFlow
- marian - C++
- SRILM - C++
- srilm-python - Python
- LibROSA - Python
- StarSpace - C++ and Python
See also[edit]
Other Resources[edit]
- Computational Linguistics - Google Scholar Metrics (Top Publications)
- SQuAD - The Stanford Question Answering Dataset
- Awesome-NLP - Github
- seq2seq_learn - Github
- Word Embeddings: A Natural Language Processing Crash Course
- Complete Guide to Topic Modeling - NLP for hackers
- Quick Tutorial: Topic Modelling with LDA
- Smart Compose: Using Neural Networks to Help Write Emails - blog post
- Wit-Speech-API-Wrapper - Github
- python-speech-recognition - Github
- A Practitioner's Guide to Natural Language Processing (Part I) — Processing & Understanding Text - blog post
- The Ultimate Guide To Speech Recognition With Python - blog post
- Practical seq2seq - blog post
- Chatbots with Seq2Seq - blog post
- Pre-Processing in Natural Language Machine Learning - blog post
- How to Prepare Text Data for Machine Learning with scikit-learn - blog post
- Word2Vec and FastText Word Embedding with Gensim - blog post
- Introduction to NLP - Part 1: Overview - blog post
- The Ultimate Guide To Speech Recognition With Python - blog post
- Digital signal processing through speech, hearing, and Python - presentation
- MATLAB Functionality for Digital Speech Processing - presentation
- attract-repel - Github
- speaker-recognition - Github
- Microsoft Speaker Recognition API: Python Sample - Github
- Speaker-Recognition - Github
- speaker-diarization - Github
- VBDiarization - Github
- pyAudioAnalysis - Github
- seq2seq - Github
- Automatic_Speech_Recognition - Github
- neuralmonkey - Github
- nlpnet - Github
- nematus - Github
- speech-processing - Github
- Deep Learning for Natural Language Processing: Tutorials with Jupyter Notebooks - blog post
- Word morphing - blog post
- Tutorial: Asynchronous Speech Recognition in Python - blog post
- Web Speech API Specification - W3C Community Group
- Speech recognition with wit.ai- blog post
- LibriSpeech ASR corpus - dataset
- NLP For Topic Modeling & Summarization Of Legal Documents - blog post
- Automatic Speech Recognition (Youtube) - video
- You Shall Not Speak - blog post
- Simple Audio Recognition (TensorFlow) - blog post
- open-speech-recording (GitHub) - code
- Introduction to NLP - blog post
- Deep Learning for Speech Recognition (Youtube) - video
- Web Speech API (GitHub) - code
- Web Speech API: Add Speech to your Website
- Voice Driven Web Apps: Introduction to the Web Speech API - blog post
- Launching the Speech Commands Dataset - blog post
- Improving End-to-End Models For Speech Recognition - blog post
- Sentiment Analysis on Reddit News Headlines with Python’s Natural Language Toolkit (NLTK) - blog post with code
- Predicting Reddit News Sentiment with Naive Bayes and Other Text Classifiers - blog post with code
- Attention in NLP - blog post
- Building a Question-Answering System from Scratch— Part 1 - blog post
- Han Xiao - blog
- text-top-model (GitHub) - code for benchmarking text classification algorithms
- Text Analytics Techniques - blog
- Sentiment-Analysis-Twitter (GitHub) - code
- machine-translation (GitHub) - code
- Datasets on Linguistics (Kaggle)
- NLP-progress - Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks
- Attention-Based-BiLSTM-relation-extraction (GitHub) - code
- bert (GitHub) - code
- dialogflow - chatbot (by Google)
- SCOOT: Speech Communication Online Training - ISCA’s guide to online training resources in Speech Communication
- uis-rnn (GitHub) - code
- nlptutorial (GitHub) - code
- transformer (GitHub) - code
- transformer-tensorflow (GitHub) - code
- bert (GitHub) - code
- Creating an open speech recognition dataset for (almost) any language - blog post
- How To Create Natural Language Semantic Search For Arbitrary Objects With Deep Learning - blog post
- How To Create Data Products That Are Magical Using Sequence-to-Sequence Models - blog post
- Self-Attention Mechanisms in Natural Language Processing - blog post
- nmt (GitHub) - code
- Frontiers in Natural Language Processing
- Frontiers in Natural Language Processing Expert Responses
- Building a Conversational Chatbot for Slack using Rasa and Python (Part 1, Part 2)
- CMU-MOSI - dataset
- MOUD - dataset
- nlp_tasks (GitHub) - NLP Tasks and References
- 3 silver bullets of word embeddings in NLP - blog post
- seq2seq-chatbot (GitHub) - code
- Evaluating Text Output in NLP: BLEU at your own risk - blog post
- Introduction to Transformers Architecture - blog post
- word-embeddings-for-nmt (GitHub) - code
- How Do Task-Oriented Dialogue Systems Work and What Benefits They Bring for Business - blog post
- Deep Learning for Chatbots, Part 1 – Introduction - blog post
- Better Language Models and Their Implications - blog post