Natural Language Processing

From Ioannis Kourouklides
Jump to navigation Jump to search

This page contains resources about Natural Language Processing, Text Mining, Speech Processing, Audio Signal Processing and Computational Linguistics.

Subfields and Concepts[edit]

  • Vector Space Model (VSM)
  • Latent Semantic Indexing
  • Latent Semantic Analysis
  • Latent Dirichlet Allocation (LDA)
  • Attention Mechanism
    • Transformer
    • Set Transformer
  • Speaker Recognition
  • Speaker Identification / Speaker Diarization
    • Speaker Segmentation
    • Speaker Clustering
  • Speech Synthesis / Text-to-Speech
  • Speech Recognition / Voice Recognition / Speech-to-Text / Transcription
    • Conversational Speech
    • Voice Dictation
    • Voice Commands
  • Audio Captioning / Subtitling
  • Automatic Lyrics Recognition
  • Topic Model
  • Text Preprocessing
    • Tokenization
    • Stemming
    • Lemmatisation
    • Word embeddings / Word vectors
      • Word2Vec Model
        • Continuous Skip-gram
        • Continuous Bag-of-Words (CBOW)
      • GloVe
      • FastText
    • Bag-of-Words (BoW) Model
    • N-grams
      • Unigrams
      • Bigrams
  • Term Frequency - Inverse Document Frequency (TF-IDF)
  • Sequence-to-Sequence (seq2seq) Model
  • Dynamic Memory Network (a specific architecture of Artificial Neural Networks)
  • Sequence Tagging
  • Natural Language Understanding (NLU)
  • Natural Language Generation (NLG)
  • Named-Entity Recognition (NER)
  • (Natural Language) Semantic Analysis
  • Sentiment Analysis
  • Emotion Recognition
  • Diacritization (e.g. in Hebrew or Arabic)
  • Chatbots
  • Question-Answering System
  • Machine Translation
  • Text summarization

Online Courses[edit]

Video Lectures[edit]

Lecture Notes[edit]

Books[edit]

Natural Language Processing[edit]

  • Arumugam, R., & Shanmugamani, R. (2018). Hands-On Natural Language Processing with Python. Packt Publishing.
  • Srinivasa-Desikan, B. (2018). Natural Language Processing and Computational Linguistics. Packt Publishing.
  • Goldberg, Y., & (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool Publishers.
  • Silge, J., & Robinson, D. (2017). Text mining with R: A tidy approach. O'Reilly Media, Inc.
  • Sarkar, D. (2016). Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from Your Data. Apress. (link)
  • Mihalcea, R., & Radev, D. (2011). Graph-based natural language processing and information retrieval. Cambridge University Press.
  • Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc. (link)
  • Zhai, C. (2008). Statistical language models for information retrieval. Morgan and Claypool Publishers.
  • Tiwary, U. S., & Siddiqui, T. (2008). Natural language processing and information retrieval. Oxford University Press.
  • Manning, C. D., & Schutze, H. (1999). Foundations of Statistical Natural Language Processing. MIT Press.

Speech Processing[edit]

  • Rabiner, L. R., & Schafer, R. W. (2011). Theory and applications of digital speech processing. Pearson.
  • Gold, B., Morgan, N., & Ellis, D. (2011). Speech and audio signal processing: processing and perception of speech and music. 2nd Ed. John Wiley & Sons.
  • Mitra, S. K., & Kuo, Y. (2010). Digital signal processing: a computer-based approach. 4th Ed. McGraw-Hill.
  • Spanias, A., Painter, T., & Atti, V. (2006). Audio signal processing and coding. John Wiley & Sons.
  • Huang, X., Acero, A., Hon, H. W., & Reddy, R. (2001). Spoken language processing: A guide to theory, algorithm, and system development. Prentice Hall.
  • Quatieri, T. F. (2001). Discrete-time speech signal processing: principles and practice. Pearson.
  • Holmes, J., & Holmes, W. (2001). Speech Synthesis and Recognition. 2nd Ed. CRC press.
  • Huang, X., Acero, A., Hon, H. W., & Reddy, R. (2001). Spoken language processing: A guide to theory, algorithm, and system development. Prentice Hall.
  • Jelinek, F. (1998). Statistical methods for speech recognition. MIT Press.
  • Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals. Prentice Hall.

Speech and Natural Language Processing (both)[edit]

  • Jurafsky, D., & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall. (link)

Scholarly articles[edit]

  • Park, D. S., Chan, W., Zhang, Y., Chiu, C. C., Zoph, B., Cubuk, E. D., & Le, Q. V. (2019). SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. arXiv preprint arXiv:1904.08779.
  • Das, A., Li, J., Zhao, R., & Gong, Y. (2018). Advancing connectionist temporal classification with attention modeling. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4769-4773).
  • Moriya, T., Ueno, S., Shinohara, Y., Delcroix, M., Yamaguchi, Y., & Aono, Y. (2018). Multi-task Learning with Augmentation Strategy for Acoustic-to-word Attention-based Encoder-decoder Speech Recognition. In Proceedings of Interspeech (pp. 2399-2403).
  • Zeyer, A., Irie, K., Schlüter, R., & Ney, H. (2018). Improved Training of End-to-end Attention Models for Speech Recognition. In Proceedings of Interspeech (pp. 7-11).
  • Toshniwal, S., Tang, H., Lu, L., & Livescu, K. (2017). Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition. In Proceedings of Interspeech (pp. 3532-3536).
  • Kim, S., Hori, T., & Watanabe, S. (2017). Joint CTC-attention based end-to-end speech recognition using multi-task learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4835-4839).
  • Wen, T. H., Gasic, M., Mrksic, N., Su, P. H., Vandyke, D., & Young, S. (2015). Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems. arXiv preprint arXiv:1508.01745.
  • Sethu, V., Epps, J., & Ambikairajah, E. (2015). Speech based emotion recognition. In Speech and Audio Processing for Coding, Enhancement and Recognition (pp. 197-228). Springer.
  • Blei, D. M. (2012). Probabilistic Topic Models. Communications of the ACM, 55(4), 77-84.

Tutorials[edit]

Software[edit]

Speech Recognition[edit]

Miscellaneous[edit]

See also[edit]

Other Resources[edit]