Skip to main content

NLP

NLP (Natural Language Processing)

  • Lexical Processing
  • Semantic Analysis
  • Syntactic Analysis
  • Neural Network (NN)
  • Recurring NN
  • Chatbot Project

Why Natural Language is hard for computer to parse

May is fun but June bores me.

Does it refer to months or to people?

https://www.toptal.com/machine-learning/google-nlp-tutorial

Natural Language Processing with TensorFlow 2 - Beginner's Course

https://www.freecodecamp.org/news/google-bert-nlp-machine-learning-tutorial

Spacy

Industrial-Strength Natural Language Processing

https://spacy.io/usage/models

spaCy · Industrial-strength Natural Language Processing in Python

GitHub - explosion/spaCy: 💫 Industrial-strength Natural Language Processing (NLP) in Python ⭐ 33k

Gensim (Topic Modeling for Humans)

Gensim is a Python library for topic modeling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing(NLP) andinformation retrieval(IR) community.

https://github.com/parulsethi/gensim

https://radimrehurek.com/gensim

https://www.toptal.com/python/topic-modeling-python

Topic Modeling

Topic modeling is a related problem, where a program is given a list of human language documents and is tasked with finding out which documents cover similar topics.

Text Similarity Methods

  • Normalized, metric, similarity and distance
  • (Normalized) similarity and distance
  • Metric distances
  • Shingles (n-gram) based similarity and distance
  • Levenshtein
  • Normalized Levenshtein
  • Weighted Levenshtein
  • Damerau-Levenshtein
  • Optimal String Alignment
  • Jaro-Winkler
  • Longest Common Subsequence
  • Metric Longest Common Subsequence
  • N-Gram
  • Shingle(n-gram) based algorithms
  • Q-Gram
  • Cosine similarity
  • Jaccard index
  • Sorensen-Dice coefficient
  • Overlap coefficient (i.e., Szymkiewicz-Simpson)

https://github.com/luozhouyang/python-string-similarity#python-string-similarity

FlashText

Replace keywords in sentences or extract keywords from sentences

https://pypi.org/project/flashtext

ML Kit Natural Language APIs

  • Language ID
  • On-device translation
  • Smart reply
  • Entity extraction

https://developers.google.com/ml-kit

Haystack

Haystack is the open source Python framework by deepset for building custom apps with large language models (LLMs). It lets you quickly try out the latest models in natural language processing (NLP) while being flexible and easy to use. Our inspiring community of users and builders has helped shape Haystack into what it is today: a complete framework for building production-ready NLP apps.

GitHub - deepset-ai/haystack ⭐ 25k

What is Haystack? | Haystack

Models

References

The Association for Computational Linguistics is the international organization that represents the field of NLP. The ACL website (http://www.aclweb.org) hosts many useful resources, including: information about international and regional conferences and workshops; the ACL Wiki with links to hundreds of useful resources; and the ACL Anthology, which contains most of the NLP research literature from the past 50+ years, fully indexed and freely downloadable.

https://www.freecodecamp.org/news/natural-language-processing-with-spacy-python-full-course

NLP - EXPLAINED! - YouTube

Convolution in NLP - YouTube