Member-only story
Top Machine Learning NLP Tools for Python
Introduction
The goal of this article is to give a quick overview with examples of the tools available to you to perform Natural Language Processing (NLP) tasks using Python. In this article, I will focus only on text data and not on audio or video processing.
This article is an extract of my previous article where I give a general overview of what NLP is, I recommend checking this article first if you are not familiar with NLP. I assume you have previous knowledge of Python and Machine Learning.
What is NLP?
In a nutshell, NLP is a field of Machine Learning focused on extracting insights from natural language. Your goal is to make computers understand our own language.
Some practical examples of NLP are speech recognition, translation, sentiment analysis, topic modeling, lexical analysis, entity extraction and much more.
Using all these tools and algorithms you can extract structured data from natural language, data that can be processed by computers. Furthermore, the output of NLP tasks if often a machine learning algorithm that will use this raw data to make predictions.
By combining many algorithms together, you can extract useful data that can be used in a wide range of scenarios such as:
- Fraud Detection
- Risk Intelligence
- Email Classification
- Sentiment Analysis
NLP Tools
In my previous article, I talked about the importance of curating the raw data. You probably heard that 80% of the time spent in Machine Learning is data preparation: data cleansing, data wrangling, feature engineering, etc. This is specially true for NLP since our main goal is to transform text into numbers that a computer can use.
In a nutshell, we can take raw text data, load it, clean it, tokenize it and finally perform feature engineering to transform text to numbers so it can be used for entity extraction or classification. Now, we are going to review some of…