A Code-First Intro to Natural Language Processing

Jun 24, 2021 | Data Science

Welcome to your journey into the fascinating world of Natural Language Processing (NLP)! This course, originally taught at the University of San Francisco’s Masters of Science in Data Science program, summer 2019, uses Python and Jupyter Notebooks, incorporating libraries such as sklearn, nltk, pytorch, and fastai. Dive in and explore cutting-edge techniques while keeping it manageable and fun!

What You Will Learn

This course covers a wide range of topics essential for mastering NLP:

  • What is NLP?
    • A changing field
    • Resources
    • Tools
    • Python libraries
    • Example applications
    • Ethics issues
  • Topic Modeling with NMF and SVD
    • Stop words, stemming, lemmatization
    • Term-document matrix
    • Topic Frequency-Inverse Document Frequency (TF-IDF)
    • Singular Value Decomposition (SVD)
    • Non-negative Matrix Factorization (NMF)
    • Truncated SVD, Randomized SVD
  • Sentiment classification with Naive Bayes, Logistic regression, and ngrams
    • Sparse matrix storage
    • Counters
    • the fastai library
    • Naive Bayes
    • Logistic regression
    • Ngrams
    • Logistic regression with Naive Bayes features, with trigrams
  • Regex (and re-visiting tokenization)
  • Language modeling and sentiment classification with deep learning
    • Language model
    • Transfer learning
    • Sentiment classification
  • Translation with RNNs
    • Review Embeddings
    • Bleu metric
    • Teacher Forcing
    • Bidirectional
    • Attention
  • Translation with the Transformer architecture
    • Transformer Model
    • Multi-head attention
    • Masking
    • Label smoothing
  • Bias and ethics in NLP
    • Bias in word embeddings
    • Types of bias
    • Attention economy
    • Drowning in fraudulent/fake info

Why is this Course Taught in a Unique Order?

This course adopts a top-down teaching method, which is quite different from the traditional bottom-up approaches. While most courses dive into minute details, our goal is to keep your motivation high and provide a sense of the big picture from the get-go. Think of baseball: instead of requiring kids to memorize rules, we let them play! They learn progressively, and the framework becomes clearer over time.

Don’t be worried if some concepts seem confusing initially. We’re going to start using some “black boxes,” and later we’ll unravel the details! The key here is to focus on what things DO, rather than getting bogged down by technicalities right away.

Getting Started

Ready to dive in? Check out the resources:

Troubleshooting Tips

As you navigate through this course, you may encounter a few hurdles. Here are some common troubleshooting ideas:

  • If you’re having problems with Python libraries, ensure they’re correctly installed by running:
    pip install library_name
  • For issues related to Jupyter Notebooks, try restarting your kernel or checking the notebook path to ensure it’s accessible.
  • If your code is resulting in errors, read the traceback carefully; it often contains hints about what went wrong.
  • Connect with others in your learning community – collaboration can often provide new insights.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Final Note

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

So, are you ready to take your first steps in NLP? Here’s to learning!

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox