Member-only story

How is stemming different from lemmatization?

Path to a High-Paying AI Jobs: Key Interview Questions and Expert Answers

Mark Kara
5 min readSep 8, 2024

This article is the series of Path to a High-Paying AI Jobs: Key Interview Questions and Expert Answers Index Article. You can find Key Interview Questions that are highly asked on High-Paying AI Job Interviews and links to expert answer articles from that page.

Stemming vs Lemmatization

Stemming and lemmatization are both text normalization techniques used in Natural Language Processing (NLP) to reduce words to their base or root form. The goal of these processes is to treat different variations of a word (such as plurals, tenses, and derivations) as the same word, allowing algorithms to process text more effectively by reducing vocabulary size and improving the performance of downstream tasks like search, classification, or machine translation.

However, while they serve similar purposes, stemming and lemmatization differ significantly in how they transform words into their base forms.

Stemming:

Definition: Stemming is a rule-based process of chopping off the ends of words to reduce them to their root form, often without regard to whether the result is a valid word. The resulting root word, known as the “stem,” may not always be a dictionary word.

--

--

Mark Kara
Mark Kara

Written by Mark Kara

amazon.com/author/markkara Salesforce Marketing Cloud Technical Architect who writes on Technology, Data Science, Finance , Management and who creates Puzzles.

No responses yet