Term Frequency-Inverse Document Frequency (TF-IDF) is a statistical measure used to evaluate the importance of a word in a document relative to a collection of documents, known as a corpus. It combines two components: Term Frequency (TF), which counts how often a word appears in a document, and Inverse Document Frequency (IDF), which measures how common or rare a word is across all documents.
The TF-IDF score increases with the frequency of the word in a specific document but decreases if the word appears in many documents. This helps identify words that are more relevant to a particular document, making TF-IDF useful in applications like information retrieval and text mining.