What is Zipf’s Law?

Zipf’s law is also known as the principle of least effort. In natural language texts, it has been observed that:

  • The second most used word appears half as often as the most used word.
  • The third most used word appears one-third the number of times the most used word appears, and so on.

Zipf proposed that such a distribution was observed because we tend to frequently use words that we are more comfortable with. We try to communicate as efficiently as possible by putting in the least amount of effort.

F Auerbach, a German physicist observed the phenomenon concerning population in cities. The second most populous city had half the population of the most populous city. In 1932, Zipf observed a similar distribution of word frequencies in natural language text (English). He proposed a law based on his findings and it began to be known as Zipf’s law. The same kind of relationship was observed in corporation sizes, income of people etc.

Zipf’s Law

Zipf’s law is an empirical formula discovered by George Zipf in 1930s. Zip’s law describes the relationship between the frequency of words in language corpus and their rank in a frequency sorted list. In this article, we will be diving into the concept of Zipf’s law and its application in natural language processing.

Table of Content

  • What is Zipf’s Law?
  • Mathematical Formulation
  • Example of Zipf’s Law
  • Python Implementation of Zipf’s Law
  • Applications
  • Deviation from Zipf’s Law

Similar Reads

What is Zipf’s Law?

Zipf’s law is also known as the principle of least effort. In natural language texts, it has been observed that:...

Mathematical Formulation

Zipf’s Law can be understood intuitively by considering that in any language, there are a few extremely common words (e.g., “the,” “of,” “and”) that are used very frequently, while the vast majority of words are used relatively infrequently. This distribution of word frequencies follows a power-law distribution, where the frequency of a word is proportional to its rank raised to a negative power....

Example of Zipf’s Law

Two friends were met by a bear. One climbed a tree, abandoning the other. The other played dead, and the bear left him unharmed....

Python Implementation of Zipf’s Law

The code segment demonstrates Zipf’s law by plotting the frequency of words against their ranks in a given text passage. The resulting plot typically shows a curve indicating the inverse relationship between word frequency and rank, as predicted by Zipf’s law. Let’s discuss the code in detail:...

Applications

Zipf’s Law has a wide range of applications across various fields. Some key applications include:...

Deviation from Zipf’s Law

Indeed, deviations from Zipf’s Law are common and can be attributed to various factors. Here are some key points regarding deviations from the law:...