The benefits of using the internet are undeniable. Even today, when people around the world are questioning social media’s ability to poison discourse, we can say that the internet is the defining invention of our time. In seconds, you can communicate with anyone who has internet access anywhere in the world. You might think this would make the internet a representative microcosm of the world, where demographics are accurately represented. If you actually look at languages on the internet, however, they tell a different story.
The internet has an English problem. Namely, the English language completely dominates the content that’s put on the world wide web. Why is that so? Are there simply more English speakers than anyone else online? We looked into the linguistic anomaly.
The Most-Used Languages On The Internet
There are two measures of language use on the internet, and they tell two different stories.
If you are looking at how many people use the internet, the top 10 languages online reflect the population of the world at least somewhat accurately.
|Most-Used Languages In The World||Percentage Of World||Most- Used Languages Online By User||Percentage Of World|
|1. English||15.0%||1. English||25.4%|
|2. Chinese||15.0%||2. Chinese||19.3%|
|3. Hindi||7.2%||3. Spanish||8.1%|
|4. Spanish||6.9%||4. Arabic||5.3%|
|5. French||3.8%||5. Portuguese||4.1%|
|6. Arabic||3.7%||6. Indonesian/Malaysian||4.1%|
|7. Russian||3.6%||7. French||3.2%|
|8. Bengali||3.6%||8. Japanese||2.9%|
|9. Portuguese||3.2%||9. Russian||2.6%|
|10. Indonesian||2.6%||10. German||2.2%|
While English-language users are indeed overrepresented, these numbers are not entirely surprising given that the internet is still in a relatively early stage of existence. These numbers are also liable to swing wildly over the next decades. Between 2000 and 2018, English users have grown online by almost 650 percent. And while this may sound like a lot, it pales in comparison to Chinese users’ 2,391 percent, or Arabic users’ 8,616 percent.
If you look at W3Techs’ statistics for how much content in each language is online, however, there’s a different trend.
|Most-Used Language Online
|Percentage Of World|
Assuming these numbers are accurate, they’re a tad confusing. Here, Chinese — which is the second-largest language both online and in the world — doesn’t even make the top 10 (falling right behind Polish). And while you might expect these numbers to be slowly shifting, there hasn’t been much change over the past decade. This brings us to our next question.
Why Is So Much Of The Internet In English?
It’s not immediately obvious why there’s such a gap between the languages users speak and the content available online. It does make sense that English would have started out pretty far ahead; the internet’s early history is heavily based in the United States. It’s estimated that in the 1990s, 80 percent of content online was in English. The amount of English content has indeed decreased a huge amount since then, but it’s still holding strong, comprising over half of all the stuff on the internet. The lasting dominance of English may reveal a more structural bias for the English language on the internet.
Almost all of the biggest companies on the internet are English-first. Google, Amazon and Facebook are all based in the United States, and so English is always the first priority. This isn’t to say that internet companies haven’t made strides to be inclusive of the world’s languages. But still, that means there are people who have to make decisions about which languages are worth investing resources in.
It’s possible that the trend of languages on the internet reflects a larger fact about languages in the world. English is the lingua franca for a huge number of countries. English only has about 360 million native speakers, but has more people studying it than any other language. To access the global marketplace, it’s practically a necessity to have some grasp of English.
Does It Matter How Much Of Each Language Is Online?
There’s a famous saying that history is written by the winners, and you could probably add that the history is probably written in English. A Guardian article that looked at the digital language divide showed how English is the most commonly used language to describe countries in Africa, Asia and eastern Europe. While that’s only one aspect of the massive entity that is the internet, it shows how English users are defining the world online.
Having content written in a variety of languages allows for more voices to be heard online. Google can implement tools to translate web pages instantly, but that isn’t the same as having a multiplicity of perspectives.
How is it possible to fix this? When a language like Hindi — with over 180 million speakers of its standard variety in the world — represents less than 0.1 percent of the content online, it can seem like we’re stuck with this linguistic inequality. But that’s not necessarily true.
A study on the languages of the internet predicts that English is on its way out as the online lingua franca. The study says that the increasing number of Arabic, Hindi and Chinese speakers will eventually edge out the English majority. On the other hand, the study is from six years ago and the predictions don’t seem to be borne out by recent trends. Still, perhaps it’s just a slow process that will eventually lead to more language representation.
Some groups don’t have the patience to sit back and see whether representation will change naturally. In a plan for the future of the French language, French President Emmanuel Macron specifically called out the need for more French usage online. Very active multilingual users of Wikipedia are working to create more content across Wikipedia’s many languages. While it would take a very large cultural shift to change these percentages, it’s certainly possible these are the first steps in a bigger movement.
It’s hard to know whether the internet will end up being a good or bad thing for the world’s linguistic diversity. On one hand, there are projects like Wikitongues that are trying to save languages from extinction by recording them. On the other, there are estimates that less than 5 percent of all languages will ever see regular use online. This 5 percent will cover a huge portion of the world, but not all of it. The future of language — whether we all end up speaking a single language or the world is brought back from a mass language extinction — could very well rely on the ability of the internet to connect not just the lucky few, but the whole world.