31 December 2008

Christmas & New Year in Belgium: Dutch vs. French vs. English

For New Year's Eve, I analyzed the Google search volume for Christmas and New Year-related terminology in Belgium's first (Dutch) and second language (French) as well as today's international lingua franca (English). This meant Kerst/Kerstfeest/Kerstdag/Kerstman vs. Noël (Christmas) vs. Nieuwjaar, vs. Nouvelle Année/Nouvel An (New Year) vs. Christmas/Santa Claus/New Year:



For Christmas, French won clearly while for the New Year Dutch narrowly beat out French. English was of course behind though when taking both holidays together not insignificant: Belgium is indeed becoming more and more international in its population and orientation. Why the two holidays showed opposite patterns as far as the primary languages were concerned was unclear. Maybe the search volume just wasn't big enough. Anyway, let's have a look at the rankings for the different (groups of) terms as far as the 11 provinces were concerned—treating Brussels as a province for convenience's sake.


Yellow indicates a Dutch term or Dutch-speaking province while turquoise marks French terms and French-speaking provinces. The capital district of Brussels is bilingual. As expected, the Dutch terms came first in a Dutch-speaking province while the French ones did in a French-speaking province. The English terms were searched most in a Dutch-speaking province (Limburg) and Brussels. On average though, Antwerp won out, helped by its larger population; Walloon Brabant came in last due to its low population. In the bottom section of the table, I made a quick attempt to rework the ranking numbers taking the population in consideration. Interesting was here that while for the individual terms nothing changed, the average now saw Luxembourg as the winner and East Flanders as the loser.

27 December 2008

Christmas vs. Halloween vs. Easter vs. Thanksgiving vs. Valentine's Day

My yummy Christmas dinner is almost digested ;-) Time to check on the old Yuletide's status in comparison with the other big—esp. in the US—holidays.



As expected, each holiday had its own seasonal peak and Christmas had the biggest one. Second was Halloween: everyone likes to dress up... Easter peaked in March-April (not fixed in the normal sense). Note that Canada's Thanksgiving Day in October was dwarfed by the US one in November. Valentine's Day was last. Let's have a look at the country rankings for each holiday:











The results were more or less as expected. A few observations: Halloween with its Celtic roots was also important in Ireland (no. 3) and even in Belgium (no. 6)—as a Belgian, I have actually no clue why that would be; of course, I haven't lived there in decades. Why Easter registered so heavily in Australia (no.1) and New Zealand (no. 3) was also kind of puzzling. That Thanksgiving registered at all outside the US and Canada might provide further proof of the supersized influence of esp. the US in the world. South African Googlers were the second-most interested in Valentine's.


Photo Copyright j_wijnands, CC Attribution-Noncommercial-Share Alike 2.0 Generic

23 December 2008

Top 5 non-social blog-memetrackers: Technorati vs. Blogdex vs. Techmeme vs. Daypop vs. Memeorandum

Today, a look at web sites that aggregate/rank blog memes, sometimes called memetrackers. However, I excluded the social media sites where the content is determined by people's votes, e.g., Digg. Some no longer exist but were influential in the past: Blogdex and Daypop. The current leading non-social memetrackers are memeorandum, Techmeme, Technorati.



Guess what: Technorati obliterated all the others. However, it is also a very popular blog search site which probably explains the distortion. Note how it saw a steep climb between 2004 and 2007, the latter being its zenith. Anyway, let's leave out Technorati and get more meaningful results:



First was Blogdex, now defunct. It displayed the opposite trend of Technorati: a steep decline. Second was Techmeme which only started in 2006 but quickly overpowered all others. Next came Daypop which experienced a continuous decline into oblivion like Blogdex, just not as precipitous. Memeorandum has declined somewhat from its late 2005 peak but still finished second. Let's have a look at the country rankings for each term, starting with Technorati:



The top 3 was Singapore, Malaysia and the Philippines. Then came Indonesia and the US was only fifth. Interesting! How about Blogdex?



It was almost uniquely Googled by people in the US, Canada and the UK. Maybe blogs were just not common elsewhere in 2004-2005? Not sure about that. Next, Techmeme:



The US came first, followed by India and Canada. This memetracker has an IT tech focus which potentially explains India's rank. It was remarkable that Southeast Asia, so prominent with Technorati, didn't even figure in the top 10. What were the rankings for Daypop?



I saw the exact same pattern as for Blogdex (1. US, 2. Canada, 3. UK) with one exception: France joined in on no. 4. Again though, no other countries were really interested in Daypop. Finally: memeorandum.



This one had an identical pattern as Blogdex: 1. US, 2. Canada, 3. UK, and almost nothing behind those.

21 December 2008

Guantanamo vs. Habeas Corpus vs. Rendition vs. Waterboarding vs. Wiretapping

Still-vice-president Cheney is making some news as of late with his blunt defense of the excesses of the Bush administration. I faced off a few terms considering only US Googlers:



The prison camp in Guantanamo came out on top overall. The highest peaks however were noted for waterboarding. Wiretapping didn't register much. How were the state rankings?



The District of Columbia, the center of the federal government, came in first every time. Virginia where among others the CIA is headquartered made two appearances (Guantanamo and habeas corpus). The other DC neighbor, Maryland, was second for habeas corpus and wiretapping. Most surprising was the presence of South Dakota (rendition) and Utah (waterboarding).

17 December 2008

Highest death toll from recent conflicts: Congo, Sudan, Afghanistan, Iraq, Burundi and Somalia

Using the estimates of the Political Economy Research Institute (Univ. of Mass.), I established the countries with the top 5 deadliest recent or ongoing conflicts in the world; in order of numbers of dead, they are Congo (Kinshasa), Sudan, Afghanistan, Iraq and Burundi. How did they fare in a Google Insights for Search face-off?



Basically, Iraq overpowered all others, mostly due to the May 2004 peak. By the way, the top 3 countries interested in Iraq were Lebanon, the US and Uganda. To get more meaningful results, I then took Iraq out and added the next deadliest country: Somalia.



Afghanistan was first, then came both Congo and Sudan. The latter displayed the only peak of the graph and well in February 2005. Burundi invoked little Googling.

Again using PERI's estimates, I gathered the no. of dead, the dead as % of population as well as the relative Google-popularity scores (I extrapolated Iraq):



First, I plotted no. of dead against dead as % of population, using the Google-popularity score to determine the size of the bubbles:



Note that the polynomial trend line is a pretty good fit. For most countries, a higher no. of dead meant a higher dead as % of population. Congo, however, was the obvious exception. Next, I bubble-plotted no. of dead against relative Google-popularity (bubbles determined by dead as % of population):



The trend line didn't fit well at all this time; but this was probably due to Iraq being an outlier because of extreme interest, comparatively speaking, in Anglo-Saxon countries that were also the most actively involved in the country. So maybe we should've excluded Iraq?



The correlation was now good, very similar to the first bubble graph. Congo was again a bit of an exception. Finally, I plotted relative Google-popularity against dead as % of population with the bubble size reflecting the no. of dead:



While leaving the Iraq outlier out, we obtained the best-fitting trend line of all bubble graphs. In other words, Google-popularity and dead as % of population were highly correlated.

14 December 2008

Bankruptcy vs. Inflation vs. Recession vs. 1930s vs. Layoffs

Today, I compared the Google-Popularity of some terms that come up a lot in this time of recession:



No.1, bankruptcy, was huge in 2004-2005, then fell somewhat but stayed on top and was again rising this year. The second most Googled word was inflation. Recession was generally low but saw a huge but brief peak in January 2008. Toward the end of the year it had climbed higher again, even surpassing inflation. Note that 1930s actually had a higher search volume in 2004-early 2006 than in 2008. It always displayed a seasonal pattern with a low during the summer.

Let me give you a quick rundown of the country top 3 for each term:



The US appeared in four out of five top 3's. The exception, inflation, had South Africa as no.1, followed by India and Singapore. India actually figured also frequently: three times. 1930s was more an "Anglo-Saxon" phenomenon.

13 December 2008

Marriage vs. Divorce

I faced off marriage/wedding and divorce.



Marriage was much more Google-popular than divorce, no surprise there. Note the seasonal pattern: the summer was the peak season every year while December was the low point. How about we see which countries were most interested in wedded bliss?



The top 3 was: South Africa, the US and Trinidad and Tobago. The latter was interesting: did a lot of Americans and Europeans maybe get married in this sunny, tropical holiday destination? The antithesis then:



In the US, South Africa and Canada, more people used Google to inform themselves about divorce than in other countries. Trinidad and Tobago made it in the top 10 again: no. 8.

05 December 2008

Fidel Castro vs. Osama Bin Laden vs. Hugo Chavez vs. Mahmoud Ahmadinejad vs. Kim Jong-il

Today, a face-off of George Bush's "bogeymen" that are likely to outlast him in a position of power. First, I analyzed the US:



He may be the oldest and in bad health, but Castro was still tops together with Bin Laden. Chavez had a lot less web search volume and Kim was of little interest—or hardly anyone knows how to spell his name maybe? The peaks were pretty obviously linked to events such as Castro's illness and resignation, etc. I wondered if the graph would be much different when considering worldwide searches:



The only relevant difference lay in Bin Laden's reduction in Google-popularity. Let's look at the top 5 of countries that are interested in each "bogeyman":



Bin Laden's 5th country was Honduras: odd. Norway and Portugal were also not what you'd expect for Ahmadinejad and what's with Finland and Kim?

30 November 2008

United Kingdom, France, Germany, Spain, Italy, England, Scotland, Wales, Northern Ireland, (Great) Britain in the US

I investigated a little bit further along the lines of my "United Kingdom, France, Germany, Spain and Italy in English, French, German, Spanish and Italian" posts 1 and 2. This time, I faced off the same five European countries but only among US Googlers:



France was no. 1, followed by Italy. It was however odd that the UK with whom Americans share a common language and are culturally and politically closely linked, came in dead last. As mentioned before, France's recurring summer peak was due to the Tour de France. Let's have a look at the state rankings for interest in the five countries, beginning with the UK:



Among this smaller volume of searches, DC, New York and Illinois stood out. Mississippi came last. How about France?



The top 3 consisted of DC, New York and California. South Dakota came in dead last. Just like with the UK, the political and trade ties probably pushed DC to the top. Louisiana, which you would expect to be interested in France, didn't make the top 10. Next, Germany:



Michigan led for this country, followed by Wisconsin and DC. Could this have something to do with a larger share of German-Americans in the first two states? Louisiana thought the least about Germany. Then came Spain:



New York led, New Jersey was a close second and third was Florida. The pattern seemed to correlate with the distribution of Hispanics in the US. Oklahoma ended up last. Finally, Italy:



For Italy, it was New York, Connecticut and then New Jersey. South Dakota didn't care much for the Mediterranean "boot." I was still puzzled about the low score for the UK though: maybe "the United Kingdom" was a tad too bureaucratic a term, not what people type into the Google search box? I tested this theory by comparing the constituent parts of the UK:



England blew the competition as well as the UK out of the water: more than 13 times higher Google-popularity than the UK. Northern Ireland barely registered. A caveat however was apparent when looking at the state rankings: the New England states made up the top 6, in other words, a lot of the hits may be for the New England region rather than England.


One more test: I faced off Britain, Great Britain (also often used terms) and the UK:



Britain was tops, Great Britain was last. The term Great Britain—from which the automobile country code "GB" is derived—covers England, Wales and Scotland. The full name of the UK is actually the United Kingdom of Great Britain and Northern Ireland. Why is Britain "great"? Originally, "greater" distinguished it from "lesser" Britain, i.e., Brittany in France. Remember William the Conqueror and the French connections? Later, it was used to cement the union of England and Scotland. Britain in its proper use is then the geographical entity of the island containing Scotland, Wales and England. Of course, common usage conflates all this ;-)

27 November 2008

United Kingdom, France, Germany, Spain and Italy in English, French, German, Spanish and Italian (Part 2)

In part 1, I faced off the search volumes of five versions of the names of the five big European countries. Today, I compared each time four European countries plus the US as far as the Google-popularity of a fifth European country was concerned. I started with the UK:



The UK was most popular in Spain, twice as much as in the US and Italy. France and Germany closed the rankings. My guess would be that this was caused by the larger contingent of British tourists in Spain. Note that I put the US in place of the search-term country. After all, a country name being most popular in its own territory didn't need proving but a comparison with the search patterns in the US could provide a nice study in contrast. Next, France:



France was most popular by far in the UK—a sentiment that was not returned as we saw above... Americans were the least interested in France. All searchers though were most interested in the summer: Tour de France time! Just like with the UK, popularity is generally in decline. Then came Germany:



All Europeans were about as much interested in this country, the US lagged a bit behind. Italy narrowly won this one. The 2006 World Cup soccer was again clearly visible, esp. in soccer-crazy Spain and Italy. So, how about Spain?



The interest in this country was staggered perfectly, from the top score in the UK (returning the favor; see above under the UK) to the least interested Italians. Americans came out one step above Italy. I wonder whether the UK trend line showed their vacation season, beginning at Christmas and lasting through the summer. The French and to a lesser extent the Germans also vacation in Spain but mostly in the summer only. Strangely, Americans were less into Spain during the summer... Finally, Italy:



One more time, the Europeans more or less agreed while the US lagged behind some. The summer 2006 peak was caused by the Azurri (Italian national soccer team) winning the World Cup in Germany.

25 November 2008

United Kingdom, France, Germany, Spain and Italy in English, French, German, Spanish and Italian (Part 1)

Today I faced off the five most populous countries of the European Union but not against each other. Instead, I investigated the Google-popularity of their names in the five main languages. Note that I kept the order of the languages the same for each country: English, French, German, Spanish, Italian. Let's start with the United Kingdom:



As expected, the English version of the name was by far the most common but was also diminishing in volume at a rapid and sustained pace. Spanish came in second; Italian, French and German barely registered in comparison—remember that the numbers on these Google Insights for Search graphs are relative, not absolute, i.e., not reflecting real volume numbers but only how the different search terms differ in search volume. Next, France:



We had two complications here: the English and French versions of the name were identical and so were the Spanish and Italian ones. In both cases, I kept the two languages on the graph even though only one of the trend lines was visible, that line being the sum total of the two languages' search volume. That way, the language color codes didn't change from graph to graph. Anyway, France was king and it didn't matter whether the French or the English version was what drove this line because the other three languages were so small in volume that they would have been behind even if it were possible to split up the English and French searches. The Tour de France cycling extravaganza produced the annual summer peak (see also my August 9, 2008 post). How about Germany?



This time, I had solid evidence for the native tongue normally being on top: Deutschland bested Germany. Both were decreasing through time. The one time English won out was in June 2006 when even Italian almost caught up with German. The reason: the World Cup soccer took place in Germany that year, mostly in June. The smaller uptick later on was Euro 2008, the European soccer championship. Do we need anymore proof that soccer is king in Germany? :-) This in contrast to France where cycling is king (see above). Next, Spain:



The pattern continued: the native language bested English and both declined through time. Italians had little affinity with Spain. Finally, I analyzed Italy:



One more time we had an "identical twin": Italia is both the Italian and the Spanish name for the country. Of course then, Italian + Spanish bested English. Just like with Spain, there was at least some search volume from the other languages too: I wonder if this might be due to both countries being favorite tourist destinations?

Update: Don't forget to read part 2.