Help us continue our mission of world wide justice through strong transparency.


More info on the Donations page.

Wikileaks:Wikileaks Zeitgeist

From Wikileaks

Jump to: navigation, search

The following Wikileaks Zeitgeist was generated on Sat Apr 7 14:37:37 EST 2007

Normalization is performed by dividing the number of googable "wikileaks" pages in the language or domain specified with the number of googlable "html" pages. "html" was chosen as it is unlikely to have significant language or domain bias.

Based on these statistics, we are, per capita, one to two hundred times more interesting to Russians than the English speaking average -- and what is with Slovenia?

Contents

"wikileaks" google pages by absolute language popularity

Lang Language Pages Norm x 10
en English 314000 33
ru Russian 74500 3475
sl Slovenian 33300 7845
es Spanish 18000 446
fr French 15300 267
hu Hungarian 9620 1640
ja Japanese 9390 226
nl Dutch 9360 348
pl Polish 3120 142
pt Portuguese 1450 80
de German 1180 16
zh-TW Chinese (Traditional) 1170 188
it Italian 991 37
iw Hebrew 760 156
bg Bulgarian 678 149
zh-CN Chinese (Simplified) 509 28
hr Croatian 302 64
no Norwegian 257 54
fi Finnish 231 49
ro Romanian 205 38
sv Swedish 185 32
ko Korean 141 16
da Danish 120 23
ar Arabic 50 10


Graph

Image:WL google pages by absolute language popularity.gif

"wikileaks" google pages by normalized language popularity

Lang Language Pages Norm x 10
sl Slovenian 33300 7845
ru Russian 74500 3475
hu Hungarian 9620 1640
es Spanish 18000 446
nl Dutch 9360 348
fr French 15300 267
ja Japanese 9390 226
zh-TW Chinese (Traditional) 1170 188
iw Hebrew 760 156
bg Bulgarian 678 149
pl Polish 3120 142
pt Portuguese 1450 80
hr Croatian 302 64
no Norwegian 257 54
fi Finnish 231 49
ro Romanian 205 38
it Italian 991 37
en English 314000 33
sv Swedish 185 32
zh-CN Chinese (Simplified) 509 28
da Danish 120 23
ko Korean 141 16
de German 1180 16
ar Arabic 50 10


Graph 1

Image:WL google pages by normalized language popularity.gif

Graph 2

Image:WL google pages by normalized language popularity excluded top 3.gif

"wikileaks" google pages by absolute domain popularity

TLD Description Pages Norm x 10
com commercial 62500 960
ru Russia 53100 164141
net network 32400 4583
nl Netherlands 11900 39408
hu Hungary 9320 94797
info information 4380 51688
de Germany (Deutschland) 3820 1051
pl Poland 2650 7916
ua Ukraine 1460 17520
jp Japan 940 413
es Spain (Españ)a 908 3788
it Italy 907 3027
au Australia and territories 703 807
org organization 684 22
uk United Kingdom 545 214
il Israel 539 7020
bg Bulgaria 466 8158
tw Taiwan (Taiwan, Penghu, Kinmen, and Matsu) 296 3242
br Brazil 294 2573
se Sweden 270 2621
no Norway 251 2898
biz business 247 2465
ro Romania 197 1248
fi Finland 151 2041
edu educational 125 10
ca Canada 112 69
za South Africa (Zuid-Afrika) 109 1711
cn China 107 419
be Belgium 107 948
ch Switzerland (Confoederatio Helvetica) 103 133
tv Tuvalu (also sold as an abbreviation) 92 958
ar Argentina 76 1089
dk Denmark 75 800
fr France 57 51
ve Venezuela 39 603
mx Mexico 33 499
cc Cocos (Keeling) Islands 33 435
pt Portugal 27 360
kr South Korea 26 264
at Austria 21 108


Graph

Image:WL google pages by absolute domain popularity.gif

"wikileaks" google pages by normalized domain popularity

TLD Description Pages Norm x 10
ru Russia 53100 164141
hu Hungary 9320 94797
info information 4380 51688
nl Netherlands 11900 39408
ua Ukraine 1460 17520
bg Bulgaria 466 8158
pl Poland 2650 7916
il Israel 539 7020
net network 32400 4583
es Spain (Españ)a 908 3788
tw Taiwan (Taiwan, Penghu, Kinmen, and Matsu) 296 3242
it Italy 907 3027
no Norway 251 2898
se Sweden 270 2621
br Brazil 294 2573
biz business 247 2465
fi Finland 151 2041
za South Africa (Zuid-Afrika) 109 1711
ro Romania 197 1248
ar Argentina 76 1089
de Germany (Deutschland) 3820 1051
com commercial 62500 960
tv Tuvalu (also sold as an abbreviation) 92 958
be Belgium 107 948
au Australia and territories 703 807
dk Denmark 75 800
ve Venezuela 39 603
mx Mexico 33 499
cc Cocos (Keeling) Islands 33 435
cn China 107 419
jp Japan 940 413
pt Portugal 27 360
kr South Korea 26 264
uk United Kingdom 545 214
ch Switzerland (Confoederatio Helvetica) 103 133
at Austria 21 108
ca Canada 112 69
fr France 57 51
org organization 684 22
edu educational 125 10


Graph 1

Image:WL google pages by normalized domain popularity.gif

Graph 2

Image:WL google pages by normalized domain popularity excluded top 5.gif

Excel spreadsheets of graphs and raw data

media:WL Zeigeist 2007-04.xls

CSV tables

Language Code, Language, Pages, Normalized Pages

"en", "English", 314000, 33
"ru", "Russian", 74500, 3475
"sl", "Slovenian", 33300, 7845
"es", "Spanish", 18000, 446
"fr", "French", 15300, 267
"hu", "Hungarian", 9620, 1640
"ja", "Japanese", 9390, 226
"nl", "Dutch", 9360, 348
"pl", "Polish", 3120, 142
"pt", "Portuguese", 1450, 80
"de", "German", 1180, 16
"zh-TW", "Chinese (Traditional)", 1170, 189
"it", "Italian", 991, 37
"iw", "Hebrew", 760, 156
"bg", "Bulgarian", 678, 149
"zh-CN", "Chinese (Simplified)", 509, 28
"hr", "Croatian", 302, 64
"no", "Norwegian", 257, 54
"fi", "Finnish", 231, 49
"ro", "Romanian", 205, 38
"sv", "Swedish", 185, 32
"ko", "Korean", 141, 16
"da", "Danish", 120, 23
"ar", "Arabic", 50, 10

Domain, Description, Pages, Normalized Pages

"com", " commercial", 62400, 958
"ru", "Russia", 53100, 164141
"net", " network", 32400, 4583
"nl", "Netherlands", 11900, 39408
"hu", "Hungary", 9320, 94797
"info", "information", 4380, 51688
"de", "Germany (Deutschland)", 3820, 1051
"pl", "Poland", 2650, 7916
"ua", "Ukraine", 1460, 17520
"jp", "Japan", 940, 413
"it", "Italy", 908, 3030
"es", "Spain (Españ)a", 908, 3788
"au", "Australia and territories", 703, 807
"org", " organization", 685, 22
"uk", "United Kingdom", 545, 214
"il", "Israel", 539, 7020
"bg", "Bulgaria", 466, 8158
"tw", "Taiwan (Taiwan, Penghu, Kinmen, and Matsu)", 296, 3242
"br", "Brazil", 294, 2573
"se", "Sweden", 270, 2621
"no", "Norway", 251, 2898
"biz", " business", 247, 2476
"ro", "Romania", 197, 1248
"fi", "Finland", 151, 2041
"edu", " educational", 125, 10
"ca", "Canada", 112, 69
"za", "South Africa (Zuid-Afrika)", 109, 1711
"cn", "China", 107, 420
"be", "Belgium", 107, 948
"ch", "Switzerland (Confoederatio Helvetica)", 103, 133
"tv", "Tuvalu (also sold as an abbreviation)", 92, 958
"ar", "Argentina", 76, 1089
"dk", "Denmark", 75, 800
"fr", "France", 57, 51
"ve", "Venezuela", 39, 603
"mx", "Mexico", 33, 499
"cc", "Cocos (Keeling) Islands", 33, 435
"pt", "Portugal", 27, 360
"kr", "South Korea", 26, 264
"at", "Austria", 21, 108

Code

#!/usr/bin/env ruby
#author j a y @ w i k i l e a k s . o r g

require 'net/http'

class GComparitor
# output CSV instead of wiki tables?
  CSV = true

  TLDS = {
  'arpa' => "address and routing",
  'aero' => "air-transport industry",
  'biz' => " business",
  'cat' => " Catalan",
  'com' => " commercial",
  'coop' => "cooperatives",
  'edu' => " educational",
  'gov' => " governmental",
  'info' => "information",
  'int' => " international organizations",
  'jobs' => "company jobs",
  'mil' => " US Military",
  'mobi' => "mobile devices",
  'museum' => "museums",
  'name' => "individuals, by name",
  'net' => " network",
  'org' => " organization",
  'pro' => " professions",
  'travel' => "travel and travel-agency",
  'ac' => "Ascension Island",
  'ad' => "Andorra",
  'ae' => "United Arab Emirates",
  'af' => "Afghanistan",
  'ag' => "Antigua and Barbuda",
  'ai' => "Anguilla",
  'al' => "Albania",
  'am' => "Armenia",
  'an' => "Netherlands Antilles",
  'ao' => "Angola",
  'aq' => "Antarctica (south 60')" ,
  'ar' => "Argentina",
  'as' => "American Samoa",
  'at' => "Austria",
  'au' => "Australia and territories",
  'aw' => "Aruba",
  'ax' => "√Öland",
  'az' => "Azerbaijan",
  'ba' => "Bosnia and Herzegovina",
  'bb' => "Barbados",
  'bd' => "Bangladesh",
  'be' => "Belgium",
  'bf' => "Burkina Faso",
  'bg' => "Bulgaria",
  'bh' => "Bahrain",
  'bi' => "Burundi",
  'bj' => "Benin",
  'bm' => "Bermuda",
  'bn' => "Brunei Darussalam",
  'bo' => "Bolivia",
  'br' => "Brazil",
  'bs' => "Bahamas",
  'bt' => "Bhutan",
  'bv' => "Bouvet Island (Norwegian dependency; see .no)",
  'bw' => "Botswana",
  'by' => "Belarus",
  'bz' => "Belize",
  'ca' => "Canada",
  'cc' => "Cocos (Keeling) Islands",
  'cd' => "Democratic Republic of the Congo (formerly Zaire)",
  'cf' => "Central African Republic",
  'cg' => "Republic of the Congo",
  'ch' => "Switzerland (Confoederatio Helvetica)",
  'ci' => "Côte 'Ivoire",
  'ck' => "Cook Islands",
  'cl' => "Chile",
  'cm' => "Cameroon",
  'cn' => "China",
  'co' => "Colombia",
  'cr' => "Costa Rica",
  'cu' => "Cuba",
  'cv' => "Cape Verde",
  'cx' => "Christmas Island",
  'cy' => "Cyprus",
  'cz' => "Czech Republic",
  'de' => "Germany (Deutschland)",
  'dj' => "Djibouti",
  'dk' => "Denmark",
  'dm' => "Dominica",
  'do' => "Dominican Republic",
  'dz' => "Algeria",
  'ec' => "Ecuador",
  'ee' => "Estonia",
  'eg' => "Egypt",
  'er' => "Eritrea",
  'es' => "Spain (Españ)a",
  'et' => "Ethiopia",
  'eu' => "European Union",
  'fi' => "Finland",
  'fj' => "Fiji",
  'fk' => "Falkland Islands",
  'fm' => "Federated States of Micronesia",
  'fo' => "Faroe Islands",
  'fr' => "France",
  'ga' => "Gabon",
  'gb' => "United Kingdom (see .uk)",
  'gd' => "Grenada",
  'ge' => "Georgia",
  'gf' => "French Guiana",
  'gg' => "Guernsey",
  'gh' => "Ghana",
  'gi' => "Gibraltar",
  'gl' => "Greenland",
  'gm' => "The Gambia",
  'gn' => "Guinea",
  'gp' => "Guadeloupe",
  'gq' => "Equatorial Guinea",
  'gr' => "Greece",
  'gs' => "South Georgia and South Sandwich Islands",
  'gt' => "Guatemala",
  'gu' => "Guam",
  'gw' => "Guinea-Bissau",
  'gy' => "Guyana",
  'hk' => "Hong Kong",
  'hm' => "Heard Island and McDonald Islands",
  'hn' => "Honduras",
  'hr' => "Croatia (Hrvatska)",
  'ht' => "Haiti",
  'hu' => "Hungary",
  'id' => "Indonesia",
  'ie' => "Ireland",
  'il' => "Israel",
  'im' => "Isle of Man",
  'in' => "India",
  'io' => "British Indian Ocean Territory",
  'iq' => "Iraq",
  'ir' => "Iran",
  'is' => "Iceland",
  'it' => "Italy",
  'je' => "Jersey",
  'jm' => "Jamaica",
  'jo' => "Jordan",
  'jp' => "Japan",
  'ke' => "Kenya",
  'kg' => "Kyrgyzstan",
  'kh' => "Cambodia (Khmer)",
  'ki' => "Kiribati",
  'km' => "Comoros",
  'kn' => "Saint Kitts and Nevis",
  'kr' => "South Korea",
  'kw' => "Kuwait",
  'ky' => "Cayman Islands",
  'kz' => "Kazakhstan",
  'la' => "Laos",
  'lb' => "Lebanon",
  'lc' => "Saint Lucia",
  'li' => "Liechtenstein",
  'lk' => "Sri Lanka",
  'lr' => "Liberia",
  'ls' => "Lesotho",
  'lt' => "Lithuania",
  'lu' => "Luxembourg",
  'lv' => "Latvia",
  'ly' => "Libya",
  'ma' => "Morocco",
  'mc' => "Monaco",
  'md' => "Moldova",
  'mg' => "Madagascar",
  'mh' => "Marshall Islands",
  'mk' => "Republic of Macedonia",
  'ml' => "Mali",
  'mm' => "Myanmar",
  'mn' => "Mongolia",
  'mo' => "Macau",
  'mp' => "Northern Mariana Islands",
  'mq' => "Martinique",
  'mr' => "Mauritania",
  'ms' => "Montserrat",
  'mt' => "Malta",
  'mu' => "Mauritius",
  'mv' => "Maldives",
  'mw' => "Malawi",
  'mx' => "Mexico",
  'my' => "Malaysia",
  'mz' => "Mozambique",
  'na' => "Namibia",
  'nc' => "New Caledonia",
  'ne' => "Niger",
  'nf' => "Norfolk Island",
  'ng' => "Nigeria",
  'ni' => "Nicaragua",
  'nl' => "Netherlands",
  'no' => "Norway",
  'np' => "Nepal",
  'nr' => "Nauru",
  'nu' => "Niue (Swedish and Dutch)",
  'nz' => "New Zealand",
  'om' => "Oman",
  'pa' => "Panama",
  'pe' => "Peru",
  'pf' => "French Polynesia and Clipperton Island",
  'pg' => "Papua New Guinea",
  'ph' => "Philippines",
  'pk' => "Pakistan",
  'pl' => "Poland",
  'pm' => "Saint-Pierre and Miquelon",
  'pn' => "Pitcairn Islands",
  'pr' => "Puerto Rico",
  'ps' => "Palestine (PA-controlled West Bank and Gaza Strip)",
  'pt' => "Portugal",
  'pw' => "Palau",
  'py' => "Paraguay",
  'qa' => "Qatar",
  're' => "Réunion",
  'ro' => "Romania",
  'ru' => "Russia",
  'rw' => "Rwanda",
  'sa' => "Saudi Arabia",
  'sb' => "Solomon Islands",
  'sc' => "Seychelles",
  'sd' => "Sudan",
  'se' => "Sweden",
  'sg' => "Singapore",
  'sh' => "Saint Helena",
  'si' => "Slovenia",
  'sj' => "Svalbard and Jan Mayen Islands (Norwegian dependencies; see .no)",
  'sk' => "Slovakia",
  'sl' => "Sierra Leone",
  'sm' => "San Marino",
  'sn' => "Senegal",
  'so' => "Somalia",
  'sr' => "Suriname",
  'st' => "São Tomé and Príncipe",
  'su' => "former Soviet Union Still in use",
  'sv' => "El Salvador",
  'sy' => "Syria",
  'sz' => "Swaziland",
  'tc' => "Turks and Caicos Islands",
  'td' => "Chad",
  'tf' => "French Southern and Antarctic Lands",
  'tg' => "Togo",
  'th' => "Thailand",
  'tj' => "Tajikistan",
  'tk' => "Tokelau (also used as a free domain service to the public)",
  'tl' => "East Timor (old code .tp is still in use)",
  'tm' => "Turkmenistan",
  'tn' => "Tunisia",
  'to' => "Tonga",
  'tp' => "East Timor (now .tp)",
  'tr' => "Turkey",
  'tt' => "Trinidad and Tobago",
  'tv' => "Tuvalu (also sold as an abbreviation)",
  'tw' => "Taiwan (Taiwan, Penghu, Kinmen, and Matsu)",
  'tz' => "Tanzania",
  'ua' => "Ukraine",
  'ug' => "Uganda",
  'uk' => "United Kingdom",
  'um' => "United States Minor Outlying Islands",
  'us' => "United States of America (but see .gov, .mil, .edu etc)",
  'uy' => "Uruguay",
  'uz' => "Uzbekistan",
  'va' => "Vatican City State",
  'vc' => "Saint Vincent and the Grenadines",
  've' => "Venezuela",
  'vg' => "British Virgin Islands",
  'vi' => "U.S. Virgin Islands",
  'vn' => "Vietnam",
  'vu' => "Vanuatu",
  'wf' => "Wallis and Futuna",
  'ws' => "Samoa Formerly Western Samoa",
  'ye' => "Yemen",
  'yt' => "Mayotte",
  'yu' => "Yugoslavia (now used for Serbia and Montenegro)",
  'za' => "South Africa (Zuid-Afrika)",
  'zm' => "Zambia",
  'zw' => "Zimbabwe",
  }

  LANGS = {
  'ar' => 'Arabic',
  'bg' => 'Bulgarian',
  'ca' => 'Catalan',
  'zh-CN' => 'Chinese (Simplified)',
  'zh-TW' => 'Chinese (Traditional)',
  'hr' => 'Croatian',
  'cs' => 'Czech',
  'da' => 'Danish',
  'nl' => 'Dutch',
  'en' => 'English',
  'et' => 'Estonian',
  'fi' => 'Finnish',
  'fr' => 'French',
  'de' => 'German',
  'el' => 'Greek',
  'iw' => 'Hebrew',
  'hu' => 'Hungarian',
  'is' => 'Icelandic',
  'id' => 'Indonesian',
  'it' => 'Italian',
  'ja' => 'Japanese',
  'ko' => 'Korean',
  'lv' => 'Latvian',
  'lt' => 'Lithuanian',
  'no' => 'Norwegian',
  'fa' => 'Persian',
  'pl' => 'Polish',
  'pt' => 'Portuguese',
  'ro' => 'Romanian',
  'ru' => 'Russian',
  'sr' => 'Serbian',
  'sk' => 'Slovak',
  'sl' => 'Slovenian',
  'es' => 'Spanish',
  'sv' => 'Swedish',
  'tr' => 'Turkish'
  }

  NORMALIZER="html" # search to normalize by

  def google term, lang, site
    term = "#{term}%22site:#{site}" if site
    l = "&meta=lr%3Dlang_#{lang}" if lang
    res = Net::HTTP.get('www.google.com', "/search?q=#{term}&hr=en#{l}")
    if res.match(/of about <b>([0-9,. ]+)/)
      n = $1.gsub(/[,. ]/,'').to_i
      n > 0 && n
    end
  end

  def google_norm term, lang, site
     num = google term, lang, site
     norm = google NORMALIZER, lang, site
#     puts num, lang, site
     [num, num.to_f / norm.to_f]
  end
  def all_tlds term, lang
    pretty_norm TLDS.keys.map {|tld|
     num, normed = google_norm term, lang, tld
     desc = TLDS[tld]
     [tld, desc, num, normed] if num
    }.compact
  end
  def all_lang term, site
    pretty_norm LANGS.keys.map {|code|
     num, normed = google_norm term, code, site
     lang = LANGS[code]
     [code, lang, num, normed] if num
    }.compact
  end
  def wiki_print4 title, ta, tb, tc, td, l
    if CSV
      l.each {|a,b,c,d| puts "\"#{a}\", \"#{b}\", #{c}, #{d}"}
    else
      puts
"==#{title}==
{| class=\"wikitable\" border=1
!#{ta} !! #{tb} !! #{tc} !! #{td}
|-"
      l.each {|a,b,c,d|
        puts "| #{a} || #{b} || #{c} || #{d}"
        puts "|-"
      }
      puts "|}"
    end
  end
  def pretty_norm l
    smallest = 1.0
    l.each {|a,b,c,normed| smallest = normed unless normed > smallest}
    l.map {|a,b,c,normed| [a,b,c,(normed * 10.0 / smallest).to_i]}
  end

  def all term
    l = all_lang(term, nil)
    wiki_print4 "\"#{term}\" google pages by absolute language popularity", 'Lang', 'Language', 'Pages', 'Normed rank x 10', l.sort_by{|a,b,num,normed| -num}
    wiki_print4 "\"#{term}\" google pages by normalized language popularity", 'Lang', 'Language', 'Pages', 'Normed rank x 10 ', l.sort_by{|a,b,num,normed| -normed}
    l = all_tlds(term, nil)
    wiki_print4 "\"#{term}\" google pages by absolute domain popularity", 'TLD', 'Description', 'Pages', 'Normed rank x 10 ', l.sort_by{|a,b,num,normed| -num}
    wiki_print4 "\"#{term}\" google pages by normalized domain popularity", 'TLD', 'Description', 'Pages', 'Normed rank x 10 ', l.sort_by{|a,b,num,normed| -normed}
  end
end

GComparitor.new.all 'wikileaks'
Personal tools