As of December 5, 2006, we are no longer issuing new API keys for the SOAP Search API. Developers with existing SOAP Search API keys will not be affected.

Google SOAP Search API Reference

 Contents

    1.
Overview
1.1 Search Requests
1.2 Cache Requests
1.3 Spelling Requests

    2. Search Request Format

2.1 Search Parameters
2.2 Query Terms
2.3 Automatic Filtering
2.4 Restricts
2.5 Input and Output Encoding
2.6 SafeSearch
2.7 Limitations

    3. Search Results Format

3.1 Search Response
3.2 Result Element
3.3 Directory Category
 1. Overview Back to top            

This document explains in detail the semantics of the function calls you can make using the Google SOAP Search API service. In this document, you will learn:

You may also find the following files from the Google SOAP Search API developer kit to be helpful:

For comments or questions, please use the Google SOAP Search API discussion group.

1.1 Search Requests Back to top    

Search requests submit a query string and a set of parameters to the Google SOAP Search API service and receive in return a set of search results. Search results are derived from Google's index of billions of web pages.

The details of the interactions involved with search requests are covered in the Search Request Format and Search Results Format sections of this document.

1.2 Cache Requests Back to top    

Cache requests submit a URL to the Google SOAP Search API service and receive in return the contents of the URL when Google's crawlers last visited the page (if available).

Please note that Google is not affiliated with the authors of cached pages nor responsible for their content.

The return type for cached pages is base64 encoded text.

1.3 Spelling Requests Back to top    

Spelling requests submit a query to the Google SOAP Search API service and receive in return a suggested spell correction for the query (if available). Spell corrections mimic the same behavior as found on Google's Web site.

Spelling requests are subject to the same query string limitations as any other search request. (The input string is limited to 2048 bytes and 10 individual words.)

The return type for spelling requests is a text string.

 2. Search Request Format Back to top            
2.1 Search Parameters Back to top    

This table lists all the valid name-value pairs that can be used in a search request and describes how these parameters will modify the search results.

Name
Description
key
Provided by Google, this is required for you to access the Google service. Google uses the key for authentication and logging.
q
(See Query Terms section for details on query syntax.)
start
Zero-based index of the first desired result.
maxResults
Number of results desired per query. The maximum value per query is 10. Note: If you do a query that doesn't have many matches, the actual number of results you get may be smaller than what you request.
filter
Activates or deactivates automatic results filtering, which hides very similar results and results that all come from the same Web host. Filtering tends to improve the end user experience on Google, but for your application you may prefer to turn it off. (See Automatic Filtering section for more details.)
restricts
Restricts the search to a subset of the Google Web index, such as a country like "Ukraine" or a topic like "Linux." (See Restricts for more details.)
safeSearch
A Boolean value which enables filtering of adult content in the search results. See SafeSearch for more details.
lr
Language Restrict - Restricts the search to documents within one or more languages.
ie
Input Encoding - this parameter has been deprecated and is ignored. All requests to the API should be made with UTF-8 encoding. (See Input and Output Encodings section for details.)
oe
Output Encoding - this parameter has been deprecated and is ignored. All requests to the API should be made with UTF-8 encoding. (See Input and Output Encodings for details.)

2.2 Query Terms - <q> Back to top    

Default Search

By default, Google searches for all of your search terms, as well as for relevant variations of the terms you've entered. There is no need to include "AND" between terms. Keep in mind that the order of the terms in the query will affect the search results.

Stop Words

Google ignores common words and characters such as "where" and "how," as well as certain single digits and single letters. Common words that are ignored are known as stop words. However, you can prevent Google from ignoring stop words by enclosing them in quotes, such as in the phrase "to be or not to be".

Special Characters

By default, all non-alphanumeric characters that are included in a search query are treated as word separators. The only exceptions are the following: double quote mark ( " ), plus sign ( + ), minus sign or hyphen ( - ), and ampersand ( & ). The ampersand character ( & ) is treated as another character in the query term in which it is included, while the remaining exception characters correspond to search features listed in the section below.

Special Query Terms

Google supports the use of several special query terms that allow the user or search administrator to access additional capabilities of the Google search engine.ign in front of it.

Special Query Capability
Example Query
Description
Include Query Term Star Wars Episode +I If a common word is essential to getting the results you want, you can include it by putting a "+" sign in front of it.
Exclude Query Term bass -music You can exclude a word from your search by putting a minus sign ("-") immediately in front of the term you want to exclude from the search results.
Phrase Search "yellow pages" Search for complete phrases by enclosing them in quotation marks or connecting them with hyphens. Words marked in this way will appear together in all results exactly as entered.

Note: You may need to use a "+" to force inclusion of common words in a phrase.

Boolean OR Search vacation london OR paris Google search supports the Boolean I operator. To retrieve pages that include either word A or word B, use an uppercase OR between terms.
Site Restricted Search admission site:www.stanford.edu If you know the specific web site you want to search but aren't sure where the information is located within that site, you can use Google to search only within a specific web site.

Do this by entering your query followed by the string "site:" followed by the host name.

Note: The exclusion operator ("-") can be applied to this query term to remove a web site from consideration in the search.

Note: Only one site: term per query is supported.

Date Restricted Search Star Wars daterange:2452122-2452234 If you want to limit your results to documents that were published within a specific date range, then you can use the "daterange:" query term to accomplish this. The "daterange:" query term must be in the following format:
daterange:<start_date>-<end date>
where
<start_date> = Julian date indicating the start of the date range
<end date> = Julian date indicating the end of the date range
The Julian date is calculated by the number of days since January 1, 4713 BC. For example, the Julian date for August 1, 2001 is 2452122.
Title Search (term) intitle:Google search If you prepend "intitle:" to a query term, Google search restricts the results to documents containing that word in the title. Note there can be no space between the "intitle:" and the following word.

Note: Putting "intitle:" in front of every word in your query is equivalent to putting "allintitle:" at the front of your query.

Title Search (all) allintitle: Google search Starting a query with the term "allintitle:" restricts the results to those with all of the query words in the title.
URL Search (term) inurl:Google search If you prepend "inurl:" to a query term, Google search restricts the results to documents containing that word in the result URL. Note there can be no space between the "inurl:" and the following word.

Note: "inurl:" works only on words , not URL components. In particular, it ignores punctuation and uses only the first word following the "inurl:" operator. To find multiple words in a result URL, use the "inurl:" operator for each word.

Note: Putting "inurl:" in front of every word in your query is equivalent to putting "allinurl:" at the front of your query.

URL Search (all) allinurl: Google search Starting a query with the term "allinurl:" restricts the results to those with all of the query words in the result URL.

Note: "allinurl:" works only on words, not URL components. In particular, it ignores punctuation. Thus, "allinurl: foo/bar" restricts the results to pages with the words "foo" and "bar"" in the URL, but does not require that they be separated by a slash within that URL, that they be adjacent, or that they be in that particular word order. There is currently no way to enforce these constraints.

Text Only Search (all) allintext: Google search Starting a query with the term "allintext:" restricts the results to those with all of the query words in only the body text, ignoring link, URL, and title matches.
Links Only Search (all) allinlinks: Google search Starting a query with the term "allinlinks:" restricts the results to those with all of the query words in the URL links on the page.
File Type Filtering Google filetype:doc OR filetype:pdf The query prefix "filetype:" filters the results returned to include only documents with the extension specified immediately after. Note there can be no space between "filetype:&quot; and the specified extension.

Note: Multiple file types can be included in a filtered search by adding more "filetype:" terms to the search query.

File Type Exclusion Google -filetype:doc -filetype:pdf The query prefix "-filetype:" filters the results to exclude documents with the extension specified immediately after. Note there can be no space between "-filetype:" and the specified extension.

Note: Multiple file types can be excluded in a filtered search by adding more "-filetype:" terms to the search query.

Web Document Info info:www.google.com The query prefix "info:" returns a single result for the specified URL if it exists in the index.

Note: No other query terms can be specified when using this special query term.

Back Links link:www.google.com The query prefix "link:" lists web pages that have links to the specified web page. Note there can be no space between "link:" and the web page URL.

Note: No other query terms can be specified when using this special query term.

Related Links related:www.google.com The query prefix "related:" lists web pages that are similar to the specified web page. Note there can be no space between "related:" and the web page URL.

Note: No other query terms can be specified when using this special query term.

Cached Results Page cache:www.google.com web The query prefix "cache:" returns the cached HTML version of the specified web document that the Google search crawled. Note there can be no space between "cache:" and the web page URL. If you include other words in the query, Google will highlight those words within the cached document.

2.3 Automatic Filtering - <filter> Back to top    

The <filter> parameter causes Google to filter out some of the results for a given search. This is done to enhance the user experience on Google.com, but for your application, you may prefer to turn filtering off in order to get the full set of search results.

When enabled, filtering takes the following actions:

2.4 Restricts - <lr> <restrict> Back to top    

Google provides the ability to search a predefined subset of Google's web index. This is enabled by using the lr and restrict parameters.

<lr> - language restrict

To search for documents within a particular language, use the parameter, using one of the values in the table below.

Language
<lr> value
Arabic lang_ar
Chinese (S) lang_zh-CN
Chinese (T) lang_zh-TW
Czech lang_cs
Danish lang_da
Dutch lang_nl
English lang_en
Estonian lang_et
Finnish lang_fi
French lang_fr
German lang_de
Greek lang_el
Hebrew lang_iw
Hungarian lang_hu
Language
<lr> value
Icelandic lang_is
Italian lang_it
Japanese lang_ja
Korean lang_ko
Latvian lang_lv
Lithuanian lang_lt
Norwegian lang_no
Portuguese lang_pt
Polish lang_pl
Romanian lang_ro
Russian lang_ru
Spanish lang_es
Swedish lang_sv
Turkish lang_tr

<restrict> - Country and Topic Restricts

Google allows you to search for Web information within one or more countries, using an algorithm that considers the top level domain name of the server and the geographic location of the server IP address.

The automatic country sub-collections currently supported are listed below:

Country
<restrict>
value
AD-CL
Andorra countryAD
United Arab Emirates countryAE
Afghanistan countryAF
Antigua and Barbuda countryAG
Anguilla countryAI
Albania countryAL
Armenia countryAM
Netherlands Antilles countryAN
Angola countryAO
Antarctica countryAQ
Argentina countryAR
American Samoa countryAS
Austria countryAT
Australia countryAU
Aruba countryAW
Azerbaijan countryAZ
Bosnia and Herzegowina countryBA
Barbados countryBB
Bangladesh countryBD
Belgium countryBE
Burkina Faso countryBF
Bulgaria countryBG
Bahrain countryBH
Burundi countryBI
Benin countryBJ
Bermuda countryBM
Brunei Darussalam countryBN
Bolivia countryBO
Brazil countryBR
Bahamas countryBS
Bhutan countryBT
Bouvet Island countryBV
Botswana countryBW
Belarus countryBY
Belize countryBZ
Canada countryCA
Cocos (Keeling) Islands countryCC
Congo, The Democratic Republic of the countryCD
Central African Republic countryCF
Congo countryCG
Burundi countryBI
Benin countryBJ
Bermuda countryBM
Brunei Darussalam countryBN
Bolivia countryBO
Brazil countryBR
Bahamas countryBS
Bhutan countryBT
Bouvet Island countryBV
Botswana countryBW
Belarus countryBY
Belize countryBZ
Canada countryCA
Cocos (Keeling) Islands countryCC
Congo, The Democratic Republic of the countryCD
Central African Republic countryCF
Congo countryCG
Switzerland countryCH
Cote D'ivoire countryCI
Cook Islands countryCK
Chile countryCL
Country
<restrict>
value
CM-JO
Cameroon countryCM
China countryCN
Colombia countryCO
Costa Rica countryCR
Cuba countryCU
Cape Verde countryCV
Christmas Island countryCX
Cyprus countryCY
Czech Republic countryCZ
Germany countryDE
Djibouti countryDJ
Denmark countryDK
Dominica countryDM
Dominican Republic countryDO
Algeria countryDZ
Ecuador countryEC
Estonia countryEE
Egypt countryEG
Western Sahara countryEH
Eritrea countryER
Spain countryES
Ethiopia countryET
European Union countryEU
Finland countryFI
Fiji countryFJ
Falkland Islands (Malvinas) countryFK
Micronesia, Federated States of countryFM
Faroe Islands countryFO
France countryFR
France, Metropolitan countryFX
Gabon countryGA
United Kingdom countryUK
Grenada countryGD
Georgia countryGE
French Quiana countryGF
Ghana countryGH
Gibraltar countryGI
Greenland countryGL
Gambia countryGM
Guinea countryGN
Guadeloupe countryGP
Equatorial Guinea countryGQ
Greece countryGR
South Georgia and the South Sandwich Islands countryGS
Guatemala countryGT
Guam countryGU
Guinea-Bissau countryGW
Guyana countryGY
Hong Kong countryHK
Heard and Mc Donald Islands countryHM
Honduras countryHN
Croatia (local name: Hrvatska) countryHR
Haiti countryHT
Hungary countryHU
Indonesia countryID
Ireland countryIE
Israel countryIL
India countryIN
British Indian Ocean Territory countryIO
Iraq countryIQ
Iran (Islamic Republic of) countryIR
Iceland countryIS
Italy countryIT
Jamaica countryJM
Jordan countryJO
Country
<restrict>
value
JP-PS
Japan countryJP
Kenya countryKE
Kyrgyzstan countryKG
Cambodia countryKH
Kiribati countryKI
Comoros countryKM
Saint Kitts and Nevis countryKN
Korea, Democratic People's Republic of countryKP
Korea, Republic of countryKR
Kuwait countryKW
Cayman Islands countryKY
Kazakhstan countryKZ
Lao People's Democratic Republic countryLA
Lebanon countryLB
Saint Lucia countryLC
Liechtenstein countryLI
Sri Lanka countryLK
Liberia countryLR
Lesotho countryLS
Lithuania countryLT
Luxembourg countryLU
Latvia countryLV
Libyan Arab Jamahiriya countryLY
Morocco countryMA
Monaco countryMC
Moldova countryMD
Madagascar countryMG
Marshall Islands countryMH
Macedonia, The Former Yugoslav Republic of countryMK
Mali countryML
Myanmar countryMM
Mongolia countryMN
Macau countryMO
Northern Mariana Islands countryMP
Martinique countryMQ
Mauritania countryMR
Montserrat countryMS
Malta countryMT
Mauritius countryMU
Maldives countryMV
Malawi countryMW
Mexico countryMX
Malaysia countryMY
Mozambique countryMZ
Namibia countryNA
New Caledonia countryNC
Niger countryNE
Norfolk Island countryNF
Nigeria countryNG
Nicaragua countryNI
Netherlands countryNL
Norway countryNO
Nepal countryNP
Nauru countryNR
Niue countryNU
New Zealand countryNZ
Oman countryOM
Panama countryPA
Peru countryPE
French Polynesia countryPF
Papua New Guinea countryPG
Philippines countryPH
Pakistan countryPK
Poland countryPL
St. Pierre and Miquelon countryPM
Pitcairn countryPN
Puerto Rico countryPR
Palestine countryPS
Country
<restrict>
value
PT-ZR
Portugal countryPT
Palau countryPW
Paraguay countryPY
Qatar countryQA
Reunion countryRE
Romania countryRO
Russian Federation countryRU
Rwanda countryRW
Saudi Arabia countrySA
Solomon Islands countrySB
Seychelles countrySC
Sudan countrySD
Sweden countrySE
Singapore countrySG
St. Helena countrySH
Slovenia countrySI
Svalbard and Jan Mayen Islands countrySJ
Slovakia (Slovak Republic) countrySK
Sierra Leone countrySL
San Marino countrySM
Senegal countrySN
Somalia countrySO
Suriname countrySR
Sao Tome and Principe countryST
El Salvador countrySV
Syria countrySY
Swaziland countrySZ
Turks and Caicos Islands countryTC
Chad countryTD
French Southern Territories countryTF
Togo countryTG
Thailand countryTH
Tajikistan countryTJ
Tokelau countryTK
Turkmenistan countryTM
Tunisia countryTN
Tonga countryTO
East Timor countryTP
Turkey countryTR
Trinidad and Tobago countryTT
Tuvalu countryTV
Taiwan countryTW
Tanzania countryTZ
Ukraine countryUA
Uganda countryUG
United States Minor Outlying Islands countryUM
United States countryUS
Uruguay countryUY
Uzbekistan countryUZ
Holy See (Vatican City State) countryVA
Saint Vincent and the Grenadines countryVC
Venezuela countryVE
Virgin Islands (British) countryVG
Virgin Islands (U.S.) countryVI
Vietnam countryVN
Vanuatu countryVU
Wallis and Futuna Islands countryWF
Samoa countryWS
Yemen countryYE
Mayotte countryYT
Yugoslavia countryYU
South Africa countryZA
Zambia countryZM
Zaire countryZR

Google also has four topic restricts:

Topic
<restrict> value
US. Government unclesam
Linux linux
Macintosh mac
FreeBSD bsd

Combining the <lr> and <restrict> parameters:

Search requests which use the lr and restrict parameters support the Boolean operators identified in the table below (in order of precedence).

Note: If both lr and restrict parameters are used in a search request, the sub-collection strings will be combined together using "AND" logic.

Boolean Operator
Sample Usage
Description
Boolean NOT [ - ] -lang_fr Removes all results which are defined as part of the sub-collection immediately following the "-" operator.

The example restrict value would remove all results in French.

Boolean AND [ . ] linux.countryFR Returns results which are in the intersection of the results returned by the sub-collection to either side of the "." operator.

The example restrict value would return all results which are from both the "linux" subtopic and identified as being located in France.

Boolean OR [ | ] lang_en|lang_fr Returns results which are in either of the results returned by the sub-collection to either side of the "|" operator.

The example restrict value would return all results matching the query that are in either the French or English sub-collections.

Parentheses [ ( ) ] (linux).(-(conutryUK|countryUS)) All terms within the innermost set of parentheses in a sub-collection string will be evaluated before terms outside the parentheses are evaluated. Use parentheses to adjust the order of term evaluation.

The example restrict value would return all results in the "linux" custom sub-collection that are not in either the United States or United Kingdom sub-collections.

Note: Spaces are not valid characters in the restrict parameter.

2.5 Input and Output Encodings - <ie>, <oe> Back to top    

In order to support searching documents in multiple languages and character encodings the Google SOAP Search API performs all requests and responses in the UTF-8 encoding. The parameters <ie> and <oe> are required in client requests but their values are ignored. Clients should encode all request data in UTF-8 and should expect results to be in UTF-8.

2.6 SafeSearch - <safeSearch> Back to top    

Many Google users prefer not to have adult sites included in their search results. Google's SafeSearch feature screens for sites that contain this type of information and eliminates them from search results. While no filter is 100% accurate, Google's filter uses advanced proprietary technology that checks keywords and phrases, URLs, and Open Directory categories.

If you have SafeSearch activated and still find websites containing offensive content in your results, please contact us and we'll investigate it.

2.7 Limitations Back to top    

There are some important limitations you should be aware of. Some of these are because Google's infrastructure is currently optimized for end users. However, in the future we hope to vastly increase the limits for Google SOAP Search API developers.

Component
Limit
Search request length 2048 bytes
Maximum number of words in the query 10
Maximum number of site: terms in the query 1 (per search request)
Maximum number of results per query 10
Maximum value of <start> + <maxResults> 1000

 3. Search Results Format Back to top            

3.1 Search Response Back to top    

Each time you issue a search request to the Google service, a response is returned back to you. This section describes the meanings of the values returned to you.

<documentFiltering> - A Boolean value indicating whether filtering was performed on the search results. This will be "true" only if (a) you requested filtering and (b) filtering actually occurred.

<searchComments> - A text string intended for display to an end user. One of the most common messages found here is a note that "stop words" were removed from the search automatically. (This happens for very common words such as "and" and "as.")

<estimatedTotalResultsCount> - The estimated total number of results that exist for the query. Note: The estimated number may be either higher or lower than the actual number of results that exist.

<estimateIsExact> - A Boolean value indicating that the estimate is actually the exact value.

<resultElements> - An array of <resultElement> items. This corresponds to the actual list of search results.

<searchQuery> - This is the value of <q> for the search request.

<startIndex> - Indicates the index (1-based) of the first search result in <resultElements>.

<endIndex> - Indicates the index (1-based) of the last search result in <resultElements>.

<searchTips> - A text string intended for display to the end user. It provides instructive suggestions on how to use Google.

<directoryCategories> - An array of <directoryCategory> items. This corresponds to the ODP directory matches for this search.

<searchTime> - Text, floating-point number indicating the total server time to return the search results, measured in seconds.

3.2 Result Element Back to top    

<summary> - If the search result has a listing in the ODP directory, the ODP summary appears here as a text string.

<URL> - The URL of the search result, returned as text, with an absolute URL path.

<snippet> - A text excerpt from the results page that shows the query in context as it appears on the matching results page. This is formatted HTML and usually includes <B> tags within it. Query terms will be highlighted in bold in the results, and line breaks will be included for proper text wrapping. If Google searched for stemmed variants of the query terms using its proprietary technology, those terms will also be highlighted in bold in the snippet. Note that the query term does not always appear in the snippet. <title> - The title of the search result, returned as HTML.

<cachedSize> - Text (Integer + "k"). Indicates that a cached version of the <URL> is available; size is indicated in kilobytes.

<relatedInformationPresent> - Boolean indicating that the "related:" query term is supported for this URL.

<hostName> - When filtering occurs, a maximum of two results from any given host is returned. When this occurs, the second resultElement that comes from that host contains the host name in this parameter.

<directoryCategory> - See below.

<directoryTitle> - If the URL for this resultElement is contained in the ODP directory, the title that appears in the directory appears here as a text string. Note that the directoryTitle may be different from the URL's <title>.

3.3 Directory Category Back to top    

<fullViewableName> - Text, containing the ODP directory name for the current ODP category.

<specialEncoding> - Specifies the encoding scheme of the directory information.