Google serp api python

Содержание

Google Search Results in Python
Installation
Quick start
Summary
Google Search API capability
How to set SerpApi key
Example by specification
Location API
Search Archive API
Account API
Search Bing
Search Baidu
Search Yandex
Search Yahoo
Search eBay
Search Home Depot
Search Youtube
Search Google Scholar
Search Walmart
Search Youtube
Search Apple App Store
Search Naver
Generic search with SerpApiClient
Search Google Images
Search Google News
Search Google Shopping
Google Search By Location
Batch Asynchronous Searches
Python object as a result
Pagination using iterator
Error management
Change log
Conclusion
Free Plan · 100 searches / month
They trust us

Google Search Results in Python

This Python package is meant to scrape and parse search results from Google, Bing, Baidu, Yandex, Yahoo, Home Depot, eBay and more, using SerpApi.

The following services are provided:

SerpApi provides a script builder to get you started quickly.

Installation

pip install google-search-results

Quick start

from serpapi import GoogleSearch search = GoogleSearch(< "q": "coffee", "location": "Austin,Texas", "api_key": "" >) result = search.get_dict()

This example runs a search for «coffee» using your secret API key.

The SerpApi service (backend)

Searches Google using the search: q = «coffee»
Parses the messy HTML responses
Returns a standardized JSON response The GoogleSearch class
Formats the request
Executes a GET http request against SerpApi service
Parses the JSON response into a dictionary

Alternatively, you can search:

Bing using BingSearch class
Baidu using BaiduSearch class
Yahoo using YahooSearch class
DuckDuckGo using DuckDuckGoSearch class
eBay using EbaySearch class
Yandex using YandexSearch class
HomeDepot using HomeDepotSearch class
GoogleScholar using GoogleScholarSearch class
Youtube using YoutubeSearch class
Walmart using WalmartSearch
Apple App Store using AppleAppStoreSearch class
Naver using NaverSearch class

Summary

Google Search Results in Python

Installation
Quick start

Summary

Google Search API capability
How to set SerpApi key
Example by specification
Location API
Search Archive API
Account API
Search Bing
Search Baidu
Search Yandex
Search Yahoo
Search Ebay
Search Home depot
Search Youtube
Search Google Scholar
Generic search with SerpApiClient
Search Google Images
Search Google News
Search Google Shopping
Google Search By Location
Batch Asynchronous Searches
Python object as a result
Python paginate using iterator
Error management

Google Search API capability

params = < "q": "coffee", "location": "Location Requested", "device": "desktop|mobile|tablet", "hl": "Google UI Language", "gl": "Google Country", "safe": "Safe Search Flag", "num": "Number of Results", "start": "Pagination Offset", "api_key": "Your SerpApi Key", # To be match "tbm": "nws|isch|shop", # To be search "tbs": "custom to be search criteria", # allow async request "async": "true|false", # output format "output": "json|html" ># define the search search search = GoogleSearch(params) # override an existing parameter search.params_dict["location"] = "Portland" # search format return as raw html html_results = search.get_html() # parse results # as python Dictionary dict_results = search.get_dict() # as JSON using json package json_results = search.get_json() # as dynamic Python object object_result = search.get_object()

See below for more hands-on examples.

How to set SerpApi key

You can get an API key here if you don’t already have one: https://serpapi.com/users/sign_up

The SerpApi api_key can be set globally:

GoogleSearch.SERP_API_KEY = "Your Private Key"

The SerpApi api_key can be provided for each search:

Example by specification

We love true open source, continuous integration and Test Driven Development (TDD). We are using RSpec to test our infrastructure around the clock to achieve the best Quality of Service (QoS).

The directory test/ includes specification/examples.

export API_KEY="your secret key"

Location API

from serpapi import GoogleSearch search = GoogleSearch(<>) location_list = search.get_location("Austin", 3) print(location_list)

This prints the first 3 locations matching Austin (Texas, Texas, Rochester).

Search Archive API

The search results are stored in a temporary cache. The previous search can be retrieved from the cache for free.

from serpapi import GoogleSearch search = GoogleSearch() search_result = search.get_dictionary() assert search_result.get("error") == None search_id = search_result.get("search_metadata").get("id") print(search_id)

Now let’s retrieve the previous search from the archive.

archived_search_result = GoogleSearch(<>).get_search_archive(search_id, 'json') print(archived_search_result.get("search_metadata").get("id"))

This prints the search result from the archive.

Account API

from serpapi import GoogleSearch search = GoogleSearch(<>) account = search.get_account()

This prints your account information.

Search Bing

from serpapi import BingSearch search = BingSearch() data = search.get_dict()

This code prints Bing search results for coffee as a Dictionary.

Search Baidu

from serpapi import BaiduSearch search = BaiduSearch() data = search.get_dict()

This code prints Baidu search results for coffee as a Dictionary. https://serpapi.com/baidu-search-api

Search Yandex

from serpapi import YandexSearch search = YandexSearch() data = search.get_dict()

This code prints Yandex search results for coffee as a Dictionary.

Search Yahoo

from serpapi import YahooSearch search = YahooSearch() data = search.get_dict()

This code prints Yahoo search results for coffee as a Dictionary.

Search eBay

from serpapi import EbaySearch search = EbaySearch() data = search.get_dict()

This code prints eBay search results for coffee as a Dictionary.

Search Home Depot

from serpapi import HomeDepotSearch search = HomeDepotSearch() data = search.get_dict()

This code prints Home Depot search results for chair as Dictionary.

Search Youtube

from serpapi import HomeDepotSearch search = YoutubeSearch() data = search.get_dict()

This code prints Youtube search results for chair as Dictionary.

Search Google Scholar

from serpapi import GoogleScholarSearch search = GoogleScholarSearch() data = search.get_dict()

This code prints Google Scholar search results.

Search Walmart

from serpapi import WalmartSearch search = WalmartSearch() data = search.get_dict()

This code prints Walmart search results.

Search Youtube

from serpapi import YoutubeSearch search = YoutubeSearch() data = search.get_dict()

This code prints Youtube search results.

Search Apple App Store

from serpapi import AppleAppStoreSearch search = AppleAppStoreSearch() data = search.get_dict()

This code prints Apple App Store search results.

Search Naver

from serpapi import NaverSearch search = NaverSearch() data = search.get_dict()

This code prints Naver search results.

Generic search with SerpApiClient

from serpapi import SerpApiClient query = search = SerpApiClient(query) data = search.get_dict()

This class enables interaction with any search engine supported by SerpApi.com

Search Google Images

from serpapi import GoogleSearch search = GoogleSearch() for image_result in search.get_dict()['images_results']: link = image_result["original"] try: print("link: " + link) # wget.download(link, '.') except: pass

This code prints all the image links, and downloads the images if you un-comment the line with wget (Linux/OS X tool to download files).

Search Google News

from serpapi import GoogleSearch search = GoogleSearch(< "q": "coffe", # search search "tbm": "nws", # news "tbs": "qdr:d", # last 24h "num": 10 >) for offset in [0,1,2]: search.params_dict["start"] = offset * 10 data = search.get_dict() for news_result in data['news_results']: print(str(news_result['position'] + offset * 10) + " - " + news_result['title'])

This script prints the first 3 pages of the news headlines for the last 24 hours.

Search Google Shopping

from serpapi import GoogleSearch search = GoogleSearch(< "q": "coffe", # search search "tbm": "shop", # news "tbs": "p_ord:rv", # last 24h "num": 100 >) data = search.get_dict() for shopping_result in data['shopping_results']: print(shopping_result['position']) + " - " + shopping_result['title'])

This script prints all the shopping results, ordered by review order.

Google Search By Location

With SerpApi, we can build a Google search from anywhere in the world. This code looks for the best coffee shop for the given cities.

from serpapi import GoogleSearch for city in ["new york", "paris", "berlin"]: location = GoogleSearch(<>).get_location(city, 1)[0]["canonical_name"] search = GoogleSearch(< "q": "best coffee shop", # search search "location": location, "num": 1, "start": 0 >) data = search.get_dict() top_result = data["organic_results"][0]["title"]

Batch Asynchronous Searches

We offer two ways to boost your searches thanks to the async parameter.

Blocking — async=false — more compute intensive because the search needs to maintain many connections. (default)
Non-blocking — async=true — the way to go for large batches of queries (recommended)

# Operating system import os # regular expression library import re # safe queue (named Queue in python2) from queue import Queue # Time utility import time # SerpApi search from serpapi import GoogleSearch # store searches search_queue = Queue() # SerpApi search search = GoogleSearch(< "location": "Austin,Texas", "async": True, "api_key": os.getenv("API_KEY") >) # loop through a list of companies for company in ['amd', 'nvidia', 'intel']: print("execute async search: q = " + company) search.params_dict["q"] = company result = search.get_dict() if "error" in result: print("oops error: ", result["error"]) continue print("add search to the queue where id: ", result['search_metadata']) # add search to the search_queue search_queue.put(result) print("wait until all search statuses are cached or success") # Create regular search while not search_queue.empty(): result = search_queue.get() search_id = result['search_metadata']['id'] # retrieve search from the archive - blocker print(search_id + ": get search from archive") search_archived = search.get_search_archive(search_id) print(search_id + ": status = " + search_archived['search_metadata']['status']) # check status if re.search('Cached|Success', search_archived['search_metadata']['status']): print(search_id + ": search done with q = " + search_archived['search_parameters']['q']) else: # requeue search_queue print(search_id + ": requeue search") search_queue.put(result) # wait 1s time.sleep(1) print('all searches completed')

This code shows how to run searches asynchronously. The search parameters must have . This indicates that the client shouldn’t wait for the search to be completed. The current thread that executes the search is now non-blocking, which allows it to execute thousands of searches in seconds. The SerpApi backend will do the processing work. The actual search result is deferred to a later call from the search archive using get_search_archive(search_id). In this example the non-blocking searches are persisted in a queue: search_queue. A loop through the search_queue allows it to fetch individual search results. This process can easily be multithreaded to allow a large number of concurrent search requests. To keep things simple, this example only explores search results one at a time (single threaded).

Python object as a result

The search results can be automatically wrapped in dynamically generated Python object. This solution offers a more dynamic, fully Oriented Object Programming approach over the regular Dictionary / JSON data structure.

from serpapi import GoogleSearch search = GoogleSearch() r = search.get_object() assert type(r.organic_results), list assert r.organic_results[0].title assert r.search_metadata.id assert r.search_metadata.google_url assert r.search_parameters.q, "Coffee" assert r.search_parameters.engine, "google"

Pagination using iterator

Let’s collect links across multiple search results pages.

# to get 2 pages start = 0 end = 40 page_size = 10 # basic search parameters parameter = < "q": "coca cola", "tbm": "nws", "api_key": os.getenv("API_KEY"), # optional pagination parameter # the pagination method can take argument directly "start": start, "end": end, "num": page_size ># as proof of concept # urls collects urls = [] # initialize a search search = GoogleSearch(parameter) # create a python generator using parameter pages = search.pagination() # or set custom parameter pages = search.pagination(start, end, page_size) # fetch one search result per iteration # using a basic python for loop # which invokes python iterator under the hood. for page in pages: print(f"Current page: ") for news_result in page["news_results"]: print(f"Title: \nLink: \n") urls.append(news_result['link']) # check if the total number pages is as expected # note: the exact number if variable depending on the search engine backend if len(urls) == (end - start): print("all search results count match!") if len(urls) == len(set(urls)): print("all search results are unique!")

Examples to fetch links with pagination: test file, online IDE

Error management

SerpApi keeps error management simple.

If it’s a backend error, a simple error message is returned as string in the server response.

from serpapi import GoogleSearch search = GoogleSearch(">) data = search.get_json() assert data["error"] == None

In some cases, there are more details available in the data object.

If it’s a client error, then a SerpApiClientException is raised.

Change log

simplify import
improve package for python 3.5+
add support for python 3.5 and 3.6

Change namespace «from lib.» instead: «from serpapi import GoogleSearch»
Support for Bing and Baidu

Conclusion

SerpApi supports all the major search engines. Google has the more advance support with all the major services available: Images, News, Shopping and more.. To enable a type of search, the field tbm (to be matched) must be set to:

isch: Google Images API.
nws: Google News API.
shop: Google Shopping API.
any other Google service should work out of the box.
(no tbm parameter): regular Google search.

The field tbs allows to customize the search even more.

Free Plan · 100 searches / month

They trust us

You are in good company. Join them.

Источник