google_search_module

Retrieve google results using python

Progam obtained the results links from google main page and each links are run separately using Scrapy. In this way, users have more flexibility in obtaining various information from individual websites. At present, only the title and meta contents are scrapped from each website. The other advantage is that is remove further dependency from Google html tag changes.

Dependency of script are Scrapy and yaml (for unicode handling). Both can be downloaded using PIP.

Scripts is divided into 2 parts. The main script for running is from Python_Google_Search.py. The get_google_link_results.py is the scrapy spider for crawling either the google search page or individual websites. The switch depends on the json setting file created.

More information can be obtained from: http://wp.me/p4nnkg-1i

For the GUI version, required the following module: https://github.com/spidezad/Extract_specified_txt_fr_files See the docs folder for the GUI display. For more details on the GUI, pls see the http://simplypython.wordpress.com/2014/07/12/scaping-google-results-using-python-gui-version/

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
Get_google_link_results.py		Get_google_link_results.py
Python_Google_Search.py		Python_Google_Search.py
Python_google_search_gui.py		Python_google_search_gui.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

google_search_module

About

Releases

Packages

Languages

spidezad/google_search_module

Folders and files

Latest commit

History

Repository files navigation

google_search_module

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages