Java запрос в google

jsoup: отправить поисковый запрос в Google

google-search

В этом примере показано, как использоватьjsoup для отправки поискового запроса в Google.

Document doc = Jsoup .connect("https://www.google.com/search?q=mario"); .userAgent("Mozilla/5.0") .timeout(5000).get();

Unusual traffic from your computer network
Не используйте этот пример для спама Google, вы получите вышеуказанное сообщение от Google, прочтите этоGoogle answer.

1. пример jsoup

Пример отправки поискового запроса «mario» в Google, анализа результатов поиска и фильтрации доменного имени.

package com.example; import java.io.IOException; import java.util.HashSet; import java.util.Set; import java.util.regex.Matcher; import java.util.regex.Pattern; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class FunnyCrawler < private static Pattern patternDomainName; private Matcher matcher; private static final String DOMAIN_NAME_PATTERN = "([a-zA-Z0-9]([a-zA-Z0-9\\-][a-zA-Z0-9])?\\.)+[a-zA-Z]"; static < patternDomainName = Pattern.compile(DOMAIN_NAME_PATTERN); >public static void main(String[] args) < FunnyCrawler obj = new FunnyCrawler(); Setresult = obj.getDataFromGoogle("mario"); for(String temp : result) < System.out.println(temp); >System.out.println(result.size()); > public String getDomainName(String url) < String domainName = ""; matcher = patternDomainName.matcher(url); if (matcher.find()) < domainName = matcher.group(0).toLowerCase().trim(); >return domainName; > private Set getDataFromGoogle(String query) < Setresult = new HashSet(); String request = "https://www.google.com/search?q=" + query + "&num=20"; System.out.println("Sending request. " + request); try < // need http protocol, set this as a Google bot agent :) Document doc = Jsoup .connect(request) .userAgent( "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)") .timeout(5000).get(); // get all links Elements links = doc.select("a[href]"); for (Element link : links) < String temp = link.attr("href"); if(temp.startsWith("/url?q paragraph">

Выход

Sending request. https://www.google.com/search?q=mario&num=20 www.imdb.com www.mariobatali.com www.freemario.org www.mariogames.be mario.wikia.com stabyourself.net webcache.googleusercontent.com www.youtube.com www.huffingtonpost.com www.mariowiki.com mario.lancashire.gov.uk amirulhafiz.deviantart.com www.mariohugo.com mariofoods.com mario.nintendo.com www.mario2u.com www.botta.ch en.wikipedia.org www.mariotestino.com www.hubmario.com www.mariolemieux.org pouetpu.pbworks.com 23

Источник

Google Search from Java Program Example

Google Search from Java Program Example

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Sometime back I was looking for a way to search Google using Java Program. I was surprised to see that Google had a web search API but it has been deprecated long back and now there is no standard way to achieve this. Basically google search is an HTTP GET request where query parameter is part of the URL, and earlier we have seen that there are different options such as Java HttpUrlConnection or Apache HttpClient to perform this search. But the problem is more related to parsing the HTML response and get the useful information out of it. That’s why I chose to use jsoup that is an open source HTML parser and it’s capable to fetch HTML from given URL. So below is a simple program to fetch google search results in a java program and then parse it to find out the search results.

package com.journaldev.jsoup; import java.io.IOException; import java.util.Scanner; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class GoogleSearchJava < public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search"; public static void main(String[] args) throws IOException < //Taking search term input from console Scanner scanner = new Scanner(System.in); System.out.println("Please enter the search term."); String searchTerm = scanner.nextLine(); System.out.println("Please enter the number of results. Example: 5 10 20"); int num = scanner.nextInt(); scanner.close(); String searchURL = GOOGLE_SEARCH_URL + "?q="+searchTerm+"&num="+num; //without proper User-Agent, we will get 403 error Document doc = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get(); //below will print HTML data, save it to a file and open in browser to compare //System.out.println(doc.html()); //If google search results HTML change the a"); for (Element result : results) < String linkHref = result.attr("href"); String linkText = result.text(); System.out.println("Text::" + linkText + ", URL::" + linkHref.substring(6, linkHref.indexOf("&"))); >> > 

Google Search API Java, java google search, google search java program

Below is a sample output from above program, I saved the HTML data into file and opened in a browser to confirm the output and it’s what we wanted. Compare the output with below image.

Please enter the search term. journaldev Please enter the number of results. Example: 5 10 20 20 Text::JournalDev, URL::=https://www.journaldev.com/ Text::Java Interview Questions, URL::=https://www.journaldev.com/java-interview-questions Text::Java design patterns, URL::=https://www.journaldev.com/tag/java-design-patterns Text::Tutorials, URL::=https://www.journaldev.com/tutorials Text::Java servlet, URL::=https://www.journaldev.com/tag/java-servlet Text::Spring Framework Tutorial . URL::=https://www.journaldev.com/2888/spring-tutorial-spring-core-tutorial Text::Java Design Patterns PDF . URL::=https://www.journaldev.com/6308/java-design-patterns-pdf-ebook-free-download-130-pages Text::Pankaj Kumar (@JournalDev) | Twitter, URL::=https://twitter.com/journaldev Text::JournalDev | Facebook, URL::=https://www.facebook.com/JournalDev Text::JournalDev - Chrome Web Store - Google, URL::=https://chrome.google.com/webstore/detail/journaldev/ckdhakodkbphniaehlpackbmhbgfmekf Text::Debian -- Details of package libsystemd-journal-dev in wheezy, URL::=https://packages.debian.org/wheezy/libsystemd-journal-dev Text::Debian -- Details of package libsystemd-journal-dev in wheezy . URL::=https://packages.debian.org/wheezy-backports/libsystemd-journal-dev Text::Debian -- Details of package libsystemd-journal-dev in sid, URL::=https://packages.debian.org/sid/libsystemd-journal-dev Text::Debian -- Details of package libsystemd-journal-dev in jessie, URL::=https://packages.debian.org/jessie/libsystemd-journal-dev Text::Ubuntu – Details of package libsystemd-journal-dev in trusty, URL::=https://packages.ubuntu.com/trusty/libsystemd-journal-dev Text::libsystemd-journal-dev : Utopic (14.10) : Ubuntu - Launchpad, URL::=https://launchpad.net/ubuntu/utopic/%2Bpackage/libsystemd-journal-dev Text::Debian -- Details of package libghc-libsystemd-journal-dev in jessie, URL::=https://packages.debian.org/jessie/libghc-libsystemd-journal-dev Text::Advertise on JournalDev | BuySellAds, URL::=https://buysellads.com/buy/detail/231824 Text::JournalDev | LinkedIn, URL::=https://www.linkedin.com/groups/JournalDev-6748558 Text::How to install libsystemd-journal-dev package in Ubuntu Trusty, URL::=https://www.howtoinstall.co/en/ubuntu/trusty/main/libsystemd-journal-dev/ Text::[global] auth supported = cephx ms bind ipv6 = true [mon] mon data . URL::=https://zooi.widodh.nl/ceph/ceph.conf Text::UbuntuUpdates - Package "libsystemd-journal-dev" (trusty 14.04), URL::=https://www.ubuntuupdates.org/libsystemd-journal-dev Text::[Journal]Dev'err - Cursus Honorum - Enjin, URL::=https://cursushonorum.enjin.com/holonet/m/23958869/viewthread/13220130-journaldeverr/post/last 

That’s all for google search in a java program, use it cautiously because if there is unusual traffic from your computer, chances are Google will block you.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases. Learn more about us

Источник

Читайте также:  Php mysqli query exception
Оцените статью