Frequently Asked Questions
Why use the Google Trends API instead of the Google Trends Web interface?
There is no problem with just using the web interface, however, when doing a large-scale project, which requires building a large dataset — this might become very cumbersome. Manually researching and copying data from the Google Trends site is a research and time-intensive process. When using an API, this time and effort are cut dramatically.
Are there any limitations to using the Pytrends Google Trends API?
Yes, there are. Before you begin you must be aware of these few things:
1) Search terms and topics are two different things.
2) The results are disproportionate.
3) There are keyword length limitations.
4) All data is relative, not absolute.
5) The categories are unreliable at best.
6) You can only provide five entries per chart.
What do Google Trends values actually denote?
According to Google Trends, the values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular. A value of 0 indicates a location where there was not enough data for this term.
Explore search data at scale
Google Trends is a public platform that you can use to analyze interest over time for a given topic, search term, and even company.
Pytrends is an unofficial Google Trends API that provides different methods to download reports of trending results from google trends. The Python package can be used for automation of different processes such as quickly fetching data that can be used for more analyses later on.
In this article, I will share some insights on what you can do with Pytrends, how to do basic data pulls, providing snippets of Python code along the way. I will also answer some FAQs about Google Trends and most importantly — address the limitations of using the API and the data.
What data can you pull with the Google Trends API?
Related to a particular keyword you provide to the API, you can pull the following data:
- Interest Over Time
- Historical Hourly Interest
- Interest by Region
- Related Topics
- Related Queries
- Trending Searches
- Top Charts
- Keyword Suggestions
We will explore the different methods that are available in the API for pulling this data in a bit, alongside how the syntax for each of these methods looks like.
What parameters can you specify in your queries?
There are two objects that you can specify parameters for:
- optionsObject
- callback
The callback is an optional function, where the first parameter is an error and the second parameter is the result. If no callback is provided, then a promise is returned.
const googleTrends = require('google-trends-api');
googleTrends.apiMethod(optionsObject, [callback])
The optionsObject is an object with the following options keys:
- keyword (required) — Target search term(s)
string
orarray
- startTime — Start of the time period of interest (
new Date()
object). IfstartTime
is not provided, date of January 1, 2004, is assumed as this is the oldest available google trends data - endTime — End of the time period of interest (
new Date()
object). IfendTime
is not provided, the current date is selected. - geo — location of interest (
string
orarray
if you wish to provide separate locations for each keyword). - hl — Preferred language (
string
defaults to English) - timezone — Timezone (
number
defaults to the time zone difference, in minutes, from UTC to current locale (host system settings)) - category — the category to search within (
number
defaults to all categories) - property — Google property to filter on. Defaults to a web search. (enumerated
string
[‘images’, ‘news’, ‘youtube’ or ‘froogle’] the latter relating to Google Shopping results) - resolution — Granularity of the geo search (enumerated
string
[‘COUNTRY’, ‘REGION’, ‘CITY’, ‘DMA’]).resolution
is specific to the interestByRegion method. - granularTimeResolution — Boolean that dictates if the results should be given in a finer time resolution (if
startTime
andendTime
is less than one day, this should be set totrue
)
Are there any limitations to using the Pytrends Google Trends API?
Yes, there are. Before you begin you must be aware of these few things:
1. Search terms and topics are two different things
Search terms and Topics are measured differently, so relatedTopics
will not work with comparisons that contain both Search terms and Topics.
This leads to duplicate entries.
This is something easily observable in the Google Trends UI, which sometimes offers several topics for the same phrase.
2. Disproportionate results
When using the interestbyregion
module, a higher value means a higher proportion of all queries, not a higher absolute query count.
So a small country where 80% of the queries are for “Google” will get twice the score of a giant country where only 40% of the queries are for that term.
3. Keyword Length Limitations
Google returns a response with code 400 when a keyword is > 100 characters.
4. All data is relative, not absolute
The data Google Trends shows you are relative, not absolute. Forbes Baxter Associates explains this neatly:
Look at the chart for searches in 2019. When you see the red line on the chart reaching 100 in about June, it doesn’t mean there were 100 searches for that term in June. It means that was the most popular search in 2019 and that it hit its peak in June.
5. The categories are unreliable at best.
There are some top-level categories, but they are not representative of the real interest and data.
There are cases where the categories and the data don’t represent the real-life operations, and this may be due to a lack of understanding from the searcher, falsely attributed intent, or an algorithm bug.
Another limitation is that you can only pick one category. But if you need to choose more than one due to a discrepancy between the data in the two categories, then this becomes a challenge for the next steps in data consolidation, visualization, and analysis.
6. You can only provide five entries per chart.
This can be really annoying. If you are using the API for professional purposes, such as analyzing a particular market, this makes the reporting really challenging.
Most markets have more than five competitors in them. Most topics have more than five keywords in them. Comparisons need context in order to work.
What API Methods are available with the Google Trends API?
The following API methods are available:
autoComplete
Returns the results from the “Add a search term” input box in the google trends UI.
#install pytrends !pip install pytrends #import the libraries import pandas as pd from pytrends.request import TrendReq pytrend = TrendReq() # Get Google Keyword Suggestions keywords = pytrend.suggestions(keyword='Facebook') df = pd.DataFrame(keywords) df.head(5)
dailyTrends
Daily Search Trends highlights searches that jumped significantly in traffic among all searches over the past 24 hours and updates hourly.
These trends highlight specific queries that were searched, and an absolute number of searches made.
20 daily trending search results are returned. Here, a retroactive search for up to 15 days back can also be performed.
#install pytrends !pip install pytrends #import the libraries import pandas as pd from pytrends.request import TrendReq pytrend = TrendReq() #get today's treniding topics trendingtoday = pytrend.today_searches(pn='US') trendingtoday.head(20)
You can also get the topics that were trending historically, for instance for a particular year.
# Get Google Top Charts df = pytrend.top_charts(2020, hl='en-US', tz=300, geo='GLOBAL') df.head()
Output:
interestOverTime
Numbers represent search interest relative to the highest point on the chart for the given region and time.
If you use multiple keywords for comparison, the return data will also contain an average result for each keyword.
You can check the regional interest for multiple search terms.
#import the libraries import pandas as pd from pytrends.request import TrendReq pytrend = TrendReq() #provide your search terms kw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google'] #search interest per region #run model for keywords (can also be competitors) pytrend.build_payload(kw_list, timeframe='today 1-m') # Interest by Region regiondf = pytrend.interest_by_region() #looking at rows where all values are not equal to 0 regiondf = regiondf[(regiondf != 0).all(1)] #drop all rows that have null values in all columns regiondf.dropna(how='all',axis=0, inplace=True) #visualise regiondf.plot(figsize=(20, 12), y=kw_list, kind ='bar')
You can also get historical interest by specifying a time period.
#historical interest historicaldf = pytrend.get_historical_interest(kw_list, year_start=2020, month_start=10, day_start=1, hour_start=0, year_end=2021, month_end=10, day_end=1, hour_end=0, cat=0, geo='', gprop='', sleep=0) #visualise #plot a timeseries chart historicaldf.plot(figsize=(20, 12)) #plot seperate graphs, using theprovided keywords historicaldf.plot(subplots=True, figsize=(20, 12))
This has to be my favorite one as it enables super cool additional projects such as forecasting, calculating the share of search (if using competitors as input) and other cool mini-projects.
interestByRegion
This allows examining search term popularity based on location during the specified time frame.
Values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular, and a value of 0 indicates a location where the term was less than 1% as popular as the peak.
#install pytrends !pip install pytrends #import the libraries import pandas as pd from pytrends.request import TrendReq #create model pytrend = TrendReq() #provide your search terms kw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google'] #get interest by region for your search terms pytrend.build_payload(kw_list=kw_list) df = pytrend.interest_by_region() df.head(10)
realtimeTrends
Realtime Search Trends highlight stories that are trending across Google surfaces within the last 24 hours and are updated in real-time.
#install pytrends !pip install pytrends #import the libraries import pandas as pd from pytrends.request import TrendReq pytrend = TrendReq() # Get realtime Google Trends data df = pytrend.trending_searches(pn='united_states') df.head()
Users searching for your term also searched for these queries. The following metrics are returned:
- Top — The most popular search queries. Scoring is on a relative scale where a value of 100 is the most commonly searched query, 50 is a query searched half as often, and a value of 0 is a query searched for less than 1% as often as the most popular query.
- Rising — Queries with the biggest increase in search frequency since the last time period. Results marked “Breakout” had a tremendous increase, probably because these queries are new and had few (if any) prior searches.
Check out the full code in the Collab link.
#install pytrends !pip install pytrends #import the libraries import pandas as pd from pytrends.request import TrendReq from google.colab import files #build model pytrend = TrendReq() #provide your search terms kw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google'] pytrend.build_payload(kw_list=kw_list) #get related queries related_queries = pytrend.related_queries() related_queries.values() #build lists dataframes top = list(related_queries.values())[0]['top'] rising = list(related_queries.values())[0]['rising'] #convert lists to dataframes dftop = pd.DataFrame(top) dfrising = pd.DataFrame(rising) #join two data frames joindfs = [dftop, dfrising] allqueries = pd.concat(joindfs, axis=1) #function to change duplicates cols=pd.Series(allqueries.columns) for dup in allqueries.columns[allqueries.columns.duplicated(keep=False)]: cols[allqueries.columns.get_loc(dup)] = ([dup + '.' + str(d_idx) if d_idx != 0 else dup for d_idx in range(allqueries.columns.get_loc(dup).sum())] ) allqueries.columns=cols #rename to proper names allqueries.rename({'query': 'top query', 'value': 'top query value', 'query.1': 'related query', 'value.1': 'related query value'}, axis=1, inplace=True) #check your dataset allqueries.head(50) #save to csv allqueries.to_csv('allqueries.csv') #download from collab files.download("allqueries.csv")
Users searching for your term also searched for these topics. The following metrics are returned:
- Top — The most popular topics. Scoring is on a relative scale where a value of 100 is the most commonly searched topic, a value of 50 is a topic searched half as often, and a value of 0 is a topic searched for less than 1% as often as the most popular topic.
- Rising — Related topics with the biggest increase in search frequency since the last time period. Results marked “Breakout” had a tremendous increase, probably because these topics are new and had few (if any) prior searches.
The syntax here is the same as above, with the change only in two rows, where related_queries
are mentioned:
# Related Topics, returns a dictionary of dataframes related_topic = pytrend.related_topics() related_topic.values()
What to look out for next…
Hope you enjoyed this exploration.
You can find all of the code compiled into one Collab below ( ⬇️ Scroll down to the bottom of the page to view 🚀)
Thanks for reading. Check out these resources created by brilliant people:
- Google Trends API exploration: Google Trends API for Python
- Script as a function for getting daily search data using Pytrends: pytrends/dailydata.py at master · GeneralMills/pytrends