The Ultimate Guide to PyTrends: the Google Trends API (with Python code examples)

cloth with artistic design
Photo by Frank Cone on Pexels.com

Frequently Asked Questions

Why use the Google Trends API instead of the Google Trends Web interface?

There is no problem with just using the web interface, however, when doing a large-scale project, which requires building a large dataset — this might become very cumbersome. Manually researching and copying data from the Google Trends site is a research and time-intensive process. When using an API, this time and effort are cut dramatically.

Are there any limitations to using the Pytrends Google Trends API?

Yes, there are. Before you begin you must be aware of these few things:
1) Search terms and topics are two different things.
2) The results are disproportionate.
3) There are keyword length limitations.
4) All data is relative, not absolute.
5) The categories are unreliable at best.
6) You can only provide five entries per chart.

What do Google Trends values actually denote?

According to Google Trends, the values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular. A value of 0 indicates a location where there was not enough data for this term.

Explore search data at scale

Google Trends is a public platform that you can use to analyze interest over time for a given topic, search term, and even company. 

Pytrends is an unofficial Google Trends API that provides different methods to download reports of trending results from google trends. The Python package can be used for automation of different processes such as quickly fetching data that can be used for more analyses later on. 

In this article, I will share some insights on what you can do with Pytrends, how to do basic data pulls, providing snippets of Python code along the way. I will also answer some FAQs about Google Trends and most importantly — address the limitations of using the API and the data.

What data can you pull with the Google Trends API? 

Related to a particular keyword you provide to the API, you can pull the following data:

  • Interest Over Time
  • Historical Hourly Interest
  • Interest by Region
  • Related Topics
  • Related Queries
  • Trending Searches
  • Top Charts
  • Keyword Suggestions

We will explore the different methods that are available in the API for pulling this data in a bit, alongside how the syntax for each of these methods looks like. 

What parameters can you specify in your queries?

There are two objects that you can specify parameters for: 

  • optionsObject
  • callback 

The callback is an optional function, where the first parameter is an error and the second parameter is the result. If no callback is provided, then a promise is returned.

const googleTrends = require('google-trends-api');

googleTrends.apiMethod(optionsObject, [callback])

The optionsObject is an object with the following options keys:

  • keyword (required) — Target search term(s) string or array 
  • startTime —  Start of the time period of interest (new Date() object). If startTime is not provided, date of January 1, 2004, is assumed as this is the oldest available google trends data
  • endTime — End of the time period of interest (new Date() object). If endTime is not provided, the current date is selected.
  • geo — location of interest (string or arrayif you wish to provide separate locations for each keyword).
  • hl — Preferred language (string defaults to English)
  • timezone — Timezone (number defaults to the time zone difference, in minutes, from UTC to current locale (host system settings))
  • category — the category to search within (number defaults to all categories)
  • property — Google property to filter on. Defaults to a web search. (enumerated string [‘images’, ‘news’, ‘youtube’ or ‘froogle’] the latter relating to Google Shopping results)
  • resolution — Granularity of the geo search (enumerated string [‘COUNTRY’, ‘REGION’, ‘CITY’, ‘DMA’]). resolution is specific to the interestByRegion method.
  • granularTimeResolution — Boolean that dictates if the results should be given in a finer time resolution (if startTime and endTime is less than one day, this should be set to true)

Are there any limitations to using the Pytrends Google Trends API? 

Yes, there are. Before you begin you must be aware of these few things:

1. Search terms and topics are two different things

Search terms and Topics are measured differently, so relatedTopics will not work with comparisons that contain both Search terms and Topics.

This leads to duplicate entries. 

This is something easily observable in the Google Trends UI, which sometimes offers several topics for the same phrase.

2. Disproportionate results

When using the interestbyregionmodule, a higher value means a higher proportion of all queries, not a higher absolute query count.

So a small country where 80% of the queries are for “Google” will get twice the score of a giant country where only 40% of the queries are for that term.

3. Keyword Length Limitations

Google returns a response with code 400 when a keyword is > 100 characters.

4. All data is relative, not absolute

The data Google Trends shows you are relative, not absolute. Forbes Baxter Associates explains this neatly: 

Look at the chart for searches in 2019. When you see the red line on the chart reaching 100 in about June, it doesn’t mean there were 100 searches for that term in June. It means that was the most popular search in 2019 and that it hit its peak in June.

5. The categories are unreliable at best.

There are some top-level categories, but they are not representative of the real interest and data. 

There are cases where the categories and the data don’t represent the real-life operations, and this may be due to a lack of understanding from the searcher, falsely attributed intent, or an algorithm bug. 

Another limitation is that you can only pick one category. But if you need to choose more than one due to a discrepancy between the data in the two categories, then this becomes a challenge for the next steps in data consolidation, visualization, and analysis.

6. You can only provide five entries per chart.

This can be really annoying. If you are using the API for professional purposes, such as analyzing a particular market, this makes the reporting really challenging. 

Most markets have more than five competitors in them. Most topics have more than five keywords in them. Comparisons need context in order to work. 

What API Methods are available with the Google Trends API?

The following API methods are available:

autoComplete

Returns the results from the “Add a search term” input box in the google trends UI. 

#install pytrends
!pip install pytrends
#import the libraries
import pandas as pd
from pytrends.request import TrendReq
pytrend = TrendReq()
# Get Google Keyword Suggestions
keywords = pytrend.suggestions(keyword='Facebook')
df = pd.DataFrame(keywords)
df.head(5)
1*QWGJBAY64U89re540VS4jA
Author’s own

dailyTrends 

Daily Search Trends highlights searches that jumped significantly in traffic among all searches over the past 24 hours and updates hourly. 

These trends highlight specific queries that were searched, and an absolute number of searches made. 

20 daily trending search results are returned. Here, a retroactive search for up to 15 days back can also be performed. 

#install pytrends
!pip install pytrends
#import the libraries
import pandas as pd
from pytrends.request import TrendReq
pytrend = TrendReq()

#get today's treniding topics
trendingtoday = pytrend.today_searches(pn='US')
trendingtoday.head(20)

You can also get the topics that were trending historically, for instance for a particular year.

# Get Google Top Charts
df = pytrend.top_charts(2020, hl='en-US', tz=300, geo='GLOBAL')
df.head()

Output: 

1*vAf6Z4KmPMB5lFLk6ZVnIA
image by author

interestOverTime

Numbers represent search interest relative to the highest point on the chart for the given region and time. 

If you use multiple keywords for comparison, the return data will also contain an average result for each keyword.

You can check the regional interest for multiple search terms.

#import the libraries
import pandas as pd                        
from pytrends.request import TrendReq
pytrend = TrendReq()

#provide your search terms
kw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google']

#search interest per region
#run model for keywords (can also be competitors)
pytrend.build_payload(kw_list, timeframe='today 1-m')

# Interest by Region
regiondf = pytrend.interest_by_region()
#looking at rows where all values are not equal to 0
regiondf = regiondf[(regiondf != 0).all(1)]

#drop all rows that have null values in all columns
regiondf.dropna(how='all',axis=0, inplace=True)

#visualise
regiondf.plot(figsize=(20, 12), y=kw_list, kind ='bar')
1*qnowdfst41 oD12KOv8q6w
image by author

You can also get historical interest by specifying a time period. 

#historical interest
historicaldf = pytrend.get_historical_interest(kw_list, year_start=2020, month_start=10, day_start=1, hour_start=0, year_end=2021, month_end=10, day_end=1, hour_end=0, cat=0, geo='', gprop='', sleep=0)

#visualise
#plot a timeseries chart
historicaldf.plot(figsize=(20, 12))

#plot seperate graphs, using theprovided keywords
historicaldf.plot(subplots=True, figsize=(20, 12))

This has to be my favorite one as it enables super cool additional projects such as forecasting, calculating the share of search (if using competitors as input) and other cool mini-projects. 

interestByRegion 

This allows examining search term popularity based on location during the specified time frame.

Values are calculated on a scale from 0 to 100, where 100 is the location with the most popularity as a fraction of total searches in that location, a value of 50 indicates a location that is half as popular, and a value of 0 indicates a location where the term was less than 1% as popular as the peak.

#install pytrends
!pip install pytrends

#import the libraries
import pandas as pd                        
from pytrends.request import TrendReq

#create model
pytrend = TrendReq()

#provide your search terms
kw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google']

#get interest by region for your search terms
pytrend.build_payload(kw_list=kw_list)
df = pytrend.interest_by_region()
df.head(10)

realtimeTrends

Realtime Search Trends highlight stories that are trending across Google surfaces within the last 24 hours and are updated in real-time. 

#install pytrends
!pip install pytrends
#import the libraries
import pandas as pd                        
from pytrends.request import TrendReq
pytrend = TrendReq()
# Get realtime Google Trends data
df = pytrend.trending_searches(pn='united_states')
df.head()

relatedQueries

Users searching for your term also searched for these queries. The following metrics are returned:

  • Top — The most popular search queries. Scoring is on a relative scale where a value of 100 is the most commonly searched query, 50 is a query searched half as often, and a value of 0 is a query searched for less than 1% as often as the most popular query.
  • Rising — Queries with the biggest increase in search frequency since the last time period. Results marked “Breakout” had a tremendous increase, probably because these queries are new and had few (if any) prior searches.

Check out the full code in the Collab link.

#install pytrends
!pip install pytrends

#import the libraries
import pandas as pd                        
from pytrends.request import TrendReq
from google.colab import files

#build model
pytrend = TrendReq()

#provide your search terms
kw_list=['Facebook', 'Apple', 'Amazon', 'Netflix', 'Google']
pytrend.build_payload(kw_list=kw_list)


#get related queries
related_queries = pytrend.related_queries()
related_queries.values()

#build lists dataframes

top = list(related_queries.values())[0]['top']
rising = list(related_queries.values())[0]['rising']

#convert lists to dataframes

dftop = pd.DataFrame(top)
dfrising = pd.DataFrame(rising)

#join two data frames
joindfs = [dftop, dfrising]
allqueries = pd.concat(joindfs, axis=1)

#function to change duplicates

cols=pd.Series(allqueries.columns)
for dup in allqueries.columns[allqueries.columns.duplicated(keep=False)]: 
    cols[allqueries.columns.get_loc(dup)] = ([dup + '.' + str(d_idx) 
                                     if d_idx != 0 
                                     else dup 
                                     for d_idx in range(allqueries.columns.get_loc(dup).sum())]
                                    )
allqueries.columns=cols

#rename to proper names

allqueries.rename({'query': 'top query', 'value': 'top query value', 'query.1': 'related query', 'value.1': 'related query value'}, axis=1, inplace=True) 

#check your dataset
allqueries.head(50)

#save to csv
allqueries.to_csv('allqueries.csv')

#download from collab
files.download("allqueries.csv")

relatedTopics

Users searching for your term also searched for these topics. The following metrics are returned:

  • Top — The most popular topics. Scoring is on a relative scale where a value of 100 is the most commonly searched topic, a value of 50 is a topic searched half as often, and a value of 0 is a topic searched for less than 1% as often as the most popular topic.
  • Rising — Related topics with the biggest increase in search frequency since the last time period. Results marked “Breakout” had a tremendous increase, probably because these topics are new and had few (if any) prior searches.

The syntax here is the same as above, with the change only in two rows, where related_queries are mentioned:

# Related Topics, returns a dictionary of dataframes
related_topic = pytrend.related_topics()
related_topic.values()

What to look out for next… 

Hope you enjoyed this exploration. 

You can find all of the code compiled into one Collab below ( ⬇️ Scroll down to the bottom of the page to view 🚀)

Thanks for reading. Check out these resources created by brilliant people: