« back

Wikidata Query Service in Python

We'll use SPARQLWrapper to query Wikidata via Wikidata Query Service (WDQS) and Pandas (which is probably already installed) for working with the results:

pip install sparqlwrapper

We'll want to specify WDQS's SPARQL endpoint ("https://query.wikidata.org/bigdata/namespace/wdq/sparql" or the alias "https://query.wikidata.org/sparql"):

In [38]:
from SPARQLWrapper import SPARQLWrapper, JSON
import pandas as pd
In [42]:
sparql = SPARQLWrapper("https://query.wikidata.org/sparql")
In [46]:
# From https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Cats
sparql.setQuery("""
SELECT ?item ?itemLabel
WHERE
{
    ?item wdt:P31 wd:Q146 .
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
""")
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
In [47]:
results_df = pd.io.json.json_normalize(results['results']['bindings'])
results_df[['item.value', 'itemLabel.value']].head()
Out[47]:
item.value itemLabel.value
0 http://www.wikidata.org/entity/Q25393350 Tomba
1 http://www.wikidata.org/entity/Q25471040 Pixel
2 http://www.wikidata.org/entity/Q27190410 Gladstone
3 http://www.wikidata.org/entity/Q27739753 Sister Cream
4 http://www.wikidata.org/entity/Q27744042 Bob