Explanation why one is not able to use PyOData (from SAP) to query Seattle’s open data
Here I demonstrate why trying to use PyOData library (from SAP) to query the Seattle open data’s OData endpoint is going to give you nightmares.
The PyOData library seems to make the assumption that the XML atom title and href have the same value (which is the case with the Northwind example dataset), however, Seattle’s open data OData endpoint assigns a unique SODA identifier as the href and use the atom title to document the dataset name.
This misalignment is not picked up by the PyOData parser and thus whey attempting to query the endpoint of a specific dataset, the title of the Seattle open data dataset will be used instead of the unique identifier that is logged in the XML href field. The result is an HTTP 400 response code.
Resolution
Use the SODA API endpoint instead. It returns JSON as well, which is my preferred data format.
Using PyOData to query Northwind data
No problem here, we are able to query the OData endpoint and query the Customers
entity.
svc = get_pyodata_service_northwind()
print(type(svc))
print()
entity_proxy = getattr(svc.entity_sets, "Customers")
print(type(entity_proxy))
print()
for item in entity_proxy.get_entities().execute():
print(f"{item.CustomerID}: {item.CompanyName} - {item.ContactName}")
Output:
<class 'pyodata.v2.service.Service'>
<class 'pyodata.v2.service.EntitySetProxy'>
ALFKI: Alfreds Futterkiste - Maria Anders
ANATR: Ana Trujillo Emparedados y helados - Ana Trujillo
ANTON: Antonio Moreno Taquería - Antonio Moreno
AROUT: Around the Horn - Thomas Hardy
BERGS: Berglunds snabbköp - Christina Berglund
BLAUS: Blauer See Delikatessen - Hanna Moos
BLONP: Blondesddsl père et fils - Frédérique Citeaux
BOLID: Bólido Comidas preparadas - Martín Sommer
BONAP: Bon app - Laurence Lebihan
BOTTM: Bottom-Dollar Markets - Elizabeth Lincoln
BSBEV: B's Beverages - Victoria Ashworth
CACTU: Cactus Comidas para llevar - Patricio Simpson
CENTC: Centro comercial Moctezuma - Francisco Chang
CHOPS: Chop-suey Chinese - Yang Wang
COMMI: Comércio Mineiro - Pedro Afonso
CONSH: Consolidated Holdings - Elizabeth Brown
DRACD: Drachenblut Delikatessen - Sven Ottlieb
DUMON: Du monde entier - Janine Labrune
EASTC: Eastern Connection - Ann Devon
ERNSH: Ernst Handel - Roland Mendel
[error] Using PyOData to query Seattle Open Data
Attempting to do the same throws error because the entity_proxy attempts to use the atom title field value (e.g., “Building Certificates of Occupancy”), but the dataset is exposed at the endpoint using the unique ID (e.g., “axkr-2p68”).
When trying to query the entity using the unique ID, the query also failed because the PyOData parser picked up the atom xml title instead of the href for the identifier for each dataset.
svc = get_pyodata_service_spd()
print(type(svc))
print()
entity_proxy = getattr(svc.entity_sets, "Building Certificates of Occupancy")
print(type(entity_proxy))
print()
try:
entity_proxy.get_entities().execute()
except BaseException as e:
print("HTTP error code 400~~")
print()
try:
entity_proxy = getattr(svc.entity_sets, "axkr-2p68")
print(type(entity_proxy))
print()
except AttributeError as e:
print("Unable to get 'axkr-2p68'~~")
print("Error msg: EntitySet axkr-2p68 not defined in...")
print()
Output:
<class 'pyodata.v2.service.Service'>
<class 'pyodata.v2.service.EntitySetProxy'>
HTTP error code 400~~
Unable to get 'axkr-2p68'~~
Error msg: EntitySet axkr-2p68 not defined in...
What is wrong?
Here is a quick look at the XML.
In the Northwind XML we see that the atom title and the href matches.
However, for the Seattle open data OData endpoint, the atom title and the href mismatches.
get_first_500_char_of_northwind_odata_xml()
'<?xml version="1.0" encoding="utf-8"?><service xml:base="https://services.odata.org/V2/northwind/Northwind.svc/" xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom"><workspace><atom:title>Default</atom:title><collection href="Categories"><atom:title>Categories</atom:title></collection><collection href="CustomerDemographics"><atom:title>CustomerDemographics</atom:title></collection><collection href="Customers"><atom:title>Customers</atom:title></collection><collection href'
get_first_500_char_of_spd_odata_xml()
'<?xml version="1.0" encoding="UTF-8"?>\n<service xml:base="https://data.seattle.gov/OData.svc/" xmlns="http://www.w3.org/2007/app" xmlns:atom="http://www.w3.org/2005/Atom"><workspace><atom:title>Default</atom:title><collection href="axkr-2p68"><atom:title>Building Certificates of Occupancy</atom:title></collection><collection href="9gb2-237x"><atom:title>Monthly Tow Statistics 2021-Current</atom:title></collection><collection href="SDOTDW"><atom:title>SDOT Drawbridge Status</atom:title></collecti'
Function definitions
import pyodata
import requests
def get_pyodata_service_northwind() -> pyodata.v2.service.Service:
"""Returns the Service object from PyOData for the example Northwind dataset.
"""
SERVICE_URL_NORTHWIND = 'http://services.odata.org/V2/Northwind/Northwind.svc/'
HTTP_LIB_NORTHWIND = requests.Session()
return pyodata.Client(SERVICE_URL_NORTHWIND, HTTP_LIB_NORTHWIND)
def get_pyodata_service_spd() -> pyodata.v2.service.Service:
"""Return the Service object from PyOData for the Seattle Open Data example.
One of the example entities >> https://data.seattle.gov/OData.svc/33kz-ixgy
"""
SERVICE_URL_SPD = "https://data.seattle.gov/OData.svc"
HTTP_LIB_SPD = requests.Session()
return pyodata.Client(SERVICE_URL_SPD, HTTP_LIB_SPD)
def get_first_500_char_of_spd_odata_xml():
url_example = "https://data.seattle.gov/OData.svc"
response = requests.get(url_example)
return response.text[:500]
def get_first_500_char_of_northwind_odata_xml():
url_example = "http://services.odata.org/V2/Northwind/Northwind.svc/"
response = requests.get(url_example)
return response.text[:500]
def get_northwind_example_entity_url_last_segment():
northwind_service = get_pyodata_service_northwind()
northwind_entity_sets = northwind_service.entity_sets
northwind_category_entity = getattr(northwind_entity_sets, "Categories")
northwind_category_entity_url_last_segment = northwind_category_entity.last_segment
return northwind_category_entity_url_last_segment
def get_spd_example_entity_url_last_segment():
spd_service = get_pyodata_service_spd()
spd_entity_sets = spd_service.entity_sets
spd_example_entity = getattr(spd_entity_sets, "Building Certificates of Occupancy")
spd_example_entity_url_last_segment = spd_example_entity.last_segment
return spd_example_entity_url_last_segment