Using the Taxonomy from Python

This is a tutorial on how to use the Taxonomy API from Python. We will be using the Requests library to make HTTP requests. Start by installing it, and then, in a new script, import it:

import requests

The GraphQL API is arguably the easiest API to use. You may want to have a look at the GraphQL documentation along with this tutorial.

We will be talking against a server at the following address:

host = "https://taxonomy.api.jobtechdev.se"

and in order to display results, we will import a pretty printing module.

import pprint
pp = pprint.PrettyPrinter(width=41, compact=True)

Querying versions using GraphQL

The taxonomy has several versions. Let's start by querying what versions there are:

response = requests.get(host + "/v1/taxonomy/graphql", params={"query": """query MyQuery {
  versions {
    id
  }
}"""})

version_data = response.json()["data"]["versions"]
print("Last three versions:")
pp.pprint(version_data[-3:])

Output:

Last three versions:
[{'id': 20}, {'id': 21}, {'id': 22}]

Let's pick out the latest version:

latest_version = max([x["id"] for x in version_data])

pp.pprint(latest_version)

Output:

22

Querying concepts using GraphQL

If we want to, we can modify our query to include concept data from every version. To keep things concise, we will limit ourselves to the latest version and only pick two concepts. Note that we pass in a separate json-coded map with variables.

import json

query = """query MyQuery($version:VersionRef) {
  versions(from: $version, to: $version) {
    id
    concepts(limit: 2) {
      id
      preferred_label
    }
  }
}"""

variables = json.dumps({"version": "21"})

response = requests.get(
    host + "/v1/taxonomy/graphql",
    params={"query": query, "variables": variables})
    
pp.pprint(response.json())

and we get

Output:

{'data': {'versions': [{'concepts': [{'id': 'ghs4_JXU_BYt',
                                      'preferred_label': 'Planeringsarkitekt/Fysisk '
                                                         'planerare'},
                                     {'id': 'GPNi_fJR_B2B',
                                      'preferred_label': 'Inredningsdesigner'}],
                        'id': 21}]}}

We could also have approached this by directly querying the concepts:

query = """query MyQuery($version: VersionRef) {
  concepts(version: $version, limit: 2) {
    id
    preferred_label
  }
}"""

response = requests.get(
    host + "/v1/taxonomy/graphql",
    params={"query": query, "variables": variables})

pp.pprint(response.json())

and we get

Output:

{'data': {'concepts': [{'id': 'ghs4_JXU_BYt',
                        'preferred_label': 'Planeringsarkitekt/Fysisk '
                                           'planerare'},
                       {'id': 'GPNi_fJR_B2B',
                        'preferred_label': 'Inredningsdesigner'}]}}

What concepts are returned differ from query to query.

Querying concept types using GraphQL

To see what concept types there are in the latest version we can do

query = """query MyQuery {
  concept_types {
    id
    label_en
  }
}"""

response = requests.get(host + "/v1/taxonomy/graphql", params={"query": query})
concept_types = response.json()["data"]["concept_types"]
n = len(concept_types)
print(f"There are {n} concept types in the latest version")
print("Here are some:")
pp.pprint(concept_types[0:3])

We only print three concept types.

Output:

There are 52 concept types in the latest version
Here are some:
[{'id': 'occupation-experience-year',
  'label_en': 'Occupation experience '
              '(time)'},
 {'id': 'forecast-occupation',
  'label_en': 'Forecast occupation'},
 {'id': 'occupation-field',
  'label_en': 'Occupation field'}]

but we can also look at the concept types for a specific version:

query = """query MyQuery($version:VersionRef) {
  concept_types(version: $version) {
    id
    label_en
  }
}"""

variables = json.dumps({"version": "1"})
response = requests.get(host + "/v1/taxonomy/graphql", params={"query": query, "variables": variables})
n = len(response.json()["data"]["concept_types"])
print(f"There are {n} concepts in the first version.")

Output:

There are 36 concepts in the first version.