Notebook 04 — Wikibase Data Model

Project: Linked Open Exhibition — NFDI4Culture / Hochschule Hannover (BIM-126-02)
AI attribution: GitHub Copilot (Claude Sonnet 4.6)
Requires: .env file with WB_URL, WB_USER, WB_PASSWORD

Purpose: Define and upload the minimal exhibition data model to the project Wikibase instance. Creates item classes (Q items) and properties (P items) needed by Notebook 05.


Background: Wikibase vs Wikidata

Wikidata is the public, community-maintained knowledge base hosted by the Wikimedia Foundation. Wikibase is the underlying open-source software — you can run your own Wikibase instance with your own data model and properties, independent of Wikidata.

The project instance is at: https://wikibase.wbworkshop.tibwiki.io/

Data model alignment

The model uses WB4R (Wikibase 4 Research) as a reference (see WB4R/wikibase_generic_model.csv) and maps to CIDOC-CRM.

Item classes (Q items)

Label Description WB4R class
Exhibition A temporary display of artworks event
Exhibition Catalogue A publication accompanying an exhibition bibliographic_work
Sprengel Museum Hannover The art museum in Hannover place

Properties (P items)

Label Data type WB4R ref CIDOC-CRM
instance of Item P1 E55 Type
title Monolingual text P4 E35 Title
start date Time P10 P82a
end date Time P11 P82b
location Item P38 P53
GND ID External identifier P102
DNB IDN External identifier
exhibition catalogue Item P46
image Local media

Note on image (P12): Uses the localMedia datatype — files are hosted on the project’s own MediaWiki instance, not Wikimedia Commons. If P12 was previously created with the wrong commonsMedia type, delete it in Wikibase and re-run this notebook to recreate it correctly.

Idempotency: Before creating any property or item, the notebook checks whether it already exists by searching for its label. If found, the existing ID is reused and nothing is overwritten.

import os, json
from pathlib import Path
from dotenv import load_dotenv
from wikibaseintegrator import WikibaseIntegrator, wbi_login
from wikibaseintegrator.wbi_config import config as wbi_config
from wikibaseintegrator.wbi_enums import WikibaseDatatype

env_path = Path("../../.env")
load_dotenv(dotenv_path=env_path)

WB_URL  = os.getenv("WB_URL", "https://wikibase.wbworkshop.tibwiki.io")
WB_USER = os.getenv("WB_USER")
WB_PASS = os.getenv("WB_PASSWORD")

if not WB_USER or not WB_PASS:
    raise EnvironmentError("Set WB_USER and WB_PASSWORD in your .env file.")

wbi_config["MEDIAWIKI_API_URL"]   = f"{WB_URL}/w/api.php"
wbi_config["SPARQL_ENDPOINT_URL"] = f"{WB_URL}/query/sparql"
wbi_config["WIKIBASE_URL"]        = WB_URL

login_instance = wbi_login.Login(user=WB_USER, password=WB_PASS)
wbi = WikibaseIntegrator(login=login_instance)

print(f"Connected to: {WB_URL}")
Connected to: https://wikibase.wbworkshop.tibwiki.io

Helper — find or create a property

import requests as req
from wikibaseintegrator.wbi_exceptions import MWApiError

def _fetch_entity_labels(prefix, max_n=20):
    """Fetch en labels for {prefix}1..{prefix}{max_n} via wbgetentities. Returns label→id dict."""
    ids = [f"{prefix}{i}" for i in range(1, max_n + 1)]
    resp = req.get(
        wbi_config["MEDIAWIKI_API_URL"],
        params={
            "action": "wbgetentities",
            "ids": "|".join(ids),
            "props": "labels",
            "languages": "en",
            "format": "json",
        },
        timeout=15,
    )
    label_to_id = {}
    for eid, entity in resp.json().get("entities", {}).items():
        if "missing" in entity:
            continue
        label = entity.get("labels", {}).get("en", {}).get("value", "")
        if label:
            label_to_id[label.lower()] = eid
    return label_to_id

# Build lookup maps once (re-run this cell to refresh)
_prop_lookup = _fetch_entity_labels("P", max_n=25)
_item_lookup = _fetch_entity_labels("Q", max_n=25)
print(f"Known properties: {_prop_lookup}")
print(f"Known items:      {_item_lookup}")


def get_or_create_property(label, description, datatype):
    """Return existing property PID or create a new one."""
    existing = _prop_lookup.get(label.lower())
    if existing:
        print(f"  Property already exists: {existing}{label}")
        return existing

    prop = wbi.property.new(datatype=datatype)
    prop.labels.set(language="en", value=label)
    prop.descriptions.set(language="en", value=description)
    prop.write(summary="Bot: create exhibition data model property")
    pid = prop.id
    _prop_lookup[label.lower()] = pid
    print(f"  Created property: {pid}{label}")
    return pid


def get_or_create_item(label, description):
    """Return existing item QID or create a new one."""
    existing = _item_lookup.get(label.lower())
    if existing:
        print(f"  Item already exists: {existing}{label}")
        return existing

    item = wbi.item.new()
    item.labels.set(language="en", value=label)
    item.descriptions.set(language="en", value=description)
    item.write(summary="Bot: create exhibition data model item")
    qid = item.id
    _item_lookup[label.lower()] = qid
    print(f"  Created item: {qid}{label}")
    return qid
Known properties: {'instance of': 'P1', 'title': 'P2', 'start date': 'P3', 'end date': 'P4', 'location': 'P5', 'gnd id': 'P6', 'dnb idn': 'P7', 'exhibition catalogue': 'P8'}
Known items:      {'sanboxitem': 'Q1', 'exhibition': 'Q3', 'exhibition catalogue': 'Q4', 'sprengel museum hannover': 'Q5', 'niki de saint phalle - die grotte': 'Q6', 'niki de saint phalle - the grotto': 'Q7', 'feministische avantgarde': 'Q8', 'grethe jürgens': 'Q9', 'lillien grupe - realität(en)?': 'Q10', 'love you for infinity': 'Q11', 'on lies, secrets and silence': 'Q13', 'peter heber - über das sterben': 'Q14', 'verfemt - gehandelt': 'Q15', 'bastian hoffmann - radical negation': 'Q16', 'how to be an artist like me': 'Q17', 'nahaufnahmen': 'Q18', 'peter tuma - aufkommende unruhe': 'Q19', 'porträt einer sammlung - sprengel museum hannover': 'Q20', 'subjective evidence': 'Q21', 'adrian sauer: truth table': 'Q22', 'christian retschlag': 'Q23', 'fotografien der fotografie': 'Q24', 'kunst und künstler*innen in hannover im nationalsozialismus': 'Q25'}

Step 1 — Create properties

PROPERTIES = [
    ("instance of",         "the type or class this item is an instance of",           WikibaseDatatype.ITEM),
    ("title",               "the title of a work or event",                             WikibaseDatatype.MONOLINGUALTEXT),
    ("start date",          "date when the event or period began",                      WikibaseDatatype.TIME),
    ("end date",            "date when the event or period ended",                      WikibaseDatatype.TIME),
    ("location",            "place where an event was held",                            WikibaseDatatype.ITEM),
    ("GND ID",              "identifier in the GND (Gemeinsame Normdatei) authority file", WikibaseDatatype.EXTERNALID),
    ("DNB IDN",             "unique record identifier in the DNB catalogue",            WikibaseDatatype.EXTERNALID),
    ("exhibition catalogue","publication accompanying an exhibition",                   WikibaseDatatype.ITEM),
    ("image",               "cover image file in the project MediaWiki instance",       WikibaseDatatype.LOCALMEDIA),
]

prop_map = {}
print("Creating / verifying properties:")
for label, desc, dtype in PROPERTIES:
    pid = get_or_create_property(label, desc, dtype)
    prop_map[label] = pid

print("\nProperty map:", prop_map)
Creating / verifying properties:
  Property already exists: P1 — instance of
  Property already exists: P2 — title
  Property already exists: P3 — start date
  Property already exists: P4 — end date
  Property already exists: P5 — location
  Property already exists: P6 — GND ID
  Property already exists: P7 — DNB IDN
  Property already exists: P8 — exhibition catalogue
  Created property: P13 — image

Property map: {'instance of': 'P1', 'title': 'P2', 'start date': 'P3', 'end date': 'P4', 'location': 'P5', 'GND ID': 'P6', 'DNB IDN': 'P7', 'exhibition catalogue': 'P8', 'image': 'P13'}

Step 2 — Create item classes

ITEMS = [
    ("Exhibition",                 "a temporary display of artworks or cultural objects"),
    ("Exhibition Catalogue",       "a publication accompanying an exhibition, documenting exhibited works"),
    ("Sprengel Museum Hannover",   "art museum in Hannover, Germany, specialising in modern and contemporary art"),
]

item_map = {}
print("Creating / verifying items:")
for label, desc in ITEMS:
    qid = get_or_create_item(label, desc)
    item_map[label] = qid

print("\nItem map:", item_map)
Creating / verifying items:
  Item already exists: Q3 — Exhibition
  Item already exists: Q4 — Exhibition Catalogue
  Item already exists: Q5 — Sprengel Museum Hannover

Item map: {'Exhibition': 'Q3', 'Exhibition Catalogue': 'Q4', 'Sprengel Museum Hannover': 'Q5'}

Step 3 — Save the property/item map to JSON

Notebook 05 reads this file to look up the correct local P-IDs for each property.

MAP_PATH = Path("../wikibase_property_map.json")

combined_map = {"properties": prop_map, "items": item_map}
MAP_PATH.write_text(json.dumps(combined_map, indent=2, ensure_ascii=False), encoding="utf-8")

print(f"Saved property/item map → {MAP_PATH.resolve()}")
print(json.dumps(combined_map, indent=2))
Saved property/item map → C:\git\linked-open-exhibition\catalogues\wikibase_property_map.json
{
  "properties": {
    "instance of": "P1",
    "title": "P2",
    "start date": "P3",
    "end date": "P4",
    "location": "P5",
    "GND ID": "P6",
    "DNB IDN": "P7",
    "exhibition catalogue": "P8",
    "image": "P13"
  },
  "items": {
    "Exhibition": "Q3",
    "Exhibition Catalogue": "Q4",
    "Sprengel Museum Hannover": "Q5"
  }
}

Next step: Run 05_wikibase_upload.ipynb to create one Wikibase item per exhibition from the CSV.