A prototype publication of a fictional 'exhibition catalogue' based on a Wikidata based collection of seventeenth century painting from the Bavarian State Painting Collections. The prototype shows how with a compuational publishing pipeline different distributed linked open data (LOD) sources can be brough together in a multi-format computational publication — allowing for asynchronous collaborative working. Distributed LOD sources include: Wikidata/base, Nextcloud, Thoth, Semantic Kompakkt, TIB AV Portal, and more.
Prototype series: Baroque TOC
Coordinated by Simon Worthington - NFDI4Culture @Open Science Lab, TIB, Hannover
Publishers catalogue demo: ScholarLed A catalogue of ScholarLed presses built on a Quarto / Jupyter Notebook model for computational publishing. The publication is automatically updated daily to reflect any new books added by the publishers.
Proof of concept #1 - Computational Publication: Computational Publishing for Collections - ADA CP Prototype #1 - Nov 22
Proof of concept #2 - To be confirmed, completion for end of April 2023. This contains all parts fully rendered: Cover, colophon, essay, collection, graph, TIB AV Portal, Semantic Kompakkt
semanticClimate: To be confirmed - customised research papers readers made for regional climate change action plans based on IPCC reports and sourcing content from open research repositories.
FSCI Summer School - publishing from collections class: To be confirmed, July 2023
About the prototype
Publication type: Use case - An exhibition catalogue
We are creating a demonstration prototype: An exhibition catalogue about a baroque painting collection.
Objectives:
Write an exhibition catalogue essay using AI tools
Review the 'catalogue essay and AI tools' as open peer review
Create the parts of the the catalogue:
Cover
Colophon
Essay
Collection
What is the collection?
The catalogue uses part of a Wikidata based collection of Bavarian collections of Baroque paintings. See: 17C Bavarian painting.
We focus on the Baroque period: Bavarian Collections, 1590-1750 query link
We make a small collection of paintings - 9 in this case.
How are we using computational publishing and what is the prototype experiment?
Creating a publication from different distributed (federated) remote sources using linked open data.
Showing how asyncronous work can be carried out by team working on a single publication - this is the power of the TOC part! Which in more advanced domains becomes package management.
Learning points
Workflow activities that will be covered to create the exhibition catalogue:
Real-time collaborative editing,
Creating a Wikidata query of a collection,
Displaying a painting catalogue sample collection from Wikidata LOD query for a multi-format publication.
Editing a Jupyter Notebook in MyBinder,
Embedding media objects: Video - TIB AV Portal, and; Semantic Kompakkt,
Using GitHub
Accessing API content for colophon
Editing Wikidata collection query in Juypter Notebooks
Asycrononous collective working and making a publication from multiple remote Linked Open Data (LOD) sources, and
Rendering a multi-format publication with CSS styling.
Software (open-source)
Over 2023/24 the computational components will be added to the ADA Semantic Publishing Pipeline as well as introducing Vivliostyle Create Book markdown renderer and swapping to Jupyter Book computational book platform away from Quarto – https://github.com/NFDI4Culture/ada
Worthington, S. (2022). Designing an Open Peer Review Process for Open Access Guides. Community-Led Open Publication Infrastructures for Monographs (COPIM). https://doi.org/10.21428/785a6451.e0245b43
Activity: Editing a Jupyter Notebooks and accessing video
Objective: Running and editing Juypter Notebooks in MyBinder and retrieving video and 3D models as embeds.
External LOD and media used: TIB AV Portal, and Semantic Kompakkt
3D view size, we can make the initial view bigger, add:
<iframe
width="1200" height="630"
Download Notebook
Render some videos and 3D models in the Quarto book. Pass along video id codes and 3d models using a hedge doc and chat to the Quarto render. The rendering and final display will take less than 10 minutes (hopefully): a. The code needs to be added to the main repo; b. Rendered locally; c. Uploaded to GitHub; d. Time for GitHub Pages to finish loading.
Cloning a repository - use GitHub Desktop, Visual Studio Code, or any other tool to copy the repository to your local machine
IMPORTANT!: Turning on Github Pages.
First go to the Settings tab in your repository; second in the left menu go down to Pages; third, secect main, docs - and save. In a few minutes this will torn on your Github Pages website. Congrats!
The last steup is to add the GitHub Pages URL to the front end information panel of your repository.
Navigate back ton the Code view of your repo. Top your you can add your GitHub Pages URL to the About information of your repository. Open the About area by clicking on the cog icon. The in the dialog window click the use Pages address, and save.
For general purposes use the manual install. The Docker install is for when you are running multiple environments on your computer or carrying out long term development.
This repository also contains Docker Compose and Dockerfiles for running the various applications in Docker containers.
Run docker-compose up -d --build to start the containers and docker-compose down to stop the containers.
The jupyterlab container runs a stand-alone version of JupyterLab on http://localhost:8888. This can be used to edit any Jupyter Notebook files in the repository. The JupyterLab instance runs with the password 'jupyterlab'.
The nginx container runs Nginx webserver and displays the static site that Quarto renders. This runs at http://localhost:1337.
The quarto container starts a Ubuntu 22.04 container, installs various things like Python, downloads Quarto and installs it, and then adds Python modules like jupyter, matplotlib, and panda. It then runs in the background so Quarto can be called on to render the qmd and ipynb files into the site/book like so:
docker exec -it quarto quarto render
There's a known issue with Quarto running in the Docker container in macOS due to the amd64 emulation of Docker Desktop for arm64 MacOS. See discussion at quarto-dev/quarto-cli#3308. This shouldn't occur in any other environment running Docker.
Visual Code Studio - Quarto install
This is not covered at present as there are conflicts with other approaches.
Troubleshooting
Raise an issue in the GitHub project if you need support.
If you are using GitHub Desktop, then just do the following steps:
Close GitHub Desktop and all other applications with open files to your current directory path.
Move the whole directory as mentioned above to the new directory location. (NB: The fdirectory has to be completely moved).
Open GitHub Desktop and click on the blue (!) "repository not found" icon. Then a dialog will open and you will see a "Locate..." button which will open a popup allowing you to direct its path to a new location.
Editing Quarto
Visual Studio Code is one option as an editor, but you can use any editor suite that you like.
Load the whole repository folder into your editor.
The key file to edit in Quart is _quarto.yml. This file contains the main configurations for your publication.
If you are working on a fork the first thing you need to do is edit the repository address on line 19 - this will point the GitHub icon in your publication to your own GitHub repo.
Note: You will use the lefthand 'Query helper' GUI, where you can type in names of items. Sometimes you need to enter a term twice to get the correct item to appear.
Items names are in bold below with their corresponding identifiers.
Activity: Making SPARQL Queries in Wikidata of Collections
Keywords: Collections, Wikidata, SPARQL
If you are not familiar with creating a Wikidata SPARQL query then see the section on how to create a query: Activity: Create a Wikidata query
The activity goal
Create a sample SPARQL query of a cultural collection of a museum and save a link to the query on your Wikiversity user page.
Note: Two collections are needed as the painting collection is easier to add to Jupyter Notebooks as you edit existing data. The second general collection is used to explore more options - assistence will be needed to help move the second query to the Jupyter Notebooks as they can be complex.
First one of a painting collection, and then,
A second one - of any type of media or artifact from a collection: books, sculpture, photography, etc.
Date: Line 37 - FILTER((1590 <= ?inceptionyear) && (?inceptionyear < 1750 )): here change the from and to dates.
Number of items: Line 40 - LIMIT 9: Change the number of items and run preview.
Now copy the query URL and save the link to your Wikiversiyt page. To copy the link you need to have previewed the query and bottom right is a short link copy button (if this doesnt work - copy the URL from your browsers address bar). Later we will add the new query to the Notebook painting.ipny.
Query 2: Make your own collection query
Note: We want to have images in our query, using the image grid view allows you to preview images.
Resources for finding collections: Data models and information
The links will help you find collections and see what types of artwork or media are listed on Wikidata.
Use the Examples button top left and then edit the example, or use DuckduckGo or Google to search for queries to edit.
Save your query link on your Wikiversity user homepage.
In the next section 'Transfering a Wikidata SPARQL query to a Juypter Notebook' you will see how to move your queries to Jupyter Notebooks.
Transfering a Wikidata SPARQL query to a Juypter Notebook
Query 1: A painting collection - can already be rendered in the Notebook and by Quarto.
Query 2: Yourn own collection query - you will need assistance in the class to render this as the outputted SPARQL metadata results needs to be processed by some Python code and this quickly becomes complicated.
Query 1: A painting collection
For this collection you simply edit the existing Notebook for the painting collection 'paintings.ipynb' rename as paintings_1xx.ipynb (with xx as your initails) - transfering across the new values that you have already created. See the section 'Making SPARQL Queries in Wikidata of Collections'.
These are the values being changed: Collection, Date range, and Number of items.
All you need to do is paste in two items into the main cell of the Notebook:
The SPARQL URL - see the green text indicated in Figure 1
Paste in the complete body of the SPARQL query - see the orange text in Figure 1. Note all of your query is pasted between the four apostrohies top and bottom:
''''
your query
'''''
Then run the Notebook and save the Notebook file when you are done and any other files edited.
Note: Change the notebook name in the _quarto.yml TOC so that your new file is included in the publication. Change paintings.ipynb to paintings_1xx.ipynb.
Note: Ensure requirements.txt has been added and run.
To render the output in Quarto, run the Quarto commands in your terminal - 'quarto preview' and/or 'quarto render'.
When your finished you can upload the results to your GitHub repository.
Query 2: Your custom collection query
Because SPARQL queries are varied and complex the process of moving the query into a Jupyter Notebook is not simple and involved knowledge of SPARQL and Python. The challenge is that the SPARQL metadata output has to be parsed by Python for presentation in the publication and to do this custom code needs to be written.
To help with this obstacle a support service is in place to consult and add the Python code to your Notebooks.
When you have completed the steps below and have a Notebook ready raise a ticket in the GitHub Project and assign to support and we will review the Notebook.
This will prepare the Notebook to output the SPARQL results only. The additional rendering will need to be added by support.
Create a new Notebook in your Quarto publication 'collection-name.ipynb'.
Back in the custom collection query at the bottom right there is a <code> button - click on the button and it will launch a preview window. Top right click on the Python tab, this will present the Query as Python code. Now copy and paste the code into your Notebook.
Next, we want to copy the URL of your query to then also past this into your notebook. This will allow users to review your orginal query. The URL is found on the bottom right of the query page, as copy short link. If this does not work just copy the long address from your browser address bar.
Paste your query link into the Notebook as a comment, which means haveing a # at the start of the line. Like so, underneath the lines shown here:
import sys
from SPARQLWrapper import SPARQLWrapper, JSON
# query link - https://w.wiki/6YJi based on https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples#Paintings_by_Gustav_Klimt
To finish your Notebook editing insert a Markdown field above your code cell as add a desciption, e.g.,
The collection Notebook only contains the SPARQL query and needs additional Python adding to parse the metadata output.
Save and run your Notebook. The result will be the SPARQL unformatted output.
Add your Notebook to the Quarto TOC by adding the file name of your collection to the TOC in the file _quarto.yml, under the chapter header on line 12. Save and render the quart publication to see the results.
To complete this part raise a ticket in the GitHub project assigning to support.
This concludes this part of the process.
Publishing Tasks
These are the small tasks needed for getting ready to publish. There will be an additional Flight Check round after this before completeing publishing.
Add a cover image
Update Imprint (colophon)
Add Essay
Change Style
Update the readme information
Tidy up the TOC
Add a cover image: Add image - instructions from Quarto help.
You need to make a cover image and add link to the _quarto.yml configuration file. The image size can be any size. Recommend to use 2,560 x 1,600 pixels which is Amazon Marketing Cover Image size.
Upload your image to the top level of repository.
Then link in the file _quarto.yml at the link cover-image: as for example
Cover-image: cover.jpg
See example _quarto.yml cover link. File can be in repo or online.
Update Imprint (colophon): file name - colophon.ipynb
The colophon is made of two parts. The first part that you will edit using instructions below. And the second part which automatically comes for the Thoth.pub book metadata system.
To get a DOI you need to register one at Zenodo. Make a Zenodo deposit for a book: Make a pre-release and get a DOI. You only need the minimal information to start with and the fields can be changed later. See Zenodo help - search for’ Reserve DOI ‘.
Add a Markdown cell after the Notebook metadata cell. Here is an example you can copy. Add and fill these fields:
Fork title
Author
ORCID
Date
DOI
Repository URL
Add Essay
Paste and edit Markdown here collection003.qmd. The essay is only for illustration purposes. Feel free to use Wikipedia or some AI content. Any content must be cited and be open licensed.
See files README.md and index.qmd. Add minimum publication information, which can be a copy of your colophon. See the example file.
Tidy up the TOC
Remove other Notebooks from the chapter list in _quarto and add any other Notebooks or files to the chapter list. Remember to render and push to complete the publishing.
# put a hash in front of chapters to omit
Mynewchapter.qmd or .ipynb
Parts of the book → Files to modify or add
Quick reference file look-up.
Note: _quarto.yml is where all book configurations are done for Quarto.
Push back to repo via GitHub desktop, or GitHub in VSCode
Add summary when committing to main
And Push
Review and Flight Check
These items will be covered in the class.
Flight check: README; LICENCE; and add CONTRIBUTE file. Check other parts.
Make a GitHub Release.
Complete Zenodo deposit and release: We will add a PDF and an interoperable form. In this case a repository Zip file downloaded from the Github release