Difference between revisions of "Imagining librarianship & experiments with document conversion"
Line 25: | Line 25: | ||
After inspecting metadata in PDFs using ExifTool | After inspecting metadata in PDFs using ExifTool | ||
We followed [[Create_portable_libraries_by_embedding_metadata_in_Calibre|a tutorial]] (originally written by Roel Roscam-Abbing) which shows how to embed metadata in PDFs using a Calibre plugin. | We followed [[Create_portable_libraries_by_embedding_metadata_in_Calibre|a tutorial]] (originally written by Roel Roscam-Abbing) which shows how to embed metadata in PDFs using a Calibre plugin. This plugin, ,as well as many others that extend Calibre's functionality can be added to the main toolbar in Calibre easily. It was important to note that this is only possible in Calibre, [[Calibre-web]] does not support this, or other plugins. | ||
[[File:Calibre customise main toolbar.png|thumb|Calibre's main toolbar preferences]] | |||
Our workshop was documented on a pad using [[Markdown]] to create structure. Markdown is a lightweight markup language that can be useful in hybrid publishing, where inputs (plain text) may have may outputs (file formats). From the one document it is possible to create a variety of files, including EPUB, PDF, HTML and even Wikitext, the syntax MediaWiki uses. | Our workshop was documented on a pad using [[Markdown]] to create structure. Markdown is a lightweight markup language that can be useful in hybrid publishing, where inputs (plain text) may have may outputs (file formats). From the one document it is possible to create a variety of files, including EPUB, PDF, HTML and even Wikitext, the syntax MediaWiki uses. |
Revision as of 11:47, 12 May 2022
Imagining librarianship & experiments with document conversion |
---|
Location: At Varia (Gouwstraat 3, Rotterdam), and online |
Date: November 24th, 2021 |
Time: 16:00-19:00 CET |
Pad: https://pad.simonbrowne.biz/p/pls-meeting-4 |
Tools: {{{tools detail}}} |
Guests: {{{guests detail}}} |
Context
PDF (Portable Document Format) is a highly popular digital file format for ebooks. In this workshop, we created, queried and embedded metadata in a PDF by using tools such as Pandoc, ExifTool and of course Calibre, "the swiss army knife of document conversion".
Activities
After some catching up on the contexts of our projects, we discussed the plan for today:
- a tour of Calibre
- hybrid publishing workflows
- embedding metadata in PDFs
- making digital files (EPUB, PDF) with pandoc
- converting between file formats in Calibre (.docx > .epub)
The first half of the workshop involved taking a close look at Calibre and hybrid publishing workflows using plain text file formats such as HTML and Markdown.
After inspecting metadata in PDFs using ExifTool
We followed a tutorial (originally written by Roel Roscam-Abbing) which shows how to embed metadata in PDFs using a Calibre plugin. This plugin, ,as well as many others that extend Calibre's functionality can be added to the main toolbar in Calibre easily. It was important to note that this is only possible in Calibre, Calibre-web does not support this, or other plugins.
Our workshop was documented on a pad using Markdown to create structure. Markdown is a lightweight markup language that can be useful in hybrid publishing, where inputs (plain text) may have may outputs (file formats). From the one document it is possible to create a variety of files, including EPUB, PDF, HTML and even Wikitext, the syntax MediaWiki uses.
Markdown uses YAML metadata headers, which require a title in the initial metatdata block:
---
title: my new document
---
After this, it uses a simple syntax to make headings, paragraphs, bold and italic, lists (ordered and unordered), hyperlinks, and many more elements that can easily be converted to multiple file formats. This is part of a markdown publishing workflow, whereby content is gathered and structured in plain text documents. These are usually a source markdown document with the extension .md
, and a stylesheet - in CSS, for example - with the file extension .css
.
We began by catching up on our projects, recording notes in a pad:
We then exported the pad to a plain text format by running curl in a terminal:
curl https://pad.simonbrowne.biz/p/pls-meeting-4/export/txt -o pls-meeting-4.md
This exports the file in plain text, from which we can use Markdown and CSS to make a PDF with pandoc's weasyprint pdf rendering engine:
pandoc --pdf-engine=weasyprint -c stylesheet.css -s pls-meeting-4.md -o pls-meeting-4.pdf