Make library card catalogue PDFs with Python scripts

From Parallel Library Services
Jump to navigation Jump to search

Dependencies

  • git, software for managing development and versioning repositories of code
  • an activated python virtual environment
  • scripts cloned from a git repository
  • installation of calibrestekje, a python-bindings library
  • a valid metadata.db file as produced by an existing installation of Calibre

Getting started

If you don't have git on your system, you can install it via instructions on this website:

https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

After installing git, you can clone repositories, and work on files on your computer. There are many other things you can do with git, for example push/pull changes, check version history and "fork" projects or work on them collaboratively.

First clone the git repository and change to the new bootleg/ directory::

git clone https://git.xpub.nl/simoon/bootleg.git
cd bootleg/

In the bootleg/ directory, create and activate a python virtual environment. Once activated, you'll notice the prompt in the terminal has changed to be prefaced by (venv), which indicates that the virtual environment is active.

python3 -m venv venv
source venv/bin/activate

Then, install dependencies in the python environment.

pip install reportlab calibrestekje pillow markdown html5lib

lesssimplelayout.py is a script that makes A6 double-sided cards of the contents in a metadata.db file. You can use sqlite to query the database, and make changes to what calibrestekje searches for in each book's unique ID metadata, for example, the book's title, timestamp (when it was uploaded) and authors listed in the catalog.

Read more about how to work with sqlite here:

https://www.sqlite.org/cli.html

Working with the scripts

readfrompad.py

This requires Python 3. To run it, type

TODO: clean up script

something

simplelayout.py

This is a script adapted from one written by Michael Murtaugh during a prototyping tutorial. We were excited by the idea of making an analog interface for the bootleg library using Calibrestekje to query the database, and Reportlab to make PDFs. The script was made to produce one card (double-sided) for each book in the library. Running two lines produces a 1098 page PDF in an instant, a thrill when you are used to a much slower layout process.

Calibrestkje works with an existing Calibre database. This means, a previously installed version of Calibre. If you don't have calibre installed yet, you should do this before trying out this recipe. The contents that are produced in the PDF depend entirely on what is in the metadata.db file. Alternatively, it may also be useful to install calibre and force it to make an empty, but valid file which can be written to using calibredb, a tool that comes with the calibre package. There is a handy guide for how to work with the Calibre database on the examples page for Calibrestekje.

Make sure you have a valid metadata.db file in the same bootleg/ directory. One which is usually produced the first time you run Calibre in a path similar to /home/myusername/calibre/metadata.db on Debian and Unix-like systems. This file is usually kept in the same directory as the contents of the Calibre book collection, which is called Calibre Library in a general installation, or is named after whatever was given at the time.

from reportlab.lib.pagesizes import *
from reportlab.pdfgen import canvas
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from calibrestekje import Book, Publisher, init_session

# defining width x height with Reportlab
pagewidth, pageheight = landscape(A6)

# defining the output file with Reportlab
doc = SimpleDocTemplate("newsimpletext.pdf", pagesize=landscape(A6),
                        rightMargin=18, leftMargin=18,
                        topMargin=0, bottomMargin=18)

content = []
styles = getSampleStyleSheet()

# initiate session with calibrestekje
session = init_session("sqlite:///metadata.db")

for book in session.query(Book).all():

    print (book.title)
    print (book.authors)
    print (book.timestamp)
    print (book.path)
    
    # c.drawString(10,pageheight-10, book.title)
    # c.showPage()

    # todo - read the path of images, and pull them from metadata.db
    # f = open('{}/cover.jpg'.format(book.path),"rb")

    # create a paragraph and append content to it - e.g. book.title, book.authors etc
    ptitle = Paragraph('<font size=12>{}</font>'.format(book.title), styles["Italic"])
    ptime = Paragraph('<font size=12>{}</font>'.format(book.timestamp), styles["BodyText"])
    
    # todo - add a cover image with the book path produced above
    # pcover = Paragraph('<font size=12>{}/cover.jpg</font>'.format(book.path), styles["Italic"])
    # cover= Image('{}/cover.jpg'.format(book.path))

    content.append(ptitle)
    content.append(ptime)

    # todo - ditto above re book cover images and book paths
    # content.append(pcover)
    # content.append(cover)
    # content.append(Image(f))
    
 
    # content.append(PageBreak())
    content.append(Spacer(3, 12))

    # set a trace with ipdb for debugging
    # import ipdb; ipdb.set_trace()

    # use list comprehensions to find and glue lists of multiple authors
    format_string = '<font size=12>{}</font>'
    all_authors = [author.name for author in book.authors]
    glued_together = format_string.format(", ".join(all_authors))

    # ALTERNATIVE WAY... (without list comprehensions)
    first = True
    author_text = ""
    for author in book.authors:
        if not first:
            author_text += ", "
        author_text += author.name
        first = False
    author_text = "<font size=12>{}</font>".format(author_text)

    # 2020
    #if all_authors==['John Markoff']:
    #	import ipdb; ipdb.set_trace()

    p = Paragraph(glued_together, styles["Normal"])
    content.append(p)
    content.append(PageBreak())
    content.append(Spacer(1, 12))

    # this script only outputs one side of a card

doc.build(content)