Difference between revisions of "Make library card catalogue PDFs with Python scripts"

From Parallel Library Services
Jump to navigation Jump to search
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
This recipe depends on
== Dependencies ==


* a python virtual environment
* git, software for managing development and versioning repositories of code
* a cloned git repository
* an activated [[Set_up_a_Python_virtual_environment|python virtual environment]]
* [[calibrestekje]], a python-bindings library
* scripts cloned from a git repository
* a <code>metadata.db</code> file as produced by an existing installation of [[Calibre]]
* installation of [[calibrestekje]], a python-bindings library
* a valid <code>metadata.db</code> file as produced by an existing installation of [[Calibre]]


Calibrestkje works with an existing Calibre database. This means, a previously installed version of Calibre. If you don't have calibre installed yet, you should do this before trying out this recipe. The contents that are produced in the PDF depend entirely on what is in the <code>metadata.db</code> file. Alternatively, it may also be useful to install calibre and force it to make an empty, but valid file which can be written to using <code>calibredb</code>, a tool that comes with the calibre package. There is a handy guide for how to work with the Calibre database on [https://calibrestekje.readthedocs.io/en/latest/examples.html the examples page] for Calibrestekje.
== Getting started ==
 
If you don't have <code>git</code> on your system, you can install it via instructions on this website:
 
https://git-scm.com/book/en/v2/Getting-Started-Installing-Git


== Getting started ==
After installing git, you can clone repositories, and work on files on your computer. There are many other things you can do with git, for example push/pull changes, check version history and "fork" projects or work on them collaboratively.


First, clone the git and change to the new <code>bootleg/</code> directory:
First clone the git repository and change to the new <code>bootleg/</code> directory::


<syntaxhighlight lang="bash" line>
<syntaxhighlight lang="bash" line>
Line 26: Line 31:
Then, install dependencies in the python environment.
Then, install dependencies in the python environment.


<syntaxhighlight lang="bash" line>
<syntaxhighlight lang="bash">
pip install reportlab  
pip install reportlab calibrestekje pillow markdown html5lib
pip install calibrestekje
pip install pillow
</syntaxhighlight>
</syntaxhighlight>


Make sure you have a valid <code>metadata.db</code> file in the same <code>bootleg/</code> directory. One which is usually produced the first time you run Calibre in a path similar to <code>/home/myusername/calibre/metadata.db</code> on Debian and Unix-like systems. This file is usually kept with the contents of the Calibre book collection.
<code>lesssimplelayout.py</code> is a script that makes A6 double-sided cards of the contents in a <code>metadata.db</code> file. You can use sqlite to query the database, and make changes to what calibrestekje searches for in each book's unique ID metadata, for example, the book's title, timestamp (when it was uploaded) and authors listed in the catalog.
 
Read more about how to work with sqlite here:
 
https://www.sqlite.org/cli.html
 
== Working with the scripts ==


Next, run this command:
=== readfrompad.py ===
This requires Python 3. To run it, type


<syntaxhighlight lang="bash">
TODO: clean up script
python3 reportlab_image_poster.py
<syntaxhighlight lang="bash" line>
something
</syntaxhighlight>
</syntaxhighlight>


This will produce a PDF, and also a list of the contents. In my case, it produced a 1048 page PDF in seconds, with the title and author of each book on separate pages of a card catalogue.
=== simplelayout.py ===


=== How it works ===
This is a script adapted from one written by Michael Murtaugh during a prototyping tutorial. We were excited by the idea of making an analog interface for the bootleg library using Calibrestekje to query the database, and Reportlab to make PDFs. The script was made to produce one card (double-sided) for each book in the library. Running two lines produces a 1098 page PDF in an instant, a thrill when you are used to a much slower layout process.


The python script <code>reportlab_image_poster.py</code>  
Calibrestkje works with an existing Calibre database. This means, a previously installed version of Calibre. If you don't have calibre installed yet, you should do this before trying out this recipe. The contents that are produced in the PDF depend entirely on what is in the <code>metadata.db</code> file. Alternatively, it may also be useful to install calibre and force it to make an empty, but valid file which can be written to using <code>calibredb</code>, a tool that comes with the calibre package. There is a handy guide for how to work with the Calibre database on [https://calibrestekje.readthedocs.io/en/latest/examples.html the examples page] for Calibrestekje.


<syntaxhighlight lang="python" highlight="19,21,29,33" line>
Make sure you have a valid <code>metadata.db</code> file in the same <code>bootleg/</code> directory. One which is usually produced the first time you run [[Calibre]] in a path similar to <code>/home/myusername/calibre/metadata.db</code> on Debian and Unix-like systems. This file is usually kept in the same directory as the contents of the Calibre book collection, which is called <code>Calibre Library</code> in a general installation, or is named after whatever was given at the time.
#!/usr/bin/env python3


import os, datetime, sys
<syntaxhighlight lang="python" line>
from argparse import ArgumentParser
from reportlab.lib.pagesizes import *
from glob import glob
 
from PIL import Image
from reportlab.pdfgen import canvas
from reportlab.pdfgen import canvas
from reportlab.lib.pagesizes import *
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from calibrestekje import Book, Publisher, init_session
from calibrestekje import Book, Publisher, init_session


# p = ArgumentParser("")
# defining width x height with Reportlab
# p.add_argument("--output", default="poster.pdf")
pagewidth, pageheight = landscape(A6)
# p.add_argument("--interpolation", default="cubic", help="nearest,cubic")
# p.add_argument("--labels", default="labels_public.txt")
# args = p.parse_args()


# defining the output file with Reportlab
doc = SimpleDocTemplate("newsimpletext.pdf", pagesize=landscape(A6),
                        rightMargin=18, leftMargin=18,
                        topMargin=0, bottomMargin=18)


pagewidth, pageheight = landscape(A6)
content = []
 
styles = getSampleStyleSheet()
c = canvas.Canvas("card_catalogue.pdf", pagesize=landscape(A6))
# x, y = 0, 0
# imagewidth = 200
# aw = pagewidth - imagewidth
# images = (glob ("images/*.JPG"))
# dx = aw/(len(images)-1)
# dy = 20


# initiate session with calibrestekje
session = init_session("sqlite:///metadata.db")
session = init_session("sqlite:///metadata.db")


# publisher = (session.query(Publisher)
#                    .filter(Publisher.name == "MIT Press").one())
for book in session.query(Book).all():
for book in session.query(Book).all():


    print (book.title)
    print (book.authors)
    print (book.timestamp)
    print (book.path)
   
    # c.drawString(10,pageheight-10, book.title)
    # c.showPage()
    # todo - read the path of images, and pull them from metadata.db
    # f = open('{}/cover.jpg'.format(book.path),"rb")
    # create a paragraph and append content to it - e.g. book.title, book.authors etc
    ptitle = Paragraph('<font size=12>{}</font>'.format(book.title), styles["Italic"])
    ptime = Paragraph('<font size=12>{}</font>'.format(book.timestamp), styles["BodyText"])
   
    # todo - add a cover image with the book path produced above
    # pcover = Paragraph('<font size=12>{}/cover.jpg</font>'.format(book.path), styles["Italic"])
    # cover= Image('{}/cover.jpg'.format(book.path))
    content.append(ptitle)
    content.append(ptime)
    # todo - ditto above re book cover images and book paths
    # content.append(pcover)
    # content.append(cover)
    # content.append(Image(f))
   
    # content.append(PageBreak())
    content.append(Spacer(3, 12))


     print (book.title)
     # set a trace with ipdb for debugging
     c.drawString(10,pageheight-10, book.title)
     # import ipdb; ipdb.set_trace()
    c.showPage()
 
     # print (image)
     # use list comprehensions to find and glue lists of multiple authors
     # im = Image.open(image)
     format_string = '<font size=12>{}</font>'
     # pxwidth, pxheight = im.size
     all_authors = [author.name for author in book.authors]
     # print ("Got the image, it's size is:", im.size)
     glued_together = format_string.format(", ".join(all_authors))
     # imageheight = imagewidth * (pxheight / pxwidth)
 
     # c.drawInlineImage(image, x, y, imagewidth, None)
     # ALTERNATIVE WAY... (without list comprehensions)
     # print ("placing image {0} at {1}".format(image, (x,y)))
     first = True
     # x += dx
    author_text = ""
     # y += dy
    for author in book.authors:
        if not first:
            author_text += ", "
        author_text += author.name
        first = False
     author_text = "<font size=12>{}</font>".format(author_text)
 
    # 2020
     #if all_authors==['John Markoff']:
     # import ipdb; ipdb.set_trace()


c.save()
    p = Paragraph(glued_together, styles["Normal"])
# sys.exit(0)
    content.append(p)
    content.append(PageBreak())
    content.append(Spacer(1, 12))


    # this script only outputs one side of a card


#################
doc.build(content)
# GRID
# imsize = 96
# cols = int(A0[0] // imsize)
# rows = int(A0[1] // imsize)
# # calculate margins to center the grid on the page
# mx = (A0[0] - (cols*imsize)) / 2
# my = (A0[1] - (rows*imsize)) / 2
# print ("Grid size {0}x{1} (cols x rows)".format(cols, rows))
# print ("  (total size:", cols*imsize, rows*imsize, "margins:", mx, my, ")")
#################
</syntaxhighlight>
</syntaxhighlight>


[[Category: Cookbook]]
[[Category:Cookbook]]
[[Category:Etherpad]]
[[Category:Calibre]]
[[Category:Calibrestekje]]
[[Category:Reportlab]]

Latest revision as of 23:00, 9 November 2021

Dependencies

  • git, software for managing development and versioning repositories of code
  • an activated python virtual environment
  • scripts cloned from a git repository
  • installation of calibrestekje, a python-bindings library
  • a valid metadata.db file as produced by an existing installation of Calibre

Getting started

If you don't have git on your system, you can install it via instructions on this website:

https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

After installing git, you can clone repositories, and work on files on your computer. There are many other things you can do with git, for example push/pull changes, check version history and "fork" projects or work on them collaboratively.

First clone the git repository and change to the new bootleg/ directory::

git clone https://git.xpub.nl/simoon/bootleg.git
cd bootleg/

In the bootleg/ directory, create and activate a python virtual environment. Once activated, you'll notice the prompt in the terminal has changed to be prefaced by (venv), which indicates that the virtual environment is active.

python3 -m venv venv
source venv/bin/activate

Then, install dependencies in the python environment.

pip install reportlab calibrestekje pillow markdown html5lib

lesssimplelayout.py is a script that makes A6 double-sided cards of the contents in a metadata.db file. You can use sqlite to query the database, and make changes to what calibrestekje searches for in each book's unique ID metadata, for example, the book's title, timestamp (when it was uploaded) and authors listed in the catalog.

Read more about how to work with sqlite here:

https://www.sqlite.org/cli.html

Working with the scripts

readfrompad.py

This requires Python 3. To run it, type

TODO: clean up script

something

simplelayout.py

This is a script adapted from one written by Michael Murtaugh during a prototyping tutorial. We were excited by the idea of making an analog interface for the bootleg library using Calibrestekje to query the database, and Reportlab to make PDFs. The script was made to produce one card (double-sided) for each book in the library. Running two lines produces a 1098 page PDF in an instant, a thrill when you are used to a much slower layout process.

Calibrestkje works with an existing Calibre database. This means, a previously installed version of Calibre. If you don't have calibre installed yet, you should do this before trying out this recipe. The contents that are produced in the PDF depend entirely on what is in the metadata.db file. Alternatively, it may also be useful to install calibre and force it to make an empty, but valid file which can be written to using calibredb, a tool that comes with the calibre package. There is a handy guide for how to work with the Calibre database on the examples page for Calibrestekje.

Make sure you have a valid metadata.db file in the same bootleg/ directory. One which is usually produced the first time you run Calibre in a path similar to /home/myusername/calibre/metadata.db on Debian and Unix-like systems. This file is usually kept in the same directory as the contents of the Calibre book collection, which is called Calibre Library in a general installation, or is named after whatever was given at the time.

from reportlab.lib.pagesizes import *
from reportlab.pdfgen import canvas
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, PageBreak
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from calibrestekje import Book, Publisher, init_session

# defining width x height with Reportlab
pagewidth, pageheight = landscape(A6)

# defining the output file with Reportlab
doc = SimpleDocTemplate("newsimpletext.pdf", pagesize=landscape(A6),
                        rightMargin=18, leftMargin=18,
                        topMargin=0, bottomMargin=18)

content = []
styles = getSampleStyleSheet()

# initiate session with calibrestekje
session = init_session("sqlite:///metadata.db")

for book in session.query(Book).all():

    print (book.title)
    print (book.authors)
    print (book.timestamp)
    print (book.path)
    
    # c.drawString(10,pageheight-10, book.title)
    # c.showPage()

    # todo - read the path of images, and pull them from metadata.db
    # f = open('{}/cover.jpg'.format(book.path),"rb")

    # create a paragraph and append content to it - e.g. book.title, book.authors etc
    ptitle = Paragraph('<font size=12>{}</font>'.format(book.title), styles["Italic"])
    ptime = Paragraph('<font size=12>{}</font>'.format(book.timestamp), styles["BodyText"])
    
    # todo - add a cover image with the book path produced above
    # pcover = Paragraph('<font size=12>{}/cover.jpg</font>'.format(book.path), styles["Italic"])
    # cover= Image('{}/cover.jpg'.format(book.path))

    content.append(ptitle)
    content.append(ptime)

    # todo - ditto above re book cover images and book paths
    # content.append(pcover)
    # content.append(cover)
    # content.append(Image(f))
    
 
    # content.append(PageBreak())
    content.append(Spacer(3, 12))

    # set a trace with ipdb for debugging
    # import ipdb; ipdb.set_trace()

    # use list comprehensions to find and glue lists of multiple authors
    format_string = '<font size=12>{}</font>'
    all_authors = [author.name for author in book.authors]
    glued_together = format_string.format(", ".join(all_authors))

    # ALTERNATIVE WAY... (without list comprehensions)
    first = True
    author_text = ""
    for author in book.authors:
        if not first:
            author_text += ", "
        author_text += author.name
        first = False
    author_text = "<font size=12>{}</font>".format(author_text)

    # 2020
    #if all_authors==['John Markoff']:
    #	import ipdb; ipdb.set_trace()

    p = Paragraph(glued_together, styles["Normal"])
    content.append(p)
    content.append(PageBreak())
    content.append(Spacer(1, 12))

    # this script only outputs one side of a card

doc.build(content)