WordMord: No Annotation is Alone

From Parallel Library Services
Jump to navigation Jump to search
WordMord: No Annotation is Alone
Location: Online at https://meet.greenhost.net/ParallelLibraryServices
Date: January 26th, 2022
Time: 16:00-19:00 CET
Pad: https://pad.simonbrowne.biz/p/pls-meeting-6
Tools: Tesseract, ImageMagick
Guests: Aggeliki Diakrousi, WordMord

--- Work(d)session---

Context

This workshop "WordMord: No Annotation is Alone", was driven by the particular practices of the WordMord collective research group. With the urgent understanding that words can kill, WordMord works with annotation as an activist strategy.

Guests

In this workshop, we were joined by Angeliki Diakrousi and some of the WordMord team.

The PDF format is technically quite fixed (unlike EPUB) - and our plan was to make annotative interventions upon the rigidity of fixed formats, such as PDF, to challenge established protocols. Rooted in intersectional feminist methodologies, WordMord questions the language and the legal systems around the use of the term 'femicide' in the Greek context.

A short description of a WordMord intervention on a particular legal PDF document:

"PDF as a democratic means of digitally publishing a criminal code, makes the law accessible to everyone. It is a static tool for sharing institutional knowledge that does not easily allow for editing and commenting on its content. Each digital tool contains a narrative as a guidance mechanism with specific technical and ideological constraints. The way of using a PDF is universally accepted and stored in collective memory. Similarly, the original content of this PDF is determined by entrenched decisions, based on the perpetuation of old habits, and in turn determines dominant social, institutional and ideological behaviours."

Tools

We used tools such as Python3, Tesseract, and ImageMagick to go through the experimental workflow the WordMord team has been using.

In preparation, participants were asked to bring formal, bureaucratic or legal PDFs that become obstacles for them or their community, e.g.:

  • constitutional laws and bylaws
  • penal codes (e.g. the city, copyright)
  • houserules
  • legal forms
  • terms and conditions
  • manuals

Activities

We began by reading together an excerpt from Chapter Four ("The Weather”, pg 113 - pg 120) of Christina Sharpe's "In the Wake: On Blackness and Being", which is about what she calls “Black Annotation" and “Black Redaction", part of the "wake work" of Black survival.

WORDMORD - NO ANNOTATION IS ALONE

http://wordmord-ur.la/

WordMord believes that the violence of language is not eradicated by merely deleting/erasing words, but rather by transversing their violent imposition through specific practices that trouble and disrupt grammatical consistency, semantic norms, ‘correct’ pronunciation, ‘proper’ bodily posture. The rupture of linguistic limits suggests the possibility of experiencing language in its materiality.

WordMord poses questions on the relationship between language, technology, trauma and violence. The collective artistic research will evolve through workshops, presentations and artworks. Through collaborations with artists, activists and groups working on feminist coding,WordMord seeks to shape an online rhizomatic space as an active feminist archive. At the same time, it project will provide tools and methods towards a poetically subversive meta/para/re-writing of derogatory narratives and consequently of trauma and violence. Wordmord seeks to connect art with queer feminist activism and emancipated life. Through collaborations with artists, performers, linguists, lawyers, programmers, activists and groups working with feminist algorithmic and computational practices, it shapes an online rhizomatic space as an active feminist archive. WordMord´s initial research group consists of: Vassiliea Stylianidou aka Franck-Lee Alli-Tis, Ageliki Diakrousi, Christina Karagianni, Oýto Άrognos aka Stylianos Benetos, Mounologies: Eleni Diamantouli and Anna Delimpasi. At a later stage Cristina Cochior and Manetta Berends joined the group to contribute with linguistic coding practices. It started in collaboration with the #CNMFPP in 2019.

Formats of WordMord

Weekly meetings (sharing our situated embodied languages and knowledges)

  • Workshops
  • Let´s assemble our Wordy Arms_an open laboratory, 2020 at Haus der Statistik Berlin. Within the context of Glitter and Griff
  • WordComminutes_a workshop on crashing language, 2021 at Eight
  • "Dear [neutral] language, (...)" with Allison Parrish at Varia
  • Collective projects and subgroups (tentacles/threads)

Tentacles: groups:

  • onlania
  • The Comminuters_musicgroup. A band στα σκαριά

Tentacles: projects in process:

  • Genealogy of queer feminist artistic and theoretical methodologies towards the deconstruction of patriarchal language
  • Wordlist
  • Manyfesto by onlania
  • Καμία επισημείωση δεν είναι μόνη / No annotation is alone
  • Para-dictionary/Lexikon

Re(d)action

https://parallel-library.simonbrowne.biz/calibre/read/49/pdf#page=126

Redaction Reaction Readaction

OuNuPo: https://issue.xpub.nl/05/ (especially Chapter 4 - Natasha Berting How Bias Spreads from the Canon to the Web + Erase / Replace)

Annotate the web

https://web.hypothes.is/

Hypothes.is is a 501(c) open-source software project that aims to collect comments about statements made in any web-accessible content, and filter and rank those comments to assess each statement's credibility. [from Wikipedia]


XPPL: https://issue.xpub.nl/06/

https://w-i-t-m.net/images/xppl_interface.jpg (especially Annotations interface by Angeliki https://pzwiki.wdka.nl/mediadesign/User:Angeliki/X-LIB/Annotations

No Annotation is Alone

"PDF as a democratic means of digitally publishing a criminal code, makes the law accessible to everyone. It is a static tool for sharing institutional knowledge that does not easily allow for editing and commenting on its content. Each digital tool contains a narrative as a guidance mechanism with specific technical and ideological constraints. The way of using a PDF is universally accepted and stored in collective memory. Similarly, the original content of this PDF is determined by entrenched decisions, based on the perpetuation of old habits, and in turn determines dominant social, institutional and ideological behaviours."

No Annotation is Alone - Tool

[Instructions made by WordMord]

Install

MAC: first install homebrew and Python3

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
$ brew install python3

WINDOWS: https://www.python.org/downloads/windows/

Create a Python virtual environment & activate it

$ python3 -m venv venv
$ . venv/bin/activate

Install tesseract-ocr

LINUX:

$ sudo apt-get install tesseract-ocr

MAC:

$ brew install tesseract-lang
$ tesseract --list-langs

If the language you want is not there check languages here: https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html

Install the languages you want:

$ sudo apt-get install tesseract-ocr-ell
$ sudo apt install imagemagick OR  $ brew install imagemagick
$ pip install reportlab


Annotate the PDF

Download the scripts http://wordmord-ur.la/tools/

$ convert [name0fyourFile].pdf [name0fyourFile].png

This converts all pages in the pdf into png files, for example if image.pdf is a 2-page PDF, running the above command will produce two PNG files called

image-0.png
image-1.png 

To specify a page of a PDF to convert to jpg:

  • specify page of pdf adding page number (numbered from 0) in square brackets after .pdf

simon: I just tested this and it only produced this output:

   no matches found: image.pdf[0]
$ tesseract -l [language] [name0fyourFile].png [name0fyourFile] hocr

Download the scripts from here: http://wordmord-ur.la/tools/

Make a new searchable pdf:

$ python3 hocrtransform-invisible-PDF.py -i [name0fyourFile].png [HocrFile].hocr [NewPDF].pdf

hocr

hOCR is an open standard of data representation for formatted text obtained from optical character recognition (OCR). The definition encodes text, style, layout information, recognition confidence metrics and other information using Extensible Markup Language (XML) in the form of Hypertext Markup Language (HTML) or XHTML. [from Wikipedia]

View the text (produced as hocr) within the pdf ( from https://pzwiki.wdka.nl/mediadesign/Optical_character_recognition_with_Tesseract )

using hocrjs

We will use User Script instruction with Tampermonkey.

  • open Firefox
  • go to FF addons and search for Tampermonkey
  • install it
  • Browse to unpkg.com/hocrjs/dist/hocr.user.js
    • click "Install". It will install the script in your browser Tampermonkey
    • click the Tampermonkey and go to the "Dashboard". hocr-viewer should be enabled

View the hocr int the Firefox

  • change the extension of your hocr file from .hocr to .html
  • open the .html file in firefox

Edit the hocr file with a text editor and replace words [editedHocrFile].hocr

Make a new searchable pdf with an annotated hocr:

$ python3 hocrtransform-invisible-PDF.py -i [name0fyourFile].png [editedHocrFile].hocr [NewAnnotatedPDF].pdf

Annotation replaces the initial content

$ python3 hocrtransform-visible-pdf.py -i [name0fyourFile].png [editedHocrFile].hocr [TransformedPDF].pdf


Support

The development of the tools presented is supported by the Creative Industries Fund NL.