arminstraub.com

krop: A tool to crop PDF files

What is krop?

krop is a simple graphical tool to crop the pages of PDF files.

How about a brief guided tour & screenshots?

  • krop should work on any recent Linux distribution, see how to install krop. I don't know if krop can be used on Windows or Mac after a sufficient amount of tinkering: please let me know in case you succeed.
  • It is written in Python and relies on PyQT, python-poppler-qt5 and PyPDF2 for its functionality.
  • It is free software, released under GPLv3+ in the sole hope that you or someone else may find it useful.
  • A unique feature of krop, at least to my knowledge, is its ability to automatically split pages into subpages to fit the limited screensize of devices such as eReaders. This is particularly useful, if your eReader does not support convenient scrolling. (In fact, I initially wrote krop to be able to read mathematical papers on my Nook.)
  • Please report bugs or feature requests at https://github.com/arminstraub/krop where the source code for krop is hosted.

What krop fails to do

Unfortunately, there is no simple way to eliminate unnecessary/invisible parts of a PDF file. krop only adjusts which parts of a PDF are displayed; the original content is still there in the file and will, for instance, show up when editing the file in inkscape. As a result, krop is not suited for

  • censoring a PDF document or
  • decreasing the size of a PDF file.

That being said, since version 0.5.0, you may have some success in decreasing the size of the PDF (and even censoring some parts) using the option to use Ghostscript to optimize the final PDF.

Changelog

 krop 0.6.0 2020/06/09Armin Straub
+Fixed aspect ratios, like letter size, can be chosen for selections.
+Selections (or a grid of selections) for the full page can now be created using the context menu or by pressing Insert (or Shift+Insert). Also introduced the command line option --grid to create a grid of selections on the initial page.
+Auto trimming margins can now inspect all pages.
+New option for whether to include pages without selections in the output.
+Added the command line options --optimize (thanks to Ondrej Tichacek for suggesting this feature) and --exceptions.
+Implemented several keyboard shortcuts including Shift+Arrow to move current selection and Delete to remove it.
*Keep track of current selection and highlight it visually.
*Don't fail on PDFs that are encrypted with an empty password.
*Remember window geometry and fit in view setting.
 
 krop 0.5.1 2018/10/27Armin Straub
*Replace and extend the README file with a MarkDown version (thanks to Eduardo Montenegro for doing this).
+Add a manpage.
*Fix a Qt5 related bug when selecting filename for saving (thanks to Lin-Buo-Ren for reporting this).
 
 krop 0.5.0 2018/02/11Armin Straub
+Support PyQt5.
+Use Ghostscript to optionally optimize the final PDF (thanks to Mathias Rav for the idea and code).
 
More ...

Source code

The source code for krop is now hosted on GitHub: https://github.com/arminstraub/krop

Previously, I posted snapshots of the releases here as tarballs. You can download such snapshots from GitHub. For instance, to download krop-0.6.0.tar.gz, the tarball for the most recent release:

$ wget -O krop-0.6.0.tar.gz https://github.com/arminstraub/krop/archive/v0.6.0.tar.gz

Install krop

krop is available in the official repositories of several linux distributions, including the following, ready to install with a single command.

Note that the version of krop in these repositories may not always be the most recent. In that case, or if your distribution doesn't include krop, you can follow the instructions in the next section to install the most recent version directly.

Install krop directly

To get right to it, the following two commands install krop on Kubuntu 16.04/18.04/20.04 (and similar systems based on Debian/Ubuntu):

$ sudo apt install python3-poppler-qt5 python3-pypdf2 python3-pip
$ pip3 install https://github.com/arminstraub/krop/archive/v0.6.0.tar.gz --user

In general, we need to install the python libraries that krop uses for its functionality: PyQT, python-poppler-qt5 (or the older python-poppler-qt4) as well as PyPDF2 (or the older pyPdf). This is what the first of the two commands does.

The following then automatically downloads and installs the latest version of krop:

$ pip3 install https://github.com/arminstraub/krop/archive/v0.6.0.tar.gz --user
This assumes that pip3 is installed on your system. Depending on your distribution and whether you use Python 2, you may need to replace pip3 with pip.

Prepend this command with sudo and omit the --user if you wish to install krop system-wide. (By the way, on new systems the --user can always be omitted.)

If installed as above, the krop binary should then be in a location like /home/user/.local/bin/krop. (If the /home/user/.local/bin/ directory didn't exist before then, at least on Ubuntu 20.04, you need to log out and back in before your system is able to find krop. In that case, the pip3 install command probably printed a warning like: WARNING: The script krop is installed in '/home/user/.local/bin' which is not on PATH. This is resolved when logging back in because the default ~/.profile script adds that directory to the PATH if it exists.)

You can uninstall krop using:

$ pip3 uninstall krop

Hints for advanced usage

  • Basic keyboard shortcuts are supported: you can navigate the PDF file using PageUp/PageDown and Home/End, you can create and delete selections using Insert/Delete. You can also move the current selection using the arrow keys while pressing Shift.
  • If you are cropping a PDF file with many pages, then you may have some exceptional pages which need to be cropped in a different way then the other pages. In that case, the option Exceptions under Selections apply to will be useful to you.
  • If you press Trim Margins on a page without selections, then krop will automatically create a region for the full page with the margins trimmed.
  • You can use command line arguments in addition to (or, to a degree, instead of) the graphical interface. Run krop --help to get a list of all possible arguments. For instance, to automatically undo 4 pages print onto a single page:
    $ krop --go --grid=2x2 file.pdf
    
    Omit the --go to open the GUI with the 2x2 grid of selections pre-created. To additionally trim each of these pages:
    $ krop --go --grid=2x2 --trim --trim-use=all file.pdf
    
    Prefix these commands with xvfb-run (in the package xvfb on Debian/Ubuntu) if you are not running an X server.
  • You can add entries to the default aspect ratios offered in the dropdown boxes for Current Selection and Fit screen of device by editing the configuration file. The location of this file may differ from system to system, but a good place to start looking is ~/.config/arminstraub.com/krop.conf. In that file, edit the entries for SelAspectRatiosDefaults and DeviceTypesDefaults and change the header to SelAspectRatios and DeviceTypes (that is, remove Defaults from the header). Don't forget to adjust the counter size= after adding new entries.
  • If you run into the error multiple definitions in dictionary while cropping a file, then this is because pyPdf is too strict. You can either upgrade to PyPDF2 or proceed as indicated in this bug report.

TODOs: roadmap and wishlist

 Roadmap (ideas planned for one of the next releases)
+Allow translations (once UI has stabilized)
+Support PySide in addition to PyQT
+Support pikepdf in addition to or instead of pyPdf/PyPDF2
+Expose further Ghostscript features (like exporting to PDF 1.7)
+Save and later reuse regions for cropping
 
 Wishlist (ideas that are likely difficult or currently impractical)
+Preserve meta data, table of contents and bookmarks when cropping
+Option to overlay pages in order to improve making selections