Get Rewarded! We will reward you with up to €50 credit on your account for every tutorial that you write and we publish!

Paperless-ng on Docker for Brother devices

profile picture
Author
Markus Hupfauer
Published
2021-03-25
Time to read
8 minutes reading time

About the author- Markus Hupfauer

Introduction

paperless-ng is a fork of Paperless, an application that can be used to index your scanned documents. At the time of writing this tutorial, paperless-ng is not 100% done, so it is likely you will need to regularly update your docker image for the latest features.

Please note that this should not be run available on the public internet. Either limit the IP access to your origin IP or workaround this via VPN (there is already a great pfSense tutorial and hcloud private networks.

Prerequisites

This tutorial will assume there is already a VPN connection available to the server and it can be reached via its private hcloud network IP.

** paperless-ng host **

  • Hostname: dms.yourdomain.com
  • OS: Debian 10
  • Size: CPX11
  • IP address: 10.0.1.1

** Storage box **

  • Hostname: uXXXXX.your-storagebox.de
  • Username: uXXXXX

Step 1 - Preparation of paperless-ng host

Install the paperless-ng host at the closest datacenter to your location and provide the following cloud-init script to kick start the installation.

This will update the base-system and install docker, docker-compose and required libraries and tools in order to talk to the scanner and storagebox.

#cloud-config
package_upgrade: true
packages:
  - cifs-utils
  - sane-utils
  - imagemagick
runcmd:
  - curl -fsSL https://get.docker.com -o get-docker.sh
  - sudo sh get-docker.sh
  - sudo curl -L "https://github.com/docker/compose/releases/download/1.28.5/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
  - sudo chmod +x /usr/local/bin/docker-compose

Step 1.1 - Mount storagebox

Create a credential file /root/.storageboxcred:

username=uXXXXX
password=<STORAGEBOX-PASSWORD>

To auto-mount the storage box on boot add the following line at the end of /etc/fstab:

//uXXXXX.your-storagebox.de/backup /mnt/storagebox cifs credentials=/root/.storageboxcred,uid=root,forceuid,gid=root,forcegid,file_mode=0777,dir_mode=0777  0 0

Create the mount folder mkdir /mnt/storagebox, mount the storagebox with mount -a and create a folder called dms within using mkdir /mnt/storagebox/dms.

Step 1.2 - Prepare paperless-ng Docker

Create a folder in your root directory to place the docker-compose.yml and docker-compose.env files into, and navigate to it. This tutorial will use /root/paperless-ng.

Create docker-compose.env and insert the following lines:

COMPOSE_PROJECT_NAME=paperless
PAPERLESS_OCR_LANGUAGE=deu
PAPERLESS_OCR_CLEAN=clean

If you plan on using paperless-ng only for english documents you may remove PAPERLESS_OCR_LANGUAGE. If you want to use other languages please specify them in a comma separated list according to ISO 639.2 specification.

PAPERLESS_OCR_CLEAN will clean up images to yield nicer results. Not required but works fine on CPX11 machines so why not?

Create docker-compose.yml accordingly:

version: "3.4"
services:
  broker:
    image: redis:6.0
    restart: unless-stopped

  db:
    image: postgres:13
    restart: unless-stopped
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: paperless
      POSTGRES_USER: paperless
      POSTGRES_PASSWORD: paperless

  webserver:
    image: jonaswinkler/paperless-ng:latest
    restart: unless-stopped
    depends_on:
      - db
      - broker
    ports:
      - 80:8000
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000"]
      interval: 30s
      timeout: 10s
      retries: 5
    volumes:
      - /mnt/storagebox/dms/data:/usr/src/paperless/data
      - /mnt/storagebox/dms/media:/usr/src/paperless/media
      - ./export:/usr/src/paperless/export
      - /var/lib/paperless-ng/consume:/usr/src/paperless/consume
    env_file: docker-compose.env
    environment:
      PAPERLESS_REDIS: redis://broker:6379
      PAPERLESS_DBHOST: db


volumes:
  pgdata:

Consumption directory

For security reasons create a service user useradd -r brscan-skey and add it to the scanner group via gpasswd brscan-skey scaner. To properly map the user and group to the containers we require the user_id and group_id of the brscan-skey user within the docker-compose.env file. To accomplish this we run this one-liner:

printf "USERMAP_UID=$(id -u brscan-skey)\nUSERMAP_GID=$(id -g brscan-skey)" >> docker-compose.env

Next we create the consumer folder for the paperless-ng instance to monitor: mkdir /var/lib/paperless-ng && mkdir /var/lib/paperless-ng/consume. The folder is still owned by the root user so we have to change the owner and group to the brscan-skey user/group and fix the permissions.

chown $(id -u brscan-skey):$(id -u brscan-skey) -R /var/lib/paperless-ng/
chmod 700 -R /usr/share/paperless-ng/

Step 1.3 - Prepare Brother environment

Brother Linux drivers are available as .deb packages for most of the devices. Just visit their homepage and find your driver.

As paperless-ng does not print we only need the scan drivers and the brscan-skey utility tool. This utility enables initiating the scan directly at the printer device (if your model supports this).

Required files:

  • Scanner driver 64bit (deb package)
  • Scan-key-tool 64bit (deb package)

Download them and upload them to the machine or get the exact download path with CTRL+J (Chrome) and wget on the machine.

First install for example brscan4 via dpkg -i --force-all brscan4-0.4.10-1.amd64.deb and then install brscan-skey via dpkg -i --force-all brscan-skey-0.3.1-2.amd64.deb. Verify dependencies of installation via apt install -f.

Next we delete all entries from the no longer working brother udev rules nano /etc/udev/rules.d/60-brother-libsane-type1.rules and insert:

ATTR{idVendor}=="04f9", MODE="0666", GROUP="scanner", ENV{libsane_matched}="yes", SYMLINK+="scanner-%k"

The scanner prerequisites are now satisfied and we can configure the sane-service via brsaneconfig4 -a name=FRIENDLY-NAME model=MODEL-NAME ip=xx.xx.xx.xx. For MODEL-NAME chose the exact model name e.g. MFC-L2710DW.

Note that the printer needs to have a static IP configured, either via settings or DHCP lease.

For example this can look like: brsaneconfig4 -a name=Scanner -model=MFC-L2710DW -ip=10.1.1.40. To verify installation run brscan-skey && brscan-skey -l and scanimage -L. This should yield something like:

root@dms:~/paperless-ng# scanimage -L
device `brother4:net1;dev0' is a Brother Scanner MFC-L2710DW

Next we create a custom service to autostart brscan-skey on boot. Create a file nano /etc/systemd/system/brscan-skey.service and paste:

[Unit]
Description=Brother Scan SKey Service

[Service]
Type=forking
Restart=always
User=brscan-skey
ExecStart=/opt/brother/scanner/brscan-skey/brscan-skey
ExecStop=/opt/brother/scanner/brscan-skey/brscan-skey --terminate

[Install]
WantedBy=multi-user.target

Reload systemctl daemon systemctl daemon-reload. Then enable systemctl enable brscan-skey.service and start systemctl start brscan-skey.service our newly created service.

Next we have to specify the interface from which the printer is reachable, in my case enp7s0 (can vary depending on what kind of hcloud machine type you are running on -> check via ip a). nano /opt/brother/scanner/brscan-skey/brscan-skey.config and append the following line eth=enp7s0.

After this we create our own scan script to merge multiple sites into one multi site .tiff document. nano /opt/brother/scanner/brscan-skey/script/scantofile_v2.sh

#! /bin/sh

TMP=$(mktemp -d)
DSTDIR='/var/lib/paperless-ng/consume'
NAME=scan_$(date "+%d.%m.%Y-%H:%M:%S").tiff

scanimage -L
scanimage --device-name="brother4:net1;dev0" --resolution 200 --format=png --batch=$TMP"/out-%d.png" -x 210 -y 297

cd $TMP
convert out-*.png $NAME

mv $NAME $DSTDIR

rm -rf $TMP

NOTE: Replace brother4:net1;dev0 with the appropriate device name displayed when scanimage -L is called. If you scan something else than A4 please change -x and -y parameters accordingly.

Resolution of 200dpi is more than enough for Tesseract OCR.

This script scans all images on the document feeder (the automatic thingy that takes a stack of pages and scans them one by one) as .png files to a /tmp directory, converts them to a multi page .tiff, moves that to the consume directory and deletes the temp directory.

In order for brscan-skey to call our script we need to change the config again nano /opt/brother/scanner/brscan-skey/brscan-skey.config and replace FILE="bash /opt/brother/scanner/brscan-skey/script/scantofile.sh" with FILE="bash /opt/brother/scanner/brscan-skey/script/scantofile_v2.sh".

Step 2 - Initialize paperless-ng

Everything is prepared by now. So we first reboot the machine with reboot and login again in order to apply udev rules properly and also validate our new service.

systemctl status brscan-skey.service should yield:

root@dms:~/paperless-ng# systemctl status brscan-skey.service
● brscan-skey.service - Brother Scan SKey Service
   Loaded: loaded (/etc/systemd/system/brscan-skey.service; enabled; vendor preset: enabled)
   Active: active (running) since XXXX
  Process: 558 ExecStart=/opt/brother/scanner/brscan-skey/brscan-skey (code=exited, status=0/SUCCESS)
 Main PID: 567 (brscan-skey-exe)
    Tasks: 4 (limit: 2296)
   Memory: 20.7M
   CGroup: /system.slice/brscan-skey.service
           └─567 /opt/brother/scanner/brscan-skey/brscan-skey-exe

Mar 05 20:40:54 dms systemd[1]: Starting Brother Scan SKey Service...
Mar 05 20:40:54 dms systemd[1]: Started Brother Scan SKey Service.

Change into the paperless-ng directory cd ~/paperless-ng and pull the docker images docker-compose pull. Next we will initialize the postgres database and create a superuser for it. docker-compose run --rm webserver createsuperuser.

Follow the prompts within the console, you may use a made up e-mail (i.e. admin@admin.invalid. Always use .invalid). Complete initialization with docker-compose up -d.

You can now login at http://IP/. As this tutorial assumes you utilize a secure network (i.e. a secured VPN connection) there is no https in order to have the machine completely unreachable from the internet.

Conclusion

Documents scanned directly from the device will now be placed in /var/lib/paperless-ng/consume by our scanner script /opt/brother/scanner/brscan-skey/script/scantofile_v2.sh from our custom service /etc/systemd/system/brscan-skey.service.

License: MIT
Want to contribute?

Get Rewarded: Get up to €50 in credit! Be a part of the community and contribute. Do it for the money. Do it for the bragging rights. And do it to teach others!

Report Issue
Try Hetzner Cloud

Get 20€ free credit!

Valid until: 31 December 2024 Valid for: 3 months and only for new customers
Get started
Want to contribute?

Get Rewarded: Get up to €50 credit on your account for every tutorial you write and we publish!

Find out more