2. piw-master

The piw-master script is intended to be run on the database and file-server machine. It is recommended you do not run piw-slave on the same machine as the piw-master script. The database specified in the configuration must exist and have been configured with the piw-initdb script. It is strongly recommended you run piw-master as an ordinary unprivileged user, although obviously it will need write access to the output directory.

2.1. Synopsis

piw-master [-h] [--version] [-c FILE] [-q] [-v] [-l FILE] [-d DSN]
                [-o PATH] [--dev-mode] [--debug TASK] [--pypi-xmlrpc URL]
                [--pypi-simple URL] [--pypi-json URL] [--status-queue ADDR]
                [--control-queue ADDR] [--import-queue ADDR]
                [--log-queue ADDR] [--slave-queue ADDR] [--file-queue ADDR]
                [--web-queue ADDR] [--builds-queue ADDR] [--db-queue ADDR]
                [--fs-queue ADDR] [--stats-queue ADDR]

2.2. Description

-h, --help

Show this help message and exit

--version

Show program’s version number and exit

-c FILE, --configuration FILE

Specify a configuration file to load instead of the defaults at:

  • /etc/piwheels.conf

  • /usr/local/etc/piwheels.conf

  • ~/.config/piwheels/piwheels.conf

-q, --quiet

Produce less console output

-v, --verbose

Produce more console output

-l FILE, --log-file FILE

Log messages to the specified file

-d DSN, --dsn DSN

The connection string for the database to use; this database must be initialized with piw-initdb and the user must not be a PostgreSQL superuser (default: postgres:///piwheels)

-o PATH, --output-path PATH

The path under which the website should be written; must be writable by the current user

--dev-mode

Run the master in development mode; this reduces some timeouts and tweaks some defaults

--debug TASK

Set logging to debug level for the named task; can be specified multiple times to debug many tasks

--pypi-xmlrpc URL

The URL of the PyPI XML-RPC service (default: https://pypi.org/pypi)

--pypi-simple URL

The URL of the PyPI simple API (default: https://pypi.org/simple)

--pypi-json URL

The URL of the PyPI JSON API (default: https://pypi.org/pypi)

--status-queue ADDR

The address of the queue used to report status to monitors (default: ipc:///tmp/piw-status); this is usually an ipc address

--control-queue ADDR

The address of the queue a monitor can use to control the master (default: ipc:///tmp/piw-control); this is usually an ipc address

--import-queue ADDR

The address of the queue used by piw-import, piw-add, piw-remove, and piw-rebuild (default: ipc:///tmp/piw-import); this should always be an ipc address

--log-queue ADDR

The address of the queue used by piw-logger (default: ipc:///tmp/piw-logger); this should always be an ipc address

--slave-queue ADDR

The address of the queue used to talk to piw-slave (default: tcp://*:5555); this is usually a tcp address

--file-queue ADDR

The address of the queue used to transfer files to piw-slave (default: tcp://*:5556); this is usually a tcp address

--web-queue ADDR

The address of the queue used to request web page updates (default: inproc://web)

--builds-queue ADDR

The address of the queue used to store pending builds (default: inproc://builds)

--db-queue ADDR

The address of the queue used to talk to the database server (default: inproc://db)

--fs-queue ADDR

The address of the queue used to talk to the file-system server (default: inproc://fs)

--stats-queue ADDR

The address of the queue used to send statistics to the collator task (default: inproc://stats)

2.3. Deployment

A typical deployment of the master service on a Raspbian server goes something like this (each step assumes you start as root):

  1. Install the pre-requisite software:

    # apt install postgresql apache2 python3-configargparse python3-zmq \
                  python3-voluptuous python3-cbor2 python3-requests \
                  python3-sqlalchemy python3-psycopg2 python3-chameleon \
                  python3-simplejson python3-urwid python3-geoip python3-pip
    # pip3 install "piwheels[monitor,master,logger]"
    

    If you wish to install directly from the git repository:

    # apt install git
    # pip3 install git+https://github.com/piwheels/piwheels#egg=piwheels[monitor,master,logger]
    
  2. Set up the (unprivileged) piwheels user and the output directory:

    # groupadd piwheels
    # useradd -g piwheels -m piwheels
    # mkdir /var/www/piwheels
    # chown piwheels:piwheels /var/www/piwheels
    
  3. Set up the configuration file:

    /etc/piwheels.conf
    [master]
    dsn=postgresql:///piwheels
    output-path=/var/www/piwheels
    
  4. Set up the database:

    # su - postgres
    $ createuser piwheels
    $ createdb -O postgres piwheels
    $ piw-initdb
    
  5. Set up the web server:

    • Point the document root to the output path (/var/www/piwheels above, but it can be anywhere your piwheels user has write access to; naturally you want to make sure your web-server’s user only has read access to the location).

    • Set up SSL for the web server (e.g. with Let’s Encrypt; the dehydrated utility is handy for getting and maintaining the SSL certificates). This part isn’t optional; you won’t get pip installing things from an unencrypted source without a lot of pain.

    • See below for an example Apache configuration

  6. Start the master running (it’ll take quite a while to populate the list of packages and versions from PyPI on the initial run so get this going before you start bringing up build slaves):

    # su - piwheels
    $ piw-master -v
    
  7. Deploy some build slaves; see piw-slave for deployment instructions.

2.4. Example httpd configuration

The following is an example Apache configuration similar to that used on the production piwheels master. The port 80 (http) server configuration should look something like this:

/etc/apache2/sites-available/000-default.conf
<VirtualHost *:80>
    ServerName www.example.org
    ServerAlias example.org
    RedirectMatch 302 ^(.*) https://www.example.org$1
</VirtualHost>

Note

Obviously, you will want to replace all instances of “example.org” with your own server’s domain.

On the port 443 (https) side of things, you want the “full” configuration which should look something like this, assuming your output path is /var/www/piwheels:

/etc/apache2/sites-available/default-ssl.conf
<IfModule mod_ssl.c>
    <VirtualHost _default_:443>
        ServerName www.example.org
        ServerAlias example.org
        ServerAdmin webmaster@example.org
        DocumentRoot /var/www/piwheels

        ErrorLog ${APACHE_LOG_DIR}/ssl_error.log
        CustomLog ${APACHE_LOG_DIR}/ssl_access.log combined
        # Send Apache log records to piw-logger for transfer to piw-master
        CustomLog "|/usr/local/bin/piw-logger --drop" combined

        SSLEngine On
        SSLCertificateFile /var/lib/dehydrated/certs/example.org/fullchain.pem
        SSLCertificateKeyFile /var/lib/dehydrated/certs/example.org/privkey.pem

        <Directory /var/www/piwheels>
            Options -Indexes +FollowSymlinks
            AllowOverride None
            Require all granted
            <IfModule mod_rewrite.c>
                RewriteEngine On
                RewriteRule ^project/?$ /packages.html [L,R=301]
                RewriteRule ^p/(.*)/?$ /project/$1 [L,R=301]
            </IfModule>
            <IfModule mod_headers.c>
                Header set Access-Control-Allow-Origin "*"
            </IfModule>
            ErrorDocument 404 /404.html
            DirectoryIndex index.html
        </Directory>

        <Directory /var/www/piwheels/logs/>
            Options +MultiViews
            MultiviewsMatch Any
            RemoveType .gz
            AddEncoding gzip .gz
            <IfModule mod_filter.c>
                FilterDeclare gzip CONTENT_SET
                FilterProvider gzip INFLATE "! req('Accept-Encoding') =~ /gzip/"
                FilterChain gzip
            </IfModule>
        </Directory>
    </VirtualHost>
</IfModule>

Several important things to note:

  • A CustomLog line pipes log entries to the piw-logger script which buffers entries and passes them to piw-master for insertion into the database (which in turn is used to generate statistics for the homepage and the project pages)

  • Only index.html is allowed as a directory index, no directory listings are generated (they can be enormous, and remember the master is expected to be deployable on a Raspberry Pi)

  • There’s a couple of mod_rewrite redirections to deal with legacy path redirections, and providing a more friendly root for the /project/ path

  • The build logs are stored in pre-compressed gzip archives, and the server is configured to serve them verbatim to clients which provide an Accept-Encoding: gzip header. For clients which do not (e.g. curl(1)), the server unpacks the log transparently

  • An example configuration for the SSL certificate locations is given which assumes dehydrated is being used to maintain them

2.5. Example database configuration

The following sections detail various setups for the database server. The simplest is the first, the combined configuration in which the machine hosting the master service also hosts the database.

The later sections detail separating the master and database hosts, and assume your master server is accessible at the IPv6 address 1234:abcd::1 and your database server is at the IPv6 address 1234:abcd::2. Replace addresses accordingly.

2.5.1. Combined configuration

This is effectively covered in the prior deployment section. The default DSN of dsn=postgresql:///piwheels can either be implied by default, or explicitly specified in /etc/piwheels.conf.

The only thing to be aware of, particularly if you are deploying on a Pi, is that the calculation of the build queue is quite a big query. Assuming you are targeting all packages on PyPI (as the production piwheels instance does), you should never consider running the combined database+master on a machine (or VM) with less than 4 cores and 4GB of RAM, preferably more. If deploying a combined master+database on a Pi, use a Pi 4 with 8GB of RAM.

2.5.2. Separate configuration

If you wish to deploy your PostgreSQL database on a separate server, you will first need to ensure that server can accept remote connections from the master server. A simple (but less secure) means of configuring this is to simply “trust” that connections from the master’s IP address to the piwheels database by the piwheels user. This can be accomplished by adding the last line below to pg_hba.conf:

/etc/postgresql/ver/main/pg_hba.conf
# Database administrative login by Unix domain socket
local   all             postgres                                peer

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     peer
# IPv4 local connections:
host    all             all             127.0.0.1/32            md5
# IPv6 local connections:
host    all             all             ::1/128                 md5
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     peer
host    replication     all             127.0.0.1/32            md5
host    replication     all             ::1/128                 md5
host    piwheels        piwheels        1234:abcd::1/128        trust

Then restarting the PostgreSQL server:

# systemctl restart postgresql

Then, on the master, use the following DSN in /etc/piwheels.conf:

/etc/piwheels.conf
[master]
dsn=postgresql://piwheels@[1234:abcd::2]/piwheels

Warning

Never provide remote access to the PostgreSQL superuser, postgres. Install the piwheels package directly on the database server and run the piw-initdb script locally. This will also require creating a /etc/piwheels.conf on the database server, that uses a typical “local” DSN like dsn=postgresql:///piwheels.

2.5.3. SSH tunnelling

A more secure (but rather more complex) option is to create a persistent SSH tunnel from the master to the database server which forwards the UNIX socket for the database back to the master as the unprivileged piwheels user.

Firstly, on the master, generate an SSH key-pair for the piwheels user and copy the public key to the database server.

# su - piwheels
$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/piwheels/.ssh/id_rsa):
Created directory '/home/piwheels/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/piwheels/.ssh/id_rsa
Your public key has been saved in /home/piwheels/.ssh/id_rsa.pub
...
$ ssh-copy-id piwheels@1234:abcd::2

Note

This assumes that you temporarily permit password-based login for the piwheels user on the database server.

Secondly, set up a systemd(1) service to maintain the tunnel:

/etc/systemd/system/piwheelsdb-tunnel.service
[Unit]
Description=A secure tunnel for the piwheelsdb connection
After=local-fs.target network.target

[Service]
User=piwheels
Group=piwheels
RuntimeDirectory=postgresql
RuntimeDirectoryPreserve=restart
ExecStart=/usr/bin/ssh -NT \
  -o BatchMode=yes \
  -o ExitOnForwardFailure=yes \
  -o StreamLocalBindUnlink=yes \
  -L /run/postgresql/.s.PGSQL.5432:/run/postgresql/.s.PGSQL.5432 \
  piwheels@1234:abcd::2
RestartSec=5
Restart=on-failure

[Install]
WantedBy=multi-user.target
# systemctl daemon-reload
# systemctl enable piwheelsdb-tunnel.service
# systemctl start piwheelsdb-tunnel.service

At this point, you should be able to switch back to the piwheels user and connect to the piwheels database (however, note that as the tunnel is owned by the unprivileged piwheels user, only it can access the database remotely):

# su - piwheels
$ psql piwheels
psql (13.5 (Debian 13.5-0+deb11u1))
Type "help" for help.

piwheels=>

Note

This method requires no alteration of pg_hba.conf on the database server; the default should be sufficient. As far as the database server is concerned the local piwheels user is simply accessing the database via the local UNIX socket.

2.6. Automatic start

If you wish to ensure that the master starts on every boot-up, you may wish to define a systemd unit for it:

/etc/systemd/system/piwheels-master.service
[Unit]
Description=The piwheels master service
After=local-fs.target network.target

[Service]
Type=notify
Restart=on-failure
User=piwheels
NoNewPrivileges=true
TimeoutStartSec=3m
TimeoutStopSec=5m
ExecStart=/usr/local/bin/piw-master -v
ExecStartPost=-chmod g+w /tmp/piw-status /tmp/piw-control

[Install]
WantedBy=multi-user.target
# systemctl daemon-reload
# systemctl enable piwheels-master
# systemctl start piwheels-master

2.7. Upgrades

The master will check that build slaves have the same version number and will reject them if they do not. Furthermore, it will check the version number in the database’s configuration table matches its own and fail if it does not. Re-run the piw-initdb script as the PostgreSQL super-user to upgrade the database between versions (downgrades are not supported, so take a backup first!).