MS Azure cloud setup with Pulumi (Alan Turing Institute)

Note

This guide on how to run a full setup of the Seshat django app on Azure with Pulumi based on this guide.

It assumes that you have access to a Seshat database dump, including all the spatial data. You can access it through the project’s Google Drive if you have access to it.

It assumes that you are located in the UK and have access to the Azure subscription that the Seshat project is under.

It assumes that your data tables are populated with the shape data. If it is not, you can populate them with the instructions in spatialdb.rst.

Warning

The setup is only partially automated with Pulumi currently. As you’ll see below, subsequent steps are required to that involve SSH-ing into the created VM.

This setup guide is specific to Azure, but can be followed outside The Alan Turing Institute with a credited Azure subscription.

Prerequisites

The following instructions assume you have the following software installed:

  • Python 3 (in order to use venv)

  • Pulumi

  • Azure CLI

Step 1: Log in to Azure

Ensure that you are correctly logged in and that the subscription you will use comes up in the list of subscriptions printed out, then set to that subscription:

$ az login
$ az account set --subscription "<subscription-id>"

Step 2: Create a virtual environment for Pulumi

You can use either Conda or Python’s built-in venv module to create a virtual environment (you could also re-use the environment you set up for Seshat development and install the requirements there).

Create the environment:

$ conda create --name seshat_pulumi

Activate the environment:

$ conda activate seshat_pulumi

Install the requirements:

$ pip install -r pulumi/requirements.txt

Step 3: Set up a Pulumi stack

We assume here that you’ll use our provided Pulumi setup (located in the /pulumi directory in this repository).

Important

In the provided set up in the /pulumi directory, we have already set up the Pulumi stack for you.

In the included set up, we:

  • Chose a sensible project name: seshat-dev

  • Chose the stack name seshat

  • Chose UKSouth location

  • Made custom edits to the config files for the Seshat app

To set up this Pulumi stack, run the following commands:

$ pulumi stack init seshat
$ pulumi stack select seshat

Step 4: Configure Pulumi

You will need to provide the following configuration values:

  • sshPublicKey: The public key that will be used to SSH into the VM. You can find your public key by running:

The following command will set the sshPublicKey configuration value:

$ pulumi config set --secret sshPublicKey "$(cat ~/.ssh/id_rsa.pub)"

Step 5: Deploy the app

To deploy the app, run the following command:

$ pulumi up

Manual steps

The Pulumi setup is only partially automated. The following steps are required to complete the setup:

  • SSH into the created VM

  • Set up the database

  • Run the Django app

Manual step 1: SSH into the created VM

First, we want to get the public IP address of the VM:

$ pulumi stack output

This will output the public IP address of the VM. Make a note of this IP address as you will need it to SSH into the VM.

In order to SSH into the VM, run the following command:

$ ssh -i ~/.ssh/id_rsa webadmin@<VM IP adress>

Manual step 2: Set up the database

Once we’ve logged inot the VM, we need to set up the database. In this step, we create the database, add PostGIS to it, set a password for the superuser, update postgres to use md5, and restore the database from the dump.

To create the database, we need to open psql:

$ sudo -u postgres psql

Then, create the database:

CREATE DATABASE <seshat_db_name>;

Exit out of psql using \q.

Next, we need to add PostGIS to the database by opening psql again using the correct user:

$ sudo -u postgres psql -d <seshat_db_name>

Then, add PostGIS to the database:

CREATE EXTENSION postgis;

Exit out of psql using \q.

Manual step 3: Secure the database

Choose a password for Postgres. At Turing we have an Azure Key Vault set up under the project subscription where this can be saved (the one we have set up can be reused).

In order to add the password for the superuser, open psql:

$ sudo -u postgres psql

Then, add the password for the superuser:

ALTER USER postgres WITH PASSWORD '<db_password>';

Update postgres to use md5:

$ sudo nano /etc/postgresql/16/main/pg_hba.conf
../../../_images/pg_hba.conf.png

In order for the changes to take effect, reload postgres:

$ sudo systemctl reload postgresql

Exit out of psql using \q.

Manual step 4: Restore the database from the dump

Hint

This step assumes that you have access to the Seshat database dump.

You can access it through the project’s Google Drive.

In order to restore the database from the dump, run the following command:

$ sudo psql -U postgres <seshat_db_name> < ~/seshat.dump

[Optional] manual step 4.1: Update the database with the latest shape data

If you need to update the database with the latest shape datasets, you can do so by following the instructions in spatialdb.rst.

You can first upload the data files required to the VM using scp:

$ scp -i ~/.ssh/id_rsa path/to/datafile webadmin@<VM IP adress>:location_on_vm/datafile

Manual step 5: Collect static files

$ cd seshat
$ source venv/bin/activate
$ python manage.py collectstatic

Manual step 6: Add the IP address to the allowed hosts

First, open seshat/settings/local.py and add the created IP address to ALLOWED_HOSTS.

ALLOWED_HOSTS = ['<public IP>']

Then, ensure local (test site) settings are set:

$ export DJANGO_SETTINGS_MODULE=seshat.settings.local

Manual step 7: Run the Django app

$ sudo ufw allow 8000
$ cd seshat
$ source venv/bin/activate
$ export DJANGO_SETTINGS_MODULE=seshat.settings.local
$ gunicorn seshat.wsgi:application --config gunicorn.conf.py

Now, you should be able to go to the publicly’ exposed IP on port 8000: http://<public IP>:8000/.

[WIP] Manual step 8: Set up Nginx to work with Gunicorn

Untested instructions

You can test the app by running it with Gunicorn:

$ gunicorn seshat.wsgi:application --config gunicorn.conf.py

Visiting the public IP address in your browser should now show the Seshat app on the port 8000: http://<public IP>:8000/.

Kill the Gunicorn process with Ctrl+C.

Create a systemd service for Gunicorn:

$ sudo nano /etc/systemd/system/gunicorn.socket

Inside the file, add the following:

[Unit]
Description=gunicorn socket

[Socket]
ListenStream=/run/gunicorn.sock

[Install]
WantedBy=sockets.target

Next, create a systemd service file for Gunicorn. The service filename should match the socket filename with the exception of the extension:

$ sudo nano /etc/systemd/system/gunicorn.service

Inside the file, add the following:

[Unit]
Description=gunicorn daemon
Requires=gunicorn.socket
After=network.target

[Service]
User=webadmin
Group=webadmin
WorkingDirectory=/home/webadmin/seshat
ExecStart=/home/webadmin/seshat/venv/bin/gunicorn \
          --access-logfile - \
          --workers 3 \
          --bind unix:/run/gunicorn.sock \
          seshat.wsgi:application --config gunicorn.conf.py

[Install]
WantedBy=multi-user.target

You can now start and enable the Gunicorn socket. This will create the socket file at /run/gunicorn.sock` now and at boot. When a connection is made to that socket, systemd will automatically start the gunicorn.service to handle it:

$ sudo systemctl start gunicorn.socket
$ sudo systemctl enable gunicorn.socket

You can check the status of the Gunicorn socket and Gunicorn with:

$ sudo systemctl status gunicorn.socket
$ sudo systemctl status gunicorn

Next, we need to set up Nginx to pass web requests to the socket file:

$ sudo nano /etc/nginx/sites-available/seshat

Add the following configuration:

server {
    listen 80;
    server_name <public IP>;

    location = /favicon.ico { access_log off; log_not_found off; }

    location / {
        include proxy_params;
        proxy_pass http://unix:/run/gunicorn.sock;
    }

    location /static/ {
        autoindex on;
        alias /home/webadmin/seshat/seshat/staticfiles/;
    }

}

Change the content of nginx config to make sure that it can access all the files in our project:

$ sudo nano /etc/nginx/nginx.conf

On the top of the file, the user should be changed from www-data to webadmin:

user webadmin;

Then, link the file to the sites-enabled directory:

$ sudo ln -s /etc/nginx/sites-available/seshat /etc/nginx/sites-enabled

Check the Nginx configuration:

$ sudo nginx -t

If the test is successful, restart Nginx:

$ sudo systemctl restart nginx

You should now be able to access the Seshat app on the public IP address http://<public IP>/.