Hire the author: Maina K

All code referenced on this django-tenant-schemas tutorial can be found here.

Introduction

Recently I was tasked with converting an existing Single Tenant API to a multi-tenant one. The API was created using the Django Rest Framework package and used the Postgres Database as a persistence layer, consequently, I settled on django-tenant-schemas after a few minutes of scouring the web for a solution.

django-tenant-schemas is a django based package that will enable you to perform the creation of client-specific schemas in a single Postgres DB instance, in addition, it’ll also aid in request routing ensuring data isolation is maintained, in other words, every client will only access data associated with their account. In short, it will help you convert the initially single-tenant API to a multi-tenant one with as minimal changes to your existing codebase as possible.

Glossary

  • Single Tenant – A software architecture in which a single instance of the software runs on a server and serves a single tenant.
  • Multi-Tenancy – A software architecture in which a single instance of the software runs on a server and serves multiple tenants.
  • Postgres Schemas – named collections of tables, they are analogous to directories at the operating system level, except that schemas cannot be nested

Let’s get started

Prerequisites

  • Basic knowledge of how REST APIs work
  • Knowledge of Django and Django Rest Framework
  • Understanding of the basic functionality of Git and GitHub.
  • A running Postgres DB instance

Through this walk-through, you will be using an existing single-tenant API as the ‘test subject’. It’s a popular Django beginner tutorial app called the polls app, however, you will be using the API (REST) rendition of the app and include multi-tenancy on top of existing functionality. Find the final code in this GitHub repository.

Setting up PostgreSQL

Download and install Postgres software to your local machine, after that proceed to set it up as the default persistence layer for the API as opposed to the Django default SQLite.
Proceed to add below code to your setttings.py file.

DATABASES = {
'default': {
'ENGINE': 'tenant_schemas.postgresql_backend',
'NAME': os.getenv('DB_NAME', 'pollsapi'),
'USER': os.getenv('DB_USER', 'postgres'),
'PASSWORD': os.getenv('DB_PASSWORD', 'postgres'),
'HOST': os.getenv('HOST', 'localhost'),
'PORT': os.getenv('PORT', 5432),
}
}
view raw settings.py hosted with ❤ by GitHub

Ignore the ENGINE value at the moment an explanation for it will be provided in due time. However, the remaining keys and their values should be familiar and self-explanatory. After that, you should add the database values to an env file as a security precaution. As such you should have a .env file on your root folder that resembles:

SECRET_KEY=<your-very-secret-key>
DB_NAME=pollsapi
DB_USER=<your-db-user>
DB_PASSWORD=<your-db-password>
HOST=localhost
DEBUG=True
view raw .env hosted with ❤ by GitHub

Django Tenant Schemas

Like explained before django-tenant-schemas is a django-based python package that will do most of the heavy lifting for us when it comes to restructuring our database architecture from single-tenant based to multi-tenant based. Install it by running:

pip install django-tenant-schemas

Most of the steps that follow have already been included in the django-tenant-schema documentation therefore we will not delve into details, feel free to peruse the documentation at your leisure. Make the following edits to the settings.py file:

DATABASE_ENGINE

Alter the DATABASE_ENGINE to tenant_schemas.postgresql_backend to ensure that django-tenant-schemas can automatically create schemas in our database for each tenant.

DATABASE_ROUTERS

You will also need to alter the DATABASE_ROUTERS tuple in the settings.py. Therefore, add the code below:

DATABASE_ROUTERS = (
    'tenant_schemas.routers.TenantSyncRouter',
)

MIDDLEWARE_CLASSES

In the default django-tenant-schemas implementation you should also add the tenant_schemas.middleware.TenantMiddleware to the top of the MIDDLEWARE_CLASSES list, so that each request can be set to use the correct schema. Thus your MIDDLEWARE_CLASSES list should resemble:

MIDDLEWARE = [
'tenant_schemas.middleware.TenantMiddleware',
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
view raw settings.py hosted with ❤ by GitHub

SHARED_APPS/TENANT_APPS/INSTALLED_APPS

It is important that you outline what apps are to be accessed publicly (SHARED_APPS) and ones that will be specific to tenants (TENANT_APPS). This will make it possible for django-tenant-schemas to save data accessed via said apps accordingly. INSTALLED_APPS is the default Django apps list and should remain as is but for one change tenant_schemas should be the placed at the top of the list:

INSTALLED_APPS = [
'tenant_schemas',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'rest_framework',
'rest_framework.authtoken',
'pollsapi.polls',
]
view raw settings.py hosted with ❤ by GitHub

SHARED_APPS

SHARED_APPS is a new tuple/list defined to indicate to django-tenant-schemas the apps intended for public use. The tuple/list should resemble:

SHARED_APPS = (
'tenant_schemas',
'django.contrib.contenttypes',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'rest_framework',
'rest_framework.authtoken',
)
view raw settings.py hosted with ❤ by GitHub

TENANT_APPS

TENANT_APPS finally will be a tuple/list that indicates which apps are accessible only to tenants, thus individuals without a defined schema will not be able to access apps on this tuple/list. Existing tenants while accessing these apps will have their data routed and saved to their respective schemas. The tuple/list should resemble:

TENANT_APPS = (
'django.contrib.contenttypes',
'pollsapi.polls'
)
view raw settings.py hosted with ❤ by GitHub

Tenant App

Create the tenant app by running the following code from your pollsapi root folder:

django-admin startapp tenant

Proceed to the models.py file in the newly created tenant app and add the following code:

import uuid
import os
from django.db import models
from tenant_schemas.models import TenantMixin
# Create your models here.
class Client(TenantMixin):
REQUIRED_FIELDS = ('tenant_name', 'paid_until', 'schema_name', 'on_trial')
tenant_name = models.CharField(max_length=100, unique=True, null=False, blank=False)
tenant_uuid = models.UUIDField(default=uuid.uuid4, null=False, blank=False)
paid_until = models.DateField()
on_trial = models.BooleanField()
created_on = models.DateField(auto_now_add=True)
domain_url = models.URLField(blank=True, null=True, default=os.getenv('DOMAIN'))
# default true, schema will be automatically created and synced when it is saved
auto_create_schema = True
view raw models.py hosted with ❤ by GitHub

The above represents the blueprint that will define our tenant. By default, django-tenant-schemas uses subdomains to detect the tenant and route the request accordingly, but in our case, since we only want to have a fixed URL for all our tenants we will be using a unique identifier (UUID) in our request headers to perform the task. Therefore, you will now proceed to add the custom middleware that will be responsible for routing our HTTP requests to the right tenant.

Custom Middleware

In the tenant app create a file called middleware.py and add the following code:

from datetime import date
from dateutil.relativedelta import relativedelta
from django.core.exceptions import ObjectDoesNotExist
from tenant_schemas.middleware import BaseTenantMiddleware
from tenant_schemas.utils import get_public_schema_name
class RequestIDTenantMiddleware(BaseTenantMiddleware):
def get_tenant(self, model, hostname, request):
try:
public_schema = model.objects.get(schema_name=get_public_schema_name())
except ObjectDoesNotExist:
public_schema = model.objects.create(
domain_url=hostname,
schema_name=get_public_schema_name(),
tenant_name=get_public_schema_name().capitalize(),
paid_until=date.today() + relativedelta(months=+1),
on_trial=True)
public_schema.save()
x_request_id = request.META.get('HTTP_X_REQUEST_ID', public_schema.tenant_uuid)
tenant_model = model.objects.get(tenant_uuid=x_request_id)
print(tenant_model, public_schema)
return tenant_model if not None else public_schema
view raw middleware.py hosted with ❤ by GitHub

The above class inherits from django-tenant-schemas’s BaseTenantMiddleware subsequently overriding the get_tenant method where we add our own custom methodology for retrieving a tenant, we check the request headers for a unique UUID which we have attached to a field called X-Request-ID, if one is found we query the tenant model DB against it to identify the tenant if no tenant is found we return the public tenant giving the user the liberty to access apps that are allowed to the public.

We need to make changes to the settings.py file and point the tenant schema middleware to use our custom defined middleware as opposed to the default. We also need to define a new variable that points to our tenant model as a django-tenant-schemas prerequisite. In your settings.py add/alter the following parts, add the variable below. It points to your tenant model

TENANT_MODEL = 'tenant.Client'

Change the default tenant middleware from the MIDDLEWARE list to point to your newly created custom middleware, change:

'tenant_schemas.middleware.TenantMiddleware',

to

'pollsapi.apps.tenant.middleware.RequestIDTenantMiddleware',

We also include the newly created tenant app to the SHARED_APPS tuple/list as well as the INSTALLED_APPS lists:

SHARED_APPS = (
'tenant_schemas',
'pollsapi.tenant',
...
)
view raw settings.py hosted with ❤ by GitHub
INSTALLED_APPS = [
'tenant_schemas',
.....
'pollsapi.tenant'
]
view raw settings.py hosted with ❤ by GitHub

Running migrations

We can now run migrations on our API to populate our database with the required tables. Run:

python manage.py makemigrations

The terminal response should resemble:

WARNINGS:
?: (tenant_schemas.W003) Your default storage engine is not tenant aware.
	HINT: Set settings.DEFAULT_FILE_STORAGE to 'tenant_schemas.storage.TenantFileSystemStorage'
Migrations for 'tenant':
  pollsapi/tenant/migrations/0001_initial.py
    - Create model Client

You can ignore the default storage warning, we will not be covering that and it should not affect our desired functionality. You can now proceed to run the migrate command. django-tenant-schemas has modified the migrate command in order to ensure that it runs on the correct schemas as specified in the settings.py file. Thus instead of running the usual migrate command, you should run:

NOTE: Never use migrate as it would sync all your apps to public!

python manage.py migrate_schemas

Since we are yet to create a tenant the command above should create the public schema only. We shall create a tenant next

Creating a tenant

We will create a custom Django commandline command in order to make the process of creating a tenant less hectic. Create a python package folder in the tenant app and label it management inside the folder create another one labelled commands and in it create a file called client.py and add below code:

import datetime
from django.core.management.base import BaseCommand, CommandError
from pollsapi.tenant.models import Client
from django.utils.text import capfirst
from django.core import exceptions
class Command(BaseCommand):
help = 'Create a client'
def add_arguments(self, parser):
"""
Args:
parser:
Returns:
"""
for field_name in Client.REQUIRED_FIELDS:
parser.add_argument('--%s' % field_name, action='append',
help='Specifies the %s for the superuser.' % field_name, )
def handle(self, *args, **options):
user_data = {}
for field_name in Client.REQUIRED_FIELDS:
field = Client._meta.get_field(field_name)
user_data[field_name] = options[field_name]
while user_data[field_name] is None:
message = self._get_input_message(field)
input_value = self.get_input_data(field, message)
user_data[field_name] = input_value
tenant = Client.objects.create(**user_data)
tenant.save()
if options['verbosity'] >= 1:
self.stdout.write("Client created successfully.")
def get_input_data(self, field, message, default=None):
"""
Override this method if you want to customize data inputs or
validation exceptions.
"""
raw_value = input(message)
if default and raw_value == '':
raw_value = default
try:
val = field.clean(raw_value, None)
except exceptions.ValidationError as e:
self.stderr.write("Error: %s" % '; '.join(e.messages))
val = None
return val
@staticmethod
def _get_input_message(field, default=None):
return '%s%s%s: ' % (
capfirst(field.verbose_name),
" (leave blank to use '%s')" % default if default else '',
' (%s.%s)' % (
field.remote_field.model._meta.object_name,
field.m2m_target_field_name() if field.many_to_many else field.remote_field.field_name,
) if field.remote_field else '',
)
view raw client.py hosted with ❤ by GitHub

The above code will enable us to create a tenant in a similar fashion we would a django superuser. Lets create our first tenant, run below command and populate the fields as prompted:

python manage.py client

It should look something similar to this:

Tenant name: Pollsmaster
Paid until: 2020-05-30
Schema name: pollsmaster
On trial: False

If the command is successful a new tenant, as well as their schema, should be created and the migrate command for the schema run successfully.

Finalizing on the tenant app

You will need to add a few things in the tenant app, before you can test out the functionality. In the tenant app create a serializer.py file and add below code:

from rest_framework import serializers
class ClientSerializer(serializers.Serializer):
tenant_uuid = serializers.UUIDField()
tenant_name = serializers.CharField()
view raw serializer.py hosted with ❤ by GitHub

In the views.py file add the following:

from rest_framework import viewsets, status
from rest_framework.exceptions import ValidationError
from rest_framework.permissions import AllowAny
from rest_framework.response import Response
from pollsapi.tenant.models import Client
from pollsapi.tenant.serializers import ClientSerializer
class ClientViewSet(viewsets.ViewSet):
permission_classes = (AllowAny,)
serializer_class = ClientSerializer
def create(self, request):
client = request.data or {}
tenant_name = client.get('tenant_name')
if tenant_name is None:
raise ValidationError('A tenant name is mandatory.')
tenant = Client.objects.get(tenant_name=tenant_name)
serializer = self.serializer_class(tenant, many=False)
return Response(serializer.data, status=status.HTTP_200_OK)
view raw views.py hosted with ❤ by GitHub

add a urls.py and add code below:

from django.urls import path, include
from rest_framework.routers import DefaultRouter
from pollsapi.tenant.views import ClientViewSet
router = DefaultRouter(trailing_slash=False)
router.register(r'client', ClientViewSet, base_name='clients')
urlpatterns = [
path('', include(router.urls)),
]
view raw urls.py hosted with ❤ by GitHub

Testing everything out

Start up the server and head over to your favorite API testing tool. I’ll be using Insomnia.

Go to the endpoint localhost:8000/client to retrieve your tenant UUID. If successful you should receive a response of the tenant name and their UUID. Your request should look something similar to this:

Copy the tenant_uuid , we’ll use it to make requests to the polls app.

Head over to the create polls endpoint localhost:8000/polls/ add the tenant_uuid to the headers and assign it to X-Request-ID :

Calling the request above should create a new poll. To have a comparative demonstration you can create a new tenant and create multiple polls and observe the data isolation in practice.

Learning Strategy

While doing research for this project, I started off with trying to understand what exactly multitenancy was. On getting a high-level understanding of what multi-tenancy entailed the next step was deciding the methodology to consider in implementing multitenancy to the existing API. In that regard this article from Microsoft on the different implementations of multitenancy proved invaluable. Settling on a methodology entailed weighing the pros and cons which partially included, maintainability of the database, restructuring of the existing codebase, and, most importantly, how the API would scale with an increase in tenants.

In retrospect….

In depth research before writing the first line of code was the biggest take from the entire process. It was important to have a prior understanding of how extensive the codebase would have to be altered as well as whether the use of a third party package was necessary. It was also important to have an implementation plan, that is, a literal step by step plan to be undertaken in bringing the full implementation into fruition.

In conclusion

This particular project took longer than I had predicted. Having run a conservative estimate of 10 hours it took twice the time.

I hope this short example excites the reader into delving deeper into multi-tenancy architecture. Exploring the inner workings of Postgres was exciting and eye-opening. The example above can be extended to include tenant-specific authentication and authorization.

You can also find my other blogs, on pytest, on Apache and on SendGrid.

Tools for further study

Microsoft multi-tenancy article

PostgreSQL schemas documentation

Django Tenant Schemas documentation

Citations

The Featured Image is courtesy of wikimedia commons and can be found here

Get the code from GitHub on this link

Hire the author: Maina K