Multi-Tenant Data Isolation & Row Level Security

Over the past 1 and a half years, I was involved in designing and developing a multi-tenant Treasury Management system. In this article, I will share our approaches to the data isolation aspect of our multi-tenant solution and the learnings from it.

Background & problem regarding Data isolation

Before going into the problem which I am going to focus on today, I must give some background into our architecture for storage and data in our system first. When it comes to data partitioning for SaaS systems, at the extreme far right end, we have the approach of using dedicated databases for each tenant (silo model) and on the other side of the spectrum is the shared database model (pool model).

_{Image source – https://learn.microsoft.com/en-us/azure/architecture/guide/multitenant/approaches/storage-data}

For obvious reasons such as reduced management overhead and lower cost, it was decided that our solution would be in the shared database model and with that comes the drawback of lower levels of inherent data isolation. In most cases with the shared database model, the data isolation depends on developers implementing the correct WHERE clauses in every SQL statement but this is of course error prone. We wanted an approach to enforce the data isolation from a separate layer and in comes the concept of Row Level Security (RLS).

interface TransactionRepository : JpaRepository<Transaction, Long> {

    // trivial approach
    fun findAllByTenantId(): List<Transaction>
    
    // what we want to achieve
    override fun findAll(): List<Transaction>
}

Trivial Approach vs Expected Approach for Spring Repository Method Calls

Solution – ROW Level Security (RLS)

In general terms, Row Level Security (in RDBMS) refers to mechanisms that allow controlling access to rows in a database table based on some context (example tenant_id). To my knowledge, this feature has been around on PostgreSQL DB since 2020 and this is available in SQL Server, in Amazon Aurora for PostgreSQL and RDS for PostgreSQL at the time of writing.

ALTER TABLE gtms_payment.TRANSACTION ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation_policy ON gtms_payment.TRANSACTION
    USING (business_group_code = current_setting('app.current_tenant')::VARCHAR);

A sample RLS policy on PostgreSQL

going ahead with RLS

Our standard database solution throughout the organization was PostgreSQL (for use cases like ours) and this helped in our decision to go ahead with RLS as the solution but if you are trying to solve a similar problem with RLS do take note that not all of the relational databases support Row Level Security yet so you might have to be aware of being locked into a subset of databases that support RLS. However, switching to another relational DB that supports RLS should not be too difficult as there is little dependency on the DB type on the code level itself when using RLS.

Implementation

Going into the implementation details, once RLS is enabled on the DB, we should set the parameters (in our case app.current_tenant) for the DB to run the policy on for each query. In our case, we do that on the DataSource implementation as follows. In our flow, we maintain the Tenant information on Spring Security Context (again which I hope to cover in a separate blog) and the getTenantId() which is abstracted out here is retrieving the Tenant Id from the Spring Security Context.

class TenantAwareHikariDataSource : HikariDataSource() {

    override fun getConnection(): Connection? {
        val connection = super.getConnection()
        return createConnection(connection)
    }

    override fun getConnection(username: String, password: String): Connection? {
        val connection = super.getConnection(username, password)
        return createConnection(connection)
    }

    private fun createConnection(connection: Connection): Connection? {
        connection.createStatement().use { sql ->
            sql.execute("SET app.current_tenant = '${getTenantId()}'")
        }
        return connection
    }
}

Tenant-aware Hikari Datasource

Note that if the parameters required for the RLS policy are not set, we will run into errors as follows.

[2024-08-25 22:53:50] [42704] ERROR: unrecognized configuration parameter "app.current_tenant"

Supporting use cases That Does Not Require a tenant id

In all multi-tenant systems, there will always be some use cases such as tenant onboarding where you will have to run DB queries without a tenant id. So can we support such use cases? The answer is Yes!. In PostgreSQL functionality, the schema owner bypasses RLS policies. Hence in our solution we have, we maintain two sets of DB configs (Tenant DB Config and Non Tenant DB Config) and two sets of Spring Repositories tied to each DB config with Tenant Config as primary.

Thus in the few use cases that need to run queries without a Tenant Id, we use the Non-Tenant Configs. We do have to use PostgreSQL schema owner user for the database change management tool (in our case Liquibase) as well.

PROS and CONS of RLS Approach to Sum things up

PROS

RLS provides data isolation on the DB layer hence it provides a stronger isolation than that of the common WHERE clause approach.
Most of the data isolation logic (Tenant-aware data source, Tenant and Non-Tenant db configs etc) can be on a separate library and shared across different micro-services. Hence developers can worry less of tenant data isolation while developing application-level business logic

CONS

Since not all relational databases support RLS yet, there will be a DB lock-in as explained in previous sections

Bonus

Though we have talked about how to use TenantId in TenantAwareHikariDataSource, we have not touched on how to set and reset TenantId. One of the ways to do this, specially in the case of REST API calls, is to use an Interceptor.

class TenantInfoInterceptor : HandlerInterceptor {
    override fun preHandle(request: HttpServletRequest, response: HttpServletResponse,
                           handler: Any): Boolean {
        // set TenantId
        return super.preHandle(request, response, handler)
    }

    override fun afterCompletion(request: HttpServletRequest, response: HttpServletResponse, handler: Any, ex: Exception?) {
        // reset TenantId
        super.afterCompletion(request, response, handler, ex)
    }
}

Furthermore, in cases of Async thread usage, we need a mechanism to pass the TenantId to async threads. A TaskDecorator is used for this purpose.

@Configuration
class AsyncConfiguration : AsyncConfigurer {
    override fun getAsyncExecutor(): TaskExecutor {
        val executor = ThreadPoolTaskExecutor()
        executor.maxPoolSize = 50
        executor.queueCapacity = 250
        executor.setThreadNamePrefix("ContextAwareTask-")
        executor.setWaitForTasksToCompleteOnShutdown(true)
        executor.setTaskDecorator(AsyncTaskDecorator())
        executor.initialize()
        return executor
    }
}

class AsyncTaskDecorator : TaskDecorator {
    override fun decorate(task: Runnable): Runnable {
        return Runnable {
            try {
                // set TenantId
                task.run()
            } finally {
                // reset TenantId
            }
        }
    }
}

Furthermore, in cases where events/messages are passed through from one service to another, we may have to rely on metadata from event (in our case AWS SQS message metadata) to set the TenantId if needed.

Hope you all have fun with RLS! 🙂

References

~ Rajind Ruparathna

Featured image credits: TheDigitalWay from Pixabay

Multi-Tenant Data Isolation & Row Level Security

Background & problem regarding Data isolation

Solution – ROW Level Security (RLS)

Other Solutions – SpRing Post Filter, Hibernate FILTERS with SPRING ASPECTS

going ahead with RLS

Implementation

Supporting use cases That Does Not Require a tenant id

PROS and CONS of RLS Approach to Sum things up

PROS

CONS

Bonus

References

Published by Rajind Ruparathna

One thought on “Multi-Tenant Data Isolation & Row Level Security”

Leave a comment Cancel reply

Background & problem regarding Data isolation

Solution – ROW Level Security (RLS)

Other Solutions – SpRing Post Filter, Hibernate FILTERS with SPRING ASPECTS

going ahead with RLS

Implementation

Supporting use cases That Does Not Require a tenant id

PROS and CONS of RLS Approach to Sum things up

PROS

CONS

Bonus

References

Share this:

Related

Published by Rajind Ruparathna

One thought on “Multi-Tenant Data Isolation & Row Level Security”

Leave a comment Cancel reply