Storage Area Isolation per Microservice

Context

Microservices are being adopted. The overall application data was decentralized into smaller pieces aligning with the microservice boundaries. Microservices can directly access the internal of other microservices using a shared database.

Problem

Independently updating and deploying microservices is not possible
- Microservices can depend on internal data models of other services

Solution

Isolate the storage area for each microservice. Each microservice should have its internal private storage area that no other microservice is able to access.

Using a shared storage area for microservices might promise consistency in data integration by ACID transaction, but comes with the price of introducing coupling between microservices. If the independence of microservices is desired sharing the storage area between microservices should be avoided at all costs.

Instead, each microservice should have its private and isolated storage area. The degree of isolation can vary:

An isolated area (e.g. schema) per microservice in a shared database setting (clustered or not)
An extra database per microservice (individually clustered or not).

From the microservice point of view, accessing a database is indifferent to which isolation degree is chosen. However, a database per microservice offers the opportunity to independently decide on storage solutions that fit best in the context, e.g., a time series database. Sharing the database infrastructure might save operation resources, however might become a single point of failure and limit the choice of technologies. The temptation to directly access another microservice's data instead of going though its data is higher compared to connecting to a different database with potentially unknown connection details.

Enforcing storage area isolation by access privileges might add to the security concept of the system since the infiltration of one microservice will not give access to data of other microservices.

In order to isolate storage areas per microservice, the data model needs to be partitioned and explicit responsibilities for parts of the data should be mapped one microservice each. Transactions spanning multiple microservices should be avoided.

Maturity

Proposed, requires evaluation.

Sources of Evidence

L6:

decentralization in conceptual models and storage backends
- every service has its own independent storage subsystem that is isolated from other services
Context: DIMMER platform for IoT: decentralized data (conceptual + storage backend)
- different storage backends to store their meta-data (decentralized data storage)
- Historical Datastore Service: uses timeseries DB
- Service Catalog Services: uses document-based DB to store JSON documents

L8:

principles that influenced the microservice style: data isolation

L14:

Context: otto.de
Shared nothing principle includes not sharing databases
- => not have negative impact on each other

L17:

Database migration and data splitting
one hand: all non-consultants just connected to legacy db / existing db cluster
- partially reduce benefits of microservices
- not always able to split the existing data
other hand: consultants recommended splitting data
- each microservice has own private database

L19:

Each service has own type of data storage

L20:

Context: Survey about pains and gains
Primary pain at development time: managing distributed storage and application testing
- distributed and heterogeneous storage hampering work of developers
- pain to ensure data consistency => eventual consistency to mitigate pain where it makes sense, not easy to ensure
- complexity in implementing transactions over distributed data stores
- complexity of building queries combining data stored in distributed and heterogeneous data stores
Pattern database per service: often discussed (pattern as gain)
- each microservice should be equipped with own database
- storage isolation as gain
- freely choose db technology (SQL vs NoSQL), and structure data in it
- enforces independence among microservices

L24:

Database per Service pattern in 3 of 4 open source applications

L25:

Context: Data Storage Patterns
The Database-per-Service Pattern
- Each microservice has private database
- Easiest approach for implementing microservices
- Often used for migration from existing monolith
- (+) Scalability: DB can be scaled in DB cluster within a second moment
- (+) Independent deployment: schema changes don't affect other microservices
- (+) Security mechanism: Microservices cannot access and corrupt all data
- (-) effort to split data
- (-) data consistency
The Database Cluster Pattern
- (+) Scalability: allows moving DB to dedicated hardware
- consistency => microservices have a sub-set of DB tables that can be accessed only from a single microservice
- or each microservice has private DB schema
- Microservice view: identical to Shared Database Server pattern
  - DB is accessed in the same way: DB seen as single DB
  - but internally difference
  - (+) implementation easiness
- Can be used for data replication between microservices
- RECOMMENDATION: implementations with huge data traffic
- (-) increased complexity: cluster architecture
- (-) risk of failure: another distributed mechanism
The Shared Database Server Pattern
- similar to Database Cluster Pattern, but all access a single shared DB
- all reported usages access data concurrently without data isolation approach
- (+) simplicity of migration from monolithic application: schema reuse without changes; also no code changes in data access layer
- (+) data consistency
- (-) lack of data isolation
- (-) limited scalability

L30:

Shared DB discouraged
- might lead to tight coupling & loss of autonomy

L32:

Context: report of migration to microservices
choose most suitable DB for each service
- only small portion required relational DB,
- others NoSQL DB
  - more amenable to schema evolution
- schema evolution also for RDBS needs addressing
  - schema of microservice considerably smaller
  - under full control of team
  - existing schema migration tools can handle it
  - => easier to achieve zero-downtime releases than before!

L34:

Comparison to SOA
- Microservice: DB per unit
- SOA: Shared DB

L40:

Partitioning of monolith allows segmentation of application data
- avoid having monolithic data storage
- required by decentralized data governance that microservices impose
- known as sharding pattern
Dependencies among microservices and data shards are complex
- centralized data storage can help to maintain consistency
Data must be kept consistent with respect to eventual replicas / centralized data store
- process vulnerable to exposure, DoS, and alteration attacks

L41:

microservice a building block with dedicated persistence tool
SOA: doesn't require self-containment with data and persistence tools
Context: FX market
- Infrastructure with Docker, including databases
  - ports open to outside clients
- allows to develop APIs and eliminating direct database queries

L46:

DB schema a microservice uses should be part of that microservice
CRM example
- at DB level: will share same database including coupling at that level
- nature of current systems to have domain model represented in a common database schema
  - => provision business functionality quickly and efficiently
  - "Almost all business system packages are architected this way"
In these environments: independent microservices down to DB level is wishful thinking
- since very costly
- without providing immediate business value
- introducing complexity overwhelming any advantage of agility or scalability

L54:

DB can be broken into smaller DBs, each containing what is beneficial to certain type of customers
- decreases load on one DB, distributed among smaller ones
- Shared DB among different services => can be designed as a service by its own.
Challenging to split DB into smaller independent units
- especially when relying on unstructured data repositories and possibility of error occurrences
- data recovery mechanisms required to avoid data corruption

L55:

each microservice might have private db
- => hard to implement business transactions to maintain consistency over multiple DBs

L58:

data needs to be replicated across respective DBs from services, different DB engines => challenge!
Context: Synapse as cross-db replication system
- each service can develop their data structures independently in their own DB
- DBs may differ in schema, indexes, layouts, engines,...
- but Synapse lets them integrate subsets of data with others
L59:
centralized data store => propagates problems when you change the data structure
- need to test all code that uses that data
- if data volume increases: difficult to put the changes in one place
- central shared DB has become outdated
when DB is big, data structure changes often, data flows at high volume
- want data encapsulated within service
- API model should be independent from data schema

L61:

Example Netflix to tolerate failures, recover fast, gracefully degregade: each microservice manages own layer of persistence

LN41:

Microservices may store their data in different databases
=> increases attack surface
protect data in shared storage environment via encryption
stringent access control to data

LN42:

Figure A: database per service

LN43:

possible to select db engine per microservice => database per service pattern
- more freedom to select tools
- use relational vs. NoSQL db => what provider suits best for use case of application
- more dbs => harder to manage and organization might not have much knowledge about the database
designing and scaling database easier if there are fewer tables and microservice has full control of data and schemas
some microservices can even use no database
- e.g. persist to disk instead
one database per service approach
- service owns data
- other services go thru API => loose coupling
  - change db schema and API can stay the same
one database for all services
- problematic as database schema is tightly coupled
- one db => other services have access to data that should be available by API
- loss of modularity => directly query on db instead making service call
- not recommended
=> split monolithic database during migration into multiple databases, only accessible by service handling its business context

LN44:

Microservice take "isolation thru loose coupling" concept one step further:
=> shared nothing principle and strict data owning
- each service can be isolated
- only able to access information it needs
- only able to access services it needs
- better security: restrict access to data only the services needs
- each microservice maintains individual db, in their example: individual SQLite database

LM43:

Context: SLR findings about microservices in DevOps
S24 discusses
- pattern use of a database per service
  - implemented by separate sets of tables per function, scheme per service, and database per service
- pattern shared database for multiple services
  - implemented by single database for group of microservices
- usually microservices grouped according to business context and use the shared database
S12 and S33 reported design patterns for data management, example: database per service
Table 10 lists patterns shared database, database per service
Performance issues due to selection of inappropriate databases
- e.g. shared db over database per ServiceWorker, or spreading service requests across multiple databases, poorly established connection pool to debugger, etc
S24, S33: Database per service
- considered effective to achieve loosely coupled MSA
S24: Shared database
- suitable where multiple microservices need to access persist data owned by other services

Interview A:

Stateless idea => DB per microservice
- end up with many data bases for one singe product => overhead

Interview B:

Monolithic DB
- problem of dependence moved to DB layer
- easy transactions
- make no sense in microservices
- will not lead to success! fact!
strongly distributed DB
- one DB per service
- which access patterns do I use? => data replication?
weird other constructs as deploying all data with service instead of thinking about data replication

Interview C:

Solution for consistency: central database (not really microservices)
- seen most often in practice for important parts
- services lose independence, can't manage themselves anymore
  - too many people want different things of DB (scalability)
  - single point of failure even if scaled (is still the bottleneck)
- might be an option if microservices used for main reason of maintainability, and not independence
- global schema can be partially partitioned
- this kind of degenerated microservices seems to work well sometimes
will end up with central DB if transactions are something holy in culture
- loss of independence
- worse maintainability

Interview D:

vertical integration: request down to data or wherever I get my answer and back
- distance within microservice
- should not be neglected, is often forgotten in integration debate
- is done regularly
- still a few remote hops, at latest down to data store
Need to compensate for failures within a service
Need to move fast with microservices counteracts shared database
- tight coupling because I can't change the schema without affecting other services
- basic idea: be independent => why do shared DB then?
Potentially use view on data
- already a step better
  - need to think what a sensible view for the receiver is
  - potentially adapt view with schema migration to answer same questions as before
  - same applies for GraphQL: never couple to schema => has to offer logical interface API schema
- but usually raw data accessible => table schema becomes the API => break consumer APIs

Storage Area Isolation per Microservice

Context​

Problem​

Solution​

Maturity​

Sources of Evidence​

Context

Problem

Solution

Maturity

Sources of Evidence