Native Apps At The Client & Cloud

Srinivasan Sundara Rajan

Subscribe to Srinivasan Sundara Rajan: eMailAlertsEmail Alerts
Get Srinivasan Sundara Rajan: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Article

Relational to NoSQL - Migration

A primer for database design for Google Bigtable

NoSQL Databases & BigTable Revolution
In computing, NoSQL is a term used to designate database management systems that differ from classic relational database management systems in some way. These data stores may not require fixed table schemas, and usually avoid join operations and typically scale horizontally. Academics and papers typically refer to these databases as structured storage, a term that would include classic relational databases as a subset.

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.

As evident Google's Bigtable represents one of the successful implementation  of a NoSQL database, so it is only prudent to consider  the  advantages of NOSQL databases  without  being  negative  about their usage. In this context  viewing the common Enterprise Applications which are typically data modeled using  Relational Databases , from the point of  view of NoSQL especially from Google Bigtable  would be  very useful  to implement  new scalable solutions for the enterprise in the Cloud.

BigTable Data Model  Vs Relational Data Model
BigTable  Data base design and data model significantly differs from the traditional  relational databases in many a categories. The below table provides a quick comparison of the two.

Big Table NoSQL Data Model

Relational Data Model

Uses a Multi Dimensional Sorted Map as a Data Structure , Each value in the map is identified by a key combination of (Row, Column, Time Stamp)

Only two dimensions on Row and Columns

Each Row, Column  combination can store multiple versions

Each  Row, Column Combination can store only one version at any point of time

A table can have  unbounded  number of columns

Typically Tables have fixed number of columns

Arbitrary "columns" on a row-by-row basis

 

Columns are fixed per row and applicable to all rows

Column keys are grouped into sets called column families, which form the basic unit of access control.

No concept of Column Families

No Multi Row Transactions, Only single row transactions supported

Multi  Row Transactions Supported.

 

Case Study : Migrating a Relation DB Model to NoSQL/BigTable
The following enterprise scenarios gives a good idea  how a relational database design can be visualized  to represent the same in the NoSQL/Bigtable design.

In this scenario,  an enterprise stores information about it's employees and in a typical relational model the following tables will be used. This is a sample representation  to explain the design principles of Bigtable and not necessarily exactly represent an Employee in an Enterprise which may have more attributes  in real life.

  • Employee Base Table (Basic Attributes of Employee)
  • Employee Educational Qualifications (Child Table with a room to store 1:N Qualification Details)
  • Employee Address (Child Table with a room to store multiple addresses for work, home etc...)
  • Employee History (History of changes for the Employee in the organization over a period)

The below diagram gives a sketch of this data model in a traditional relational database design.

The following will be  the  design for this ER Model in a Bigtable / NOSQL Model, with the following salient features.

  • Row key will be represented by Employee ID
  • Column Family 1 : Basic with Columns (Name, DOB, Photo, Start Date)
  • Column Family 2 : Address With Columns (Door, Street, City, State, Country)
  • Column Family 3 : Education With Variable Set of Columns
  • § (High School Degree, High School Institution, High School Marks, High School Passed Year)
  • § Variable Columns (Graduate Degree, Graduate Institution, Graduate Marks, Graduate Passed Year)
  • § Variable Columns (Masters Degree, Masters Institution, Masters Marks, Masters Passed Year)
  • Column Family 4 : History with Column Job Title and which is Multi Versioned

The below diagram gives a pictorial representation of the data model under " BigTable / NOSQL"  Model. Even in this small example we see lot of flexibility in data design and storage when compared to a relational database.

Summary
This simple example of a Employee entity may  not be a correct candidate for a NOSQL database like  BigTable however it gives the idea how a relational database design needs to be viewed in a NOSQL world . This design will be more applicable for unstructured content.

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).