Cloud Bigtable is Google’s NoSQL, big data database service. Definitely, you have heard of MySQL or Postgresql and SQL. Well here now is NoSQL. What is NoSQL mean?
Think first of a relational database as offering you tables in which every row has the same set of columns, and the database engine enforces that rule and other rules you specify for each table.
That’s called the database schema. An enforced schema is a big help for some applications and a huge pain for others. Some applications call for a much more flexible approach. For example, a NoSQL schema.
In other words, for these applications not all the rows might need to have the same columns. And in fact, the database might be designed to take advantage of that by sparsely populating the rows. That’s part of what makes a NoSQL database what it is. Which brings us to Bigtable.
Your databases in Bigtable are sparsely populated tables that can scale to billions of rows and thousands of columns allowing you to store petabytes of data.
GCP fully manages the surface, so you don’t have to configure and tune it. It’s ideal for data that has a single lookup key.
Some application developers think of Bigtable as a persistent hash table. Cloud Bigtable is ideal for storing large amounts of data with very low latency. It supports high throughput, both read and write, so it’s a great choice for both operational and analytical applications including Internet of Things IoT, user analytics and financial data analysis.
Cloud Bigtable is offered through the same open-source API as HBase, which is the native database for the Apache Hadoop project.
Anyway, having the same API enables portability of applications between HBase and Bigtable. Given that you could manage your own Apache HBase installation, you might ask yourself,
Why should I choose Cloud Bigtable?
Here are a few reasons why you might.
First, scalability. If you manage your own Hbase installation, scaling past a certain rate of queries per second is going to be tough, but with Bigtable you can just increase your machine count which doesn’t even require downtime.
Also, Cloud Bigtable handles administration tasks like upgrades and restarts transparently.
All data in Cloud Bigtable is encrypted in both in-flight and at rest. You can even use IAM permissions to control who has access to Bigtable data.
One last reference point. Bigtable is actually the same database that powers many of Google’s core services including search, analytics, maps and Gmail.
As Cloud Bigtable is part of the GCP ecosystem, it can interact with other GCP services and third-party clients. From an application API perspective, data can be read from and written to Cloud Bigtable through a data service layer like Managed VMs, the HBase rest server or a Java server using the HBase client.
Typically, this will be to serve data to applications, dashboards and data services. Data can also be streamed in through a variety of popular stream processing frameworks, like Cloud Dataflow Streaming, Spark Streaming and Storm.
If streaming is not an option, data can also be read from and written to Cloud Bigtable through batch processes like Hadoop map reduce, Dataflow or Spark. Often summarized or newly calculated data is written back to Cloud Bigtable or to a downstream database.