Compare Bigtable to BigQuery - vaquarkhan/Apache-Kafka-poc-and-notes GitHub Wiki

Compare Bigtable to BigQuery

  • [Instructor] I mentioned earlier that I would compare BigQuery and Bigtable services 'cause it's easy to be confused. So, let's do that now. So, BigQuery is a mature product. It's one of the core products on Google Cloud Platform. I would say that 100% of my customers that use Google Cloud Platform use it because it is innovative, it is useful, it allows people to leverage their sequel skills, it has flexible pricing, it can be run interactively or batch if you have some variable workloads. Example of that would be like a Genomics customer that wants to run some What-If analysis overnight. I also didn't show you, but I'll tell you, there's a bunch of third party tools available for BigQuery.

There's connectors to visualization tools like Tableau and BIME. There are connectors to ETL and load tools that allow you to do loads via tools, rather than scripts. Ones that I've used are Talend and some of the other ones out there. So, there's a lot of an ecosystem around this product. It's an exciting product, it's an evolving product. And, it's one that you should definitely take a look at and use. Almost any company is going to need to use BigQuery, I think. Bigtable is more specialized.

Bigtable is raw NoSQL storage. At the time of this recording, it's a Beta product as well. So, it may evolve and change. And it's storage-based pricing more than query-based pricing. Now, that being said, you saw when we ran through the quick start, that is has a support for hbase natively and I expect it'll have support for other libraries and frameworks as we go along. The idea with Bigtable is it's a cheap way to store logging information. So, it could be logs coming out of your network, out of machines on your network.

I find that's really interesting in IoT scenarios where you have tens of thousands, maybe even hundreds of thousands of devices, and you want something more sophisticated than just file storage, Google Cloud Storage. You want a table abstraction over the top of that. But you don't really need to pay for a relational database because you're getting what is called behavioral data out of these devices. Some of it is meaningful, some of these commands coming through the device are important. But a lot of it maybe is just noise. So, you don't need to have transactional consistency, for example. You don't need to have redundancy across the data you just need a log file you can query against. So, they're different products even though they have similar names with very different pricing, and very different accessibility and very different use. So, BigQuery is the one that I'm using in all of my client scenarios. Bigtable, I'm using mostly in IoT and logging scenarios. Help/Feedback 0 notifications