News

Google Switching to Standard SQL for BigQuery Analytics Service

Google is abandoning its homegrown SQL variant as the recommended default query language for its BigQuery service in favor of a new standard-compliant dialect in the works for the managed data warehouse designed for Big Data analytics.

Just moved to beta, BigQuery's revamped SQL dialect will replace the existing dialect. The new dialect is compliant with the SQL 2011 standard, sporting extensions to support the querying of nested and repeated data.

BigQuery, which is part of the growing serverless computing trend, is described as a "fully managed, petabyte scale, low-cost analytics data warehouse" with a pay-as-you-go pricing model. The new standard-based SQL dialect may open it up to a new universe of developers.

"If you're familiar with the SQL standard or have used another standard-compliant SQL engine, you'll feel right at home with the beta of BigQuery's revamped SQL dialect," the company said in a blog post today.

Querying BigQuery
[Click on image for larger view.] Querying BigQuery (source: Google)

Google highlighted the following features of the new dialect:

  • More advanced query planning and optimization: BigQuery now provides more robust decorrelation, which allows you to write complex subqueries in any clause of your SQL statement (SELECT, FROM, WHERE and so on).
  • A richer type system with fully composable types: In addition to the existing data types BigQuery users are used to, we've added dates, times, arrays and structs, as well as additional support for timestamps.
  • Extended JOIN support: BigQuery now supports Theta JOIN, which offers the ability to use inequalities in your join key comparisons, as well as arbitrary expressions as JOIN conditions.

The newness of the standard-based dialect is reflected in its Web site, which earlier today said it's in the alpha stage, while today's news release says it's in beta. "This is an alpha release of standard SQL, which may be changed in backward-incompatible ways and is not recommended for production use. It is not subject to any SLA or deprecation policy," the Web site warned (it has since been updated to say "beta").

Today's blog post repeated the warning that the new dialect needs maturing before being put into real-world use.

"While we think the updated dialect is a wonderful addition, there's no requirement that users switch, and for production use cases, we recommend users remain on the legacy SQL dialect," the company said. "After we have a few more miles on the new dialect, we plan to launch it to general availability and recommend it as the default language for all projects."

Until then -- no time-frame was provided -- Google announced other features added to BigQuery that developers can play around with, including enhanced identity and access management (IAM), also in beta.

"Now that BigQuery supports Standard SQL, you'll have more and more teams in your company requesting access to BigQuery projects," Google said. "Earlier this year we announced Cloud IAM for Cloud Platform. We're now making IAM available for BigQuery as well, in beta. This feature is currently being rolled out, so if you don't see it today you can expect it enabled on your projects this month -- BigQuery roles will be made available in Google Cloud's 'IAM & Admin' control panel.

Yet another new feature is time-based table partitioning, which Google said "makes it easy and cost-effective for you to manage your data and write queries that span multiple days, months or years. You can now create tables with time-based partitions -- you load the data, and BigQuery will automatically put it in the right partition."

Together, the new features will make BigQuery more compatible with typical, traditional Big Data workflows, Google said.

About the Author

David Ramel is an editor and writer for Converge360.