Blog

Hive 3 support in Presto

The Hive community is centered around a few different Hive distributions, one of them being Hortonworks Data Platform (HDP). Even after the Cloudera-Hortonworks merger there is vivid interest in HDP 3, featuring Hive 3. Presto is ready for the game.

In this post, we summarize which Hive 3 features Presto already supports, covering all the work that went into Presto to achieve that. We also outline next steps lying ahead.

Introduction

There are several Hive versions in active use by the Hive community: 0.x, 1.x, 2.x and 3.x. Hive 3 major release brings a number of interesting features, including:

  • support for Hadoop Erasure Coding (EC), allowing much better HDFS storage capacity utilization without reducing data availability,
  • update to ORC ACID transactional tables - they no longer need to be bucketed,
  • transactional tables for all file formats (“insert-only” except for ORC),
  • materialized views,
  • new bucketing function, offering a better data distribution and less data skew,
  • new timestamp semantics and timestamp-related changes in file formats,
  • and a lot more (let’s skip over features and changes that are not interesting from Presto perspective).

That’s no surprise that many people want to try out all these features and run Hive 3, either the Apache project’s official release or using HDP version 3.

Hive 3 in Presto

The Presto community expressed interest in using Presto with Hive 3, both in the project’s issues and on Slack.

You spoke, we listened. Actually – we, community, spoke and listened.

In collaboration between Starburst, Qubole and the wider Presto community, Presto gradually improves its compatibility with Hive 3:

Upcoming improvements already being worked on include:

Try it out

The amazing Presto community is working hard on getting Hive 3 support fully integrated in the Presto project and a lot is already accomplished. Chances are THAT all you need is already included in the latest release. If you need one of the upcoming improvements, watch the pull requests linked above, the roadmap issue, join Slack and stay tuned for upcoming release announcements. In the meantime, you can try out the features today by running the 323-e release of Starburst Presto.