Presto Community Broadcast

3: Running two Presto distributions and Kafka headers as Presto columns

Audio

 

Video

Video sections

Show notes

Presto nation, We want to hear from you! If you have a question or pull request that you would like us to feature on the show please join the Presto slack and go to the #presto-community-broadcast channel and let us know there. Otherwise, you can message Manfred Moser or Brian Olsen directly. Also, feel free to reach out to us on our Twitter channels Brian @bitsondatadev and Manfred @simpligility.

In this week’s concept, Manfred discusses what an SPI (service provider interface) is and covers the connector architecture of Presto, Starburst, and Custom.

In this week’s pull request https://github.com/prestosql/presto/pull/4462, came from user Sven Pfennig. Sven works for Syncier GmbH and as part of his role there he gets to contribute to open source projects such as Presto. Thanks Sven! We jump into a quick setup of a kafka broker using the kafka quickstart tutorial and I use the kafkacat tool to show off the addition of headers in Kafka that Sven has provided us and discuss why this is beneficial.

Here’s the crazy select statement I used to decode the binary values to utf text of the foo column

SELECT 
   _message, 
   reduce(element_at(_headers,'foo'), '', (s, c) -> s || from_utf8(c), s -> s) AS foo 
FROM kafka.default.pcb 
WHERE contains(map_keys(_headers), 'foo');

An alternative tutorial that uses the TPC dataset can be located on the presto site. https://prestosql.io/docs/current/connector/kafka-tutorial.html

This weeks question was accidentally cut off as I had mapped my Shift + R key to toggle streaming/recording and this cut the broadcast when I typed the R in FROM.

Release Notes discussed: https://prestosql.io/docs/current/release/release-344.html

Manfred’s Training - SQL at any scale https://www.simpligility.com/2020/10/join-me-for-presto-first-steps/ https://learning.oreilly.com/live-training/courses/presto-first-steps/0636920462859/

Blogs

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-creating-a-single-point-of-access-to-multiple-postgres-servers-using-starburst-presto

https://postgresconf.org/conferences/postgres-webinar-series/program/proposals/live-demo-unlock-data-in-postgres-servers-to-query-it-with-other-data-sources-like-hive-kafka-other-dbmss-and-more

https://blog.bigdataboutique.com/2020/09/presto-meets-elasticsearch-our-elasticsearch-connector-for-presto-video-mbywtm

Upcoming events

Latest training from David, Dain, and Martin: https://prestosql.io/blog/2020/07/15/training-advanced-sql.html https://prestosql.io/blog/2020/07/30/training-query-tuning.html https://prestosql.io/blog/2020/08/13/training-security.html https://prestosql.io/blog/2020/08/27/training-performance.html

Presto Summit Series - Real world usage https://prestosql.io/blog/2020/05/15/state-of-presto.html https://prestosql.io/blog/2020/06/16/presto-summit-zuora.html https://prestosql.io/blog/2020/07/06/presto-summit-arm-td.html https://prestosql.io/blog/2020/07/22/presto-summit-pinterest.html

Recent Podcasts: https://www.contributor.fyi/presto https://www.dataengineeringpodcast.com/presto-distributed-sql-episode-149/

If you want to learn more about Presto yourself, you should check out the O’Reilly Presto Definitive guide. You can download the free PDF or buy the book online.

Music for the show is from the Megaman 6 Game Play album by Krzysztof Słowikowski.