Build Your Own Open Data Lakehouse with Presto and Iceberg: A Hands-on Guide | Ranjan Kumar

Have you ever wanted to spin up your own open data lakehouse locally using open-source tools? I recently put together a hands-on walkthrough to show you how to do just that using Presto and Iceberg.

## The Goal: Keep it Simple and Reproducible
My goal was to keep things simple, reproducible, and easy to test. I wanted to create a guide that would allow you to easily follow along and set up your own open data lakehouse.

## The Tech Stack
The tech stack used in this guide includes Presto, Iceberg, MinIO, and OLake. These open-source tools allow you to create a flexible and scalable data lakehouse.

## The Process
The guide takes you through a step-by-step process of setting up your own open data lakehouse. From running containers to configuring the environment and querying Iceberg tables with Presto, I’ve got you covered.

## What I Learned
One thing that stood out during the setup was how fast and cheap it was. I used a small dataset for the demo, but you can easily push the limits and create your own benchmarks to test how the system performs under real conditions.

## Flexibility is Key
The guide uses MySQL as the starting point, but you can easily plug in Postgres or other sources. This flexibility is what makes open-source tools so powerful.

## Take the Next Step
If you’ve been trying to build a lakehouse stack yourself, this guide can give you a good start. Check out the blog and let me know if you’d like me to dive deeper into this by testing out different query engines in a detailed series, or if I should share my benchmarks in a later thread.

If you have any benchmarks to share with Presto/Iceberg, do share them as well.

**Read the full guide and watch the video walkthrough here:** [link](https://olake.io/blog/building-open-data-lakehouse-with-olake-presto)

Leave a Comment Cancel Reply