Here are 394 public repositories matching this topic...
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
- Updated Feb 20, 2026
- Java
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
- Updated Feb 20, 2026
- Java
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
- Updated Feb 20, 2026
- Scala
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
- Updated Dec 23, 2025
- Rust
A native Rust library for Delta Lake, with bindings into Python
- Updated Feb 19, 2026
- Rust
Real-time analytics on Postgres tables
- Updated Dec 3, 2025
- Rust
Apache Kafka® compatible broker with S3, PostgreSQL, SQLite, Apache Iceberg and Delta Lake
- Updated Feb 20, 2026
- Rust
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
- Updated Jan 28, 2025
- Scala
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
- Updated Feb 12, 2026
- Java
An open protocol for secure data sharing
- Updated Feb 20, 2026
- Scala
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
- Updated Feb 1, 2026
- Python
Analytical database for data-driven Web applications 🪶
- Updated Feb 25, 2025
- Rust
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
- Updated Oct 7, 2025
- Python
Amazon SageMaker Local Mode Examples
- Updated Apr 29, 2025
- Python
Quick start: pip install jsoniq ⛈️ RumbleDB 2.0.0 "Lemon Ironwood" 🌳 for Apache Spark | Run queries on your large-scale, messy datasets (JSON, text, CSV, Parquet, Delta...) | Data Lakehouse with Updates, Scripting, Declarative Machine Learning and more
- Updated Feb 17, 2026
- Java
Sample project to demonstrate data engineering best practices
- Updated Feb 24, 2024
- Python
- Updated Sep 25, 2025
- Python
Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi
- Updated Dec 15, 2023
- Dockerfile
Improve this page
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."