Replies: 4 comments 3 replies
-
Awesome -- I plan on attending! |
Beta Was this translation helpful? Give feedback.
-
Here is the signup link: https://lu.ma/tmwuz4lg |
Beta Was this translation helpful? Give feedback.
-
Thanks again @gruuya for making this happen |
Beta Was this translation helpful? Give feedback.
-
Recap of the eventWe successfully wrapped up the first-ever Apache DataFusion Meetup in Europe on September 27, 2024, marking a significant milestone for the community. The initial idea for the event came from @alamb on [May 2, 2024](#10342), and shortly after, @gruuya took full responsibility for bringing it to life. From coordinating speakers to handling logistics, @gruuya ensured everything ran smoothly, [drafted the event details by June 12, 2024](#11431). @gruuya's dedication was truly remarkable, as he personally ensured everything ran smoothly for the participants—from picking speakers up at the airport to driving them to their hotels and making sure they were well taken care of throughout the event. His hands-on approach and commitment played a key role in making the meetup run seamlessly and creating a memorable experience for everyone. A special thanks to @gruuya for his tireless efforts in bringing this event to life. This was a major moment for the DataFusion community in Europe, bringing together leading figures from the project to share their knowledge and advancements with enthusiasts. The energy in the room and the exchange of ideas truly demonstrated the vibrancy and growth of the data ecosystem in Belgrade which has a solid future ahead. Throughout the day, a series of compelling talks covered a wide range of topics, from the core principles of DataFusion to cutting-edge innovations and real-world applications. Venue & Participation
TalksThe talks kicked off with @alamb, who provided an in-depth introduction to origins and goals of Apache DataFusion. He started by described DataFusion as LLVM for data systems, enabling innovation in data-intensive systems. @alamb highlighted DataFusion’s architecture, built with industrial best practices, and its ability to compete with tightly integrated systems. Finally, @alamb touched on the Rust-based implementation and ongoing optimizations that ensure DataFusion remains highly performant, especially in multi-core environments. Next, @mildbyte, Principal Engineer at EDB, delivered a highly technical talk on caching optimization using DataFusion in EDB. @mildbyte explained how EDB utilizes DataFusion to optimize query caching, which leads to significant performance improvements.These optimizations are crucial for managing large-scale data systems, showcasing how EDB leverages DataFusion’s capabilities effectively. @ozankabak, co-founder and CEO of Synnada, spoke about the challenges of building data-intensive applications, referring to the Data Chasm — a complex landscape with many moving parts that makes it difficult to manage data efficiently. He explained how DataFusion helps break down these barriers, allowing for a more streamlined approach to data processing. @ozankabak highlighted Synnada's contributions to the DataFusion project, including their work on a unified data processing, which builds on top of DataFusion to simplify data workflows. @gruuya, senior staff engineer at EDB, the hero of the day who gathered all of us for this amazing event, gave a talk focused on database replication using the FDAP (Flight, DataFusion, Arrow, and Parquet) stack. @gruuya explained how this powerful combination of open-source tools enables efficient and scalable data replication, particularly in analytic environments. By leveraging Apache Arrow for in-memory data processing and Flight for fast network data transfer, the FDAP stack ensures low-latency communication between distributed databases. DataFusion handles real-time query execution across replicated data, while Parquet optimizes storage and performance, making this stack a highly efficient solution for large-scale database replication. @karlovnv from Tarantool followed, sharing insights on how his team is pushing the limits of big data. @karlovnv showcased their work on real massive datasets, such as handling 3,000-column dataset, processing 70TB of data in RAM and doing these things really really fast (quicker than 10ms for a fraud detection use case!?). His talk demonstrating how DataFusion plays a key role in enabling these high-performance workloads. @findepi from SDF wrapped up the talks with a detailed exploration of types and functions in the context of Apache Arrow vs DataFusion. He explained how types are handled in Arrow and DataFusion. @findepi's insights shed light on the potential improvements that could further enhance DataFusion’s handling of data types. After the talks, we headed to Docker (accompanied with container jokes), where the conversation continued in a more relaxed setting. It was a great way to unwind and keep sharing ideas. The success of the meetup made it clear—we should do this again to exchange more war stories and insights. Closing remarksEven though it was first time, the Apache DataFusion Belgrade Meetup turned out to be a great success!
What could be improved?
|
Beta Was this translation helpful? Give feedback.
-
Hi all,
I'm pleased to announce a (first?) European DataFusion meetup, in Belgrade, Serbia. Some details:
Date: Friday September 27th, 2024
Time: 17:00-20:30 (CET)
Location: Ušće Tower 2, Bulevar Mihajla Pupina 4 (11th floor @ Microsoft Development Center Serbia)
Format: 15-min talks, with free-form discussion before and after
A bit more info here: https://docs.google.com/document/d/1wlWKFRQocLGL7Rhu3BiI8geIWsowd-ZXVozIczZqrqQ/edit
Beta Was this translation helpful? Give feedback.
All reactions