EuroPython 2024

AI & ML (6 videos)
Bioinformatics (1 videos)
Compute (13 videos)
Databases (3 videos)
Developer Experience (90 videos)
Diversity, Equity, and Inclusion (1 videos)
Health (1 videos)
Keynote (1 videos)
Multimedia (1 videos)
Natural Language Processing (1 videos)
Observability (4 videos)
Performance Engineering (2 videos)
Security (7 videos)
Sustainability (1 videos)

AI & ML

Unlocking Mixture of Experts : From 1 Know-it-all to group of Jedi Masters — Pranjal Biyani

The presentation explores the Mixture of Experts (MoE) architecture, a dynamic modeling technique that is driving advancements in generative AI. It discusses the motivation for MoE, its mathematical intuition, the core architecture and components, as well as the challenges and solutions in training and deploying MoE models.

Lies, damned lies and large language models — Jodie Burchell

The talk explores the phenomenon of hallucinations in large language models (LLMs), discussing their causes, types, and ways to measure and mitigate them. It emphasizes the complexities involved in evaluating LLM performance, particularly in the context of factual accuracy, and highlights the importance of critical thinking when it comes to interpreting model capabilities and limitations.

Scikit-LLM: Beginner Friendly NLP Using LLMs — Iryna Kondrashchenko, Oleh Kostromin

Scikit-LLM: Beginner Friendly NLP Using LLMs is a talk that introduces a Python library, Scikit-LLM, which simplifies the use of large language models (LLMs) for various natural language processing (NLP) tasks, such as text classification, text summarization, and named entity recognition. The talk covers the different prompting techniques, including zero-shot, few-shot, and dynamic few-shot, and demonstrates how Scikit-LLM can be used to seamlessly integrate LLMs into the familiar Scikit-learn API, making NLP tasks more accessible to beginners.

Mastering Generative AI: Tools and Techniques with VS Code, GitHub, Azure — Leo Yao

The presentation covers the benefits of building generative AI applications, demonstrates the setup and integration of large language models like GPT-4 using Azure and Python, and explores advanced techniques like prompt engineering, response evaluation, and integration with other tools for optimizing the performance of the AI applications.

Earth Observation through Large Vision Models — Mayank Khanduja

The talk explores the use of large vision models for Earth observation, discussing the applications of satellite data in various domains like urban planning, agriculture, and disaster management. The speaker also highlights the challenges faced in working with satellite data and demonstrates how to fine-tune and use vision-language models, object detection models, and super-resolution models to address these challenges.

NLP Application in Cases of Violence Against Women — Deborah Foroni

This talk presents a novel NLP application to analyze video testimonies of violence against women in Brazil. The speaker discusses the challenges of working with unstructured data, using speech recognition and topic modeling techniques to extract insights from the personal accounts, and highlights the potential for using technical skills to address social issues.

Bioinformatics

Deciphering the mysteries of human genomes — Anna Přistoupilová

This talk explores the complexities of the human genome and the challenges in deciphering rare genetic diseases. It highlights the advancements in DNA sequencing technologies, the bioinformatics workflows used to identify disease-causing variants, and the potential for personalized treatments, while emphasizing the importance of collaborative research efforts to improve clinical diagnoses and patient care.

Compute

SPy (Static Python) lang: fast as C, Pythonic as Python — Antonio Cuni

The talk presents SPy (Static Python), a compiler for a variant of Python that aims to be as fast as C while maintaining the Pythonic feel of the language. The key ideas behind SPy are formalizing the distinction between import-time and runtime logic, enforcing type annotations, and splitting the semantics of operations into a 'blue' (static) and 'red' (dynamic) phase to enable aggressive optimization.

One analysis a day keeps anomalies away! — Madalina Ciortan

This talk provides a comprehensive overview of anomaly detection, covering key concepts, challenges, and practical solutions. The speaker highlights the versatility of unsupervised anomaly detection in multivariate time series data, showcasing various techniques, libraries, and evaluation methods to help attendees navigate this complex field.

Automate Your Kitchen with Python & Applied AI — Sena Sahin

The talk presents an AI-powered kitchen automation project that helps users decide what to eat based on the contents of their fridge. The project utilizes object detection, recipe recommendation, and user feedback to provide personalized meal suggestions and a shopping list, addressing common pain points in meal planning.

Aggregating data in Django using database views — Mikuláš Poul

This talk explores aggregating data in Django using database views, a powerful feature that can simplify complex data processing and reporting tasks. The speaker introduces the Django PG Views Redux library, which provides seamless integration of database views and materialized views into Django applications, offering performance and maintenance benefits over traditional Django ORM-based aggregation.

GeoPandas 1.0 and beyond — Martin Fleischmann

The talk provides an overview of the GeoPandas 1.0 release and its future developments. The speaker discusses the project's history, key features, performance improvements, and upcoming plans, highlighting the project's growth and the contributions of the wider community.

How we sped up NumPy’s string operations for NumPy 2.0 — Lysandros Nikolaou

This talk discusses the performance improvements made to NumPy's string operations in the recently released NumPy 2.0. The speaker describes the process of optimizing string operations by implementing custom C functions that operate directly on the underlying data buffers, resulting in significant speed-ups compared to the previous approach of relying on Python string functions.

I reverse engineered a work of art, and this is what I learned — Yair Galler

The speaker, a long-time technologist, shares his journey of reverse-engineering a work of art made with strings, exploring algorithms, performance optimization, and color theory to create a system that can generate instructions for recreating the art piece. The talk highlights the speaker's problem-solving approach, lessons learned, and the final result, which was successfully implemented in the physical world.

Fine-tuning large models on local hardware — Benjamin Bossan

The talk discusses the challenges of fine-tuning large language models on local hardware due to memory constraints. The speaker introduces a library called Hugging Face PFT, which provides parameter-efficient fine-tuning methods, such as LoRA and quantization, to reduce the memory footprint and enable training of these models on local machines.

State-of-the-art image generation for the masses with Diffusers — Sayak Paul

This talk introduces the state-of-the-art image and video generation capabilities of the Diffusers library, a open-source Python library maintained by Hugging Face. The speaker highlights the flexibility and ease of use of Diffusers, showcasing how it enables text-to-image, image-to-image, and even text-to-video generation with just a few lines of code.

How we used vectorization for 1000x Python speedups (no C or Spark needed!)

The presenters discuss how they used vectorization techniques to achieve 1000x Python speedups without resorting to C or Spark. They present several practical examples and building blocks that demonstrate how to leverage libraries like NumPy and SciPy to optimize code performance, especially for financial and data-intensive applications.

Is it me or Python memory management? — Yuliia Barabash, Laysa Uchoa

The presentation discusses the importance of understanding Python's memory management, including concepts like reference counting, garbage collection, and generations, to optimize application performance. The speakers provide insights into how Python's internal mechanisms interact with hardware components like CPU and memory units to manage memory and ensure efficient execution of Python programs.

Forecasting the future with EarthPT — Mike Smith

This talk presents two large observation models, EarthPT and AstroPT, that leverage diverse Earth observation and astronomy data to make accurate predictions about future satellite observations and astronomical phenomena. The models demonstrate the potential of large autoregressive models to learn meaningful emergent properties from time series data, opening up new applications in fields like climate change monitoring and astronomy.

Python in Parallel: Sub-Interpreters vs. NoGIL vs. Multiprocessing — Samet Yaslan

The talk discusses the current limitations of parallel processing in Python, including the Global Interpreter Lock (GIL), and the upcoming changes in Python to address these limitations. The speaker explores two main approaches: the 'free threading' or 'no-GIL' proposal, and the parallel sub-interpreters feature, providing insights on their performance and trade-offs.

Databases

Python on the Rocks: Crafting a Smooth Blend with RocksDB — Ria Bhatia

The talk explores the use of RocksDB, a high-performance, low-latency database, and its integration with Python. It delves into the internals of RocksDB's architecture, including its log-structured merge-tree (LSM-tree) data structure and optimization techniques like indexing and Bloom filters, to provide insights into the database's performance and capabilities.

chDB: The Blazing Fast SQL Engine for Data Science — Auxten Wang

chDB is a blazing-fast SQL engine for data science, built on top of the powerful ClickHouse database. Featuring serverless architecture, seamless integration with Python, and support for a wide range of data formats, chDB empowers data scientists to quickly and efficiently analyze large datasets without the overhead of managing a traditional database infrastructure.

Exploring Apache Iceberg: A Modern Data Lake Stack — Gowthami Bhogireddy

The talk explores the use of Apache Iceberg, a modern data lake stack, at Bloomberg to ingest and manage large volumes of financial data. It covers the key features of Iceberg, such as schema evolution, time travel, and efficient query execution, and demonstrates how the speaker's team has integrated Iceberg into their data engineering architecture.

Developer Experience

Demystifying AsyncIO: Building Your Own Event Loop in Python — Arthur Pastel

This talk provides an in-depth exploration of AsyncIO, showcasing the speaker's journey in understanding and building an event loop from scratch. The presentation covers the fundamentals of AsyncIO, including coroutines, futures, and file descriptors, ultimately demonstrating the implementation of a functional FastAPI server.

FastAPI Internals — Marcelo Trylesinski

The presentation explores the internals of the FastAPI framework, covering the data flow from the client to the server, the application, and back to the client. It delves into the ASGI specification, middleware, routing, dependencies, and data validation, providing insights into the framework's architecture and design choices.

Demystify Python Types for PEP 729 — Kir Chou

The talk explores the foundations of Python's type system, delving into type theory, gradual typing, and the evolution of Python's type checkers. It highlights the need for a unified approach to type governance, as proposed in PEP 729, to address the challenges of maintaining consistency and enabling efficient adoption of new type specifications across the Python ecosystem.

Invent with PyScript — Nicholas Tollervey, Joshua Lowe

The talk showcases the evolution of PyScript, a platform for running Python in the browser, and introduces Invent, a new app creation framework built on top of PyScript. The presenters highlight how Invent aims to make coding more accessible and engaging for beginners, drawing inspiration from tools like HyperCard and Visual Basic.

Embracing Python, AI, and Heuristics: Optimal Paths for Impactful Software — Carol Willing

This talk explores how embracing Python, AI, and heuristics can lead to optimal paths for impactful software development. The speaker shares her experiences and insights on navigating the dynamic world of technology, leveraging Python's strengths, demystifying AI, and utilizing heuristics to make informed decisions and tackle complex problems.

How to deliver 3x faster with effective API design — Michal Cyprian

The presentation discusses how the speaker's team at K.com, a global travel tech company, addressed the challenges of delivering changes faster in a client-server architecture with multiple clients (iOS, Android, and web). The key solutions explored are the adoption of the Backend for Frontend (BFF) pattern and the implementation of a server-driven UI approach to simplify client-side development and enable faster production deployments.

Writing Python like it's Rust - more robust code with type hints — Jakub Beránek

The talk discusses how the author writes Python code in a more robust and Rust-like manner, focusing on the use of type hints, data classes, and sound API design to improve code understanding, maintainability, and reliability. The key ideas presented are leveraging type hints extensively, embracing data classes, and making it harder to misuse code through techniques like creating separate types for different valid states.

Building Scalable Multimodal Search Applications with Python — Zain Hasan

This talk discusses building scalable multimodal search applications using Python. The speaker presents how to leverage vector databases and multimodal models to enable cross-modal retrieval and reasoning, with applications in e-commerce and retrieval-augmented generation.

EuroPython 2024 — Lightning talks Wednesday

This video features a series of lightning talks at the EuroPython 2024 conference. The talks cover a wide range of topics, including creating a Python-based sailing game, making Cron expressions more readable, using AI to search for celebrity lookalikes, and the art of puzzle solving.

Deconstructing the text embedding models — Kacper Łukawski

The presentation explores the internals of text embedding models, focusing on the tokenization process and the limitations of these models in handling real-world data. The speaker discusses strategies for fine-tuning and improving the performance of these models, such as word injection fine-tuning, to address issues like handling non-English data and incorporating domain-specific terminology.

Impersonation in Data Engineering: No More Credentials in Your Code! — Marian Špilka

The presentation discusses a secure and efficient approach to managing credentials and access in a data engineering environment, using techniques like identity and access management, application default credentials, workload identity federation, and impersonation. The proposed solution aims to provide a seamless developer experience, ensure data security, and enable easy onboarding for new team members.

From Pandas to production: ELT with dlt — Violetta Mishechkina, Adrian Brudaru

The video discusses the differences between the machine learning and data engineering perspectives on the ETL/ELT process. It introduces DLT, an open-source Python library that aims to simplify data loading and transformation, addressing the challenges faced when moving from a local Pandas-based workflow to a production-ready data pipeline.

From Text to Context: How We Introduced a Modern Hybrid Search — Ansgar Gruene, Dharin Shah

This presentation discusses how the team at GetYourGuide introduced a modern hybrid search system that combines the power of text-based search and semantic search to improve the search experience for their customers. The team shares their approach to training and evaluating different models, the architectural changes they made, and the results and learnings from their implementation.

How to sell a big refactor or rewrite to the business? — Ivett Ördög

The talk explores the challenges and potential benefits of undertaking large-scale refactoring or rewriting of software systems. The speaker presents case studies and insights on how to effectively sell such initiatives to the business by focusing on incremental delivery of customer value and creating optionality for future development.

Keeping your projects nice and clean — Jan Musílek

This talk discusses the importance of maintaining clean and readable code in Python projects, emphasizing the use of automated tools like code formatters and linters to enforce consistent code style and best practices. The speaker highlights the benefits of adopting a consistent code style, including improved collaboration, reduced bugs, and saved time, and provides practical recommendations for implementing and enforcing these practices in a project.

DFD(Documentation-First Development) with FastAPI — Taehyun Lee

This presentation introduces the concept of Documentation-First Development (DFD) using the FastAPI web framework. The speaker highlights the importance of well-documented APIs, the challenges of managing documentation, and how FastAPI's features can enhance the development experience by automatically generating API documentation based on the code.

Animations from first principles — Rodrigo Girão Serrão

This talk provides a step-by-step introduction to creating simple animations from first principles using Python and the Pygame library. The presenter demonstrates how to draw pixels, shapes, and morph between different parameterized shapes, ultimately creating a rotating and contracting/expanding spiral animation with color changes.

Intellectual Property Law 101 — Anwesha Das

This talk provides a comprehensive overview of intellectual property law, covering key concepts such as trademarks, patents, and copyrights, and their relevance to the technology industry. The speaker also shares a personal anecdote about her journey in bridging the gap between law and technology, highlighting the importance of understanding these legal frameworks in the context of software development and open-source communities.

Event Sourcing in production — Borjan Tchakaloff

This talk explores the practical aspects of implementing an event-sourcing system in production, covering common challenges, patterns, and best practices. The speaker shares insights from their experience using event sourcing and domain-driven design in a Python-based gathering service, highlighting strategies for managing projections, domain evolution, and concurrency handling.

Deadcode - a tool to find and fix unused (dead) Python code — Albertas Gimbutas

The presentation introduces a Python tool called 'Deadcode' that aims to find and fix unused (dead) Python code. The tool provides several features, including tunable options to reduce false positives, more comprehensive detection rules, and the ability to automatically remove unused code.

Enhancing Decorators with Type Annotations: Techniques and Best Practices — Koudai Aono

This talk explores techniques and best practices for enhancing decorators with type annotations, including the use of type protocols, parameter specification, and the new type annotation syntax introduced in Python 3.12. The speaker covers several practical examples and demonstrates how these features can improve the type safety and flexibility of decorator-based code.

EuroPython 2024 — CPython Core Development Panel

The panel discussion covered the challenges and efforts involved in the core development of the CPython interpreter, including managing backwards compatibility, improving performance, and addressing issues with the introduction of new features like free threading. The panelists discussed the complexity of the codebase, the need for better documentation and signaling to the community, and the importance of maintaining a balance between progress and stability.

Accelerating Python with Rust: The PyO3 Revolution — Roshan R Chandar

The talk discusses how the PyO3 framework can be used to accelerate Python applications by integrating Rust code. It covers the advantages of using Rust, such as memory safety and performance, and provides examples of real-world projects that have benefited from the PyO3 approach.

Data pipelines with Celery: modular, signal-driven and manageable — Marin Aglić Čuvić

The talk explores the use of Celery, a distributed task queue framework, for building modular, signal-driven, and manageable data pipelines. The speaker discusses the pros and cons of using Celery, the challenges in data processing, and presents several use cases demonstrating how Celery's signal-driven architecture can be leveraged to improve the flexibility and maintainability of data pipelines.

Learning to code in the age of AI — Sheena O'Connell

The talk explores the importance of developing fundamental coding skills and problem-solving abilities, rather than relying on AI tools to simply generate code. It emphasizes the collaborative, iterative, and exploratory nature of software development, which requires human understanding and precision.

Designing Config Files: The Conflicting Needs of Programmers and Users — Steven Pool

The talk explores strategies for designing more flexible and user-friendly configuration files, including the use of hierarchical structures, runtime overrides, and pushing complexity into the config files themselves. The speaker shares practical tips and techniques for making configuration files more maintainable, debuggable, and accessible to non-technical users.

From built-in concurrency primitives to large scale distributed computing — Jakub Urban

This talk provides an introduction to Python's built-in concurrency primitives, such as concurrent.futures, and how they can be used to scale to large-scale distributed computing using frameworks like Dask and Ray. The speaker covers the concepts of concurrency and parallelism, Python's built-in tools, and how to integrate these with asynchronous programming, as well as practical examples and considerations for scaling out to distributed environments.

Is RAG all you need? A look at the limits of retrieval augmented generation — Sara Zanzottera

The talk explores the limits of retrieval-augmented generation (RAG) systems, discussing their strengths and weaknesses, evaluation strategies, and ways to improve them. The speaker covers topics such as the core components of RAG, common failure modes, and techniques like using multiple retrievers, self-correcting, and multi-hop reasoning to enhance the performance of these systems.

The PyArrow revolution in Pandas — Reuven M. Lerner

The talk discusses the PyArrow revolution in Pandas, a project that aims to provide a stable and efficient data processing infrastructure for various languages and frameworks. The speaker highlights how PyArrow can significantly improve the performance and memory usage of common Pandas operations, such as reading and writing files, as well as providing a more robust and flexible data type system.

The role of C++ in the Python ecosystem: the case of the Qt framework — Cristián Maureira-Fredes

This talk explores the role of C++ in the Python ecosystem, focusing on the Qt framework. The speaker highlights how C++ is more closely integrated with Python than one might think, and discusses the efforts of the PyQt project to make C++ more accessible and Python-friendly.

Why should we all be hyped about inclusive leadership? — Tereza Iofciu

This talk explores the concept of inclusive leadership, highlighting the need for leaders to be self-aware, empathetic, and adaptable in managing diverse teams. The speaker emphasizes the importance of inclusive leadership as a competitive advantage in today's rapidly changing and diverse work environments.

Enterprise Python: Software That Lives Long And Prosper — Alvaro Duran

This talk explores how the rise of Python in enterprise software development has challenged long-held assumptions about the need for 'serious' programming languages. The speaker highlights Python's strengths, such as its simplicity, flexibility, and the emergence of 'citizen developers', as key factors that have made it an increasingly viable choice for building enterprise-grade applications.

Rapid Prototyping & Proof of Concepts: Django is all we need — Radoslav Georgiev

This talk explores the power of rapid prototyping and proof-of-concepts using Django, a mature and reliable Python web framework. The speaker shares insights on how to leverage Django's strengths to push product development forward, emphasizing the importance of understanding the 'why' behind each prototyping effort and striking a balance between comfort and exploration of new technologies.

FastUI - panacea or pipe dream? — Samuel Colvin

The talk explores the challenges of building web applications in Python and proposes FastUI as a potential solution to unify the rendering of common UI elements like forms, tables, and navigation. The speaker discusses the trade-offs between fine-grained control and abstraction in existing frameworks and presents FastUI as a framework that aims to provide a consistent contract between the front-end and back-end, enabling developers to leverage existing front-end technologies while simplifying the development process.

The Art of the Pull Request — Ben Lomax

The talk discusses the art of crafting high-quality pull requests (PRs) to improve code quality, review efficiency, and team collaboration. The speaker presents practical tips, such as keeping PRs small, creating atomic commits, separating refactoring and functional changes, providing context, and reviewing one's own PR, all aimed at optimizing the review process for the benefit of both the author and the reviewer.

Don't fix bad data, do this instead — Martina Ivanicova

The talk highlights the challenges of fixing bad data quality and proposes an alternative approach that focuses on protecting critical data points, using integration tests to detect data contract changes, and establishing data ownership and collaboration processes. The speaker emphasizes the importance of addressing data quality issues proactively and collaboratively, rather than relying solely on technical solutions.

Shipping ready-to-run Python apps without the need to install Python — Marc-André Lemburg

The talk discusses a tool called 'pyron' that allows packaging Python applications into a single executable file, eliminating the need to install Python on the target system. The tool is open-source, supports various Python versions, and provides a way to create self-contained, OS-independent Python applications.

A Tour of Synchronization Primitives in Python — Zach Muncaster

This talk provides a comprehensive overview of synchronization primitives in Python, including semaphores, events, locks, and barriers. The speaker explores various concurrency problems and demonstrates how these primitives can be used to solve them, emphasizing the importance of determinism and avoiding race conditions in concurrent programming.

How to Build a Python-to-C++ Compiler out of Spare Parts - and Why — Xavier Thompson

The talk presents a Python-to-C++ compiler called 'typen' that aims to provide Python semantics with a focus on concurrency and parallelism. The compiler leverages C++ features to implement Python's dynamic nature, while offering structured concurrency primitives like 'fork' and 'sync' to enable efficient parallel execution.

EuroPython 2024 — Lightning talks Thursday

The video showcases a series of lightning talks at the EuroPython 2024 conference, where speakers from various Python communities around the world invite attendees to their local and regional conferences. The talks cover a wide range of topics, including conference announcements, community updates, and demonstrations of Python-based tools and technologies.

DBT & Python - How to write reusable and testable pipelines — Florian Stefan

The talk presents lessons learned from implementing production data pipelines using DBT on Snowflake. It covers the key concepts of DBT, such as sources, SQL models, Python models, and materializations, and demonstrates how to write testable and reusable pipelines using data tests, unit tests for SQL models, and unit tests for Python models.

The Catch in Rye: Seeding Change and Lessons Learned — Armin Ronacher

The talk discusses the author's experience in building a Python packaging tool called 'Ry' and the challenges faced in the Python packaging ecosystem. It highlights the need for a more unified and standardized approach to Python packaging, with the goal of providing a seamless developer experience.

PEP 639 - Towards licensing standardization in Python packaging — Karolina Surma

The talk discusses the challenges and proposed solutions for standardizing licensing information in Python packaging. It highlights the need for a more structured and machine-readable approach to specifying licenses, and the efforts to incorporate the SPDX standard into the Python packaging ecosystem.

The rise of the YAML engineer — Matthieu Caneill

The presentation explores the rise of the YAML engineer and the benefits of using declarative programming with YAML to build data platforms. The speaker highlights how YAML allows engineers to describe the desired state of their infrastructure, making data systems more ubiquitous and metadata more trackable in Git, the source of truth.

Containerize your Python apps like it's 2024 — Jan Smitka

The talk discusses best practices for containerizing Python applications, focusing on creating production-ready Docker images. The speaker covers topics such as choosing the right base image, leveraging Docker's cache, using multi-stage builds, and improving security and performance of the Docker images.

Unlock the Power of Dev Containers: Consistent Environments in Seconds! — Thomas Fraunholz

This talk explores the benefits of using Dev Containers for consistent and reproducible Python development environments. The speaker demonstrates how Dev Containers can simplify the setup and sharing of development environments, addressing challenges such as package management, interpreter compatibility, and system-level dependencies.

RPA, TDD, and Embedded: A world glued together with Python! — Javier Alonso

This talk explores the use of Robotic Process Automation (RPA), Test-Driven Development (TDD), and embedded systems, all tied together with Python. The speaker demonstrates how the Robot Framework, an open-source automation framework, can be used to streamline testing and integration for embedded systems, making the process more accessible and readable for both technical and non-technical team members.

Python’s Journey: From Upstream to Enterprise — Lumír Balhar

This talk explores the journey of Python from its upstream development to its integration and maintenance in enterprise-level Linux distributions like Fedora. The speaker highlights the challenges and efforts involved in ensuring a seamless and secure transition of Python versions across different Linux ecosystems, benefiting both developers and end-users.

EuroPython 2024 — Sponsor Highlight & Recruitment Fair

The video highlights the sponsor companies and their recruitment opportunities at the EuroPython 2024 conference. The sponsors, including Pantic, Clickhouse, DLT, Kiwi.com, Numberly, and Opel, present their companies, open positions, and the benefits of working with them, inviting attendees to explore their booths and apply for the available roles.

PySyft: Data Science on data you are not allowed to see — Valerio Maggio

PySyft: Data Science on data you are not allowed to see is a talk that introduces PySyft, an open-source project developed by OpenMind, a nonprofit organization, to enable remote data science on private data without directly accessing the data. The talk discusses the importance of data in machine learning, the challenges of accessing private data, and how PySyft's platform combines privacy-preserving technologies to allow researchers to answer questions about AI algorithms without directly seeing the underlying data.

Many ways to be a Python contributor — Paolo Melchiorre

The video discusses the various ways to contribute to the Python community, from writing code to organizing local meetups. The speaker shares his personal journey of overcoming his initial hesitation and becoming an active contributor, highlighting the benefits of community involvement and the opportunities it offers for personal and professional growth.

GraalPy - Fast Python Implementation — Štěpán Šindelář, Tim Felgentreff

GraalPy is a high-performance, compatible implementation of Python that seamlessly integrates with Java, offering features like JIT compilation, standalone binaries, and Java interoperability. The presentation covers GraalPy's compatibility, performance, and use cases, highlighting its potential as an alternative to the standard CPython implementation.

An alternative view on the OpenAPI documentation. — Maxim Danilov

The presentation offers an alternative view on the OpenAPI documentation, highlighting the complexities and challenges of managing and generating consistent API documentation across multiple frameworks and services. The speaker proposes a unified approach to collecting and merging YAML files from individual endpoints, leveraging HTTP OPTIONS requests and standard docstrings to simplify the documentation generation process.

CompiledPoem.py: Teaching about diversity and Python through poem — Soraya Roberta

The talk presents a novel approach to teaching computational thinking and Python programming through the lens of poetry, particularly in the context of underrepresented communities in Brazil. The speaker, Soraya Roberta, discusses her experiences developing the 'CompiledPoem.py' project, which integrates poetry writing, computational concepts, and social awareness to engage students and address academic challenges in computer science education.

Why communication is the best skill you can develop as a programmer — Miriam Forner

This talk explores the importance of effective communication as a key skill for programmers. The speaker highlights the value of clear, empathetic, and consistent communication throughout the software development process, from task assignment to code reviews and collaboration.

Python Unplugged: Mining for Hidden 'Batteries — Torsten Zielke

The talk explores the hidden 'batteries' within the Python standard library, showcasing various built-in tools and techniques that can be leveraged for data fetching, cleaning, and processing, without relying on external libraries. The presenter demonstrates how these tools can simplify common data-related tasks, making Python a powerful and versatile language for a wide range of applications.

Mutation Testing in Python with Cosmic Ray — Austin Bingham

The presentation covers the theory and practice of mutation testing in Python, including an introduction to the concept, the challenges involved, and the Cosmic Ray tool for performing mutation testing. The speaker discusses the goals of mutation testing, examples of mutation operators, the complexities of determining what to mutate and running the tests, and the features and future work for the Cosmic Ray project.

Caching for Jupyter Notebooks — Lauris Jullien

This talk provides an overview of caching techniques for Jupyter Notebooks, focusing on strategies, developer experience, and storage options. The speaker discusses how caching can significantly improve the performance of data science workflows and provides practical examples and recommendations for implementing caching in Jupyter Notebooks.

Mastering Design Patterns: Crafting Elegant Solutions with a Confidence — Petr Balogh

This talk provides an in-depth overview of design patterns, covering their purpose, benefits, and the three main categories: creational, structural, and behavioral patterns. The speaker delves into the details of several specific design patterns, such as Singleton, Abstract Factory, Adapter, and Observer, highlighting their structure, use cases, and implementation examples.

When and how to start coding with kids — Anna-Lena Popkes

This talk provides a comprehensive overview of when and how to start coding with kids, covering the importance of coding for children's development, the brain's maturation process, and specific age-appropriate tools and programming languages. The presenter shares practical tips and strategies to make coding an engaging and enriching experience for children of different ages.

Creating Your Own Extensions for JupyterLab — Daniel Goldfarb

The video provides a comprehensive tutorial on creating custom extensions for JupyterLab, covering topics such as building a basic extension, adding commands, creating widgets, and styling them. The speaker demonstrates step-by-step how to develop an extension, showcasing its capabilities and the various features available in the JupyterLab ecosystem.

Healthy code for healthy teams (or the other way around) — Mai Giménez

The talk discusses the importance of building a sustainable and maintainable codebase for research and development teams. It emphasizes the need for diverse, kind, and empathetic teams united by a clear mission, empowered to make decisions, and driven by open communication.

Behind the Scenes of an Ads Prediction System — Bunmi Akinremi

This presentation provides an in-depth exploration of the data and machine learning ecosystem behind an ads prediction system. The speaker delves into the key concepts, design flow, and ethical considerations involved in creating a highly-tuned and effective ad recommendation system that balances the interests of advertisers, platforms, and users.

Async Await: Mastering Python's Time-Bending Tricks — Bojan Miletic

This talk provides a concise and practical introduction to using the Async/Await features in Python, demonstrating how they can significantly improve the performance of I/O-bound tasks with minimal code changes. The speaker showcases simple functions and techniques that can be easily integrated into existing projects, allowing developers to leverage the power of asynchronous programming without delving into the underlying complexities.

Move the Python ecosystem to the stable ABI — Victor Stinner

The talk discusses the efforts to move the Python ecosystem towards a stable Application Binary Interface (ABI), which would allow C extensions to be built once and work across different Python versions. The speaker, Victor Stinner, is a Python core developer who has been working on improving the Python C API and addressing the challenges of maintaining compatibility as the language evolves.

Cython and the Limited API — David Woods

The video discusses Cython, a tool that generates C code from Python-like code, and the limited API, a restricted set of functions in the Python C API that provides a stable binary interface. The speaker covers the benefits and limitations of using the limited API with Cython, as well as practical information on how to use it in your projects.

Tales from the abyss: some of the most obscure CPython bugs — Pablo Galindo Salgado

The talk explores a collection of obscure and bizarre bugs encountered in the development of the CPython interpreter. The speaker highlights the challenges of maintaining a large and complex codebase, where even small changes can lead to unexpected and hard-to-reproduce issues, and emphasizes the importance of understanding the underlying language and system behavior to effectively address these problems.

EuroPython 2024 — Sprint orientation

The video provides an overview of the EuroPython 2024 Sprint orientation, where various project maintainers present their initiatives and invite attendees to participate in the upcoming Sprint weekend. The video highlights the diverse range of projects, from Python core development to web frameworks, syntax highlighting tools, and electronics with MicroPython and CircuitPython.

Pytest Design Patterns — Miloslav Pojman

This talk discusses design patterns for testing Python applications using the Pytest framework. The speaker covers techniques for isolating dependencies, managing authorization, and testing high-level application logic, with a focus on maintainable and understandable tests.

Test java and C applications with python — Roberto Polli

The presenter discusses how Python can be used to test Java and C applications, addressing organizational and technical challenges in maintaining legacy code. Key benefits include reducing the effort of writing C code, speeding up test execution, and enabling broader participation in the testing process.

EuroPython 2024 — Closing Session

The closing session of EuroPython 2024 celebrated the success of the conference, highlighting the diversity and inclusion efforts, the record number of proposals, and the tireless work of the organizers, volunteers, and sponsors. The session acknowledged the contributions of the speakers, tutors, and remote participants, and looked forward to the upcoming Sprint weekend and the next edition of EuroPython in 2025.

The truth about objects — Naomi Ceder

The talk explores the concept of objects in Python, covering the basics of how everything in Python is an object, including constants, functions, and modules. It also delves into the more advanced and sometimes surprising aspects of classes, such as Monkey patching and meta-classes, while cautioning against their overuse in production code.

Adventures in not writing tests — Andy Fundinger

This talk explores the use of the Hypothesis library for property-based testing, which can help reduce the time and effort required to write effective tests. The speaker demonstrates how Hypothesis can generate edge cases and handle complex data structures, allowing developers to focus on describing the expected behavior rather than manually creating test cases.

Live coding music with PyREPL in Python 3.13 — Łukasz Langa

The video demonstrates live coding music using PyREPL in Python 3.13, showcasing features like custom MIDI output, sequencers, and transposition. The presenter also discusses the potential for contributing to the development of Python's interactive shell and the advantages of the new REPL in Python 3.13 over previous versions.

From Diamonds to Mixins: Demystifying Multiple Inheritance in Python — Ariel Ortiz

The talk covers the concept of multiple inheritance in Python, including the diamond problem and its resolution using the method resolution order. It also discusses the use of the super() function, mixins, and alternatives to multiple inheritance such as composition and the interface segregation principle.

Automatic trusted publishing with PyPI — Facundo Tuesca

Trusted publishing is a secure way to upload Python packages to PyPI without manually managing API tokens. The talk covers how trusted publishing works, the security benefits it provides, and its implementation using open standards like OpenID Connect.

Neurodiversity in the IT industry. Why do YOU need to know more about it? — Amelia Walter-Dzikowska

The talk explores neurodiversity in the IT industry, highlighting the importance of understanding and accommodating the unique needs and strengths of neurodiverse individuals. It discusses the challenges faced by neurodiverse professionals and provides practical recommendations for creating inclusive work environments that benefit all employees.

Lessons learned from maintaining open-source Python projects — Bernat Gabor

This talk provides valuable insights and practical advice for maintaining open-source Python projects, covering topics such as defining the role of a maintainer, establishing community guidelines, managing contributions, and ensuring project sustainability. The speaker shares personal experiences and best practices to help attendees navigate the challenges and rewards of open-source project maintenance.

Start strong! — Honza Král

The talk covers best practices for starting a new Python project, including managing dependencies, automating code formatting and linting, setting up continuous integration, and preparing a project for publication on PyPI. The presenter emphasizes the importance of automating these tasks to save time and reduce friction in the development process, allowing developers to focus on solving the actual problem at hand.

Edges of Python: Three Radical Python Hacks for Fun and Profit — Elvis Pranskevichus

The talk covers three Python hacks for building efficient and maintainable applications. The first hack demonstrates how to create a synchronous API on top of an asynchronous implementation, the second hack introduces a persistent data structure called HAMT for efficient schema migrations, and the third hack recommends using single dispatch for cleaner and more performant code.

MLtraq: Track your ML/AI experiments at hyperspeed — Michele Dallachiesa

The video discusses the importance of efficient experiment tracking in machine learning and AI development. It presents a new framework called MLtraq that aims to provide faster and more flexible experiment tracking compared to existing solutions like MLflow and Weights & Biases.

Those annotations can have things other than typing?! — Mattijs Ugen

The presentation explores the use of Python annotations beyond their primary purpose of type annotations. It discusses various alternative use cases, such as for documentation, input validation, and data parsing, while acknowledging the potential challenges in maintaining compatibility with the typing module.

The Imposter Staff Engineer’s Journey to Leadership — Manivannan Selvaraj

The speaker shares his personal journey with imposter syndrome, a common feeling among high-performers, and provides practical strategies to overcome it, such as being vulnerable, finding mentors, and documenting one's achievements. His insights and relatable anecdotes offer valuable lessons on building self-confidence and embracing the challenges of leadership roles.

Fundamentals of Retrieval Augmented Generation — Catalin Hanga

The presentation discusses the fundamentals of Retrieval Augmented Generation (RAG), a technique that addresses the limitations of language models by leveraging a database of documents to enhance their performance. It covers the key steps involved in the RAG process, including semantic search, document retrieval, and prompt augmentation, as well as the evaluation metrics used to assess the system's performance.

How I used pgvector and PostgreSQL® to find pictures of me at a party — Tibs

The speaker discusses how they used PostgreSQL and the pgvector extension to efficiently store and search for photos of themselves from a company event. They highlight the benefits of using a general-purpose database like PostgreSQL for this task, as well as the potential limitations and future improvements in this area.

Navigating Tech Leadership: Challenges and Strategies — Çağıl Uluşahin Sönmez

The talk explores the challenges and strategies of navigating tech leadership roles, including team structures, individual contributor and leadership responsibilities, common problems, and the role of leadership in addressing diversity in the industry. The speaker shares personal experiences and insights on transitioning to managerial positions and the importance of communication, facilitation, and supporting team wellness as key aspects of effective tech leadership.

Streamlining Testing in a Large Python Codebase — Jimmy Lai

This talk discusses strategies for streamlining testing in a large Python codebase, including parallel execution, caching, skipping unnecessary computations, and using modern runners. The speaker shares how these techniques helped their team reduce test execution time and increase test coverage, ultimately improving developer experience and code quality.

You are sharing your code wrong (and what to do about it) — Jeremiah Paige

The talk discusses the importance of properly sharing and distributing code, emphasizing the need to package code effectively to ensure a smooth installation and usage experience for users. The speaker outlines a three-step process of wrapping, delivering, and unwrapping code to optimize the sharing process and reduce the burden on users.

Building Event-Driven Python service using FastStream and AsyncAPI — Abhinand C

This talk presents the use of FastStream and AsyncAPI to build an event-driven Python service, highlighting the benefits of event-driven architecture and the challenges it poses, such as debugging and unit testing. The speaker demonstrates how to integrate FastStream with FastAPI and AsyncAPI to create a scalable and documented event-driven system.

Tackling Thread Safety in Python — Jothir Adithyan, Adarsh Divakaran

This talk discusses the challenges of thread safety in Python, including race conditions, synchronization primitives, and techniques to make a program thread-safe. The presenters demonstrate examples of thread-unsafe code and provide solutions using locks, semaphores, and other synchronization mechanisms to ensure consistent and predictable behavior in a multi-threaded environment.

Diversity, Equity, and Inclusion

Effective Strategies for Disability Inclusion in Open Source Communities — Brayan Kai Mwanyumba

The talk discusses effective strategies for disability inclusion in open source communities, emphasizing the importance of transformational leadership, creating a compelling vision for social change, and implementing a disability mainstreaming action plan. The speaker provides a step-by-step approach, including setting up a disability mainstreaming committee, formulating an action plan, implementing the plan, and monitoring and evaluating progress.

Health

EuroPython 2024 — Lightning talks Friday

The speaker discusses how using self-care techniques like taking regular breaks, using eye protection, and maintaining good posture can help programmers improve their health and become more effective in their work. The talk aims to inspire the audience to prioritize their well-being and adopt healthier habits to support their programming careers.

Keynote

EuroPython 2024 — Opening Session

The EuroPython 2024 conference, organized by the European Python Society, is a volunteer-run event that brings together the Python community in Europe. The conference offers a diverse range of activities, events, and opportunities for participants to connect, learn, and contribute to the Python ecosystem.

Multimedia

Multimedia processing with FFMpeg and Python — Michał Rokita

The talk provides an overview of using FFMpeg, a powerful multimedia processing tool, with Python. The speaker demonstrates various use cases, from simple video trimming to complex video stream processing, and highlights the advantages of using the FFMpeg Python library over the command-line interface.

Natural Language Processing

Representation is King: The Journey to Quality Dialog Embeddings — Adam Zíka

The presentation explores the journey to building quality dialog embeddings, highlighting the importance of representation learning. The speaker demonstrates how fine-tuning a pre-trained sentence transformer model on custom data can significantly improve the performance of a predictive model for dialog actions, outperforming a standard transformer encoder.

Observability

Pydantic Logfire — Uncomplicated Observability — Samuel Colvin

Pydantic Logfire is a powerful observability tool that simplifies the process of logging and tracing in Python applications. The presentation showcases Logfire's ability to provide structured and hierarchical logging, SQL-based data exploration, and seamless integration with open-source observability standards like OpenTelemetry.

Autoinstrumentation Adventures: enhancing Python apps with OpenTelemetry — Israel Blancas

This talk explores the use of OpenTelemetry, an open-source observability framework, to enhance Python applications with automated instrumentation and distributed tracing. The speaker demonstrates how OpenTelemetry can provide visibility into complex application architectures, enabling developers to quickly identify and resolve performance issues and failures.

Python Observability Perfected: Advanced Techniques with OpenTelemetry — Anton Caceres

The talk explores the evolution of observability in software development, from primitive techniques like LEDs and logs to the more advanced approach of OpenTelemetry. It highlights the benefits of OpenTelemetry, a standard protocol for collecting and analyzing telemetry data, and demonstrates how it can be integrated into Python applications to provide a comprehensive view of system performance and behavior.

A Tale of Scaling Observability — Toomas Ormisson

The talk presents the challenges and key principles involved in scaling observability at Wise, a rapidly growing global financial technology company. It covers the evolution of their telemetry systems, the transition to open-source solutions like Prometheus, Loki, and Tempo, and the focus on making the observability platform available, accessible, and scalable to support the company's growth and mission.

Performance Engineering

Profile, Optimize, Repeat: One Core Is All You Need™ — Valentin Nieper, Jonathan Striebel

The talk covers profiling and optimizing code for speed and memory usage, starting with a simple example and progressively improving it through techniques like vectorization, tiling, and low-level C++ extensions. The speakers emphasize the importance of profiling before scaling out, and provide a range of tools and strategies for optimizing code on a single core.

PEP 683: Immortal Objects - A new approach for memory managing — Vinícius Gubiani Ferreira

This talk presents PEP 683, a new approach for memory management in Python called 'Immortal Objects'. It discusses the issues of CPU cache invalidation, data race conditions, and copy-on-write that this PEP aims to address, as well as the challenges and trade-offs in its implementation.

Security

Zero Trust APIs with Python — Jose Haro Peralta

The talk discusses the importance of securing APIs and presents a zero-trust security model for APIs. It provides examples of common vulnerabilities, such as pagination attacks, SQL injection, and mass assignment, and demonstrates how to address these issues through design-time and runtime testing.

How to destroy the world using Python and a synthetic virus — Helena Gómez Pozo, Marina Moro López

The presentation explores the use of Python and synthetic biology tools, such as CRISPR-Cas9, to create and modify a synthetic virus with the potential to destroy the world. The speakers emphasize the importance of responsible use of these powerful technologies and highlight their potential clinical applications in areas like gene therapy and cancer treatment.

Best practices for securely consuming open source in Python — Ciara Carey

This talk discusses the importance of securely consuming open-source software in Python, highlighting the growing threat of supply chain attacks and the need for a comprehensive framework to manage open-source dependencies. The speaker introduces the Secure Software Supply Chain Consumption Framework (S2C2F) and outlines practical strategies and tools to implement its best practices, such as using artifact management, pinning dependencies, and automating vulnerability scanning.

logger.info(f"Don't Give all your {secrets} away") — Tamar Galer

The talk discusses the importance of being cautious about what is logged in software applications, as sensitive information like API tokens and credentials can inadvertently be exposed. The speaker introduces a tool called 'Maser Logger' that helps detect and mask sensitive data in logs, leveraging the open-source Git Secrets tool and the Aho-Corasick algorithm for efficient pattern matching.

Counting down for CRA - updates and expectations — Cheuk Ting Ho, Deb Nicholson

Failed to generate summary.

It’s happening: TUF joins PyPI (Warehouse) — Kairo de Araujo, Lukas Pühringer

The talk discusses the integration of The Update Framework (TUF) into the Python Package Index (PyPI), also known as Warehouse. It highlights the benefits of TUF in enhancing the security of Python package distribution, including its ability to manage trust at scale, reduce the impact of compromises, and support incident recovery.

Which LLM said that? - watermarking generated text — Adam Kaczmarek

The presentation discusses the challenges and potential solutions for watermarking generated text to prevent the misuse of AI-generated content. The speaker explores the limitations of current watermarking approaches and the need for more robust and secure methods to ensure the integrity of AI-generated text.

Sustainability

EuroPython 2024 — Open Source Sustainability Panel

This panel discussion explores the sustainability of open-source projects, covering funding, community engagement, and managing the tension between business and community interests. The speakers share their experiences and insights on building successful and long-lasting open-source initiatives, emphasizing the importance of transparency, alignment, and fostering a healthy community.