DEEM brings together researchers and practitioners at the intersection of applied machine learning, data management, and systems research, with the goal of discussing the arising data management issues in ML application scenarios. The DEEM workshop will be held on Friday, June 5th, in conjunction with SIGMOD/PODS 2026. The workshop will be held in person in Bengaluru.
The workshop solicits regular research papers (8 pages plus unlimited references) describing preliminary or completed research results, as well as short papers (up to 4 pages) such as reports on applications and tools, or preliminary results, interesting use cases, problems, datasets, benchmarks, visionary ideas, and descriptions of system components and tools related to end-to-end ML pipelines. Submissions should follow the guidelines as for SIGMOD, i.e., use the sigconf template for the ACM proceedings format.
Follow us on twitter @deem_workshop, bluesky @deem-workshop.bsky.social, or contact the organizers via email. We also provide archived websites of previous versions of the workshop: DEEM 2017, DEEM 2018, DEEM 2019, DEEM 2020, DEEM 2021, DEEM 2022, DEEM 2023, DEEM 2024, and DEEM 2025.
Imagine a personal assistant that, with user permission, persistently remembers moments from daily life—answering questions like “When and where did I see this lady?” or offering personalized suggestions like “You might enjoy The Little Prince—it relates to the statue you liked in Lyon.” Realizing this vision requires overcoming major challenges: capturing visual memories under hardware constraints (e.g., memory, battery, thermal limits, bandwidth), extracting meaningful personalization signals from noisy, task-agnostic visual histories, and supporting real-time question answering and recommendations under tight latency requirements. In this talk, we present our early work toward this goal. Pensieve, our memory-based QA system, improves accuracy by 11% over state-of-the-art multimodal RAG baselines. VisualLens infers user interests from casual photos, outperforming leading recommendation systems by 5–10%. We also share initial results on efficient, event-triggered memory capture and compression. Our work points to a broad landscape of research opportunities in building richer, more context-aware personal assistants capable of learning from and reasoning over users’ visual experiences.
Applying Machine Learning (ML) in real-world scenarios is a challenging task. In recent years, the main focus of the data management community has been on creating systems and abstractions for the efficient training of ML models on large datasets. However, model training is only one of many steps in an end-to-end ML application, and a number of orthogonal data management problems arise from the large-scale use of ML and increased adoption large language models (LLMs).
For example, data preprocessing and feature extraction workloads may be complicated and require simultaneous execution of relational and linear algebraic operations. Next, model selection may involve searching many combinations of model architectures, features, and hyper-parameters to find the best-performing model. After model training, the resulting model may have to be deployed and integrated into business workflows and require lifecycle management using metadata and lineage. As a further complication, the resulting system may have to take into account a heterogeneous audience, ranging from domain experts without programming skills to data engineers and statisticians who develop custom algorithms. Many such challenges are human or engineer-centered (e.g., monitoring ML pipelines, leveraging LLMs for domain-specific tasks at scale), and DEEM uniquely encourages submissions in such topics.
Additionally, the importance of incorporating ethics and legal compliance into machine-assisted decision-making is being broadly recognized. Critical opportunities for improving data quality and representativeness, controlling for bias, and allowing humans to oversee and impact computational processes are missed if we do not consider the lifecycle stages upstream from model training and deployment. DEEM welcomes research on providing system-level support to data scientists who wish to develop and deploy responsible machine learning methods.
DEEM aims to bring together researchers and practitioners at the intersection of applied machine learning, data management and systems research, with the goal to discuss the arising data management issues in ML application scenarios.
We invite submissions in the following two tracks:
The Microsoft CMT service was used for managing the peer-reviewing process for this conference. This service was provided for free by Microsoft and they bore all expenses, including costs for Azure cloud services as well as for software development and support.
Liane Vogel
Matteo Interlandi
Bojan Karlaš
Stefan Grafberger