April 13, 2023, 1:15 a.m.

## Faculty Seminar: J. Marcinkowski - Towards Multiset Semantics Database Theory: How I Learned to Stop Worrying and Love Linear Algebra, April 21, 13:00, room 25

We invite employees, doctoral students and students to a faculty seminar to be held on **Friday, April 21**, at **13:00** (Institute of Computer Science, room 25).

**Prof. Jerzy Marcinkowski** will give a lecture entitled: **Towards Multiset Semantics Database Theory: How I Learned to Stop Worrying and Love Linear Algebra. **

__Abstract:__

For the last ten years, or maybe more, I have been a Database Theorist.

Like all (or at least most of) the Database Theorists I modelled the real-world-database relations, and real-world answers to the queries, as sets. If my query was "Exists y Cat(y) and Owns(x,y)", and my database happened to contain the tuples "Cat(Tibby)" and "Owns(Andreas, Tibby)" then -- clearly -- the constant "Andreas" was an element of the answer set.

I studied this model, using the methodology of mathematics. My toolbox was simple. It comprised some specific Database Theory concepts (like Chase) and some simple Computation Theory tricks (including maybe some automata-based arguments). Using this toolbox I constructed reasonings, which were occasionally complicated, but always elementary. It was fun.

But then, in early 2021, someone asked me what would happen to my results concerning Query Determinacy (whatever it is) if I considered a more realistic model, in which database relations, and answers to the queries, are multisets rather than sets. So, if my database knew that Andreas actually has six cats, the answer (multi)set would contain the constant "Andreas" with multiplicity six.

We started to think about it (with my very clever students, Jarosław Kwiecień and Piotr Ostropolski-Nalewaja) and soon we realized that this minor change in the assumptions about the model leads to a major change in the tools we need to study it. This felt to us like entering a completely new world. Suddenly the language and the methods from Linear Algebra appear so naturally that it is fair to say that they impose themselves on the researchers.

Clearly, we were not the first Database Theorists to set foot on this new continent. People have attempted to re-examine old Database Theory results, in the multiset-semantics scenario, for about 30 years now. But such attempts were few. And this is (I guess) exactly because in order to make any progress here one needs to learn totally new tools.

To our surprise we learned that this continent had also been visited before by the Mathematicians. They had constructed a number of structures there, some of them of some use for us. That's a bit frustrating, because they use different terms to call some notions, and it is not always straightforward to understand how to translate their own results to our language.

My talk will be a story by a traveller astonished by what he saw during his first short trip to the little known continent of Multiset-Semantics Database Theory.

To give you a glimpse of the techniques that appear there I will present some results and arguments from our PODS 2022 paper "Determinacy of Real Conjunctive Queries. The Boolean Case" (with Kwiecień and Ostropolski-Nalewaja). But I will also mention the adventures of other travellers to this new land, with particular emphasis on the famous Conjunctive Query Containment Problem.

Like all (or at least most of) the Database Theorists I modelled the real-world-database relations, and real-world answers to the queries, as sets. If my query was "Exists y Cat(y) and Owns(x,y)", and my database happened to contain the tuples "Cat(Tibby)" and "Owns(Andreas, Tibby)" then -- clearly -- the constant "Andreas" was an element of the answer set.

I studied this model, using the methodology of mathematics. My toolbox was simple. It comprised some specific Database Theory concepts (like Chase) and some simple Computation Theory tricks (including maybe some automata-based arguments). Using this toolbox I constructed reasonings, which were occasionally complicated, but always elementary. It was fun.

But then, in early 2021, someone asked me what would happen to my results concerning Query Determinacy (whatever it is) if I considered a more realistic model, in which database relations, and answers to the queries, are multisets rather than sets. So, if my database knew that Andreas actually has six cats, the answer (multi)set would contain the constant "Andreas" with multiplicity six.

We started to think about it (with my very clever students, Jarosław Kwiecień and Piotr Ostropolski-Nalewaja) and soon we realized that this minor change in the assumptions about the model leads to a major change in the tools we need to study it. This felt to us like entering a completely new world. Suddenly the language and the methods from Linear Algebra appear so naturally that it is fair to say that they impose themselves on the researchers.

Clearly, we were not the first Database Theorists to set foot on this new continent. People have attempted to re-examine old Database Theory results, in the multiset-semantics scenario, for about 30 years now. But such attempts were few. And this is (I guess) exactly because in order to make any progress here one needs to learn totally new tools.

To our surprise we learned that this continent had also been visited before by the Mathematicians. They had constructed a number of structures there, some of them of some use for us. That's a bit frustrating, because they use different terms to call some notions, and it is not always straightforward to understand how to translate their own results to our language.

My talk will be a story by a traveller astonished by what he saw during his first short trip to the little known continent of Multiset-Semantics Database Theory.

To give you a glimpse of the techniques that appear there I will present some results and arguments from our PODS 2022 paper "Determinacy of Real Conjunctive Queries. The Boolean Case" (with Kwiecień and Ostropolski-Nalewaja). But I will also mention the adventures of other travellers to this new land, with particular emphasis on the famous Conjunctive Query Containment Problem.

(fot. public domain, screen from Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb)