The Power BI data model is one of the most important aspects of a good Power BI environment. The data model is the way Power BI can navigate to achieve a good result. Creating a good and efficient data model will bring benefits in terms of performance, DAX encoding, and bug fixing.
A right or wrong data model is difficult to define. A data model is highly dependent on the quality of the data input. This can vary from company to company and the result is that there is no one hard truth about a correct data model.
To set up a Power BI data model, it is important to make well-considered choices. To support you in making one of these choices, this article will explain, using understandable examples, why you should always avoid a many-to-many relationship.
What is a many-to-many relationship (*: *)
Many-to-many means that in the many column (fact table) a value occurs more than once. In the one column (often the dimension) the value also occurs more than once. Although this form of relationship cannot always be prevented, it is recommended to avoid it where possible. In this form of relationship, Power BI will compare everything with everything and therefore multiply the number of queries in the background. With many-to-many comparisons, this relationship can quickly cause performance issues.

Performance Analyzer and Dax Studio
To answer the main question of this article ‘What does a many-to-many relationship do underwater?’ it is necessary to first get acquainted with the external application DAX Studio.
DAX Studio is a tool for writing, executing, and analysing DAX queries. In addition, it has the option to analyse Power BI metadata using SQL select statements. DAX studio is friendly to use in both large and small Power BI environments. It provides insight and supports the development process and is therefore a must have for every Power BI developer.
One of the many options of DAX studio is ‘traces’. These are mechanisms for analysing DAX queries. The ‘tracings’ consist of three possibilities:
All queries:
The All-Queries trace provides an overview of all rotated queries, not just queries sent from DAX Studio, such as the Query Plan and Server Timings functions.
Query Plan:
Query Plan traces the SSAS Tabular server. At the time of writing, this trace returns a difficult to read format. It requires special expertise to understand this tracing and therefore it is not relevant to this article. Microsoft does indicate that updates are coming for the readability and comprehensibility of this trace. Hopefully, this tracing will add more in the future.
Server Timings:
Server Timings is important for our analysis. This option provides the necessary overview of the number of queries underwater, as well as the elapsed time and CPU percentage.
Performance analyzer:
To be able to use Server Timings, you first run the Performance Analyzer in Power BI Desktop. The Performance Analyzer returns the DAX query that Power BI uses when executing the job. Please note that this DAX query is different from a DAX measure.

Case
To provide a good insight into the performance of a many-to-many relationship, it is best to compare this with the performance of the one-to-many relationship.
One-to-many means that in the many column (fact table) a value can occur several times. In the one-column (often the dimension), the value occurs only once. This is the most common relationship and will provide the best performance. After all, Power BI only needs to look up and match one value. In principle, this relationship is always advised.
To illustrate the explanation below, two semantic models have been created. Both are identical to each other, with the only difference being the relationship of cardinality from one-to-many to many-to-many.
The data is a simple dimension table with category {A, B, C} and a fact table with multiple recurring values {A, B, C} plus an amount column.

In addition, there is a simple table visualization where amount is broken down by category.

What does a many-to-many relationship do?
The process of gaining insight into the many-to-many relationship starts with running the Performance Analyzer. After connecting to DAX-Studio, the Server Timing can be run with the query from the Performance Analyzer.
Although the Semantic Models are identical, the Server Timing for the one-to-many relationship only runs one scan query, while the many-to-many relationship returns three scan queries. This difference can be explained in the only difference between the semantic models, the cardinality.

One-to-many
In the case of the one-to-many relationship, Power BI returns a single SQL select statement. It selects the Dimension [Category], because this is indicated in the visual and thus counts as visual context. In addition, it selects the sum of the Fact Table [quantity], because this is determined in the corresponding measure. To allocate the sum over the corresponding categories, a ‘left outer join’ is used, in which the corresponding category from the Fact Table [Category] is linked to the Dimension [Category]. By using a ‘left outer join’, null values are also visible.
In the case of the one-to-many relationship, Power BI generated a single query, copying only the Dimension [Category] in memory.

Many-to-many
The many-to-many relationship gives a different picture. In the case of the many-to-many relationship, Power BI is instructed to allocate everything to everything.
It starts by copying the entire dimension table into memory. After all, the assignment is all-on-everything. The second query is a working memory copy of the fact table where it applies a ‘where’ statement to Fact Table [Category] with all values from the dimension table as a filter context. It then determined the fact table based on the values from the dimension table. As a final query, it determines the sum.
From the above analysis it can be concluded that a many-to-many relationship need to be avoided at all times, because a many-to-many also takes the fact table into account. In our example, we have very little data, and this will not cause any problems, but in a large data model with many-to-many relationships, in addition to the dimension table, the fact table is also copied to memory. This must be prevented at all times for the same reason that no iteration function is put on a fact table.

The above analysis also explains why the ALL function does not always work in a many-to-many relationship.
All function does not work with many-to-many relationship.
Suppose the following calculation is used:
CALCULATE (SUM(‘FACT'[Amount]), ALL (‘FACT’))
One-to-many
In the case of the one-to-many relationship, Server Timing will show an additional query step. The complete copy of the dimension table, as it also appears in the many-to-many relationship in the previous example. This makes sense because many-to-many basically equals the ‘ALL’ function.

Many-to-many
In the case of many-to-many, nothing changes, except that the calculation does not work in this form. The query, where the table dimension is copied entirely into working memory, already exists. After all, the query in question is necessary to realize the follow-up query, the copy of the table of facts. Both queries are necessary to ensure the many-to-many relationship. So, Power BI cannot rerun this query in the case of a many-to-many relationship.

If the ALL functionality is placed on the dimension table instead of the fact table, then the many-to-many relationship will first execute the three queries to realize the many-to-many relationship and then remove the filters. As a result, it adds one more query, by copying the table dimension completely.

Conclusion
Although both semantic models are identical, a small difference such as relationship cardinality can already have a significant impact. In the example above, the one-to-many relationship only runs one or two queries, while the many-to-many relationship already runs three to four queries. The larger the dataset and the associated volume, the greater the impact and negative consequences. Always try to avoid a many-to-many relationship!

