Logic partition is a way for Azure Cosmos DB to scale out for your application’s DB operation performance. Each logic partition is a subset of your container (which is an Azure construct to hold data of the same type in DB). Data belonging to one logic partition will be stored in the same physical partition, yet a physical partition may very well be storing multiple logical partitions. In reality, developers only consider designing logical partitions, while the physical partition and the mapping from logical to physical are entirely managed by Azure.
The partition is determined by partition key, which is a property you choose from the properties of the items to be stored. For example, if a container ContactContainer is to store contact information, and each of the contact have properties first name, last name, phone number, email address, and id, all these five properties are candidate partition keys. As long as the property value are serialized to string or numeric value, the property can be used as partition key. Note that char property is not valid partition key.
Partition key is specified when the container is created, and cannot be changed. This means you need to be very careful and think in advance when designing your container, otherwise you’ll have to create a new container before copying all your existing data to it.
How do we choose partition key, then? To quote Azure documentation, a partition key should:
- Be a property that has a value which does not change. If a property is your partition key, you can’t update that property’s value.
- Have a high cardinality. In other words, the property should have a wide range of possible values.
- Spread request unit (RU) consumption and data storage evenly across all logical partitions. This ensures even RU consumption and storage distribution across your physical partitions.
It’s also important to keep in mind the fourth optional rule: it better be a property you use frequently in query.
To understand why these rules matter, it’s helpful to understand how logic partitions work on physical partitions:
When a container is created, it’s hosted on a physical partition, which physically stores all your data and deals with all the incoming requests. Soon, with the data size growing, a physical partition is not going to very performant.
Instead of scaling up (more expensive machines), it makes more sense to scale out using more physical partitions (more machines). When a new physical partition is allocated to you, Cosmos will move about half of the data without splitting a logical partition to the new physical partition. In other words, Cosmos will take data in some logical partitions and leave data in other logical partitions alone. This process is not exposed to the applications or developers. Developers only uses logical partitions as a indicator to Cosmos.
This is why you cannot change your partition key once the container is created: it would fundamentally change how the data are stored physically. If you really need to do it, you migrate your data to a new container, which will be distributed differently on physical partitions from your old container.
This also explains why high cardinality is required: because data of a same partition key value will be stored in one physical partition. If most of your data belong to one logical partition, they will likely reach the upper size or rate limit of a physical partition.
Why even distribution is desired? Well, turns out Cosmos divide your RU quotas evenly to each physical partition. So if one logical logical partition is getting hit a lot more often than others, the physical partition gets overwhelmed.
Lastly, why we prefer partition key to be a property used frequently in query? The answer is that the Cosmos knows exactly where to find the logical partition in the query. Without a partition key, it will have to go through each logical partition to locate the target data.
At this point, my understanding is that artificial UID is perhaps the optimal choice for partition key in almost all scenarios. It never changes, has wide value range, and is evenly distributed. This is effectively treating each item as a logical partition. There is no logical partition number limit, so don’t worry about that.
On a side note, if you are using other partition key together with id, note that id is only unique within its logical partition, not in the whole container.