Azure Cosmos DB offers features that facilitate data availability and resiliency. However, there is also a different, very important aspect of facilitating access to distributed data sources in the most optimal manner, which focuses on access control mechanisms. Specifics of this aspect applicable to Azure Cosmos DB will be the subject of this article.
From a security standpoint, the recommended approach to access control is based on two principles. The first one, referred to as defense in depth involves protecting a target workload by using a combination of different security mechanisms, including, authentication, authorization, encryption, logging and auditing, and network level protection. The second one, known as the principle of least privilege requires that the level of access granted provides precisely the extent of privileges required by authorized users and applications to carry out their tasks.
In regard to Azure Cosmos DB, its first line of defense is established on the network layer. It is important to point out that unlike other Azure data store offerings, such as Azure SQL Database, by default, all inbound connections to Cosmos DB are not blocked on the network level. However, you have the option of enabling IP Access Control. Once enabled, it becomes necessary to explicitly allow access based on one or more of the following settings:
- Allow access to Azure portal - this setting controls connectivity on the data plane from Azure portal.
- Allow access to Azure services - this setting controls connectivity from Azure services, such as Azure Functions or Azure Stream Analytics. Note that this is simply an on/off switch (corresponding to the 0.0.0.0 firewall rule), which means that it is not possible to restrict connectivity to arbitrarily chosen Azure services only.
- Add my current IP - this setting controls connectivity on the data plane from the computer which you are using to run the browser window displaying Azure portal.
- IP (single IPv4 or CIDR range) - this setting controls connectivity on the data plane from specific IPv4 addresses or ranges of IPv4 addresses that you designate.
Besides network-level security, you also control access on the data plane by relying on the Cosmos DB data access model. This model is based on usage of keys, somewhat similar to the access control mechanism used by Azure Storage. Cosmos DB supports two types of keys:
- Master keys grant permissions for administrative purposes. They allow you to delegate privileges to manage the database account, its databases, database users, and user permissions.
- Resource tokens grant data access to individual users based on the permissions to databases resources, including collections, documents, stored procedures, triggers, and user defined functions (UDF) that are assigned to these users. Configuring these permissions requires a knowledge of the value of the master key.
There are two, auto-generated master keys per database account, which provide full access on the account level. You can regenerate either of them on demand. Having two keys allows you to periodically change one of them while using the other, ensuring uninterrupted data plane access. Once the first key has been changed, you could start using this key instead, allowing you to change the second one, without disruption to data plane operations.
In addition to the two account keys that provide full account access, there are also two read-only keys. Note that they do not allow reading permissions set on the resource level.
Resource tokens provide granular access to individual resources. They are generated dynamically in response to POST, GET, or PUT requests targeting these resources. In addition to the resource-level granularity, they have also a limited lifespan. By default, they are valid for one hour, although you can extend it for up to five hours. There are two levels of permissions - read only and full.
Typically, the usage of resource tokens involves a middle tier service that brokers connection requests from users and applications and the Cosmos DB instance. That service has the knowledge of the value of at least one of two master keys that provide full access to the target Cosmos DB account. The middle service is responsible for validating the identity of the user or application. Once the identity is validated, the service requests a resource token corresponding to that identity and passes the token to the user or application. At that point, that user or application accesses the target resource directly, sending a hash-based message authentication code (HMAC) along with its request. The hash is calculated based on the resource token and is used by Cosmos DB to prevent request tampering.
All Cosmos DB databases, media attachments, and backups are encrypted. This provides protection at rest and is not configurable. In addition, all communication targeting Cosmos DB is protected in transit by enforcing SSL/TLS 1.2. This applies to both client-to-service communication and service-to-service communication (in the case of geo-replicated deployments).
Cosmos DB also provides support for diagnostics logging, which captures all requests to database accounts and their individual resources. Logs can be stored in Azure Storage or redirected to Azure Event Hub and Log Analytics.
Last, but not least, you can use Role Based Access Control (RBAC) in order to delegate management of Cosmos DB instances. The built-in roles applicable in this case include (besides Owner, Contributor, and Reader, which are not resource type-specific), Cosmos DB Account Reader, and DocumentDB Account Contributor.
This concludes our overview of Cosmos DB access control mechanisms. In our upcoming articles we will explore how to supplement these mechanisms by using the availability and resiliency features provided by Cosmos DB.