
provider "databricks"
- data “local_sensitive_file” “aad_token_fle”
- data "databricks_current_user" "me"azurerm and databricks. As you can see, there is an additional option specified for key_vault within the azurerm provider and that is purge_soft_delete_on_destroy. Since the infrastructure showcased here was never intended to be used in real production workflows, by leveraging this option we make sure that key_vault content is purged along with its deletion.“null_resource” . “get_aad_token_for_dbx” resource is used for Azure CLI call that retrieves a token including flag — resource 2ff814a6–3304–4ab8–85cb-cd0e6f879c1d, which is Azure’s programmatic ID for Databricks workspace. The output of Azure CLI call is piped to jq command: jq -r .accessToken > %s which outputs token value to %s. The whole CLI call and a pipe is wrapped within the native Terraform format() command, which maps %s to the name of the file that is used for keeping the AAD token. As specified in variables.tf file, it is aad_token.txt.data block that is specified here as ”local_sensitive_file” “aad_token_file”.provider “databricks” block, where we specify its content as a token, while also taking care of redundant file formatting that might have got into the file, depending on our operating system of choice. This trimming is done via the trimsuffix function.terraform plan or terraform apply command. If the resource group is not modified in any way in the message received after running one of those commands, your definition completely matches the definition of an already existing resource group. Otherwise, you can either modify your code or overwrite the definition of the resource group.account_kind and is_hns_enabled in azurerm_storage_account resource must be specified to StorageV2 and true, respectively.resource_group_name and its location, we just take advantage of the fact that those attributes are specified in the state and we just provide a reference to them here.<storage-name>/<container-name> will map to the top level of the container. From within Terraform code, containers can be created using azurerm_storage_data_lake_gen2_filesystem resource.azurerm provider and not the databricks provider — the latter is for interacting with the workspace itself and requires the workspace to be created first. Parameter sku controls whether you’ll be using a standard or premium workspace. The comparison of their functionalities and pricing can be found here: https://www.databricks.com/product/azure-pricing.azurerm provider, since object_id parameter is set to data.azurerm_client_config.current.object_id. In order to create an access policy for a different user, group or service principal, change this value.depends_on option in azurerm_key_vault_access_policy is set to [azurerm_key_vault.kv]. This is done to make sure that the policy is created after the underlying Key Vault has already been provisioned.resource_id and dns_name as a keyvault_metadata in databricks_secret_scope resource. But please remember, that in order for this setup to work, you need to authenticate to the databricks provider in a way that provides access to both Databricks Workspace and Azure Key Vault, as described in the section Providers. Registering a Key Vault inside Databricks Workspace as a secret scope is exactly the reason why the authentication is done via an AAD token.kind = “GlobalDocumentDB” and setting tags to:throughput is set within containers and by changing this value, you can set how much computing power will be reserved for your workloads, so it also controls the costs of the resource.databricks_cluster resource you can also add spark configuration properties, which can be leveraged for customizing runtime experience. Herein, spark_conf is used to provide access key to the ADLS2 account. Please notice, that at first a format command is used, in order to reference azurerm_storage_account.adls2.name within the string and then it’s assigned the key value itself. Since this assignment is done via referencing properties of objects that were created earlier on, the storage key is never exposed in the code itself — it is evaluated at the runtime, whenever you hit terraform apply.databricks_library resource this is very easy to do from the script. We just need to specify the source of the library (maven) and its coordinates along with the appropriate cluster id.authorization-key. This is not done by accident. In fact, the Feature Store library expects that the access keys will have this particular suffix, regardless of what else is there in the name. We find this peculiar, especially since you always need to reference the secret name in the code but without this suffix the Feature Store library just hides this suffix addition from the user. For more info, please, refer to the original docs: https://learn.microsoft.com/en-us/azure/databricks/machine-learning/feature-store/fs-authentication#--authentication-for-looking-up-features-from-online-stores-with-served-mlflow-models. You can also have a look at the Python code for this connection in the notebooks uploaded to the repo along the Terraform code.databricks_token resource. Once it’s provisioned, let’s insert it into Key Vault just as we’ve done for other credentials.databricks_notebook resource. For dealing with paths, we’ll leverage the powerful format function available in Terraform. The source specifies the local path to the file and path is the target path on Databricks workspace. We can use ${path.module} for getting the filesystem path to where our code sits. For parsing the path on the Databricks workspace, we can leverage data.databricks_current_user.me.home, which is the attribute of data that we specified earlier on. By leveraging it, the beginning of the path will be equal to our user’s home path on the Databricks workspace.databricks_job resource. Basically it’s the same setup as what was done for interactive clusters within one resource. The only real difference in the code is the addition of the notebook_task, which specifies the path to the notebook containing the Python code that will be run within the workflow. Once the Terraform code is successfully run, this is what the task specification for this workflow looks like in Databricks UI:
terraform planterraform applyterraform plan. However, you’re also now asked whether you want to go forward with the deployment. If you are, type yes, click enter, and wait a bit for Terraform to do its job and provision all of the resources.Posted Jul 30, 2025
Deployed Databricks Feature Store on Azure using Terraform as IaC.