User Docs
PlatformProduct updates
  • Getting started
    • What is DSPM?
    • Use DSPM in your company
    • Choose how to run DSPM
  • Quick start
  • Deployment guide
    • Sizing
    • Installation
      • Understand installation requirements
        • K3S installation
        • Configuring a HA K3s cluster
        • Configuring Rancher and Fleet agent to run behind an HTTP proxy
        • Install Synergy/Focus/Enterprise using Helm without Rancher
        • Install Synergy/Focus/Enterprise using Rancher
        • Air Gap Installation
        • Uploads to Rancher
      • Upgrade K3s
        • K3s - Upgrade
      • Troubleshooting
        • K3s on RHEL/CentOS/Oracle Linux
        • Networking
        • Configuring Rancher and Fleet agent to run behind a HTTP proxy if cluster was previously registered
    • Estimate hardware capacity needs
  • Administration guide
    • Customer Support Portal
    • Pattern matching
    • Data Controls
    • Analytics
    • Detectors
    • Import custom TLS certificate
    • GQL Quick Guide
    • Critical & Sensitive Classification Attribute Modification
    • How to Check AI Mesh Version
    • Webhooks
    • AI Mesh Overview
    • Is Customer Data Saved by Getvisibility?
  • Enterprise setup
    • Authentication
      • Keycloak configuration
      • Single Sign-on (SSO)
        • Using Azure AD as Keycloak Identity Provider
      • Keycloak User Federation Configuration (LDAP/AD)
      • Enable 2FA
      • Role-Based Access Control (RBAC)
      • Keycloak User Federation using LDAP over SSL
  • Implementation
    • Configuring Taxonomies & Labels
  • Integrations
    • GQL
    • Template Language
    • Multi-Language Machine Learning
    • SIEM Integration
    • Google Drive Auto-labelling
  • Scan with Getvisibility
    • Configure detectors
    • Configure data sources
      • Scan Configuration Fields
      • AWS IAM
      • AWS S3
      • Azure AD
      • Azure Blob
      • Azure Files
      • OneDrive
      • SharePoint Online
      • SharePoint on-premise
      • Box
      • Confluence Cloud
      • LDAP
      • SMB
      • Google IAM
      • Google Drive
      • ChatGPT
      • iManage
      • Dropbox
    • Scanning
      • Data Source Permissions
      • Scan Scheduler
      • Types of Scan
      • Scan History
      • Scan Analytics
      • Supported Languages for ML Classifiers
      • Rescan Files
    • Streaming
      • What is DDR?
      • How to Configure DDR Rules
      • Import Data Controls
      • Monitoring New Files via DDR Streaming
      • DDR Supported Events
      • Lineage
      • Supported Data Sources
      • Azure Blob Streaming Configuration
      • Azure Files Streaming Configuration
      • Confluence Cloud Streaming Configuration
      • Sharepoint Online Streaming Configuration
      • SMB Streaming Configuration
      • OneDrive Streaming Configuration
      • Azure AD Streaming Configuration
      • AWS S3 Streaming Configuration
      • Google Drive Streaming Configuration
      • Google IAM Streaming Configuration
      • AWS IAM Streaming Configuration
      • Box Streaming Configuration
      • Dropbox Streaming Configuration
    • Enterprise Search columns meaning
    • Supported File Types
  • Glossary
  • FAQ
  • EDC - All Documents
    • Deployment - Onboarding
      • EDC-Server Installation Guide
      • EDC-Deployment Flow Guide
        • EDC-installerConfig.json and CLI config Details
      • Deploying the agent using ManageEngine
      • EDC-Mac Agent - Installation Guide
      • Windows Agent Precheck Script
    • Functionality - Guides
      • EDC - Admin Guide - v4
      • EDC -Guide for writing Visual Labels
      • EDC- Guide for Header Footer Options
      • EDC-Metadata Details
      • EDC Supported File Types
      • Agent V4 - Configuration Options for Expert Mode
      • File Lineage - Agent Activities
      • Endpoint Data Discovery
    • Troubleshooting Documents
      • Preventing Users From Disabling Agent
      • Generate Installation Logs
      • Troubleshooting Agent for Windows
      • Guide for missing suggestions
      • Reseller Keycloak Quick Installation Guide
      • Alternative authentication methods for agent
  • EDC - All Documents
Powered by GitBook
On this page
  • Prerequisites
  • Steps to Enable Data Streaming
  • 1. Select an Existing Scan Configuration
  • 2. Enable Data Streaming
  • 3. Configure Azure Event Grid Subscription
  • 4. Assign Required Azure Permissions
  • 5. Create Azure Event Hub
  • 6. Configure Azure Storage Diagnostic settings
  • 7. Configure Azure Logic Apps
  • Troubleshooting
  • Next Steps
  • Troubleshooting

Was this helpful?

Export as PDF
  1. Scan with Getvisibility
  2. Streaming

Azure Blob Streaming Configuration

This document provides information on how to configure Azure Blob connection with real-time events monitoring and data streaming.

PreviousSupported Data SourcesNextAzure Files Streaming Configuration

Last updated 2 months ago

Was this helpful?

To enable DDR (Streaming) for an existing Azure Blob scan, follow these steps:

Prerequisites

  1. Existing Azure Blob connection: An Azure Blob scan configuration must already exist.

    • If an Azure Blob scan has not yet been created, follow this guide to and ensure the necessary credentials are configured.

Steps to Enable Data Streaming

1. Select an Existing Scan Configuration

  1. Go to the Scan configurations page in the product UI.

  2. Find the existing Azure Blob scan configuration and select Edit Configuration from the options menu.

2. Enable Data Streaming

  1. Within the Edit Azure Blob Scan Configuration page, toggle Data Streaming to ON.

  2. Copy the Webhook URL provided, as you will use it later in the Azure Portal.

3. Configure Azure Event Grid Subscription

  1. Select one of the connector from the Storage Accounts

  2. In the left-hand menu, select Events and click Create Event Subscription.tor menu

  3. In Create Event Subscription Window fill in the details:

    1. Give it a Name

    2. Select endpoint type Web Hook

    3. Set configure an endpoint

      <figure><img src="../../.gitbook/assets/cab519c5-725f-4f62-a8d4-3bce7eb60737 (1).png" alt=""><figcaption></figcaption></figure>
    4. Use the Webhook URL provided at the step 2 to Subscriber endpoint and Confirm selection.

  4. Go to Filters Menu on top

  5. In the Subject Filters section, enter the correct path format for the subscription:

    • Use the following pattern: /blobServices/default/containers/{connectionDetails.ContainerName}/blobs/{connectionDetails.FolderPath}

    • For example, if the container is mycontainer and the folder path is accuracy test/repository1, the path will look like: /blobServices/default/containers/mycontainer/blobs/accuracy test/repository1

    Make sure to replace {connectionDetails.ContainerName} and {connectionDetails.FolderPath} with the actual container name and folder path from the scan configuration.

  6. Click Create to complete the Event Subscription setup.

4. Assign Required Azure Permissions

Ensure the following permissions are assigned to the Azure Storage Account:

  • EventGrid Data Contributor

  • EventGrid EventSubscription Contributor

  • EventGrid TopicSpaces Publisher

5. Create Azure Event Hub

  1. In Create Namespace Window fill in the details

    1. Give it a Name

    2. Select your subscription and resource group

    3. Select location

    4. Pricing tier - standard

    5. Throughput Units - 1

  1. Click on Review + Create and then Create after validation

  1. After namespace is created, click on + Event Hub button

  2. In Create Event Hub Window fill in name and click Create + Review and Create after validation. Save the name of the Event Hub you created in this step, as it will be used later in step 9 to replace {eventHubName}.

  1. Configure access policy

    1. In the event hubs namespace window click on Settings/Shared access policies and then +Add button

    2. Fill in the details in the new tab, set LogicAppsListenerPolicy as name, select Listen policy, and click Save.

    3. Click on the newly created policy, then copy and save the Connection string–primary key. This will be needed later in step 8b.

6. Configure Azure Storage Diagnostic settings

  1. Select needed account from the Storage Accounts

  2. In the left-hand menu, select Monitoring/Diagnostic settings and click blob

  3. In Diagnostic settings Window click on "+ Add diagnostic setting" button

  4. In Create Diagnostic setting Window fill in the details:

    1. Give it a Name

    2. Select Category groups allLogs

    3. Select Destination details Stream to an event hub and select newly created Event Hub Namespace and Event Hub

    4. Click Save.

7. Configure Azure Logic Apps

  1. In Create Logic App Window select Workflow Service Plan

  2. In Create Logic App (Workflow Service Plan) Window fill in the details and click "Create + Review":

    1. Select your subscription and resource group

    2. Give logic app name

    3. Select region

    4. Pricing plan should be WS1

    5. In the monitoring tab select No for the application insights

    6. Click Review + create button

  3. Click Create after validation

  4. In newly created logic app click on Workflows/Workflows and then +Add button

  5. In new workflow tab fill in name, select State type: Stateful and click Create

  6. In created workflow go to Developer/Designer and click on Add a trigger, then in search type "Event hub" and select "When events are available in Event Hub"

  7. Configure API connection

    1. Click on the trigger, set "Temp" for Event Hub Name and then click on Change connection.

    2. Then click Add New and fill in the details. Enter any name for the connection name and use the connection string {Connection string–primary key} from step 3.6.c.

    3. On the Change Connection tab, click Details and copy the Name from the connection details. Save this Name, as it will be used later in step 9 to replace {connectionName}.

    4. Click save on workflow designer window

  8. In workflow navigation tab go to Developer/Code and set the provided code, then click save:

    1. {
          "definition": {
              "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
              "actions": {
                  "Filter_Records": {
                      "type": "Query",
                      "inputs": {
                          "from": "@triggerBody()?['ContentData']?['records']",
                          "where": "@and(not(empty(item()?['uri'])),or(contains(item()?['uri'], '{FolderPath}/'),contains(item()?['uri'], '{FolderPath}?')))"
                      },
                      "runAfter": {}
                  },
                  "Condition": {
                      "type": "If",
                      "expression": "@greater(length(body('Filter_Records')), 0)",
                      "actions": {
                          "HTTP-copy": {
                              "type": "Http",
                              "inputs": {
                                  "uri": "{WebhookUrl}",
                                  "method": "POST",
                                  "headers": {
                                      "Content-Type": "application/json"
                                  },
                                  "body": {
                                      "event": "@setProperty(triggerBody(),'ContentData',setProperty(triggerBody()?['ContentData'],'records',body('Filter_Records')))"
                                  }
                              },
                              "runAfter": {}
                          }
                      },
                      "else": {},
                      "runAfter": {
                          "Filter_Records": [
                              "Succeeded"
                          ]
                      }
                  }
              },
              "contentVersion": "1.0.0.0",
              "outputs": {},
              "triggers": {
                  "When_events_are_available_in_Event_Hub": {
                      "type": "ApiConnection",
                      "inputs": {
                          "host": {
                              "connection": {
                                  "referenceName": "{connectionName}"
                              }
                          },
                          "method": "get",
                          "path": "/@{encodeURIComponent('{eventHubName}')}/events/batch/head",
                          "queries": {
                              "contentType": "application/json",
                              "consumerGroupName": "$Default",
                              "maximumEventsCount": 50
                          }
                      },
                      "recurrence": {
                          "interval": 30,
                          "frequency": "Second"
                      },
                      "splitOn": "@triggerBody()"
                  }
              }
          },
          "kind": "Stateful"
      }
      

Troubleshooting

If you experience any issues with the configuration, ensure that:

  1. The Webhook URL is correct and matches the configuration in Azure.

  2. Steps 5.8 and 5.9 properly executed and all the variables are replaced with real values.

  3. You can also check if the trigger was unsuccessful by navigating to your configured in previos steps Logic App, then Workflow and Trigger History. If you see any failed triggers, you can inspect the error details to identify the issue.

Next Steps

After configuring the event subscription:

  • Documents may be uploaded to the configured path.

  • The events triggered by these uploads will be processed by the Data Streaming setup, and the results will appear in the Getvisibility dashboard.

Troubleshooting

If there any issues with the configuration, ensure that:

  1. The Webhook URL is correct and matches the configuration in Azure.

  2. The required Azure permissions are correctly assigned.

  3. Steps 5.8 and 5.9 properly executed and all the variables are replaced with real values.

  4. You can also check if the trigger was unsuccessful by navigating to your configured in previos steps Logic App, then Workflow and Trigger History. If you see any failed triggers, you can inspect the error details to identify the issue.

Navigate to and open the Storage Account.

For details on assigning these roles, refer to .

Navigate to and click Create

Navigate to and open your Storage Account.

Go to and click "Add" button

Replace with a path to the streaming folder. For ex., you want to get events from the folder "StreamingFolder" which is located in file share with the name "DocumentsShare" and in the folder with the name "Personal". In this case, the path should be "DocumentsShare/Personal/StreamingFolder"

Replace with webhook url provided in the application in the scan configuration window

Replace with azure event hub name that was created previously

Replace with connection name from previouse step

Azure Portal
this documentation
Azure Portal Event hubs
Azure Portal
Azure logic apps
{FolderPath}
{WebhookUrl}
{eventHubName}
{connectionName}
create a new Azure Blob scan