1 of 100

Public Docs

Getting started

What is DSPM?

Data Security Posture Management (DSPM) is a methodology that allows organisations to locate and classify sensitive data across their data stores, assess the risks associated with this data’s exposure based on its sensitivity, and implement access controls to this data.

Key Components of DSPM Tools

Data Discovery and Classification Automatically identifying and categorising data across an organisation’s data stores, particularly those in the cloud, based on sensitivity and regulatory requirements.
Risk Assessment Evaluating the security risks associated with both structured and unstructured data, considering factors such as exposure, access patterns and potential impact if breached.
Access and Permissions Management Ensuring that access to data is strictly controlled and that permissions are granted based on the principle of least privilege.
Data Protection Recommending security measures such as encryption, data masking and access controls to protect sensitive data.
Monitoring and Alerting Continuously monitoring data access and usage to detect suspicious activities or policy violations, with real-time alerts to facilitate rapid response.

Use DSPM in your company

Mark to create

Choose how to run DSPM

Mark to create

Quick start

Keycloak configuration (each part ends with a whats next) + Log in to an existing account
Set up an integration (configure a data source)
Configure taxonomy → link to a full taxonomy setup

Deployment guide

The deployment guide will help you install, configure, and manage your deployment environment. It covers everything from initial setup to advanced configurations and troubleshooting.

It includes:

Prerequisites for a successful installation.
Step-by-step instructions for installing and upgrading K3S.
Setting up Rancher and Fleet agents with an HTTP proxy.
Guide to installing Synergy/Focus/Enterprise using Helm without Rancher.
Guide for configuring Keycloak.
Troubleshooting guide.

Installation

Understand installation requirements

Getvisibility products use Kubernetes under the hood, and we have very specific hardware requirements. It's crucial to meet the minimum resource requirements defined for containers, as failing to do so can lead to various problems:

Resource Starvation: If a container requests more CPU or memory resources than are actually available on the cluster, it can lead to resource starvation. This means other containers may not get the resources they need to run correctly, causing performance degradation or even crashes.
Throttling: Kubernetes imposes resource limits for containers, and if a container's requested resources exceed its limits, Kubernetes may throttle or terminate the container to prevent it from consuming excessive resources, resulting in performance degradation.
Out of Memory or CrashLoopBackOff Errors: Oversubscribing memory resources can lead to containers running out of memory, causing them to terminate abruptly or enter a constant restart loop, commonly referred to as a "CrashLoopBackOff" error.
Performance Degradation: When requested CPU resources are larger than allocated, it can lead to performance issues as containers compete for CPU time, potentially slowing down critical processes and making the application unresponsive.
Difficult Troubleshooting: Misallocation of resources, whether it's too little or too much, can be challenging to identify and correct. This can lead to extended troubleshooting efforts and downtime as administrators attempt to resolve resource-related issues.

To ensure a stable and efficient Kubernetes deployment of our product, it's essential to accurately configure resource requests and limits for containers based on their actual requirements. This prevents resource-related problems and ensures smooth operation within the Kubernetes cluster.

Configuring Rancher and Fleet agent to run behind an HTTP proxy

How to configure Rancher and Fleet agent.

This is applicable when there is a cluster showing as “unavailable“ after the user configured a proxy on the server.

Replace $PROXY_IP with the IP:PORT of the corporate proxy server and $NODE_IP with the IP or CIDR of the server running Kubernetes.

Run env on the user’s server to determine what is the proxy IP. Ensure that the following line is checked:
Open the file /etc/systemd/system/k3s.service.env and append the following lines:

It is important to use correct IP addresses in the place of placeholders $PROXY_IP and $NODE_IP below.

Restart k3s:

Go to the Rancher dashboard Cluster Management > Clusters and click on Edit Config for the cluster:

a. Go to Advanced Options:

b. Configure the following Agent Environment Variables and press Save:

Remember to use correct IP addresses in the place of placeholders $PROXY_IP and $NODE_IP below.

Run the command:

6. Type letter “i“ to insert text and on the env section, type the following lines:

Example:

Save by pressing Esc and then typing "wq"

Do the same on the fleet-agent by running the command:

Repeat Step 6.
After applying all the changes, wait for the cluster to show as Online on Rancher.

Configure Dashboard

In order for the connectors to support proxy settings, you will need to enable it in the configuration page:

Install Synergy/Focus/Enterprise using Rancher

Make sure you have already configured a license through the License Manager and that the end user has installed K3s and run the registration command as described in K3S installation.

Prerequisites

Please check K3S installationfor installation requirements.

Installation

Go to Rancher dashboard and click on the customer cluster that by now should be Active:

Go to Apps > Charts and install the GetVisibility Essentials Helm chart:

2.1. Click on Enable ElasticSearch:

2.2. Configure the UTC hour (0-23) that backups should be performed at:

Go to Apps > Charts and install the GetVisibility Monitoring Helm chart and Install into Project: Default.

Go to the global menu Continuous Delivery > Clusters and click on Edit config for the cluster:

a. For Synergy: add 2 labels product=synergy environment=prod and press Save.

b. For Focus: add 2 labels product=focus environment=prod and press Save.

c. For Enterprise: add 2 labels product=enterprise environment=prod and press Save.

d. For DSPM with the Agent: add 2 labels product=ultimate environment=prod and press Save.

e. For DSPM without the Agent: add 2 labels product=dspm environment=prod and press Save.

Uploads to Rancher

Rancher manages clusters through its control plane. Managed clusters send data to Rancher's central management servers. This includes "always-on" data, exchanged with Rancher whenever the cluster has Internet access, and "on-demand" data, which should be explicitly requested by GetVisibility Support via the Rancher UI.

Always-On Data sent to Rancher includes:

Cluster Metadata:
- Information about the cluster
- Nodes list and metadata (IP address, hostname, cluster role, etc.)
- K3s version
Health and Monitoring Data:
- CPU and RAM usage on each cluster node
- Current Metrics (via Prometheus)
- Fleet agent heartbeat

On-Demand Data:

Cluster Metadata:
- Resource allocation (which Kubernetes resource runs on which node)
- Current cluster-level Alerts
- Current cluster-level Events

kubectl Commands Output:

Rancher allows running kubectl exec into running containers, but this feature is blocked by our WAF. Support needs SSH access or screen-sharing with the customer to execute these commands.

None of those categories are critical for operation, and access to Rancher can be disabled after deployment.

Upgrade K3s

This document outlines the steps to install and update K3s servers and how to deploy and backup Focus services.

K3s Installation - Client

Please refer to for the installation details.

Troubleshooting

K3s on RHEL/CentOS/Oracle Linux

firewalld/fapolicyd

It is recommended to disable and :

nm-cloud-setup

If enabled, it is required to disable

Networking

K3s uses Flannel to allow pod to pod communication between different hosts, is a lightweight provider of layer 3 network fabric that implements the Kubernetes Container Network Interface (CNI). It is what is commonly referred to as a CNI Plugin.

Flannel supports multiple backends for encapsulating packets. By default K3s uses Virtual Extensible LAN (VXLAN), which runs a Layer 2 network on top of a Layer 3 infrastructure. VXLAN uses in-kernel VXLAN to encapsulate the packets using UDP on port 8472.

During one of our HA setups () we noticed after running tcpdump -leni any -w output.pcap the UDP packets were not arriving at the destination host and we had to change the Flannel backend from VXLAN to host-gw which uses IP routes to pod subnets via node IPs.

To use host-gw backend you need to execute the following steps in all the nodes:

Configuring Rancher and Fleet agent to run behind a HTTP proxy if cluster was previously registered

This article is applicable when there is a cluster showing as “unavailable“ after the user configured a proxy on the server.

If you have a cluster which hasn’t been registered yet (registration command has not been run yet), then refer to .

Replace $PROXY_IP with the IP:PORT of the corporate proxy server and $NODE_IP with the IP or CIDR of the server running Kubernetes.

Estimate hardware capacity needs

A VM or server with the following specifications:

16 x CPU cores (x86_64 processor with speed of 2.2 GHz or more). The CPU must support the instructions SSE4.1 SSE4.2 AVX AVX2 FMA
64GB RAM
700GB Free SSD disk. K3s will be installed in /var/lib/rancher so space should be allocated there. We also need 10-20 GB free space at / and /var.

Administration guide

Customer Support Portal

How to access the Customer Support Portal and submit a ticket

To access and use the Portal, please follow the below steps:

Access the Portal by visiting
If an account has not yet been created (this is usually sent via email upon first contact with ), select the 'Sign Up' option located in the top right corner of the screen.
When the email is received, use the URL provided in the email to set a new password. After setting your password, the 'Login

Pattern matching

The pattern matching functionality allows users to identify particular pieces of information in a document. This is implemented by using Regular Expressions (RegEx) that will try match content.

Pattern Matching page

Select Administration and then Pattern Matching:

A selection of pre-configured patterns available:

If these patterns are detected during a scan they will be presented using the “Keyword Hits” to the user. On the Pattern Matching page, users have access to various Filters and Options:

Import custom TLS certificate

Using GV-Essentials chart

You can configure a custom TLS certificates during gv-essentials chart installation.

Go to TLS Certificate tab, click on Use Custom TLS Certificate and paste in the content of the certificate and private key in PEM format:

Use Self-signed Certificate

If you don’t want to import a custom certificate leave Use Custom TLS Certificate disabled, a self-signed certificate will then be auto generated instead.

Sources

GQL Quick Guide

Basic information on the Getvisibility Query Language

What is GQL

GQL: Query language
Based on: Apache Lucene
Supports: Boolean, term, and range queries
Use: For custom queries without hard coding

Querying:

Choose terms from specific dataset: Files, Trustees, Activity, Management
Apply operations like AND, OR, =, !=, >, <, >=, <= to filter data
Form queries, e.g., flow=classification AND risk>=1.

Examples:

Simple: dataAttributeName=HR
Complex: complianceTag=PII AND dataAttributeName=HR AND (dataAttributeName= Record OR dataAttributeName=Legal) AND (detectorHits="Health Insurance" OR detectorHits="Compliance report")

Aggregation (Analytics):

Use in widgets for counters, charts, maps
Aggregate terms for complex visualisations

Critical & Sensitive Classification Attribute Modification

Where to find the risk calculation rules.

To view the Critical and Sensitive Classification rule configuration from the Dashboard Click on Administration > Detectors > Attributes Detectors

Here the Critical and Sensitive attributes configuration can be viewed.

The Critical & Sensitive rules for Risk calculation can be re-configured by clicking the pencil icon beside the rule.

The Risk rules are based on a GQL query which can changed by clicking on the pencil icon on the right or by importing a JSON file using the “Import from file” function.

Once the edits are saved, the Sensitive and Critical Flags will update automatically. For the Risks to be recalculated, a rescan is needed.

How to Check AI Mesh Version

To check the AI Mesh version from the Dashboard click on Administration > AI Mesh

In the top right of this screen the AI Mesh Version can be seen.

If more information on the AI Mesh is required or if tailoring is needed please contact Support.

Webhooks

How to configure a Webhook.

A webhook is a way for one application to automatically send information to another application when something specific happens. For E.g. getting an instant message when a new email is received. It helps different apps talk to each other in real-time.

In DSPM+, the webhook service makes it possible to subscribe to documents after cataloguing/classification stages. When a document passes Cataloguing or Classification, based on GQL provided in webhook and callback URL is sent to the target system (client system). Similarly in EDC webhook can be used to send information to client system based on activity of the users.

Flow:

Is Customer Data Saved by Getvisibility?

No, file content is never saved. The classification server maintains a registry of file names and their properties but not the content. There is also an anonymization mechanism built into the Classification software that reduces file content to a mathematical number that is used throughout the platform.

More specifically when a Data Source is added to the platform the following occours:

Data source scanned and general meta data is read.
This provides file path and permissions on the files.
The files are then sent to the OCR service to read the content.
The read content is then passed through the AI Mesh.
Through the process, customer data is not stored on disk and is only ever held in memory.
There is no long-term storage of data.

Enterprise setup

Authentication

Single Sign-on (SSO)

The platform supports a wide range of Single Sign-On (SSO) protocols and providers, enabling seamless authentication across various services and applications. Here are the primary SSO protocols and some of the identity providers that Keycloak can integrate with:

SSO Protocols

OpenID Connect (OIDC): A modern, widely adopted protocol based on OAuth 2.0 for client authentication and authorization. It's used by many identity providers for secure and flexible user authentication.

Using Azure AD as Keycloak Identity Provider

You need Azure Admin permission to complete this integration.

Azure app configuration

Create new Azure app

Create a new App registration from selecting support for Multiple organizations when asked.

Find App registration in search.

Click New registration.

Fill in details as shown below.

Give the application a name and write down Application (client) ID as it will be needed later.

Configure a new secret

Next, go to your App Registrations’ Certificates & secrets to create a New client secret. Copy the Value of the secret to somewhere at had as it is needed later in the configuration.

Adding Keycloak IdP

In Keycloak, create a new IdP by selecting Microsoft from the drop down

Populate Client ID (this is Application (client) ID in Azure) and Client Secret (this is Value from Azure) using values obtained in previous steps.

Finally copy Redirect URI from Keycloak and add Redirect ID UI link in Azure App.

Test the functionality

Open up a new Incognito mode in a browser and use

Enable 2FA

Two-factor authentication (2FA) enhances security by requiring users to provide two forms of identification before they are granted access. This method adds a layer of protection to the standard username and password method, making it significantly more challenging for potential intruders to gain unauthorised access.

Implementing 2FA in Keycloak helps organizations bolster their defences against data breaches and unauthorized access, which is crucial for protecting sensitive data in today’s digital landscape.

How to configure it?

Going to the 'Authentication' tab, clicking on the browser

Implementation

TODO

Configuring Taxonomies & Labels

How to define a custom taxonomy

To access the Taxonomy screen click on Policy Centre > Compliance Hub.

Once in the screen the default Tags are visible.

To add a label click on the + on the top left of the list of Tags.

In the pop up enter a name for the new Label and optionally a Tag alias.

The Tag alias is usually used for multi-lingual deployments if not all users speak English.

Integrations

Multi-Language Machine Learning

Access global data security excellence with multi-language machine learning (ML) from GetVisibility

Getvisibility tackles unclassified data protection across multiple languages. We do this using a cutting-edge in-house Data Science team who are forging a global AI-driven solution. Here at Getvisibility we have intro- duced eleven new ML languages, including Arabic, Chinese, Spanish, and more, providing comprehensive multilingual data insights to our customers. Elevate your Data Security Posture Management (DSPM) with tailored AI that breaks language barriers for informed decisions when fortifying an organisation's data security defences.

Google Drive Auto-labelling

Cloud Data classification with Google Drive (GDrive) Auto-labelling

At Getvisibility, we understand the modern challenges of data management. With our leadership in Data Security Posture Management (DSPM), we're transforming the way organisations comprehend, classify, and protect their data.

Getvisibility is a DSPM solution that can conveniently connect, discover, classify, and enable the protection of unstructured data in an organisation's data repositories. Our latest update includes connectors for GDrive, a real-time file storage and synchronisation service that is a product within Google Workspace, which has over 9 million paying organisations. Our GDrive connectors provide an easy setup for file scanning to begin and for insights on an organisations data to be delivered at speed.

Utilising Getvisibility's cutting-edge Machine Learning (ML) classification to label files in GDrive represents a significant step in managing your sensitive data, regardless of its origin. Step into the future of data protection with Getvisibility by applying high precision tailored artificial intelligence (AI) coupled with Google Drive's native file labelling to significantly enhance the security of your Google Drive data, automatically and at scale.

With our Google Drive Auto-Labelling feature, you no longer need to manually tag your files. Let our high precision, bespoke artificial intelligence (AI) mechanisms, integrated with Google Drive's native file labelling, classify and protect every document in your GDrive, automatically and at scale.

Addressing the modern Data Security challenges

Remote working, regulatory compliance, constant pressure of cyber attacks bring forward challenges encompassing interoperability, scalability, and governance. These complications can escalate to severe security breaches, including threats like intellectual property theft, both from internal and external sources. It's essential to counter these data security concerns with a robust DSPM solution. Getvisibility's Tailored & Narrow AI, charged by Large Language Model (LLM), aligns perfectly with distinct business needs for precise data analysis. Our state-of-the-art AI system can:

Minimise data handling costs by pinpointing only essential data to keep.
Provide reports on data at risk.
Seamlessly integrate with DLP platforms.
Automatically tag files.

Benefits tailored for you:

Enhanced Data Security: Every file, irrespective of size, is labelled, solidifying its traceability and protection.
Time-saving Mechanism: Move past the era of manual classification. Entrust our machine learning and witness your files being labelled in no time.

Dive deeper with GetVisibilty's GDrive auto-labelling

Why settle for the ordinary? Experience unmatched efficiency and security with our innovative solution. For a comprehensive understanding of how Getvisibility can redefine your organisation's data security landscape, reach out to us or explore our website.

For More Information:

Scan with Getvisibility

Configure detectors

Configure data sources

Scan Configuration Fields

Description of the fields in the Scan Configuration popup

The below screenshot shows the fields that appear in the Scan Configuration screen.

Please note that not all of these fields are available for all Data Sources.

Name

Set a unique name so that the Data Source is easy to identify.

Credentials

This a dropdown to select the credentials that have already been configured for the Data Source.

Geographic Location

This is to indicate the physical location of the server the data sits on.

Path

This only needs to be defined for a specific location needs to be scanned.

If left blank the entire Data Source will be scanned.

Data Owner

This is the person that is to be the person responsible for the data.

This setting is optional.

If the Data streaming check box is not visible it may be because the license for DDR is not present.

To learn more about getting a license for DDR please reach out to the Getvisibility Enablement Team.

Confluence Cloud

How to configure Atlassian Confluence Cloud connection to scan it.

URLS for whitelisting if a proxy is in place

https://{your-domain}/wiki/api/v2 https://{your-domain}/wiki/rest/api *.atlassian.com *.atlassian.net

Generating an API token

Log in to
Click Create API token
From the dialog that appears, enter a memorable and concise Label for the token and click Create
Click Copy to clipboard, and save it somewhere secure. It isn't possible to view the token after closing the creation dialog

Configuring Confluence Cloud connector in Dashboard

Navigate to Administration -> Data Sources -> Confluence Cloud -> New scan

Enter the details
- Name: Give a name to the scan to identify it later
- Username: The email address for the Atlassian account you used to create the token

Save the configuration
Once the configuration is saved, click on the icon on the right and select Start trustee scan to begin the trustee scanning

The scan results can be viewed under Dashboard -> Access Governance

Click on the icon on the right and select Start file scan to begin the files scanning

The results can be viewed under Dashboard -> Enterprise Search

LDAP

How to configure LDAP connection to gather permissions and access rights for groups, users, and other entities (Trustees) on an LDAP server.

Configuring LDAP connector in Dashboard

Navigate to Administration -> Data Sources -> LDAP -> New scan

Enter the details of the LDAP server to scan
- Name: Give a name to the scan to identify it later
- Username: The user must be an admin level and have access to all the LDAP utilities to be scanned. The username should be entered in the format [email protected]
- Password: Password for the admin user
Save the configuration
Once the configuration is saved, click on the icon on the right and select Start trustee scan to begin scanning

The scan results can be viewed under Dashboard -> Access Governance

SMB

How to configure SMB/CIFS connection for scanning

Configuring SMB connector in Dashboard

Navigate to Administration -> Data Sources -> SMB -> New scan

ChatGPT

How to configure ChatGPT connection for scanning.

API key

Owners can generate an API key in the . Note that the correct Organization must be selected when creating a key, corresponding to the administered workspace. Do not select the owner's personal organization.

Create a new API key:

Exchange Online

This document provides information for about creating a Exchange Connector app, which is required for Focus product to connect to customer's Exchange Online accounts.

Registering an Azure App

Gmail

This document provides information on how to configure Gmail connection for Focus product.

Create OAuth2 Credentials

Create a Project in Google Cloud Console:
- Go to the
- Create a new project or select an existing project
Enable the Gmail:
- In the Google Cloud Console, navigate to the "APIs & Services" > "Library"
- Search for "Gmail API" and click on it
Create OAuth 2.0 Credentials:
- In the Google Cloud Console, navigate to the "APIs & Services" > "Credentials" tab
- Click "Create credentials" and select "Service account"

Delegate domain-wide authority to your service account

From your domain's , go to Main menu menu > Security > Access and data control > API controls

In the Domain wide delegation pane, select Manage Domain Wide Delegation

Click Add new

In the Client ID field, enter the client ID obtained from the service account creation steps above
In the OAuth Scopes field, enter a comma-delimited list of the scopes required for the application

Use the below scopes:
For scanning
- https://www.googleapis.com/auth/admin.directory.user.readonly
- https://www.googleapis.com/auth/gmail.readonly
For tagging

Scanning

Scanning process and statuses

To review Scans and their status go to Data Sources in the Administration drop-down.

The scanning process discovers and analyses files across all configured data sources. It operates in three steps:

1) Discovery

Data Source Permissions

How to find the list of permissions granted for a Data Source

The required permissions for scanning are documented by Data Source.

For more information please review the list here.

To Check the configured permissions for a Data Source Navigate to Administration > Data Sources and click on the hamburger menu.

In the dropdown click permissions.:

The example below shows the permissions for SharePoint Online.

Scan Scheduler

How to set a specific schedule for a scan.

When a Data Source is added to Getvisibility for scanning, the scan begins automatically.

If a rescan is needed this can be configured by clicking on Administration > Data Source > (the Data Source that needs Rescan e.g. One Drive) > Hamburger menu > Rescan Scheduler.

The default configuration is Does Not Repeat.

By clicking the drop-down menu other options can be choosen:

Daily

In this option both the time zone and time of day can be chosen

Weekly

With this option as well as the above configuration a specific or multiple days of the week

Monthly

This gives the option to pick a specific day or days each month to run the rescan.

Types of Scan

The two types of scan are Trustee Scan and File Scan

Trustee Scan

This scan provides the list of Users and Groups on a Data Source

File Scan

This scan provides information about files and folders on a Data Source including structure and metadata.

Once both scans are completed the data is processed and the two sets are combined to show who has access to what files.

Scan Analytics

Scan Analytics shows in-depth information gathered during the scan.

There are two ways to access Scan Analytics, either via the main Analytics Dashboard or via the Data Sources page.

Analytics Dashboards

To access the Analytics Dashboards click on the link on the Getvisibilty homepage.

Supported Languages for ML Classifiers

Listed below are the languages supported by the ML (Machine Learning) classifiers, grouped by language pack.

Rescan Files

When a targeted rescan is needed it is possible to scan individual files or a specific selection.

Reasons for a rescan can include:

Ensuring that recent changes to files are reflected in the UI.

If new patterns have been added to Pattern Matching.

If new rules have been added in Controls Orchestration.

Files can be sent for rescan individually by clicking on the hamburger menu for that file and click on “send to classification pipeline.

There is also an option to reclassify multiple files at once by selecting them using the tickboxes on the left of the screen.

Once the required files are selected the option to rescan appears on the bottom right of the screen.

What is DDR?

A brief description of DDR

Getvisibility's Data Detection and Response (DDR) solution is designed to protect sensitive data by providing near real-time detection and response capabilities. It ensures that data across user environments are constantly monitored and any potential threats are flagged immediately. DDR focuses on data-centric security, ensuring organisations have visibility and control over their critical information assets.

Key Features of DDR:

Real-Time Monitoring: DDR continuously identifies data activities, including access, modification, sharing, deletion, and other activities to identify suspicious and malicious events.
Automated Response: DDR sends instant alerts for quick remediation.
Risk Mitigation: It ensures regulatory compliance with Privacy Compliance standards like GDPR, HIPAA, PCI-DSS, CCPA and other standards.
AI-Powered Insights: DDR leverages proprietary Getvisibility’s AI-mesh models to analyse data context for the best accuracy.
Data Intelligence: It provides dashboards with visibility into sensitive data and risks to your data.

How DDR Works:

Data Analysis: DDR identifies all data across unstructured data environments and then classifies the data based on its content and context.
Risks Analysis: It evaluates user access, permissions, sharing and data location to identify risks related to your data.
Policy Enforcement: DDR applies predefined and custom security policies to protect data based on its classification and sensitivity.
Incident Response:

How to Configure DDR Rules

Create Scan Configuration

To configure DDR rules, follow these steps:

Access the Getvisibility DDR dashboard using your credentials.
Under the DDR tab, select Create Scan Configuration to connect to the data sources to be monitored.

Define Scopes: Specify the data sources that will be connect to.
Verify Configuration: Ensure that at least one data source is successfully connected. A green checkmark will confirm the completion.

Check for Incoming Events

Once the scan configuration is complete:

Go to Administration > Live Events Streaming to view real-time events.

Monitor Event Activity: Filter events by source, user name, action type (create, update, delete), and event type.

Overview Page

The Overview Page provides a comprehensive view of DDR's performance:

Event Statistics: Displays the number of events by source, such as Google Drive, SharePoint, OneDrive, and Box.
Data Source Activity: Visualizes active data sources and the volume of events generated by each.
Event Timeline: Shows when events occurred, helping identify peak activity periods and anomalies.

Open Risks

The Open Risks section highlights detected threats, categorised by risk type:

Public Exposure: Identifies sensitive files accessible to external users via public links.
External Sharing: Detects files shared outside the organisation, potentially exposing sensitive information.
Internal Over-Sharing: Flags data with excessive permissions within the organisation.

For each risk, DDR provides detailed insights, including the file path, user activity, and recommended remediation steps.

Monitoring New Files via DDR Streaming

Getvisibility DDR continuously monitors new files generated through streaming and provides real-time insights

Filter by Streaming: Under Enterprise Search, use the filter scanTrigger=streaming.
View File Details: DDR displays:
1. File Path:

Supported Data Sources

Below is a list of Data Sources that Getvisibility DDR (Streaming) currently supports:

AWS IAM
AWS S3
Azure AD

Configuring a HA K3s cluster

Our K3s HA setup consists of 4 homogeneous nodes (3 master nodes + 1 worker node) and can withstand a single-node failure with a very short failover disruption (between 3 to 6 minutes).

With our HA setup we can achieve a monthly uptime of 99.9% (a maximum of 43m of downtime every month).

Prerequisites

Please refer to K3S installation for the node specs of the product you’ll be installing.

The minimum spec allowed for a HA node is 8 CPUs, 32GB of RAM and 500GB of free SSD disk space. All nodes should also have the same spec and OS.

Networking

Internal

We recommend running the K3s nodes in a 10Gb low latency private network for the maximum security and performance.

K3s needs the following ports to be accessible by all other nodes running in the same cluster:

The ports above should not be publicly exposed as they will open up your cluster to be accessed by anyone. Make sure to always run your nodes behind a firewall/security group/private network that disables external access to the ports mentioned above.

All nodes in the cluster must have:

Domain Name Service (DNS) configured
Network Time Protocol (NTP) configured
Software Update Service - access to a network-based repository for software update packages
Fixed private IPv4 address

External

The following port must be publicly exposed in order to allow users to access Synergy or Focus product:

The user must not access the K3s nodes directly, instead, there should be a load balancer sitting between the end user and all the K3s nodes (master and worker nodes):

The load balancer must operate at Layer 4 of the OSI model and listen for connections on port 443. After the load balancer receives a connection request, it selects a target from the target group (which can be any of the master or worker nodes in the cluster) and then attempt to open a TCP connection to the selected target (node) on port 443.

The load balancer must have health checks enabled which are used to monitor the health of the registered targets (nodes in the cluster) so that the load balancer can send requests to healthy nodes only.

The recommended health check configuration is:

Timeout: 10 seconds
Healthy threshold: 3 consecutive health check successes
Unhealthy threshold: 3 consecutive health check failures
Interval: 30 seconds

Public

Please refer to for the list of urls you need to enable in your corporate proxy in order to connect to our private registries.

Configuring K3s nodes

We need 3 master nodes and at least 1 worker node to run K3s in HA mode.

The nodes must be homogeneous, having the same number of CPUs, RAM and disk space.

1st master node

To get started launch a server node using the cluster-init flag:

Check for your first master node status, it should have the Ready state:

Use the following command to copy the TOKEN that will used to join the other nodes to the cluster:

Don’t also forget to copy the private IP address of the 1st master node which will be used by the other nodes to join the cluster.

2nd master node

SSH into the 2nd server to join it to the cluster:

Replace K3S_TOKEN with the contents of the file /var/lib/rancher/k3s/server/node-token from the 1st master node installation.
Set --node-name to master2
Set --server to the private static IP address of the 1st master node.

Check the node status:

3rd master node

SSH into the 3rd server to join it to the cluster:

Replace K3S_TOKEN with the contents of the file /var/lib/rancher/k3s/server/node-token from the 1st master node installation.
Set --node-name to master3
Set --server to the private static IP address of the 1st master node.

Check the node status:

1st worker node

SSH into the 4th server to join it to the cluster:

Replace K3S_TOKEN with the contents of the file /var/lib/rancher/k3s/server/node-token from the 1st master node installation.
Set --node-name to worker1
Set --server to the private static IP address of the 1st master node.

Joining additional worker nodes

You may create as many additional worker nodes as you want.

SSH into the server to join it to the cluster:

Replace K3S_TOKEN with the contents of the file /var/lib/rancher/k3s/server/node-token from the 1st master node installation.
Update --node-name with your worker node name(Ex: worker2 , worker3 etc..)
Set --server to the private static IP address of the 1st master node.

Check the node status:

Register HA K3s Cluster to Rancher

You may run the registration command that you generated using Rancher UI or through license manager. You should see all master and worker nodes in your cluster through the Machine Pools on the Rancher dashboard:

Install Helm charts

GV Essentials

Go to Apps > Charts and install the GetVisibility Essentials Helm chart:

If you are installing Focus or Enterprise click on Enable ElasticSearch.
Configure the UTC hour (0-23) that backups should be performed at:

Click on High Available and set:
1. MinIO Replicas to 4
2. MinIO Mode to distributed

GV Monitoring

Go to Apps > Charts and install the GetVisibility Monitoring Helm chart and Install into Project: Default.

Click on High Available and set:
1. Prometheus replicas to 2
2. Loki replicas to 2

Configure Fleet labels

Go to the global menu Continuous Delivery > Clusters and click on Edit config for the cluster:

For Synergy: add 3 labels product=synergy environment=prod high_available=true and press Save.
For Focus: add 3 labels product=focus environment=prod high_available=true

Air Gap Installation

Single Node Installation

Install K3s

Make sure you have /usr/local/bin configured in your PATH: export PATH=$PATH:/usr/local/bin). All the commands must be executed as root user.

The commands have been tested on Ubuntu Server 20.04 LTS, SUSE Linux Enterprise Server 15 SP4 and RHEL 8.6.

For RHEL, K3s needs the following package to be installed: k3s-selinux (repo rancher-k3s-common-stable) and its dependencies container-selinux (repo rhel-8-appstream-rhui-rpms) and policycoreutils-python-utils (repo rhel-8-baseos-rhui-rpms). Also, firewalld nm-cloud-setup.service and nm-cloud-setup.timer must be disabled and the server restarted before the installation, for more information.

The steps below you guide you through the air-gap installation of , a lightweight Kubernetes distribution created by Rancher Labs:

Extract the downloaded file: tar -xf gv-platform-$VERSION.tar
Prepare K3s for air-gap installation:

Install K3s:

Wait for the 30s and check if K3s is running with the command: kubectl get pods -A and systemctl status k3s.service

Import Docker images

The steps below will manually deploy the necessary images to the cluster.

Import Docker images locally:

Install Helm charts

The following steps guide you through the installation of the dependencies required by Focus and Synergy.

Replace $VERSION with the version that is present in the bundle that has been downloaded. To check all the charts that have been download run ls charts.

Install Getvisibility Essentials and set the daily UTC backup hour (0-23) for performing backups.

Install Monitoring CRD:

Install Monitoring:

Check all pods are Running with the command: kubectl get pods -A

Install Focus/Synergy Helm Chart

Replace the following variables:

$VERSION with the version that is present in the bundle that has been downloaded
$RESELLER with the reseller code (either getvisibility or forcepoint)
$PRODUCT with the product being installed (synergy

In case if you expirience 404 error for accessing to Keycloak or UI and use 1.26 (default) version of K3s ensure that treafik patch is applied

Install custom artifact bundles

Models and other artifacts, like custom agent versions or custom consul configuration can be shipped inside auto deployable bundles. These bundles are docker images that contain the artifacts to be deployed alongside scripts to deploy them. To create a new bundle or modify an existing one follow this guide first: . The list of all the available bundles is inside the bundles/ directory on the models-ci project on github.

link to an internal confuence

After the model bundle is published, for example images.master.k3s.getvisibility.com/models:company-1.0.1 You’ll have to generate a public link to this image by running the k3s-air-gap Publish ML models GitHub CI task. The task will ask you for the docker image URL.

We are still using the s repo because the bundles were only used to deploy custom models at first.

Once the task is complete you’ll get a public URL to download the artifact on the summary of the task. After that you have to execute the following commands.

Replace the following variables:

$URL with the URL to the model bundle provided by the task
$BUNDLE with the name of the artifact, in this case company-1.0.1

Now you’ll need to execute the artifact deployment job. This job will unpack the artifacts from the docker image into a MinIO bucket inside the on premise cluster and restart any services that use them.

Replace the following variables:

$GV_DEPLOYER_VERSION with the version of the model deployer available under charts/
$BUNDLE_VERSION with the version of the artifact, in this case company-1.0.1

You should be able to verify that everything went alright by locating the ml-model job that was launched. The logs should look like this:

In addition you can enter the different services that consume these artifacts to check if they have been correctly deployed. For example for the models you can open a shell inside the classifier containers and check the /models directory or check the models-data bucket inside MinIO. Both should contain the expected models.

Multiple Node Installation (High Availability)

Prerequisites

Firewall Rules for Internal Communication

We recommend running the K3s nodes in a 10Gb low latency private network for the maximum security and performance.

K3s needs the following ports to be accessible (Inbound and Outbound) by all other nodes running in the same cluster:

All nodes in the cluster must have:

Domain Name Service (DNS) configured
Network Time Protocol (NTP) configured
Fixed private IPv4 address
Globally unique node name (use --node-name when installing K3s in a VM to set a static node name)

Firewall Rules for External Communication

The following port must be publicly exposed in order to allow users to access Synergy or Focus product:

The user must not access the K3s nodes directly, instead, there should be a load balancer sitting between the end user and all the K3s nodes (master and worker nodes):

The load balancer must operate at Layer 4 of the OSI model and listen for connections on port 443. After the load balancer receives a connection request, it selects a target from the target group (which can be any of the master or worker nodes in the cluster) and then attempts to open a TCP connection to the selected target (node) on port 443.

The recommended health check configuration is:

Timeout: 10 seconds
Healthy threshold: 3 consecutive health check successes
Unhealthy threshold: 3 consecutive health check failures
Interval: 30 seconds

VM Count

At least 4 machines are required to provide high availability of the Getvisibility platform. The HA setup supports a single-node failure.

Install K3s

Make sure you have /usr/local/bin configured in your PATH: export PATH=$PATH:/usr/local/bin). All the commands must be executed as root user.

The commands have been tested on Ubuntu Server 20.04 LTS, SUSE Linux Enterprise Server 15 SP4 and RHEL 8.6.

The steps below you guide you through the air-gap installation of , a lightweight Kubernetes distribution created by Rancher Labs:

Create at least 4 VMs with the same specs
Extract the downloaded file: tar -xf gv-platform-$VERSION.tar to all the VMs
Create a local DNS entry private-docker-registry.local across all the nodes resolving to the master1 node:

Prepare the K3s for air-gap installation files:

Update the registries.yaml file across all the nodes.

Install K3s in the 1st master node: To get started launch a server node using the cluster-init flag:

Check for your first master node status, it should have the Ready state:

Use the following command to copy the TOKEN from this node that will be used to join the other nodes to the cluster:

Also, copy the IP address of the 1st master node which will be used by the other nodes to join the cluster.

Install K3s in the 2nd master node:

Run the following command and assign the contents of the file: /var/lib/rancher/k3s/server/node-token from the 1st master node to the K3S_TOKEN variable.

Set --node-name to “master2”

Set --server to the IP address of the 1st master node

Check the node status:

Install K3s in the 3rd master node:

Run the following command and assign the contents of the file: /var/lib/rancher/k3s/server/node-token from the 1st master node to the K3S_TOKEN variable.

Set --node-name to “master3”

Set --server to the IP address of the 1st master node

Check the node status:

Install K3s in the 1st worker node: Use the same approach to install K3s and to connect the worker node to the cluster group. The installation parameter would be different in this case. Run the following command: Set --node-name to “worker1” (where n is the nth number of the worker node)

Check the node status:

Deploy Private Docker Registry and Import Docker images

Extract and Import the Docker images locally to the master1 node

Install gv-private-registry helm chart in the master1 node: Replace $VERSION with the version that is present in the bundle that has been downloaded. To check all the charts that have been download run ls charts.

Tag and push the docker images to the local private docker registry deployed in the master1 node:

Install Helm charts

The following steps guide you through the installation of the dependencies required by Focus and Synergy.

Perform the following steps in the master1 Node

Replace $VERSION with the version that is present in the bundle that has been downloaded. To check all the charts that have been download run ls charts.

Install Getvisibility Essentials and set the daily UTC backup hour (0-23) for performing backups. If you are installing Focus or Enterprise append --set eck-operator.enabled=true to the command in order to enable (BROKEN LINK TO ELASTIC SEARCH)

Install Monitoring CRD:

Install Monitoring:

Check all pods are Running with the command:

Install Focus/Synergy Helm Chart

Replace the following variables:

$VERSION with the version that is present in the bundle that has been downloaded
$RESELLER with the reseller code (either getvisibility or forcepoint)
$PRODUCT with the product being installed (synergy

Install Kube-fledged

Perform the following steps in the master1 node

Install gv-kube-fledged helm chart. Replace $VERSION with the version that is present in the bundle that has been downloaded. To check all the charts that have been download run ls charts.

Create and deploy imagecache.yaml

Install custom artifacts

Models and other artifacts, like custom agent versions or custom consul configuration can be shipped inside auto deployable bundles. The procedure to install custom artifact bundles on an HA cluster is the same as in the single node cluster case. Take a look at the guide for single-node clusters above.

Upgrade

View current values in config file for each chart

Before upgrading each chart, you can check the settings used in the current installation with helm get values <chartname>.
If the current values are different from the defaults, you will need to change the parameters of the helm upgrade command for the chart in question.
For example, if the backup is currently set to run at 2 AM instead of the 1 AM default, change --set backup.hour=1 to --set backup.hour=2

Focus/Synergy/Enterprise Helm Chart

To upgrade Focus/Synergy/Enterprise you must:

Download the new bundle
Import Docker images
Install Focus/Synergy/Enterprise Helm Chart

LINK TO INTERNAL CONFLUENCE

Import Docker images only to the Master1 node
In the case of HA deployment, Recreate and redeploy the imagecache.yaml file : Perform the 2nd Step

GetVisibility Essentials Helm Chart

To upgrade the GV Essential chart you must:

Download the new bundle
Import Docker images
Run the command from Install Getvisibility Essentials under Install Helm charts section

Import Docker images only to the Master1 node
In the case of HA deployment, Recreate and redeploy the imagecache.yaml file : Perform the 2nd Step

Install custom artifacts

Models and other artifacts, like custom agent versions or custom consul configuration can be shipped inside auto deployable bundles. The procedure to upgrade custom artifact bundles is the same as the installation one, take a look at the guides above for single-node and multi-node installations.

DDR Supported Events

A comprehensive list of the supported event types by Data Source for DDR

When DDR (aka streaming) is enabled and events start coming in from the data source there are two types of events:

Informational

Examples would be Read, View, etc.

No actions are taken when these events are detected.

Modification Events:

These are events that alter the file or the file permissions. Examples would include creating a file or user, changing a file name etc.

When these types of events are detected a scan or rescan of the item will occur so that it can be classified.

AWS IAM

Events that Trigger (Re)Scan:

Create Events:

CreateUser - A new user account is created.
CreateGroup - A new user group is created.
CreateRole - A new role is created with specific permissions.

Update Events:

UpdateUser - Modifications are made to an existing user.
UpdateGroup - Changes are made to a group, such as adding or removing members.
UpdateRole - A role is updated with new permissions or settings.
AttachUserPolicy - A policy is attached to a user, modifying access rights.

Delete Events:

DeleteUser - A user account is deleted.
DeleteGroup - A group is deleted along with its associated permissions.
DeleteRole - A role is deleted from IAM.

Other Processed Events:

Informational Events:

ConsoleLogin - A user logs in through the AWS console.
SignInFailure - A login attempt fails.
SignInSuccess - A login attempt is successful.
FederatedLogin - A user logs in via federated authentication.

List of Processed AWS S3 Events

Events that Trigger (Re)Scan:

Create Events:

s3:ObjectCreated: - A new object is uploaded to an S3 bucket.
s3:ObjectCreated:Post – A new object is uploaded to an S3 bucket by an HTTP POST operation.
s3:ObjectCreated:CompleteMultipartUpload – An object was created after a multipart upload operation.
s3:ObjectCreated:Copy – A new object is created by an S3 copy operation.

Update Events:

s3:ObjectRestore:Post – A restore request for an archived object is initiated.
s3:ObjectRestore:Delete – A restore request for an archived object is deleted.
s3:ObjectAcl:Put – Access control settings for an object are updated.
s3:ObjectTagging:Put – Tags for an object are added or modified.

Delete Events:

s3:ObjectRemoved:Delete – An object is deleted from an S3 bucket.
s3:ObjectRemoved:DeleteMarkerCreated – A delete marker is created for an object, marking it as deleted.
s3:LifecycleExpiration:Delete – An object is removed due to lifecycle rules.
s3:LifecycleExpiration:DeleteMarkerCreated – A delete marker is created due to lifecycle rules.

Other Processed Events:

Informational Events:

s3:ReducedRedundancyLostObject - An object stored in Reduced Redundancy Storage is lost.
s3:LifecycleTransition – An object is transitioned to a different storage class based on lifecycle rules.
s3:Replication:OperationFailedReplication – The replication operation for an object failed.
s3:Replication:OperationNotTracked – The replication operation for an object is not tracked.

Azure Blob

Events that Trigger (Re)Scan:

Create Events:

Microsoft.Storage.BlobCreated - A new blob is created or content is updated in a storage container.
Microsoft.Storage.DirectoryCreated - A new directory is created in a storage container.

Update Events:

Microsoft.Storage.BlobRenamed - A blob is renamed within a container.
Microsoft.Storage.DirectoryRenamed - A directory is renamed within a container.

Delete Events:

Microsoft.Storage.BlobDeleted - A blob is deleted from a storage container.
Microsoft.Storage.DirectoryDeleted - A directory is deleted from a storage container.

Other Processed Events:

Microsoft.EventGrid.SubscriptionValidationEvent - A subscription validation event.
Microsoft.Storage.BlobTierChanged - The storage tier of a blob is modified.
GetBlobServiceProperties - Retrieves properties of the Blob service.
GetContainerProperties - Retrieves properties of a storage container.