GQL
About
GQL (Getvisibility Query Language) is a query language designed to enhance the flexibility and efficiency of querying data through the DSPM+, DDC, and EDC platforms. It enables users to craft custom queries without the need for hard coding, significantly simplifying the process of filtering through and analysing data. Based on Apache Lucene query language, GQL supports boolean, term, and range queries. This flexibility allows the language to seamlessly integrate with the platform’s Analytics software to produce elegant and insightful visualisations. Once mastered, GQL offers maximum flexibility, enabling both broad and precise data analysis. This adaptability ensures that users can leverage the full potential of the DSPM solution, whether they're conducting comprehensive overviews or detailed investigations.
Usage
Terms
There are separate sets of terms used for the different datasets within the DSPM+, DDC, and EDC platforms. Each of the datasets allow for unique GQL terms relating to this data:
Files: Unstructured data discovered and classified on-prem and in the cloud file storage locations. GQL term examples:
path, ingestedAt, flow
Trustees: Users and groups that are discovered in on-prem and in cloud IAM systems. GQL term examples:
type, isAdmin, outdatedPassword
Activity: User activities tracked by the endpoint classification platform. GQL term examples:
recipients, operation, agentId
Management: Administrative data from individual classification endpoints. GQL term examples:
lastSeen, status, os
For the full sets of terms, see tables below.
Operations
Operations are performed on or between terms to help filter data. The available operations are:
AND
Combines queries to match items meeting all conditionsOR
Matches items meeting any listed conditions()
Groups queries to clarify operation order=
Equal to!=
Not equal to>
Greater than<
Less than>=
Greater than or equal to<=
Less than or equal toEXISTS
NOT_EXISTS
Formation
Queries are formed using terms, their values, and operations. They can be as simple as a query looking for High Risk HR Data:
To complex queries specifying Health, Safety, and Compliance Documents as a data asset in DSPM:
The UI will give suggestions as you type to help out.
You should experiment with GQL queries across various platform interfaces. See what works and what doesn't. Get creative and let the real-time suggestions assist you. Remember, you can save the queries you create as bookmarks for future use.
Select the star
Enter a description, select Accept
The bookmark is saved
Scroll down to see all of your saved bookmarks
Dates
Queries can be created that incorporate dates. These can include exact dates and times or ranges. Date types include: createdAt
, lastModifiedAt
, and ingestedAt
.
GQL will provide suggestions for common time intervals such as minutes, days, months, and years.
Once a date type has been selected and an operation associated with it, a date interface will be presented to the user. Simply search for and select the appropriate date to create the query.
Date ranges
If a specific range of dates are needed, for example, all files created in May 2022, the following method should be used.
This method will search for files whose creation dates are greater than or equal to midnight on the 1st May 2022 and less than midnight on the 1st of June 2022.
Type
createdAt>=
and select the first date
Select
AND
Type
createdAt<
and select the closing date
Hit enter or the search icon and the query will the filter the results
This method can be used with any date data type. It can be as granular as seconds or as broad as years.
Aggregation
When creating or editing widgets such as counters, charts, or maps in the Analytics boards you will have the ability to aggregate some of the terms in the datasets. For example: you can use counts to show critical shared files, group by file type when displaying classification results, or use multiple groupings to create more complex visualisations.
While not strictly part of GQL yet, they are useful to know as it will help in constructing more descriptive visualisations.
GQL glossaries
GQL Term: Used in the query
Label: Displayed in the interface
Type: Data type of the term
Aggregation: Grouping types that are available to that term, only in the Analytics boards
Files Dataset
OUTDATED - new columns added
Unstructured data discovered and classified on-prem and in the cloud file storage locations.
fileId
id
STRING
The internal Id of the document
fileType
File Type
STRING
The type of the document
Can be grouped
path
Path
STRING
The path of the document
modelVersion
Version
STRING
contentLength
Content length
BYTES
The size of the document in bytes
count,
sum,
average,
min, max,
median,
Can be grouped
risk
Risk
NUMBER
The document risk factor. low=0,
medium=1,
high=2
category
Category
STRING
The ML category of the document
Can be grouped
categoryConfidence
Category confidence
DOUBLE
The ML category confidence of the document
subCategory
Sub category
STRING
The ML sub category of the document
subCategoryConfidence
Sub category confidence
DOUBLE
The ML sub category confidence of the document
source
Source
STRING
The source of the document
Can be grouped
createdAt
Created at
DATE
The document creation date
min, max;
Can be grouped
lastModifiedAt
Last modified at
DATE
The document last modified date
min, max;
Can be grouped
ingestedAt
Ingested at
DATE
Data document passed through the ML pipeline
min, max;
Can be grouped
flow
Flow
STRING
The document current flow stage in the ML pipeline. Classified, Catalogued, etc…
Can be grouped
classification
Classification
STRING
The ML classification of the document
Can be grouped
classificationConfidence
Classification confidence
DOUBLE
The ML classification confidence of the document
configurationIds
Configuration Id
STRING
The scan configuration id of the document
connectorId
Connector name
STRING
Name of the scan connector
Can be grouped
classifierResult
Classifier result
NUMBER
The classifier result of the document
pii
Pii
BOOLEAN
The document Pii flag
piiConfidence
Pii confidence
DOUBLE
The Pii confidence of the document
pi
Pi
BOOLEAN
The document Pi flag
piConfidence
Pi confidence
DOUBLE
The Pi confidence of the document
sensitive
Sensitive
BOOLEAN
The document sensitive flag
manual
Manual Classification
BOOLEAN
The flag for manually classified files
critical
Critical
BOOLEAN
The document critical flag
modifiedAtMilli
Last modified date milliseconds
DATE
The document last modified date in milliseconds
createdAtMilli
Created date milliseconds
DATE
The document created date in milliseconds
md5
Document hash
STRING
The hash value of the document
Can be grouped
keywordHits
Keyword Hits
STRING
The keyword hits of the document
Can be grouped
trusteeName
Trustee Name
STRING
The name of an owner of the document
Can be grouped
trusteeLoginName
Trustee Login Name
STRING
The login name of the owner of the document
signatureConfidence
Signature Confidence
DOUBLE
The signature confidence of the document
dataAttributeName
Data Attribute Name
STRING
The data attribute or ML Model hits of the document
Can be grouped
distributionTag
Distribution Tag Name
STRING
The distribution tag of the document
Can be grouped
keyword
Keyword
STRING
Keyword of the document
Can be grouped
complianceTag
Compliance Tag
STRING
Compliance Tag of the document
Can be grouped
location
Location
STRING
To get Documents by connection location
Can be grouped
language
Language
STRING
The document language
Can be grouped
externalSharedLink
External Shared Link
BOOLEAN
The document sharing status
Can be grouped
sourceSpecificLabelsA ttributes
Source Specific Labels
STRING
The document source specific labels
ownerId
Owner
Identifier
STRING
The document owner identifier
Can be grouped
Trustees dataset
Users and groups that are discovered in on-prem and in cloud IAM systems
type
type
STRING
User/Group
Can be grouped
source
source
STRING
The type of the connector
Can be grouped
name
name
STRING
Login name of the trustee
Can be grouped
displayName
displayName
STRING
Name of the trustee
Can be grouped
isEnabled
isEnabled
BOOLEAN
if the trustee is enabled
isAdmin
isAdmin
BOOLEAN
if trustee is an admin
outdatedPassword
outdatedPassword
BOOLEAN
The trustee has outdated password
lastLoginAt
lastLoginAt
DATE
The last time trustee logged in
min, max,
median, average
Can be grouped,
lastModifiedAt
lastModifiedAt
DATE
The last time trustee was modified
min, max,
median, average
createdAt
createdAt
DATE
The time trustee was created
min, max, median, average
connectorId
connectorId
STRING
Configuration Id of the trustee
isActive
isActive
BOOLEAN
if trustee is active
Activity dataset
OUTDATED - new columns added
User activities tracked by the endpoint classification platform
recipients
Email Recipients
STRING
The recipients of the email
senderEmail
Email Sender
STRING
The sender of the email
operation
Operation Type
STRING
The type of the operation performed
Can be grouped
eventTime
Event Time
DATE
The time when the event occurred
min, max
Can be grouped
ipAddress
IP Address
STRING
The IP address of the machine where the activity was performed
Can be grouped
hostName
Host Name
STRING
The identification of the agent who performed the activity
Can be grouped
department
Department
STRING
The department of the user who performed the activity
Can be grouped
agentId
Agent
STRING
Unique identifier of the machine
user
User
STRING
The username of the individual who performed the activity
Can be grouped
contentLength
File Size
BYTES
The size of the file involved in the activity
sum, average, min, max, median
Can be grouped
mimeType
File Type
STRING
The MIME type of the file
Can be grouped
fileName
File Name
STRING
The name of the file
Can be grouped
creationTime
Created At
DATE
The time when the file involved in the activity was created
min, max
Can be grouped
lastModificationTime
Last Modified At
DATE
The last time the file involved in the activity was changed
min, max
Can be grouped
tags
Tags
STRING
Classification tags
Can be grouped
Management dataset
Administrative data from individual classification endpoints
lastSeen
Last Seen
DATE
The last time the device was observed to be online
min, max
Can be grouped
hostName
Host Name
STRING
The identification of the agent who performed the activity
Can be grouped
domain
Domain
STRING
Shows the Active Directory domain name, if applicable
Can be grouped
ipAddress
IP Address
STRING
Shows the IP address last recorded when the device was active
Can be grouped
status
Online Status
STRING
Shows whether the device is currently online or offline
user
User Name
STRING
Displays the name of the last user who logged into the device
Can be grouped
version
Agent Version
STRING
The version of the agent software currently installed on the device
Can be grouped
os
OS
STRING
Indicates the operating system of the device, either Windows or Mac
Can be grouped
deviceId
Device ID
STRING
Displays the ID of the device
department
Department
STRING
Displays the department the agent belongs to
Can be grouped
Last updated
Was this helpful?