Pattern matching
Last updated
Was this helpful?
Last updated
Was this helpful?
The pattern matching functionality allows users to identify particular pieces of information in a document. This is implemented by using Regular Expressions (RegEx) that will try match content.
Select Administration and then Pattern Matching:
A selection of pre-configured patterns available:
If these patterns are detected during a scan they will be presented using the “Keyword Hits” to the user. On the Pattern Matching page, users have access to various Filters and Options:
Filters and options explained:
Search: Enter text here to filter patterns based in name
Classification: Filter by classification tags associated with patterns
Compliance: Filter by compliance tags associated with patterns
Distribution: Filter by distribution tags associated with patterns
Categories: Filter by file categories associated with patterns
Subcategories: Filter by file subcategories associated with patterns
Enabled: Filter by patterns that have been enabled or disabled.
Published: Filter by patterns that have been published or unpublished
Add New Pattern: Create a custom pattern
Publish: Push changes to the pattern matching system for start using
Clear filters: Remove all previously selected filters
Options to create pattern matching explained:
Pattern Name: identifies the RegEx when it is found by the software
Regular Expression: the sequence to be matched
Enabled: whether the pattern will be searched for by the software
Hide RegEx in UI: obfuscates the regular expression
Tag Overrides: when the RegEx is found these tags will be written to the file
Classifications: security levels
Compliance: regulations that apply to data
Distribution: policies on how data should distributed
Category: data grouping
Subcategory: data subgrouping
Cancel: exit without saving
Create: save pattern information and exit
RegEx: Regular Expression, a sequence or pattern that is searched for in text. Ex-ID uses Java RegEx notation.
Rules: Instructions for Ex-ID about what to do when a RegEx is detected in a file.
Pattern: The RegEx and rules associated with its detection.
Pattern Name: Used to identify the pattern when it is detected.
Classification: Tags that help secure documents and other files. e.g. Public, Internal, and Confidential.
Compliance: Tags that help organisations conform to certain regulatory regimes. By applying compliance tags such as GDPR/PII to RegEx such as Social Security number, organisations can identify all related documents.
Distribution: Tags that specify how a files should be moved either within or outside an organisation.
Category: From Getvisibility’s ML model. These are groupings of information based on their use. e.g. Finance, HR, or Technical Documents.
Subcategory: From Getvisibility’s ML model. These are sub-groupings of information based on their particular use. e.g. CV (resume), Code, or Sales Agreement.
Publish: The action of pushing the enabled patterns to be used. As some parts of the system need to be restarted in order to take on a new pattern matching configuration, we allow users to chose when to enact the configuration so as not to impact the workflow of others.
Unpublished: A pattern that has been created, changed, or edited but has not been pushed to the pattern matching system.
Published: A pattern that is currently part of the pattern matching configuration.
Disabled: A pattern that is currently part of the pattern matching configuration but is not to be detected.
Enabled: An active pattern. One that is part of the configuration and will be used by the pattern matching system.
Hide RegEx: Ex-ID allows for RegEx notations to be obfuscated for security and intellectual property reasons.