Pattern matching

The pattern matching functionality allows users to identify particular pieces of information in a document. This is implemented by using Regular Expressions (RegEx) that will try match content.

Pattern Matching page

Select Administration and then Pattern Matching:

A selection of pre-configured patterns available:

If these patterns are detected during a scan they will be presented using the “Keyword Hits” to the user. On the Pattern Matching page, users have access to various Filters and Options:

Filters and options explained:

  • Search: Enter text here to filter patterns based in name

  • Classification: Filter by classification tags associated with patterns

  • Compliance: Filter by compliance tags associated with patterns

  • Distribution: Filter by distribution tags associated with patterns

  • Categories: Filter by file categories associated with patterns

  • Subcategories: Filter by file subcategories associated with patterns

  • Enabled: Filter by patterns that have been enabled or disabled.

  • Published: Filter by patterns that have been published or unpublished

  • Add New Pattern: Create a custom pattern

  • Publish: Push changes to the pattern matching system for start using

  • Clear filters: Remove all previously selected filters


Create a New Pattern

Options to create pattern matching explained:

  • Pattern Name: identifies the RegEx when it is found by the software

  • Regular Expression: the sequence to be matched

  • Enabled: whether the pattern will be searched for by the software

  • Hide RegEx in UI: obfuscates the regular expression

  • Tag Overrides: when the RegEx is found these tags will be written to the file

  • Classifications: security levels

  • Compliance: regulations that apply to data

  • Distribution: policies on how data should distributed

  • Category: data grouping

  • Subcategory: data subgrouping

  • Cancel: exit without saving

  • Create: save pattern information and exit


Glossary of Pattern Matching terms

RegEx: Regular Expression, a sequence or pattern that is searched for in text. Ex-ID uses Java RegEx notation.

Rules: Instructions for Ex-ID about what to do when a RegEx is detected in a file.

Pattern: The RegEx and rules associated with its detection.

Pattern Name: Used to identify the pattern when it is detected.

Classification: Tags that help secure documents and other files. e.g. Public, Internal, and Confidential.

Compliance: Tags that help organisations conform to certain regulatory regimes. By applying compliance tags such as GDPR/PII to RegEx such as Social Security number, organisations can identify all related documents.

Distribution: Tags that specify how a files should be moved either within or outside an organisation.

Category: From Getvisibility’s ML model. These are groupings of information based on their use. e.g. Finance, HR, or Technical Documents.

Subcategory: From Getvisibility’s ML model. These are sub-groupings of information based on their particular use. e.g. CV (resume), Code, or Sales Agreement.

Publish: The action of pushing the enabled patterns to be used. As some parts of the system need to be restarted in order to take on a new pattern matching configuration, we allow users to chose when to enact the configuration so as not to impact the workflow of others.

Unpublished: A pattern that has been created, changed, or edited but has not been pushed to the pattern matching system.

Published: A pattern that is currently part of the pattern matching configuration.

Disabled: A pattern that is currently part of the pattern matching configuration but is not to be detected.

Enabled: An active pattern. One that is part of the configuration and will be used by the pattern matching system.

Hide RegEx: Ex-ID allows for RegEx notations to be obfuscated for security and intellectual property reasons.

Last updated

Was this helpful?