User Docs
PlatformProduct updates
  • Getting started
    • What is DSPM?
    • Use DSPM in your company
    • Choose how to run DSPM
  • Quick start
  • Deployment guide
    • Sizing
    • Installation
      • Understand installation requirements
        • K3S installation
        • Configuring a HA K3s cluster
        • Configuring Rancher and Fleet agent to run behind an HTTP proxy
        • Install Synergy/Focus/Enterprise using Helm without Rancher
        • Install Synergy/Focus/Enterprise using Rancher
        • Air Gap Installation
        • Uploads to Rancher
      • Upgrade K3s
        • K3s - Upgrade
      • Troubleshooting
        • K3s on RHEL/CentOS/Oracle Linux
        • Networking
        • Configuring Rancher and Fleet agent to run behind a HTTP proxy if cluster was previously registered
    • Estimate hardware capacity needs
  • Administration guide
    • Customer Support Portal
    • Pattern matching
    • Data Controls
    • Analytics
    • Detectors
    • Import custom TLS certificate
    • GQL Quick Guide
    • Critical & Sensitive Classification Attribute Modification
    • How to Check AI Mesh Version
    • Webhooks
    • AI Mesh Overview
    • Is Customer Data Saved by Getvisibility?
  • Enterprise setup
    • Authentication
      • Keycloak configuration
      • Single Sign-on (SSO)
        • Using Azure AD as Keycloak Identity Provider
      • Keycloak User Federation Configuration (LDAP/AD)
      • Enable 2FA
      • Role-Based Access Control (RBAC)
      • Keycloak User Federation using LDAP over SSL
  • Implementation
    • Configuring Taxonomies & Labels
  • Integrations
    • GQL
    • Template Language
    • Multi-Language Machine Learning
    • SIEM Integration
    • Google Drive Auto-labelling
  • Scan with Getvisibility
    • Configure detectors
    • Configure data sources
      • Scan Configuration Fields
      • AWS IAM
      • AWS S3
      • Azure AD
      • Azure Blob
      • Azure Files
      • OneDrive
      • SharePoint Online
      • SharePoint on-premise
      • Box
      • Confluence Cloud
      • LDAP
      • SMB
      • Google IAM
      • Google Drive
      • ChatGPT
      • iManage
      • Dropbox
    • Scanning
      • Data Source Permissions
      • Scan Scheduler
      • Types of Scan
      • Scan History
      • Scan Analytics
      • Supported Languages for ML Classifiers
      • Rescan Files
    • Streaming
      • What is DDR?
      • How to Configure DDR Rules
      • Import Data Controls
      • Monitoring New Files via DDR Streaming
      • DDR Supported Events
      • Lineage
      • Supported Data Sources
      • Azure Blob Streaming Configuration
      • Azure Files Streaming Configuration
      • Confluence Cloud Streaming Configuration
      • Sharepoint Online Streaming Configuration
      • SMB Streaming Configuration
      • OneDrive Streaming Configuration
      • Azure AD Streaming Configuration
      • AWS S3 Streaming Configuration
      • Google Drive Streaming Configuration
      • Google IAM Streaming Configuration
      • AWS IAM Streaming Configuration
      • Box Streaming Configuration
      • Dropbox Streaming Configuration
    • Enterprise Search columns meaning
    • Supported File Types
  • Glossary
  • FAQ
  • EDC - All Documents
    • Deployment - Onboarding
      • EDC-Server Installation Guide
      • EDC-Deployment Flow Guide
        • EDC-installerConfig.json and CLI config Details
      • Deploying the agent using ManageEngine
      • EDC-Mac Agent - Installation Guide
      • Windows Agent Precheck Script
    • Functionality - Guides
      • EDC - Admin Guide - v4
      • EDC -Guide for writing Visual Labels
      • EDC- Guide for Header Footer Options
      • EDC-Metadata Details
      • EDC Supported File Types
      • Agent V4 - Configuration Options for Expert Mode
      • File Lineage - Agent Activities
      • Endpoint Data Discovery
    • Troubleshooting Documents
      • Preventing Users From Disabling Agent
      • Generate Installation Logs
      • Troubleshooting Agent for Windows
      • Guide for missing suggestions
      • Reseller Keycloak Quick Installation Guide
      • Alternative authentication methods for agent
  • EDC - All Documents
Powered by GitBook
On this page

Was this helpful?

Export as PDF
  1. EDC - All Documents
  2. Troubleshooting Documents

Guide for missing suggestions

PreviousTroubleshooting Agent for WindowsNextReseller Keycloak Quick Installation Guide

Last updated 12 months ago

Was this helpful?

Prerequisites

Check your VPN connection. Verify Agent Configuration and make sure if:

  • textForwarding is enabled for all plugins

  • set confidenceSuggestionThreshold to 0.1 (By default its 0.6)

  • set fileTextEventDebounceMilis to 3000 (By default its 15000, lowering it will give faster suggestions)

Brief architecture overview

When troubleshooting missing suggestions, it's essential to understand the sequence of services that play a role in the suggestion generation process. This sequence involves a series of steps that a request passes through, including:

  • agent: Desktop app. The initial component that handles user requests or queries. Agent communicates with agent-edge through gRPC. (responsible: delta team)

  • agent-Edge: A back-end proxy service that serve as an intermediary layer between the agent and other back-end components. The agent-edge talks to the text-classification-pipeline using Kafka. (responsible: delta team or any Java BE engineer)

  • text-classification-pipeline: A critical part of the process that performs various tasks, including data extraction, transformation and api calls. The text-classification-pipeline talks to the classifier using pure HTTP. (responsible: OT team or any Java BE engineer)

  • classifier: Responsible for classifying or identifying relevant suggestions based on user input. (data-science team)

Requests are processed sequentially through these services. If missing suggestions are observed, the root cause of the issue can often be found somewhere along this sequence.

Localizing the Problem

Extracting Kafka Event Identifier

To troubleshoot and localize the problem regarding missing suggestions, we'll begin by inspecting the Kafka event associated with the classification process. Follow these steps to access the relevant Kafka event:

  1. To initialize the classification process, start by opening Microsoft Word and writing a sample text.

  2. Connect to Kafka UI Using kubectl: Ensure you have kubectl configured and access to your Kubernetes cluster. Use the following command to connect to the Kafka UI.

    kubectl port-forward svc/gv-kafka-ui 8080:80 --kubeconfig <path-to-kubeconfig-yaml>

    This command will create a local port forwarding to the Kafka UI.

  3. Navigate through Kafka topics and find the Text topic: Open your web browser and go to http://localhost:8080. This will take you to the Kafka UI web interface. In the Kafka UI, navigate to the topic where text messages related to the classification process are stored. This topic may have a name related to the process or content classification. Once you've located the appropriate topic, click on it to view the list of messages stored within

  4. Locate the message with the Text you wrote: Browse through the list of messages in the Kafka topic and find the message that corresponds to the text you wrote in Microsoft Word. This message will typically contain metadata and content related to the classification process. Within the selected Kafka message, look for an identifier (id field) in the event. This ID is crucial for tracking and troubleshooting the specific event related to the missing suggestions.

Finalization

Now that you have the ID from the Kafka event associated with the text classification process, you can proceed to find the corresponding event in the "text-classification-results" Kafka topic.

  1. Find the event in "text-classification-results" Kafka topic: Access the "text-classification-results" Kafka topic where the processed classification events are stored. Use the ID you obtained from the Kafka event associated with the text you wrote in Microsoft Word to search for the corresponding event. This ID doesn’t serve as a unique identifier for the event, you need to find the closest event by time.

  2. If the event is NOT found. The root cause of the missing suggestions likely lies within the text-classification-pipeline. Check flink-taskmanager logs and inform responsible team.

  3. If the event is found:

    1. If the successCode is a non-zero value. In such cases, the problem lies within the classifier component. Inform responsible team.

    2. Otherwise, it suggests that the classification process successfully processed the text, and the issue might be further downstream in the process.

  4. Access agent-edge logs Within the agent-edge logs, perform a search using the extracted ID as the search criteria. Once you've located a log entry that corresponds to the extracted ID, examine the content of the entry. Look for an entry that resembles the following format:

    [<ID>] Classification received. With key: {}. partition: {}

    1. If no matching entry is found: First you have to wait a minute, maybe classification is not ready yet. If you cannot find log entries in the agent-edge logs that match the specified format or if there are errors or anomalies in the entries, it suggests that the issue might be related to the agent-edge component. In such cases, you should investigate the agent-edge logs further to identify potential issues and errors that may be affecting the classification process.

    2. If you find log entries in the agent-edge logs with the specified format and there are no exceptions in logs, it indicates that the agent-edge successfully received the classification event. In this case, the agent-edge component appears to be functioning correctly.

  5. The root cause lies within the agent or it’s more complex configuration issue.