Skip to content

Configuring and Scheduling the Crawler

To set or edit the Crawler configuration and scheduling:

  1. Go to Admin > Applications.
  2. Scroll through the list or use the filter to find the application.
  3. Select the Edit icon on the application row.
  4. Select Next until you reach the Crawler settings page.

    Note

    The entry fields vary by application type.

    Resource Collection Cluster - Select an existing virtual appliance cluster to associate with this application. Select the + to create a new cluster.

  5. In the Calculate Resource Size field, determine when, or at what frequency, Data Access Security calculates the resources' size:

    • Never
    • Always
    • Second crawl and on (default)
  6. Schedule a task.

  7. Set the Crawl Scope by:

Crawler Regex Exclusion Example

The following are examples of crawler Regex exclusions:

Exclude all shares which start with one or more shares names:

  • Starting with \\server_name\shareName

    Regex:\\\\server_name\\shareName$

  • Starting with \\server_name\shareName or \\server_name\OtherShareName

    Regex: \\\\server_name\\(shareName|OtherShareName)$

Include ONLY shares which start with one or more shares names:

  • Starting with \\server_name\shareName

    Regex: ^(?!\\\\server_name\\shareName($|\\.*)).*

  • Starting with \\server_name\shareName or \\server_name\OtherShareName

    Regex: ^(?!\\\\server_name\\(shareName|OtherShareName)($|\\.*)).*

Narrow down the selection:

  • Include ONLY the C$ drive shares: \\server_name\C$

    Regex: ^(?!\\\\server_name\\C\$($|\\.*)).*

  • Include ONLY one folder under a share: \\server\share\folderA

    Regex: ^(?!\\\\server_name\\share\$($|\\folderA$|\\folderA\\.*)).*

  • Include ONLY all administrative shares

    Regex: ^(?!\\\\server_name\\[a-zA-Z]\$($|)).*

Notes

  • To use a backslash or $ sign, add a backslash before it as an escape character.

  • To add a condition in a single command, use a pipe character |.

Excluding Top-Level Resources

Use the top-level exclusion screen to select top-level roots to exclude from the crawl. This setting is done per application.

To exclude top-level resources from the crawl process:

  1. Go to Admin > Applications.
  2. Find the application to configure and select the dropdown list menu on the application line. Select Exclude Top Level Resources to open the configuration panel.
  3. Select the Run Task button to trigger a task that runs a short detection scan to detect the current top-level resources. If the top-level resource list has changed in the application while you are on this screen, select the Run Task button to retrieve the updated structure.
  4. Once triggered, you can view the task status in Settings > Task Management > Tasks, depending on your access to the task page.
  5. When the task has completed, select Refresh to update the page with the list of top-level resources.
  6. Select the top-level resource list and choose top-level resources to exclude.
  7. Select Save to save the change.
  8. To refresh the list of top-level resources, run the task again. Running the task will not clear the list of top-level resources to exclude.

Documentation Feedback

Feedback is provided as an informational resource only and does not form part of SailPoint’s official product documentation. SailPoint does not warrant or make any guarantees about the feedback (including without limitation as to its accuracy, relevance, or reliability). All feedback is subject to the terms set forth at https://developer.sailpoint.com/discuss/tos.