What to do if Events are not Collected

NetApp 7-mode

  1. In the relevant vfiler context in the NetApp, run the command

    FPolicy show [whitebox_cifs_policy] or [whitebox_nfs_policy] depending on the application type.

  2. Verify that the Activity Monitor server is connected as an FPolicy server.
  3. If the FPolicy server is registered, simulate some activity, run the command again, and look on the counters at the end of the output of the command. They should increase.
  4. If they don’t increase, there might be something wrong with the definition of the included volumes. If the name of the included volume is wrong, no events will be sent by NetApp.
  5. If the Activity Monitor is not registered as an FPolicy server, stop the activity monitor service, wait 60 seconds, and start the activity monitor service again.

    In some cases, it takes a while to NetApp to de-register the FPolicy server in case of an error.

  6. Run the command again and make sure the FPolicy server is registered.
  7. If the FPolicy server is not registered, Verify the following:

    • The Activity Monitor service is running with a domain user who is a local administrator on the server running the Activity Monitor
    • The user running the Activity Monitor service is a member of the ‘Backup Operators’ local group on the filer/vfiler
    • The activity monitor server is in the same domain as the server running the Activity Monitor service
    • The clock of the server running the Activity Monitor and the NetApp clock are accurate to within 5 minutes. A larger difference might cause the RPC Kerberos authentication process to fail
    • There is no firewall between the NetApp and the server running the Activity Monitor, and that the Windows Firewall is off on the server running the Activity Monitor
  8. If all the prerequisites are set, look for errors in the activity monitor which indicates if it cannot connect to the FPolicy server, and look for messages in the NetApp log which indicates if the FPolicy server is trying to register and fails, or disconnected after a while.
  9. If there are authenticated failures in the Activity Monitor/Permission Collector logs to the Ontapi API:

    • Make sure all the prerequisites listed in the Permissions section were configured correctly
    • If the Activity Monitor seems to connect successfully to the NetApp, but disconnects a few seconds later, check whether the NetApp filer and the Activity Monitor server use the :same SMB version.

      SMB1 is no longer supported on most modern systems. The following methods pertaining to that version may not work on any server. Use SMB2 or higher when possible.

      • Run the following command on the NetApp side to see if SMB2 is enabled:

        options cifs.smb2.enable
      • If it is not enabled, run the following command to enable it:

        options cifs.smb2.enable on
      • If for some internal reason you cannot are are not allowed to enable SMB2, use one of the following methods to enable SMB1 on the Windows side:
      • On Windows Server 2012 up to 2016, you can use the following PowerShell command:
      • Set-SmbServerConfiguration -EnableSMB1Protocol $true
      • Windows Server 2016 or lower, you can check the registry value SMB1 under: HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters.
      • If it exists and is set to 0, SMB1 is disabled. To enable, set the value to 1.

NetApp Cluster Mode:

If not all events are collected, perform the following:

  1. Run the command:

    fpolicy show-engine
  2. Locate the line which represents the FPolicy engine for the Vserver you are analyzing and verify that the IP address of the FPolicy server matches the IP address of the server where the Activity Monitor is installed and that the Server Status is connected.
  3. If the Server Status is disconnected, run the following command:

    fpolicy show-engine –node <node-name> -instance

    This will indicate the reason for the disconnection.

  4. If the disconnect reason is TCP failure, make sure the port configured in the Application configuration matches the port configured in the FPolicy configuration, and that the IP address of the external-engine configuration is the same as the IP address of the server running the Activity Monitor.
  5. Verify that there is no firewall between the Activity Monitor server and the cluster nodes and that the windows firewall is off on the Activity Monitor server.

    The firewall should only be off during troubleshooting. Turning off the firewall should not be a permanent solution. Refer to Physical Filer 7-Mode Communications Requirementsfor more information.

  6. If you see Authentication Failures to the ONTAP API in the Activity Monitor or Permissions Collector logs, check for the following:

    1. All the prerequisites in the Permissions section were configured correctly.
    2. The domain case configured in the application matches the configured domain value for the user configured in the Permissions sections.
    3. The username configured in the Permissions section is with the same as the username in Active Directory, and the user defined in the Application configuration.
  7. Make sure the NetApp internal firewall is not blocking communications with the Activity Monitor. Running the following commands in case of a block allows communication with the Activity Monitor:

    system services firewall policy clone -vserver <vserver_name> -policy data -destination-policy fp_siq1 -destination-vserver <vserver_name>
    system services firewall policy create -vserver <vserver_name> -policy fp_siq -service http -allow-list <am_server_ip_address_with_mask>
  8. If the Crawler hits unexpected “access denied” errors or misses entire shares because of “access denied” errors, this might be related to a known NetApp bug, which is documented in their knowledgebase (you need a NetApp account to see the entire entry):

    https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Backups_failing_even_though_user_is_part_of_BUILTIN%5CBackup_Operators_group_for_ONTAP_9

    • The bug affects Data ONTAP 9.x, and according to the document should be fixed in version 9.4. It “causes backup intent permissions to be incorrectly checked”. This means the Backup Operators membership used to gain access to the filesystem doesn’t work, and “access denied” errors are sent back.

    • Fortunately, there’s a workaround provided in the knowledgebase entry, which is to “disable fake open capability” by running the following commands on the NetApp console or an SSH connection to the management interface (replace SVM01 with the relevant Vserver):

    set diag
    cifs options modify -vserver SVM01 -is-fake-open-enabled false