Configuring Data Extraction

Data extraction defines what to extract. It is configured by creating an Extract YAMLConfig and importing it to IdentityIQ. You may use the IdentityIQ console to generate extract configurations, then modify them as needed. See IIQ Console Commands.

Data Extract yields image objects that contain imageFields with data in them, as well as metadata about the image, such as the object type it represents and the timing of the image. The ObjectSelector is used to select objects for extraction. Administrators define key data in the extract configuration, including:

  • Types of objects to extract and transform

  • What attributes to include or exclude for each object type

  • Filter criteria for selecting objects

  • Name of the queue to publish to

  • Name of the transform configuration

Note: You can exclude any property or attribute for any object you have defined from going through Data Extract.

Note: Attributes that are known to be encrypted or secret cannot be extracted by default. If your implementation is storing an Extended Attribute defined by your organization and if you define that attribute as part of a Data Extract YAMLConfig, then it will be extracted as normal. If an encrypted attribute were included, it would be extracted in its encrypted state unless you take extra steps to decrypt it. See Using Data Extract with Sensitive Data.

Major classes matching the objects found in the Debug Object browser can be extracted. Best practice is to exclude AccessHistory classes and Intercepted Deletes from extraction, although this is not enforced by IdentityIQ. The following objects are not extractable in any circumstance:

  • AuthenticationAnswer

  • RemoteLoginToken

  • SAMLToken

Make sure the extractedObjects from the Extract YAMLConfig all have a corresponding imageConfigDescriptor and that each has a valid objectClassName.

The extract configuration may look like this example:

Copy
1extractedObjects:
2   identity:
3    resultLimit: 10
4  role:
5  certification:
6    filterString: phase == 
Certification.Phase.End
7transformConfigurationName: 
ExtractTransformConfig
8messageDestination: extractedObjectsQueue

transformConfigurationName – the name of the configuration object describing the transformation of the objects extracted.

messageDestination – the name of a queue to place the message into. You may use a script if you want to dynamically decide what queue to write to. It is important to correctly define a message destination; without something reading from the queue, it will fill up and negatively impact performance.

Note: IdentityIQ supports ActiveMQ only. Whether you use embedded or external ActiveMQ, message destinations do not have expiry defined by IdentityIQ. Be sure to configure what will read from them before you start running data extracts. The internal processing queues used by the tasks do have expiration times set after the first run to ensure that failed execution do not block subsequent runs. If there is a time out event during a data extract task, the task will fail, allowing the cause to be fixed and the data to be re-extracted.

extractedObjects– a map of values. The key is the string names of the object types described in the transformation configuration, for example, identity, role, and certification. If that is all that is present, the data extract process will look up the classname from the transformation configuration object and find the classname to know which object to read from the database.

The value can contain an object with the following properties:

  • filterString – a normal IdentityIQ filter string to be used for the queries

    • You can use a filter string to pull specific audit table rows based a particular action, so the usage would look like this:

      className:

      filterString: property == "<filter value>"

      Example <filter value> would be name.startsWith("B")

  • deleteTransformFormat – indicates this YAMLConfig is interested in intercepted deleted objects (optional)

    • You can set the output for the transform to brief, full, or none. Brief cannot be compared; full captures can be compared.

    • Full captures a full copy of what is intercepted including data such as ID, name, created date, modified date, and acknowledgement of child records.

    • Brief contains only the ID and name.

    • None gives no indication that a record was deleted.