Getting the Most Performance From Your IDM Deployment

jake.feasel · July 24, 2021, 4:00am

IDM is a highly configurable product, capable of solving a large number of provisioning-related business problems. There are features available to store arbitrary data structures, sync with external systems, validate data, track usage, gather data directly from users, and more. As with anything, however, features come with a cost. Having things enabled that you don't really need will slow your system down. For that reason, it's important to review your IDM configuration to make sure that you have only enabled the features you need.

To help with that effort, here are some important areas of IDM configuration to evaluate. The key to understanding performance in IDM is understanding what exactly it's doing "under the hood" for any one request. To help shine some light on that process, let's consider the details for a request to create a single managed/user record based on the default, out-of-the-box configuration of IDM. These are the high-level steps that IDM will take:

Authentication: establishing the security context for the request. authentication.json

https://backstage.forgerock.com/docs/idm/7/security-guide/auth-session-modules.html
- Loop through each of the defined auth modules one at a time until one reports a success.
- Different auth modules have different cost associated with them. STATIC_USER is the fastest one, since it doesn't involve calls out to any external system (not even the repo). Carefully consider the order of the auth modules; frequently used modules should be placed before slower, less-frequently used modules.
- Internal accounts (such as openidm-admin) are configured to use INTERNAL_USER by default in IDM 6.5. Consider changing this to STATIC_USER to avoid the lookup in the repo; this is how IDM 7 is configured by default.
- By default, every authentication will result in a session being created and returned in the form of a cookie. If you are not saving the cookie for subsequent requests (as is often the case when a proxy is in front of IDM) then this session creation overhead is wasted. Avoid it by passing in X-OpenIDM-NoSession: true.
- IDM will use the subject identified by the auth module to query the queryOnResource endpoint for that module, looking for details about the subject (including the userRoles property) in order to construct a complete security context. It will also include whatever roles are declared in the defaultUserRoles. If you have a role that applies to all authenticated users (for example, internal/role/openidm-authorized), then it is much faster to use defaultUserRoles for this than it would be to maintain it in storage for every user.
Audit: logging various events throughout the system. audit.json

https://backstage.forgerock.com/docs/idm/7/audit-guide/
- By default, IDM will generate comprehensive audit data for various topics. Creating these audit records adds overhead to the whole request. The amount of overhead varies by the handlers and the topics that are enabled, and if there is any filtering being done.
- Creating audit events is a blocking operation on the requests. Therefore, consider the responsiveness of whatever system you are using to write these audit events. Prioritize using an audit handler that has as little delay as possible. Syslog, Json StdOut, and JMS should all be very fast handlers. CSV and JSON file handlers are fairly fast, depending on the disk i/o available to IDM.
- Avoid the repo audit handler. Using it will increase the load on the repo, decreasing the number of connections in the pool, and also just generally burdening this performance-critical system.
- Aggressively filter out all events you don't actually care to monitor.
Authorization: whether or not the specific request is allowed. router.json

https://backstage.forgerock.com/docs/idm/7/security-guide/idm-authorization.html#router-authz-js
- The first filter within router.json calls out to the default router-authz authorization implementation. This implementation has fairly low overhead and probably doesn't need much tuning. However, in extreme cases (especially when no direct user interaction with IDM is required), you could remove this filter to gain a slight performance improvement.
- Consider removing unnecessary authorization checks with access.js. Ensure the rules listed there tightly match your needs—no more, no less.
Policy: whether or not the data provided meets the requirements. router.json, managed.json and policy.json

https://backstage.forgerock.com/docs/idm/7/objects-guide/configuring-default-policy.html
- In the case of a managed/user, the default policy rules come from the schema declared in managed.json.
- In terms of performance, the main default policies that you need to consider are those which involve queries to the repo. These include: "valid-username", "unique", and "no-internal-user-conflict". Depending on the version of IDM you're using, one of these will be declared for the "userName" property. Keeping this enabled means that the policy service will perform a query to the repository to check to see if the provided userName for the new managed/user you're creating conflicts with an existing managed/user. This query adds significant overhead to the request, depending on how fast that query responds. Avoid adding new properties which use these policies, if possible.
- Consider replacing the unique policy with a constraint in your repository instead. Databases are typically much faster at performing these kinds of checks than a separate application (such as IDM) is capable of. Keep in mind that adding this constraint will be repo-specific; PostgreSQL vs. MySQL vs. DS will all have different ways of accomplishing this.
Managed Object Service: changes made to the incoming data prior to saving it. managed.json

https://backstage.forgerock.com/docs/idm/7/objects-guide/managed-objects.html
- The main performance considerations here are around the default scripts that execute for the request, as well as the declarative features that are enabled for the request.
- Review every script (onCreate/onUpdate/postDelete ...) to ensure you need what the script offers. The main change you may want to consider is within the default onCreate script; if you are using 6.5 and earlier, there is logic within setDefaultFields that adds a default value for authzRole. Since this is a relationship property, this default value will result in a separate create request to the repository. If you expect that every user should have this value, you can avoid the overhead of maintaining it by simply using the defaultUserRoles property to grant it dynamically within authentication.json (as described in step 1). IDM 7 has changed the default configuration to align with this recommendation.
- Use of conditional roles. By default, managed/users in IDM have roles and authzRolesproperties that are configured with "conditionalAssociationField": "condition". What this means is that during this create request, IDM will query all resourceCollection entries (such as managed/role and internal/role), and will look for a condition property within each record returned. It is expected that this property will contain a filter string (for example /title eq "Manager"). IDM will process this filter against the incoming managed/user content to see if it matches. If so, then IDM will update the request payload so that the managed/user will be created with that relationship assigned. Each query to the repository to search for these potentially related objects incurs an overhead cost. Therefore, if you do not require this type of conditional assignment of relationships then you should disable this behavior. Disable it by removing all "conditionalAssociationField" entries within the schema for managed/user, and also setting "conditionalAssociation": false for members within managed/role. Consider also doing the same for internal/roles within internal.json.
- Use of metadata tracking. By default, IDM has the "meta" attribute enabled for managed/user. This feature tracks changes made to the user, specifically to support the use of various self-service features (such as, progressive profile completion, privacy and consent, and terms and conditions acceptance). This metadata is tracked as an additional relationship value for each user, which means it is stored separately from the user itself, and is therefore maintained with separate create and update requests to the repository. If you are not using the self-service features which require this, then you should edit the schema and remove the meta block.
Repository: persisting the data in your chosen repository. repo.ds.json or repo.jdbc.json and datasource.jdbc-default.json

https://backstage.forgerock.com/docs/idm/7/objects-guide/repo-config.html
- Repository choice and configuration are hugely important to the performance of IDM; it's probably the single most significant environmental detail. Consider the fact that for this single request to create a user, we have already made at least five blocking, sequential queries to various parts of the repository. By tuning it differently, that number can be reduced. But the fact remains that a slow repository will weaken performance in IDM.
- General performance considerations for whatever repository you choose will apply here. Be sure to give your repository ample resources, particularly in terms of disk i/o, memory, and network connectivity. Carefully follow the appropriate guidance for sizing and monitoring your repository.
- Both JDBC and DS repositories have a connection pool available to tune. For JDBC, this is configured within datasource.jdbc-default.json. For DS, it is within repo.ds.json, under /ldapConnectionFactories/bind/connectionPoolSize. Running out of available connections to the repo will cause requests to back up; therefore it is important that your pool size is large enough to support the load. Expand the connection pool so that you can be sure that this does not happen. When doing so, ensure that IDM and the repo have enough memory available to support connection pool sizes you specify. Tuning IDM so that unnecessary requests to the repo are not occurring (such as this guide describes) will also dramatically improve the health of your connection pool.
- Care should be taken for how your data is stored in the repository. Consider the choices presented within the Generic and Explicit Object Mappings documentation. Rarely will the default object mapping configuration perform optimally for your needs. Consider these details for the various specific backends:
- PostgreSQL: Within the provided default_schema_optimization.pgsql there are different types of indexes for various properties of managed objects. You should take this script as a starting point for your specific data structure needs. Remove all indexes for attributes you don't need to search for. Add indexes for attributes you will need to search for. Unless you rely heavily on the Admin UI to search for users, consider removing the various _gin indexes that are included within that script.
- PostgreSQL: If you have multiple managed object types that have the same indexed properties, create distinct generic object tables for each object type. Doing this will allow the indexes to perform much more precisely.
- PostgreSQL: Due to the fact that PostgreSQL has native support for JSON attributes, there is little benefit to using the "explicit" object mapping. In fact, it may end up being slower to use "explicit" than "generic" object mappings with PostgreSQL, due to the cost of constructing and deconstructing the JSON representation of the data.
- MySQL and other non-PostgreSQL JDBC repositories: Consider using an explicit object mapping for managed user data, as well as any other large set of data you are maintaining within the repo. This will give you the most control possible for these database types, in terms of indexes and storage maintenance. If you do not want to use an explicit object mapping, be sure the searchable properties listed within the generic mapping are as precise as possible for your needs. Every property that you declare as searchable will result in a separate record in the properties table. Maintaining all of those extra property records adds overhead for each change (create or update).
- DS: The best advice for this at the moment is to make sure you are using one appropriately sized primary ForgeRock® Directory Server (DS) server (version 7) as your repository, and that it is configured with the idm-repo setup profile. As with the general advice for JDBC, you should prefer the use of explicitly mapped objects over generic objects. Accordingly, any changes made to your object model may require corresponding indexes.
Synchronization: automatically pushing changes to target systems. sync.json

https://backstage.forgerock.com/docs/idm/7/synchronization-guide/index.html#preface
- After your managed/user is created in the repo, any mappings to target systems will be triggered. Each mapping will be processed sequentially in the order declared in sync.json. Unless you have configured queued sync, the speed that this sync process completes is dependent on how fast each mapping can be completed. Using queued sync can help with this problem, but be aware that it isn't free—it does involve an extra write operation to a table in the repo, and then subsequent tuning for the threads that handle the queue. This would only be a useful option for dealing with those systems which are particularly slow to respond.
- If you have declared a correlation query, be sure that the target system is able to perform that query quickly. You may have to adjust indexes in that system as well.
Notifications: letting users know when things happen on their account. notificationFactory.json

https://backstage.forgerock.com/docs/idm/7/audit-guide/notification-config.html#notification-config
- This feature is mainly designed to support a very particular user experience that is showcased in the end-user UI. Basically, changes to a user will result in an extra "notification" record created for that user, for them to review within the end-user UI. These notifications are maintained as relationships associated with the user.
- If you are not actually using this feature, you should disable it either by editing notificationFactory.json and setting "enabled": false, or you can simply delete notificationFactory.json. Doing either will save your connection pool from making another request, potentially blocking more important connections. Also, disabling this will keep unimportant data from filling up your relationships table (which could cause other queries to be slow).
- The default configuration doesn't actually create notifications during a managed/user "create" request, but instead only for an update. Therefore changing this won't help create request performance, but it will definitely help updates.
Filtering the response: ensuring that unauthorized data is not leaked. router.json

https://backstage.forgerock.com/docs/idm/7/security-guide/idm-authorization.html#idm-authorization
- The last filter in router.json uses the "relationshipFilter" script. It looks through the data included in the response and prevents data related to the base object from being exposed. This could potentially occur when a user with low privileges makes a REST call to an object that they have access to read and also request related data to that object (see View Relationships Over REST for examples on how to do that). If the user doesn't have access to view the related data, then it needs to be excluded from the response; this filter performs that exclusion.
- In IDM 6.5, the default condition for this filter was not very precise; as a result, there was a lot of extra work being done with each request. This resulted in a noticeable performance impact. Tuning the condition to align with 7.0 will reduce this impact. If you are sure that there is no authorization risk for your particular data and authorization model, then you can go a step further and remove this filter entirely.

Performance Tests

Performance tuning is a notoriously tricky task. As you can see from this above list, there are a lot of options to consider for each step. There are so many variables at play with a whole deployment that it is almost impossible to assign a specific cost associated with a particular feature. However, it is possible to gauge a relative impact. One of the most important tools for doing so is the metrics service available to IDM. This will reveal critical details regarding how much time IDM is spending in any one area.

You can enable metrics by editing metrics.json and setting "enabled": true. There is also an easy way to view the metrics within the Admin UI - just add the "Dropwizard Table with Graph" widget to a dashboard, and you'll see it update live.

Let's try taking some of the details from the above list and seeing how configuration changes can have a significant impact. We will use internal IDM metrics as well as external timers to make the evaluations.

The baseline configuration for my tests will be IDM 7.0.1 using PostgreSQL 11 as the repository. Note that since I am focusing on IDM configuration changes, the particular choice of repository doesn't actually matter too much for these tests. For a more comprehensive performance tuning exercise, repo choice and tuning would take on a much more significant role.

For my environment, I unzipped IDM 7.0.1 and ran this to initialize PostgreSQL:

docker run --network host -d --name postgres -e POSTGRES_PASSWORD=password -d postgres:11 && \
sleep 4 && \
psql -U postgres -h localhost < db/postgresql/scripts/createuser.pgsql && \
psql -U openidm -h localhost < db/postgresql/scripts/openidm.pgsql && \
psql -U postgres -h localhost openidm < db/postgresql/scripts/default_schema_optimization.pgsql

Next, I started up IDM by using this Dockerfile:

FROM gcr.io/forgerock-io/idm/pit1:7.0.1

RUN rm /opt/openidm/conf/repo.ds.json && \
    cp /opt/openidm/db/postgresql/conf/datasource.jdbc-default.json \
    /opt/openidm/db/postgresql/conf/repo.jdbc.json \
    /opt/openidm/conf

COPY conf/* /opt/openidm/conf/

And these docker commands:

docker build -t idm_perf:latest .
docker run --network host -m 2g -e OPENIDM_REPO_HOST=`hostname` -e OPENIDM_REPO_PORT=5432 idm_perf:latest

Using this environment, IDM 7.0.1 and PostgreSQL 11 were running on my Linux laptop within Docker. IDM was given 2 GB of memory. It was available locally at http://localhost:8080.

To create managed/users, I was using Apache JMeter. It was configured to create 40,000 users using 10 active threads. The requests to create each one are equivalent to this curl command:

curl -H 'X-OpenIDM-Username: openidm-admin' \
     -H 'X-OpenIDM-Password: openidm-admin' \
     -H 'X-OpenIDM-NoSession: true' \
     -H 'Content-type: application/json' \
     --data '{"userName": "user.a1", "cn": "rick sutter", "telephoneNumber": "6669876987", "givenName": "rick", "description": "Just another user", "sn": "sutter", "mail": "rick@example.com", "password": "Th3Password", "accountStatus": "active"}' \
    http://localhost:8080/openidm/managed/user?_action=create

Remember not to take much stock in the particular numbers in this report - so many more factors can also make a tremendous impact. The version of IDM you have, the particular repository, the compute resources, the network connectivity with external systems, etc... The numbers here merely show relative impact of particular configuration changes.

Baseline

This is the environment you get when you start the default IDM project with no changes at all, besides using an out-of-the box PostgreSQL 11 server as the repository (created with the out-of-the-box repository setup scripts).

The baseline creates per second reported from JMeter is 173.9/s.

Key metrics:

         PATH                                    ACTION             COUNT    MIN     MAX      MEAN    STDDEV

internal.role                           queryCollection    40000    0.5     13.75    1.32    1.13
internal.user                           queryCollection    40000    0.58    44.86    1.52    1.65
internal.usermeta                       create             40000    2.24    35.1     7.48    3.07
managed.role                            queryCollection    40000    0.64    29.79    1.65    1.47
managed.user                            queryCollection    40000    0.76    30.81    1.72    1.55
managed.user                            create             40000    11.42   120.59   27.17   9.17
filter.scripted.on-request.policyFilter js                 80000    0.27    67.11    4.81    5.95

Removing the valid-username policy

The baseline with the valid-username policy removed. Note that PostgreSQL uses a constraint in the database to perform a similar uniqueness check, so removing this policy causes little downside.

Creates per second reported from JMeter is 210.3/s.

This is a 21% improvement from the baseline.

Key metrics:

PATH                                    ACTION             COUNT    MIN     MAX      MEAN    STDDEV

internal.role                           queryCollection    40000    0.52    18.94    1.72    1.62
internal.user                           queryCollection    0
internal.usermeta                       create             40000    2.29    40.15    8.06    3.75
managed.role                            queryCollection    40000    0.69    38.29    2.03    1.78
managed.user                            queryCollection    0
managed.user                            create             40000    10.66   109.04   31.3    12.61
filter.scripted.on-request.policyFilter js                 80000    0.26    46.79    3.56    4.4

Note that the metrics are no longer reporting queries for internal/user and managed/user, due to the policy change.

Conditional Roles Disabled

Baseline environment with conditional roles disabled. If you know you aren't using conditional roles, then you can make IDM faster by configuring it not to look for them.

Creates per second reported from JMeter is 185.4/s.

This is a 7% improvement from the baseline.

Key metrics:

PATH                                    ACTION             COUNT    MIN     MAX      MEAN    STDDEV

internal.role                           queryCollection    0
internal.user                           queryCollection    40000    0.62    43.38    2.02    2.6
internal.usermeta                       create             40000    2.48    43.55    8.34    4.24
managed.role                            queryCollection    0
managed.user                            queryCollection    40000    0.7     38.55    2.33    2.31
managed.user                            create             40000    10.22   110.65   27.27   11.74
filter.scripted.on-request.policyFilter js                 80000    0.27    81.17    5.85    7.89

Note that the metrics are no longer reporting queries for internal/role and managed/role, now that conditional roles have been disabled.

Meta Removed

Baseline environment with meta removed. If you aren't using the associated self-service features, there is no need to maintain this extra data.

Creates per second reported from JMeter is 203.5/s.

This is a 17% improvement from the baseline.

Key metrics:

PATH                                    ACTION             COUNT    MIN     MAX      MEAN    STDDEV

internal.role                           queryCollection    40000    0.54    35.83    2.56    2.48
internal.user                           queryCollection    40000    0.59    37.88    2.92    2.87
internal.usermeta                       create             0
managed.role                            queryCollection    40000    0.65    34.62    3.13    2.96
managed.user                            queryCollection    40000    0.78    39.95    3.49    3.53
managed.user                            create             40000    4.8     100.88   17.16   10.16
filter.scripted.on-request.policyFilter js                 80000    4.1     111.22   18.09   13.05

Combined Changes

In order to demonstrate how much of a cumulative impact these various individual changes can make, let's combine the three above (removing valid-username, conditional roles and meta).

Creates per second reported from JMeter is 343.4/s.

This is a 98% improvement from the baseline.

Key metrics:

PATH                                    ACTION             COUNT    MIN     MAX      MEAN    STDDEV

internal.role                           queryCollection    0
internal.user                           queryCollection    0
internal.usermeta                       create             0
managed.role                            queryCollection    0
managed.user                            queryCollection    0
managed.user                            create             40000    2.93    94.03    9.76    8.69
filter.scripted.on-request.policyFilter js                 40000    2.03    55.02    7.03    7.06

That's quite an improvement!

Summary

The above results—nearly a 100% improvement in simple create throughput—clearly show how much of an impact various features can have on IDM performance. I want to emphasize that these are not the only changes you should focus on; there are potential gains to performance to be had at each step of the request life-cycle. These improvements were made to an extremely simple IDM 7 environment - depending on the complexity, your mileage may vary considerably.

Carefully review your configuration for some of these easy (or even not-so-easy) wins. Also be sure to review your metrics - they will help you identify problem areas in your system that you may be able to address through configuration.

Bonus: disabling filters in router.json

For a little something extra, here are a few more slightly "risky" configuration changes worth considering.

If you are sure that the content of the data being passed into your system is well-defined, you can remove the policy filter altogether and see a decent performance improvement. This would make sense if all of your data passes through another interface (such as an ICF connector or an API gateway) that is doing the necessary data structure validation.

Likewise, if you are sure that you don't have to worry about information leakage to users that are unauthorized to see related data, you can remove the "relationshipFilter" entry in router.json.

To demonstrate how much of an impact these two filters can make, I re-ran the above "combined" test again, but this time with them removed.

Creates per second reported from JMeter is 662.1/s.

This is a 93% improvement from the combined result above, and altogether they provide a 280%improvement to the baseline configuration.