Explaining index-entry-limit in ForgeRock Directory Services

A few years ago, I’ve explained the various resource limits in OpenDJ, the open source LDAP and REST directory server. A few months ago, someone read the post and asked on twitter about the index-entry-limit:

The index-entry-limit is probably the least understood parameter in the OpenDJ directory server, as was the AllIDThreshold in Sun Directory Server (and its siblings : Netscape Directory, Red Hat Directory, Oracle DSEE…). So before I dive in explaining what is this parameter, how it’s used and how it can be tuned, let me start with answering the question : how does index-entry-limit relate to other administrative limits ?

Answer: It doesn’t ! The index-entry-limit is an internal limit and does not really limits the results returned to clients. It just limits the resources consumed when processing indexes.

A Directory Server is a very specialized data-store based on the LDAP standard, and its primary goal was to be able to search and return user information such as email addresses or names and phone numbers, very quickly and for a large number of different clients. For that, the directory servers were designed to favor reads over writes, and read optimization was achieved through the use of indexes.

In LDAP, a search request (which can be used to read an entry or search for one or more through the whole database) contains a search filter. The filter may be simple or complex, and composed of one or more attribute value assertions.

A simple filter can be “(sn=Smith)”. Complex filters combine operators and different attributes : “(&(objectclass=Person)(|(sn=Smith)(cn=Smith)))” – find a person whose surname is smith or whose common name contains smith

When the ForgeRock Directory Server / OpenDJ receives a search request, it processes it in 2 phases. In the first phase, it analyzes the search filter, to identify which attributes are indexed, and then uses these indexes to build a list of possible candidates to return. If there are no indexed attributes or the list is too large, the server decides that the list is actually the whole database. Such search request is tagged as “unindexed” and the server verify if the authenticated user has the “unindexed-search” privilege before continuing. In the second phase, it reads all the candidates from the database, and assess the full filter to decide to return the entry to the client or not (subject to access controls).

ForgeRock DS / OpenDJ implements attribute indexes as reversed index. Meaning that for a specific attribute, we keep a pair of each unique value and a list of the entries that contain that value. Because maintaining a large list of entries for each value of all indexed attributes may have a big cost, both in term of memory usage and disk I/O (think that when you add an entry in the Directory, all of its indexed attributes will need to be updated), we introduced a limit to the number of entries that an index record can contain: the index-entry-limit. For example, if the number of entries that contain the objectClass person exceeds the limit, then we mark the key as “full” and we consider that the list of candidates is actually the whole set of directory entries. This saves us from updating and reading a very long record, allocating lots of data, to end up iterating through almost all entries. You might ask, so why having an index for objectClass then ? Well, in a directory server that contains millions of users, there are in fact very few entries that are not persons. These entries will have their objectClass values indexed, and searching for those entries will be very efficient thanks for the index.

The index-entry-limit is a limit of the number of entries that are contained in a single index record, per value of an attribute index. Its default value is 4000 and works for most medium to large scale deployments. So, why is it a configurable parameter, and when should you change it?

Because ForgeRock DS is used in many different environments with various use cases, and a great range of number of entries (some of our customers have over 100 millions entries in a directory service), we know that one size doesn’t fit all. But the default value works for most of the index usages. Also, the index-entry-limit can be set for each individual index, or for the whole backend (and this value applies to all indexes that don’t have a specific value). It is highly recommended that you only try to change the index-entry-limit of specific indexes, and not the backend default value.

In no case, should you increase the index-entry-limit to a value close to the total number of entries in the directory. This will undermine performances of both searches and updates, significantly increase the footprint of the data stored on disk.

There are few known cases where the index-entry-limit value should be changed (and equally cases where increasing the value will only consume more resources for no performance gain). Keep also in mind that when you change the index-entry-limit, you need to rebuild the indexes for which the limit was changed. So it’s not something that you want to do too often. And definitely not something that you need to adjust constantly.

Groups. When the server starts, it issues an internal search to find all group entries and cache them for better performances. The search is based on the ObjectClass attribute. If there are more than 4000 groups of one kind (the search is for GroupOfNames, GroupOfUniqueNames, GroupOfEntries, DynamicGroup and ds-virtual-static-group), the search will be unindexed and can take a long time to proceed. In that case, you should increase the index-entry-limit for the ObjectClass attribute, to a value just above the number of groups.

Members (or uniqueMembers). If you have more than 4000 static groups, and you know that some users are likely to be member of more than 4000 groups, then you should also increase the index-entry-limit for the member attribute (or uniqueMember) to a value just above the maximum number of group a user can be in, especially if you have enabled the Referential Integrity Plugin (that removes a user from groups when its entry is deleted).

Another typical use case for increasing the index-entry-limit is when you have millions of entries, and an attribute doesn’t have a flat distribution of values. Think about the surname of users. In a wide range of population, there are probably more “Smith” or “Lee” than “Washington”. Within 10M users, would there be more than 4000 “Lee”? If it’s possible, and the server receives searches with filters such as “(sn=Lee)”, then you should consider increasing the limit for the sn attribute.

Backendstat is the tool you want to use to verify the state of the index and whether some records have reached the index-entry-limit. For some attributes, such as ObjectClass, it is normal that the limit is reached. For others, such as sn, it’s probably something you want to check regularly.

The backendstat tool requires exclusive access to the database, and thus can only run against a server that is stopped (or a backup).

To list the indexes, use backendstat list-indexes :

$ backendstat list-indexes -b dc=example,dc=com -n userRoot

Index Name                                        Raw DB Name                                                          Type               Record Count
dn2id                                             /dc=com,dc=example/dn2id                                             DN2ID              10002
id2entry                                          /dc=com,dc=example/id2entry                                          ID2Entry           10002
referral                                          /dc=com,dc=example/referral                                          DN2URI             0
id2childrencount                                  /dc=com,dc=example/id2childrencount                                  ID2ChildrenCount   3
state                                             /dc=com,dc=example/state                                             State              18
uniqueMember.uniqueMemberMatch                    /dc=com,dc=example/uniqueMember.uniqueMemberMatch                    MatchingRuleIndex  0
mail.caseIgnoreIA5SubstringsMatch:6               /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6               MatchingRuleIndex  31232
mail.caseIgnoreIA5Match                           /dc=com,dc=example/mail.caseIgnoreIA5Match                           MatchingRuleIndex  10000
aci.presence                                      /dc=com,dc=example/aci.presence                                      MatchingRuleIndex  0
member.distinguishedNameMatch                     /dc=com,dc=example/member.distinguishedNameMatch                     MatchingRuleIndex  0
givenName.caseIgnoreMatch                         /dc=com,dc=example/givenName.caseIgnoreMatch                         MatchingRuleIndex  8605
givenName.caseIgnoreSubstringsMatch:6             /dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6             MatchingRuleIndex  19629
telephoneNumber.telephoneNumberSubstringsMatch:6  /dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6  MatchingRuleIndex  73235
telephoneNumber.telephoneNumberMatch              /dc=com,dc=example/telephoneNumber.telephoneNumberMatch              MatchingRuleIndex  10000
ds-sync-hist.changeSequenceNumberOrderingMatch    /dc=com,dc=example/ds-sync-hist.changeSequenceNumberOrderingMatch    MatchingRuleIndex  0
ds-sync-conflict.distinguishedNameMatch           /dc=com,dc=example/ds-sync-conflict.distinguishedNameMatch           MatchingRuleIndex  0
entryUUID.uuidMatch                               /dc=com,dc=example/entryUUID.uuidMatch                               MatchingRuleIndex  10002
sn.caseIgnoreMatch                                /dc=com,dc=example/sn.caseIgnoreMatch                                MatchingRuleIndex  10000
sn.caseIgnoreSubstringsMatch:6                    /dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6                    MatchingRuleIndex  32217
cn.caseIgnoreMatch                                /dc=com,dc=example/cn.caseIgnoreMatch                                MatchingRuleIndex  10000
cn.caseIgnoreSubstringsMatch:6                    /dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6                    MatchingRuleIndex  86040
objectClass.objectIdentifierMatch                 /dc=com,dc=example/objectClass.objectIdentifierMatch                 MatchingRuleIndex  6
uid.caseIgnoreMatch                               /dc=com,dc=example/uid.caseIgnoreMatch                               MatchingRuleIndex  10000

Total: 23

To check the status of the indexes and see which keys are full (i.e. exceeded the index-entry-limit ), use backendstat show-index-status . Warning , this may take a long time.

$ backendstat show-index-status -b dc=example,dc=com -n userRoot
Index Name                                        Raw DB Name                                                          Valid  Confidential  Record Count  Over Entry Limit  95%  90%  85%
uniqueMember.uniqueMemberMatch                    /dc=com,dc=example/uniqueMember.uniqueMemberMatch                    true   false         0             0                 0    0    0
mail.caseIgnoreIA5SubstringsMatch:6               /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6               true   false         31232         12                0    0    0
mail.caseIgnoreIA5Match                           /dc=com,dc=example/mail.caseIgnoreIA5Match                           true   false         10000         0                 0    0    0
aci.presence                                      /dc=com,dc=example/aci.presence                                      true   false         0             0                 0    0    0
member.distinguishedNameMatch                     /dc=com,dc=example/member.distinguishedNameMatch                     true   false         0             0                 0    0    0
 givenName.caseIgnoreMatch                         /dc=com,dc=example/givenName.caseIgnoreMatch                         true   false         8605          0                 0    0    0
givenName.caseIgnoreSubstringsMatch:6             /dc=com,dc=example/givenName.caseIgnoreSubstringsMatch:6             true   false         19629         0                 0    0    0
telephoneNumber.telephoneNumberSubstringsMatch:6  /dc=com,dc=example/telephoneNumber.telephoneNumberSubstringsMatch:6  true   false         73235         0                 0    0    0
telephoneNumber.telephoneNumberMatch              /dc=com,dc=example/telephoneNumber.telephoneNumberMatch              true   false         10000         0                 0    0    0
ds-sync-hist.changeSequenceNumberOrderingMatch    /dc=com,dc=example/ds-sync-hist.changeSequenceNumberOrderingMatch    true   false         0             0                 0    0    0
ds-sync-conflict.distinguishedNameMatch           /dc=com,dc=example/ds-sync-conflict.distinguishedNameMatch           true   false         0             0                 0    0    0
entryUUID.uuidMatch                               /dc=com,dc=example/entryUUID.uuidMatch                               true   false         10002         0                 0    0    0
sn.caseIgnoreMatch                                /dc=com,dc=example/sn.caseIgnoreMatch                                true   false         10000         0                 0    0    0
sn.caseIgnoreSubstringsMatch:6                    /dc=com,dc=example/sn.caseIgnoreSubstringsMatch:6                    true   false         32217         0                 0    0    0
cn.caseIgnoreMatch                                /dc=com,dc=example/cn.caseIgnoreMatch                                true   false         10000         0                 0    0    0
cn.caseIgnoreSubstringsMatch:6                    /dc=com,dc=example/cn.caseIgnoreSubstringsMatch:6                    true   false         86040         0                 0    0    0
objectClass.objectIdentifierMatch                 /dc=com,dc=example/objectClass.objectIdentifierMatch                 true   false         6             4                 0    0    0
uid.caseIgnoreMatch                               /dc=com,dc=example/uid.caseIgnoreMatch                               true   false         10000         0                 0    0    0
Total: 18
Index: /dc=com,dc=example/mail.caseIgnoreIA5SubstringsMatch:6
 Over index-entry-limit keys: [.com] [@examp] [ample.] [com] [e.com] [exampl] [le.com] [m] [mple.c] [om] [ple.co] [xample]
Index: /dc=com,dc=example/objectClass.objectIdentifierMatch
Over index-entry-limit keys: [inetorgperson] [organizationalperson] [person] [top]

I hope this long article will help you better understand and tune your ForgeRock Directory Servers for search performances. Please let me know how it goes.