Most LDAP servers can be set to return an unlimited number of entries on an LDAP search, however depending on the size of the LDAP database/directory this can possibly exceed your memory. Moreover if you want to write portable code, you probably should not depend on the LDAP server being able to return unlimited entries. For instance, AD’s LDAP generally defaults to 1,000 entries maximum.
Because using LDAP paging isn’t very difficult there’s not a lot of reason to not use it. Adding paging only marginally reduces performance, while certainly putting less stress on the LDAP server(s). Personally I recommend you use it on a general basis, even where not strictly necessary.
Python’s LDAP supports paging, though it isn’t well documented. I found two examples this one and this one. Both had their pluses, but neither explained what was going on too much. I melded them together, added comments, and streamlined a bit. Hopefully this will help you get the mojo…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
import sys import ldap # If you’re talking to LDAP, you should be using LDAPS for security! LDAPSERVER=‘ldaps://ldap.somecompany.com’ BASEDN=‘cn=users,dc=somecompany,dc=com’ LDAPUSER = ‘uid=someuser,dc=somecompany,dc=com’ LDAPPASSWORD = ‘somepassword’ PAGESIZE = 1000 ATTRLIST = [‘uid’, ‘shadowLastChange’, ‘shadowMax’, ‘shadowExpire’] SEARCHFILTER=‘uid=*’ # Ignore server side certificate errors (assumes using LDAPS and # self-signed cert). Not necessary if not LDAPS or it’s signed by # a real CA. ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT, ldap.OPT_X_TLS_ALLOW) # Don’t follow referrals ldap.set_option(ldap.OPT_REFERRALS, 0) l = ldap.initialize(LDAPSERVER) l.protocol_version = 3 # Paged results only apply to LDAP v3 try: l.simple_bind_s(LDAPUSER, LDAPPASSWORD) except ldap.LDAPError as e: exit(‘LDAP bind failed: %s’ % e) # Initialize the LDAP controls for paging. Note that we pass ” # for the cookie because on first iteration, it starts out empty. lc = ldap.controls.SimplePagedResultsControl(ldap.LDAP_CONTROL_PAGE_OID, True, (PAGESIZE,”)) # This is essentially a placeholder callback function. You would do your real # work inside of this. Really this should be all abstracted into a generator… def process_entry(dn, attrs): “”“Process an entry. The two arguments passed are the DN and a dictionary of attributes.”“” print dn, attrs # Do searches until we run out of “pages” to get from # the LDAP server. while True: # Send search request try: # If you leave out the ATTRLIST it’ll return all attributes # which you have permissions to access. You may want to adjust # the scope level as well (perhaps “ldap.SCOPE_SUBTREE”, but # it can reduce performance if you don’t need it). msgid = l.search_ext(BASEDN, ldap.SCOPE_ONELEVEL, SEARCHFILTER, ATTRLIST, serverctrls=[lc]) except ldap.LDAPError as e: sys.exit(‘LDAP search failed: %s’ % e) # Pull the results from the search request try: rtype, rdata, rmsgid, serverctrls = l.result3(msgid) except ldap.LDAPError as e: sys.exit(‘Could not pull LDAP results: %s’ % e) # Each “rdata” is a tuple of the form (dn, attrs), where dn is # a string containing the DN (distinguished name) of the entry, # and attrs is a dictionary containing the attributes associated # with the entry. The keys of attrs are strings, and the associated # values are lists of strings. for dn, attrs in rdata: process_entry() # Look through the returned controls and find the page controls. # This will also have our returned cookie which we need to make # the next search request. pctrls = [ c for c in serverctrls if c.controlType == ldap.LDAP_CONTROL_PAGE_OID ] if not pctrls: print >> sys.stderr, ‘Warning: Server ignores RFC 2696 control.’ break # Ok, we did find the page control, yank the cookie from it and # insert it into the control for our next search. If however there # is no cookie, we are done! est, cookie = pctrls[0].controlValue if not cookie: break lc.controlValue = (page_size, cookie) |
As a final note, one of the documents I found said the paged controls did not work with OpenLDAP. That’s not what I found – pretty much the exact code above worked without issue with OpenLDAP.
UPDATE:
A GitHub “Gist” for the above can be found here.
UPDATE 2:
For users of Python LDAP 2.4, you should check out of the comment by Ilya Rumyantsev which gives a forward/backward compatible set of code snippets since the API has changed a bit. Many thanks to Ilya for the update.
UPDATE 3:
Below I took Ilya’s updates and merged them in with some minor enhancements to compare the Python LDAP version on the fly. My next stop is to take this and convert it to a generator function, which would be more ideal than using a callback. The issue with going to a generator is handling the errors, which means throwing exceptions in some sane fashion…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
#! /usr/bin/python import sys import ldap from ldap.controls import SimplePagedResultsControl from distutils.version import LooseVersion # Check if we’re using the Python “ldap” 2.4 or greater API LDAP24API = LooseVersion(ldap.__version__) >= LooseVersion(‘2.4’) # If you’re talking to LDAP, you should be using LDAPS for security! LDAPSERVER=‘ldaps://ldap.somecompany.com’ BASEDN=‘cn=users,dc=somecompany,dc=com’ LDAPUSER = ‘uid=someuser,dc=somecompany,dc=com’ LDAPPASSWORD = ‘somepassword’ PAGESIZE = 1000 ATTRLIST = [‘uid’, ‘shadowLastChange’, ‘shadowMax’, ‘shadowExpire’] SEARCHFILTER=‘uid=*’ def create_controls(pagesize): “”“Create an LDAP control with a page size of “pagesize“.”“” # Initialize the LDAP controls for paging. Note that we pass ” # for the cookie because on first iteration, it starts out empty. # Note you may want to set “criticality=True” if you must have # paging. if LDAP24API: return SimplePagedResultsControl(criticality=False, size=pagesize, cookie=”) else: return SimplePagedResultsControl(ldap.LDAP_CONTROL_PAGE_OID, False, (pagesize,”)) def get_pctrls(serverctrls): “”“Lookup an LDAP paged control object from the returned controls.”“” # Look through the returned controls and find the page controls. # This will also have our returned cookie which we need to make # the next search request. if LDAP24API: return [c for c in serverctrls if c.controlType == SimplePagedResultsControl.controlType] else: return [c for c in serverctrls if c.controlType == ldap.LDAP_CONTROL_PAGE_OID] def set_cookie(lc_object, pctrls, pagesize): “”“Push latest cookie back into the page control.”“” if LDAP24API: cookie = pctrls[0].cookie lc_object.cookie = cookie return cookie else: est, cookie = pctrls[0].controlValue lc_object.controlValue = (pagesize,cookie) return cookie # This is essentially a placeholder callback function. You would do your real # work inside of this. Really this should be all abstracted into a generator… def process_entry(dn, attrs): “”“Process an entry. The two arguments passed are the DN and a dictionary of attributes.”“” print dn, attrs # Ignore server side certificate errors (assumes using LDAPS and # self-signed cert). Not necessary if not LDAPS or it’s signed by # a real CA. ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT, ldap.OPT_X_TLS_ALLOW) # Don’t follow referrals ldap.set_option(ldap.OPT_REFERRALS, 0) l = ldap.initialize(LDAPSERVER) l.protocol_version = 3 # Paged results only apply to LDAP v3 try: l.simple_bind_s(LDAPUSER, LDAPPASSWORD) except ldap.LDAPError as e: exit(‘LDAP bind failed: %s’ % e) # Create the page control to work from lc = create_controls(PAGESIZE) # Do searches until we run out of “pages” to get from # the LDAP server. while True: # Send search request try: # If you leave out the ATTRLIST it’ll return all attributes # which you have permissions to access. You may want to adjust # the scope level as well (perhaps “ldap.SCOPE_SUBTREE”, but # it can reduce performance if you don’t need it). msgid = l.search_ext(BASEDN, ldap.SCOPE_ONELEVEL, SEARCHFILTER, ATTRLIST, serverctrls=[lc]) except ldap.LDAPError as e: sys.exit(‘LDAP search failed: %s’ % e) # Pull the results from the search request try: rtype, rdata, rmsgid, serverctrls = l.result3(msgid) except ldap.LDAPError as e: sys.exit(‘Could not pull LDAP results: %s’ % e) # Each “rdata” is a tuple of the form (dn, attrs), where dn is # a string containing the DN (distinguished name) of the entry, # and attrs is a dictionary containing the attributes associated # with the entry. The keys of attrs are strings, and the associated # values are lists of strings. for dn, attrs in rdata: process_entry(dn, attrs) # Get cookie for next request pctrls = get_pctrls(serverctrls) if not pctrls: print >> sys.stderr, ‘Warning: Server ignores RFC 2696 control.’ break # Ok, we did find the page control, yank the cookie from it and # insert it into the control for our next search. If however there # is no cookie, we are done! cookie = set_cookie(lc, pctrls, PAGESIZE) if not cookie: break # Clean up l.unbind() # Done! sys.exit(0) |
UPDATE 4:
It turns out that the Python “ldap” module does not follow “StrictVersion” versioning in it’s “__version__” string. I have updated the “UPDATE 3” code to use “LooseVersion” comparisons.
UPDATE 5:
I updated the above code to default to “criticality=False” for the paging control. If the LDAP service doesn’t support paging, it will yield a potentially confusing “Critical extension is unavailable” error.
Note I need to ultimately fix the exception handling as for whatever reason the exception object passed back doesn’t have a reasonable “__str__()” method and the message is left in the “desc” key.
Leave a Reply