20090211 oracle internet directory with 10 million entries - plembo/onemoretech GitHub Wiki

title: Oracle Internet Directory with 10 million entries link: https://onemoretech.wordpress.com/2009/02/11/oracle-internet-directory-with-10-million-entries/ author: lembobro description: post_id: 381 created: 2009/02/11 16:19:15 created_gmt: 2009/02/11 16:19:15 comment_status: open post_name: oracle-internet-directory-with-10-million-entries status: publish post_type: post

Oracle Internet Directory with 10 million entries

This is an old (circa A.D. 2000) article on O’Reilly’s site, Putting Oracle Internet Directory Through Its Paces.

Our directory implementation would consist of a shallow, wide tree with a single parent node… The telephone numbers form the leaf nodes of the tree, upon which various object classes would hang. A single access-control attribute at the parent node allows public access to all leaf nodes…

For our first deployment, however, we opted to run both the LDAP server and database on a single 6-CPU Sun Enterprise 4500 server with dual fiber links to an EMC Symmetrix storage system…

We then loaded the directory with 10 million test nodes (one node = one phone number) and, over the course of a month, proceeded to bombard the server with 330 million LDAP queries. Outside of a five minute loss of service, the OID server performed admirably, producing an average response time of 130 milliseconds…

The author (who still is an Oracle consultant) states towards the end of article that LDAP had never before been deployed on such massive (10 million entries) scale*. The largest LDAP implementation I am personally aquainted with has around 1 to 2 million entries, with the data actually distributed over several directory servers using superior references and smart referrals as the “glue” to bind them together. If I remember correctly that environment originally used Novell NDS but moved over onto OpenLDAP to keep pace with the client system’s (a basic white pages application) performance requirements.

*Sun has a “customer success” story from 2006 claiming a deployment of it’s directory server for a customer with 230 million registered users, but it’s not clear that there are an equal number of entries in that directory — or even that they’re stored in a single “partition” or directory instance.

For really big directories people usually go with a non-LDAP heavyweight RDBMS application. Although it is expensive and complex compared to OpenLDAP, the Oracle Internet Directory solution has dual the benefit of using an RDBMS back end while still providing an LDAP v3 protocol compliant “listener” that can be addressed by standard LDAP client apps. Whether you’d necessarily choose an LDAP protocol client app to service an app whose data consists of a million or more records is another matter entirely (think I/O, bandwidth and the inefficiencies of LDAP protocol operations as the number of entries increases — it doesn’t do any good to have millions of entries in your directory if the network connection to it gets saturated when someone tries to retrieve even a fraction of them). What it buys for Oracle is the ability to massively scale service pieces like their OAS (Oracle Application Server) Portal that are heavily dependent on LDAP for configuration and user information.

Of course one thing everyone needs to understand about scaling an LDAP directory, or any other application for that matter, is that hardware really does matter. There is no substitute for lots of cores and gobs of RAM. Don’t even think about running a single 100,000+ entry LDAP directory on anything less than 4 fast (2 - 3 GHz) cores and 8 Gb RAM, no matter what vendor’s directory product you’re chosen. Anyone who tells you different is only telling you what you want to hear, not what you need to know.

P.S. If you’re planning to use LDAP groups in a directory of any real size, you’re going to be in for a rude awakening on the scalability side — unless you can resist the temptation to use them for anything more than a few thousand members. Once your LDAP groups go to over 5,000 members you’re going to hit a very hard wall. This is why I always recommend grouping users by some attribute value (like country, postalcode, title, departmentnumber, companyname, etc).

Copyright 2004-2019 Phil Lembo