http://www.developer.com/

Back to article

Introduction to LDAP (Lightweight Directory Access Protocol)


April 25, 2003

This is Chapter 1: Introduction to LDAP from the book LDAP Programming, Management and Integration (ISBN:1-93011-040-5), written by Clayton Donley, published by Manning Publications Co..

© Copyright Manning Publications Co. All rights reserved.

Introduction to LDAP

1.1 What LDAP is
1.2 What LDAP is not
1.3 Current applications
1.4 Brief history
1.5 LDAP revisions and other standards
1.6 Directory management
1.7 Directory integration
1.8 Integration and federation via virtual directory technology
1.9 Why this book?
1.10 Summary

In this chapter, we introduce the Lightweight Directory Access Protocol (LDAP) and attempt to answer the following questions:

  • What is LDAP? Who needs it? How is it used?
  • What are directory services? Where do they fit in the grand scheme of things? Which ones exist? What is their relation to LDAP?
  • What are common issues in planning and deploying directory services?
  • Where do metadirectories, provisioning tools, and virtual directories fit with LDAP?
  • What standards organizations and industry consortia are responsible for further development of directory services and LDAP standards?

1.1 WHAT LDAP IS

LDAP is a standard that computers and networked devices can use to access common information over a network. The ability to provide network access to data in itself does not make LDAP stand out from dozens of other protocols defined for data access, such as Hypertext Transfer Protocol (HTTP). As you will see in this chapter and those following, a number of features and vendor efforts make LDAP very well-suited for access and updates to many types of common information.

For example, information about employees might be stored in a directory so that people and applications can locate their contact information. Such contact information might include email addresses and fax numbers, or even additional data that unambiguously identifies employees' attempts to access enterprise applications.

1.1.1 Directory services and directory servers

A directory is simply a collection of information. For example, the telephone book is a directory used by virtually everyone to find telephone numbers.

Directory services provide access to the information in a directory. A simple directory service that most people use from time to time is the directory assistance offered by most telephone companies. By dialing a telephone number, anyone can receive instant access to information in the telephone directory.

In the computer world, directories exist everywhere. The Unix password file can be considered a directory of computer accounts. The Domain Name Service (DNS) acts as a directory service providing information about network hosts.

Computer applications often have their own directories. The Apache web server can store usernames and passwords in a data file, which is thus a directory of users. Customer information stored in a database can also be considered directory information if it is of a common nature with applications outside a single program or system.

Directory servers are applications that primarily act as directory services, providing information from a directory to other applications or end users. This functionality is most applicable in client/server environments, where the service may be located remotely from the calling application or system. For example, on Unix or Linux computers running the Network Information Service (NIS), the ypserv program can be considered a directory server.

1.1.2 LDAP and directory services

LDAP provides client-server access to directories over a computer network and is therefore a directory service. In addition to offering the ability to search and read information, it defines a way to add, update, and delete information in a directory.

Two general types of directory server software implement the LDAP standards:

  • Stand-alone LDAP servers
  • LDAP gateway servers

Stand-alone LDAP servers focus exclusively on LDAP as their only access mechanism; their proprietary internal data stores are tuned for LDAP access. These are typically what people mean when they use the words LDAP server.

Instead of being tied to a local data store, LDAP gateway servers translate between LDAP and some other native network protocol or application program interface (API) to provide access to directory information that is more directly available via other means. One example is the original use of LDAP: to gateway to other directory services supporting the X.500 standards. Another more modern example of such an LDAP gateway is a server that provides LDAP access to information residing in Oracle database tables.

Figure 1.1 illustrates the two types of services that can be used to provide LDAP-enabled directory services.

Figure 1.1
LDAP directories and LDAP gateways are different types of products that provide LDAP-enabled directory services.

The examples throughout this book will not address one type of server over the other—the idea behind LDAP is that it shouldn't matter where the end data is stored, as long as the client and server can use LDAP to communicate that information in a standard way understood by both sides.

In addition, we will focus primarily on accessing and managing information and services through the LDAP protocol. Each directory server product is installed and configured differently, usually in ways that are well-documented in product manuals. It would be of little use to duplicate such information, because installation and configuration of the software is relatively trivial.

1.1.3 Other directory services

LDAP is not alone in providing computerized directory services. It is also not the first or even the most completely defined directory service.

Other directory services that have been popular in the past, and that are still in use in many organizations, include those based on standards such as X.500, WHOIS, NIS, PH/QI, and various proprietary directories from companies such as Novell, Banyan, and others.

X.500 is a set of standards that originated in the late 1980s, with significant updates as late as 2001. The standards are extensive and cover everything from access to replication. In many respects, X.500 is more mature as a protocol than LDAP, including such technologies as multimaster replication and access control, but its relative complexity has made it less popular for access. However, it is still very popular, and a number of vendors sell servers that support these standards. These vendors tend to focus on X.500-based protocols for interoperability between servers, while exposing the data using an LDAP gateway.

WHOIS was an early attempt at a simple protocol for Internet-accessible white pages. The services supporting this protocol took a simple string and returned free-form text in response. A WHOIS server could be written on most operating systems in a short amount of time, but lack of standard data representation made it difficult to do anything but display the results as they arrived. Unfortunately, this limitation makes programmatic use of the resulting data in nonwhite pages applications very difficult.

NIS, originally called Yellow Pages (YP), was Sun's remote procedure call (RPC)-based operating system directory. Most Unix-based servers support some variant of this protocol. With a relatively simple replication model and access protocol, as well as the ability to discover servers on a local network, its creation was necessary due to the growth in client-server computing where users might exist on a number of servers. However, it was not well-suited for wide area networks (WANs) offered little in the way of security, and was not easily extensible for storing additional information in existing maps.

PH/QI was very popular at about the time HTTP became widely used. It was a multipurpose client-server directory service developed by Paul Pomes at the University of Illinois at Urbana-Champaign (UIUC). It was especially popular at universities in North America and was used to store not only white pages information, but also information that could be used for security, such as logins and credentials. One of the earliest applications to take advantage of the Common Gateway Interface (CGI) that shipped with the original National Center for Supercomputing Applications (NCSA) HTTP server was a gateway that presented an HTML interface to a PH server. Some mail applications, such as Eudora, were also able to perform PH queries for address books. LDAP's acceptance in the industry curtailed any serious move to PH/QI; in addition, the service was somewhat limited. The protocol was relatively simple and text-based; it was easy to access programmatically but designed to run on a central server, limiting its scalability and scope.

Banyan was an early leader in MS-DOS/Windows operating system directories, but it didn't fare well as Microsoft and Novell became more directory-aware. Banyan eventually changed its name to ePresence and is currently one of the larger integrators focused on directory services.

Novell based the proprietary directory service for its Netware Network Operating System (NOS) on the X.500 standards. Netware's directory has long been regarded as one of the more solid operating system directories, and Novell has a long history of directory integration in its products. As LDAP picked up steam, Novell separated the NOS from the directory and created eDirectory; it is now a popular LDAP-enabled directory service with the broadest platform support of any directory services vendor's product.

1.2 WHAT LDAP IS NOT

LDAP is an access protocol that shares data using a particular information model. The data to which it provides access may reside in a database, in memory, or just about anywhere else the LDAP server may access. It is important that the data be presented to an LDAP client in a way that conforms to LDAP's information model.

LDAP is being used for an increasing number of applications. Most of these applications are appropriate—but some aren't. To get a better idea what LDAP should and shouldn't be used for, we begin this section with an overview of LDAP limitations that make it a bad choice for certain types of applications.

LDAP is not:

  • A general replacement for relational databases
  • A file system for very large objects
  • Optimal for very dynamic objects
  • Useful without applications

1.2.1 LDAP is not a relational database

LDAP is not a relational database and does not provide protocol-level support for relational integrity, transactions, or other features found in an RDBMS. Applications that require rollback when any one of multiple operations fails cannot be implemented with the current version of LDAP, although some vendors implement such functionality when managing their underlying datafiles. LDAP breaks a number of database normalization rules. For example, 1NR states that fields with repeating values must be placed into separate tables; instead, LDAP supports multi-valued data fields.

Some LDAP server vendors proclaim that directories are somehow faster than relational databases. In some cases, this is true. In other cases, databases are both faster and more scalable. Nothing inherent in the LDAP protocol makes it in any way faster than other data access mechanisms, such as Open Database Connectivity (ODBC). Everything depends on how the underlying data store is tuned.

LDAP lacks features found in relational databases even in cases where LDAP sits on top of a relational data store, as is true with Oracle and IBM directory server products. The LDAP protocol currently has no standard for transmitting the type of information necessary to take advantage of the powerful relational and transactional capabilities present in the underlying data store.

1.2.2 LDAP is not a file system for very large objects

LDAP provides a hierarchical way of naming information that looks remarkably like that found in most file systems. Many people see this aspect of LDAP as an indication that it might be a great way to centrally store files to make them accessible over a network.

In fact, LDAP is not a great way to do network file sharing. Although it allows information (including binary data) to be transmitted and stored, it does not have the locking, seeking, and advanced features found in most modern file-sharing protocols. Figure 1.2 shows some of the disadvantages of using LDAP in this manner.

Figure 1.2
LDAP is not a network file system. Here you see that if you stored a large file using LDAP, clients would need to read the entire file via LDAP rather than page through the applicable sections. If either client died in midtransfer, it would need to start again from scratch.

The Network File System (NFS) and similar file-sharing protocols have this advanced functionality and are well-tested and accepted for use on local intranets. Web protocols such as the HTTP and File Transfer Protocol (FTP) are more appropriate when you're providing Internet access to data on local file systems.

In a similar vein, LDAP is often only marginally useful to store serialized objects, large structured documents (such as XML), and similar types of data in the directory. Because the LDAP server may not know how to parse these blobs of data, it will not be able to search on attributes within them.

For example, if you store XML documents in the directory, you will not be able to search for all XML documents in the directory that implement a particular document type unless you also store the document's type in the directory. Such a process involves duplicating information already stored in the XML document.

Without storing this metadata, the XML document is an opaque object that can only be stored and retrieved in full. By contrast, a good file-based XML parser has the ability to seek through parts of the XML document and retrieve or manipulate only those sections that are pertinent to the current operation. This situation may be changing as LDAP vendors become increasingly XML savvy and begin supporting such functionality as XPath searching.

Note that because the LDAP protocol is separate from the data to which it provides access, it is possible for a particular LDAP server to be extended to handle particular types of objects more intelligently. For example, the server might include an XML parser that indexes XML documents for easier search and retrieval with LDAP. We'll explore this process briefly in the context of attribute syntax and matching rules in chapter 2.

1.2.3 LDAP is not optimal for very dynamic objects

Generally speaking, LDAP is not the place to store very dynamic information. For example, there are a number of reasons it would be unwise to write extensive audit logs to an LDAP entry each time a user accesses a system.

First, most LDAP servers optimize for search performance at considerable cost in write performance. Updating a single attribute in some LDAP environments generally takes a longer time than comparable updates to a well-designed database.

Second, even with high write performance, LDAP as a protocol does not have facilities to ensure that a set of transactions will happen in the right order. This complicates even the simplest updates to dynamic information involving multiple applications or threads. Even a simple counter can get corrupted when two applications try to update it simultaneously.

Finally, even if a particular server supports tuning for updates and adds proprietary protocol extensions to support better locking that allows for better multiapplication updates, using these special features may avoid a major benefit of LDAP. This benefit is the ability of application developers to use LDAP without having to take note of the server implementation being used.

1.2.4 LDAP is not useful without applications

LDAP lacks an SQL-like general reporting language of the kind found with most general-purpose databases. Such reporting languages can often be used to generate sophisticated reports from a database. Because directories are used for more generally useful information, such as account information usable by many applications, this lack of report generation support is insignificant.

Lack of generalized report generation makes it even more important that LDAP directories be built around the notion that applications will be using them. In addition, it's important that LDAP directory services be designed and deployed with full cooperation from the application developers who will use the service.

Although it lacks a general report-generation language, LDAP offers a number of powerful APIs. Many of these APIs are based on well-documented industry standards whose wide acceptance has been one of the strongest drivers of early LDAP adoption. Unlike databases, directories using LDAP have a wire protocol that can be used without using special vendor drivers, making directories important for information that can benefit many applications that otherwise have nothing in common.

Thanks to the ease with which these APIs can be used, a large number of applications now provide native support for LDAP where it makes sense. You can find some of these LDAP-enabled applications, such as those providing shared address book or white pages functionality, on the Internet and in nearly all modern email and web browser software.

LDAP is now mature technology used by a wide variety of applications for many critical purposes. These applications include everything from authentication, authorization, and management of application and operating system users to routing of billions of email messages around the world. New applications are developed every day that ensure that LDAP's importance will continue to grow.

1.3 CURRENT APPLICATIONS

As we just discussed, successful directory services depend on application support. In this section we begin to examine the types of applications that normally leverage LDAP-enabled directories.

1.3.1 White pages

One of the first uses of enterprise directories was to provide electronic shared addressbooks, called white pages (see figure 1.3). LDAP has long been used to provide accessto information that enables white pages functionality. In fact, white pages applica-tionsare the most widely deployed and visible LDAP-enabled applications.



Click here for a larger image.

Figure 1.3
This screen from the Outlook Express email client is an example of a white pages application.

Both Netscape and Internet Explorer have built-in support for searching LDAP directories and presenting the results in the form of an address book. Most email applications released in the past few years provide this same functionality, although some still support their own proprietary standards to remain compatible with legacy workgroup-oriented directories. Figure 1.4 shows how such a client might talk to a directory to retrieve this information.



Click here for a larger image.

Figure 1.4
An address book client talks directly to an LDAP server.

A quick chat with most corporate intranet webmasters would reveal that the most frequently accessed application on an intranet is usually a corporate contact database. Everyone from the mailroom clerk to the CEO needs to be able to locate their peers; therefore, it is the simplest application available to demonstrate the power and simplicity provided by directory access.

Web-based white pages applications are useful for extending LDAP information to points beyond an intranet environment when firewalls or a lack of installed clients prevent pure LDAP communication. Figure 1.5 shows how a web server might act as a gateway for white pages requests from an end-user's web browser.



Click here for a larger image.

Figure 1.5
The same directory shown in figure 1.4, with a web application rather than the end-user's client communicating via LDAP.

Most people already have an LDAP-enabled browser or email client, or can access white pages via a web interface. This simplifies deployment and allows for more widespread access.

In fact, creating an application that can search for information in LDAP is not particularly difficult. The following is a full code listing in Java using the Java Naming and Directory Interface (JNDIJ) for a program that can search for information in an LDAP-enabled directory service:

import javax.naming.directory.*;import javax.naming.*;import java.util.Vector;import java.util.Enumeration;import java.util.Properties;public class SearchLDAP {  public static void main(String[] args) {    String base = "";    String filter = "(objectclass=*)";    Properties env = new Properties();    env.put(DirContext.INITIAL_CONTEXT_FACTORY,            "com.sun.jndi.ldap.LdapCtxFactory");    env.put(DirContext.PROVIDER_URL,"ldap://localhost:389");    try {        DirContext dc = new InitialDirContext(env);        SearchControls sc = new SearchControls();        sc.setSearchScope(SearchControls.OBJECT_SCOPE);        NamingEnumeration ne = null;        ne = dc.search(base, filter, sc);        while (ne.hasMore()) {          SearchResult sr = (SearchResult) ne.next();          System.out.println(sr.toString()+"\n");          dc.close();       }    } catch (NamingException nex) {        System.err.println("Error: " + nex.getMessage());    }  }}

The results of this code are not pretty, but they show how easy it is to tie LDAP into a new or existing application for white pages or other lookup functionality.

Another benefit of using a web-based white pages application is that whereas most browsers and email clients enable LDAP searches, a web-based application can offer a point of self-administration for contact information. Information such as phone numbers and mailing addresses can be managed using a simple interface that is integrated with the search tools. This approach makes it easy for someone to change his or her information quickly when necessary.

1.3.2 Authentication and authorization

It is virtually impossible to discuss user access and system security today without LDAP being part of the conversation. Although it isn't as visible to the casual user, LDAP is emerging as the de facto way to access the identity information and credentials needed to support authentication. Authentication is the process of validating the identity of a user (or any other object, such as an application).

This process allows identity information to be managed and distributed much more easily than via traditional means. Information stored in an LDAP-enabled data store can be segmented for simpler management while presenting a unified view to applications and authentication services.

Using LDAP also has the benefit of reusing identity information. This approach offers a significant advantage over authentication processes that use an operating system or proprietary mechanism. For example, using LDAP allows both Unix- and Windows-based servers running a particular application to authenticate users in the same manner and from the same repository. In effect, application development time is reduced, authentication code is relatively static between platforms, and the administrative cost of managing two identity repositories is removed. Figure 1.6 shows how an application might use LDAP to authenticate a user.

Figure 1.6
Bob Smith uses a browser to access information on a protected web server. The web server first binds to the LDAP directory to authenticate him.

After authenticating, it is possible to use other available information about the authenticated user (such as department, company, age, and so on) to determine whether he or she is authorized to perform a particular action on resources within a particular computing environment or application.

We will cover the use of LDAP as an authentication and authorization resource in chapter 13. This discussion will include more sophisticated authentication mechanisms, single sign-on issues, and many other related security concerns.

1.3.3 Personalization

Once a person has been identified through authentication, it is useful to personalize the user's experience based on their identity and preferences. In some cases, personalization may simply mean placing the current user's name at the top of a web page. A more sophisticated use might be to pull the customer's location information from the directory to prepopulate an order form.

In a complex web environment with a variety of features, LDAP-enabled directories are a useful place to store information about users' preferences. For example, you might allow users to choose a particular product line as their primary interest when a site covers a large number of products.

Capturing this information and enabling access to it via LDAP allows a variety of applications to customize users' experiences based on their interests. Doing so offers an important benefit: personalized content can be consistent between multiple applications.

LDAP has been gaining wide acceptance as a place to store and retrieve personalization information in enterprise applications. For example, most enterprise portals support LDAP as a means of obtaining the information needed for personalization.

1.3.4 Roaming profiles

Closely related in many respects to personalization, but focused more on operational preferences than content preferences, is the concept of roaming profiles. Roaming profiles allow users to authenticate to an application on any machine and get an identical environment. You do so by storing considerable individual configuration options in a directory.

In addition to enabling roaming, directory-based security also offers the potential to lock down certain configuration items or create organizational or group defaults. In environments with less-sophisticated users, doing so makes it possible to update user configurations without a system administrator needing to make a trip to each cubicle or spend time on the phone walking a user through complicated steps within an application.

Few stand-alone applications provide roaming profiles. Part of the reason is that most applications vary widely in their configuration. Thus each application may require additional information in the directory server to enable storage of that application's configuration values.

This requirement showcases a common conflict between application developers, who often want to change schema to meet their applications' needs, and system administrators, who realize that changes in schema require a great deal of administrative effort. The challenge is deciding where to draw the line between generally useful information that belongs in a directory and application specific information that belongs elsewhere. We will discuss this conflict further in chapter 2.

1.3.5 Public Key Infrastructure

Traditional authentication and encryption systems use secret keys. Generally speaking, a secret key system requires both ends of a communication to know a secret password that will be used to hide the communication. The right secret password produces a legible message, which both protects the message in transit and proves that the message must have been written by the other party, because they were the only other ones with knowledge of the secret. This approach works well as long as the secret isn't compromised and you communicate with few enough people that you can remember a shared secret with each one.

Public key technology changes all this and makes the process more scalable. In this system, two keys are produced. One key, called the private key, is still secret. However, unlike the secret key in a shared-secret system, the private key is never shared with anyone. Instead, a second key called the public key is distributed. A public key can be placed in a digitally signed container called a digital certificate. Such certificates are commonly used to distribute public keys.

A successful deployment of public key infrastructure is highly dependent on a well-designed directory services infrastructure. An LDAP-enabled directory answers the question of where to store and locate digital certificates. Centrally storing digital certificates in a directory allows people and applications to find certificates on demand for business partners and peers with whom they need to communicate securely.

In addition to helping you locate certificates for encryption, directories let you find a list of certificates that have been revoked prior to their expiration time. These certificate revocation lists (CRLs) are commonly stored in LDAP-enabled directories.

This book is not specifically about Public Key Infrastructure (PKI), but PKI is one common application that uses directories. We discuss the use of directories with PKI in much more detail in chapter 13.

1.3.6 Message delivery

On the Internet, messages are routed based on the fully qualified host name to the right of the at sign (@). Such routing is typically done by using the DNS to identify the IP address associated with the human-readable fully qualified host name.

Once a message has been routed to the correct machine, it is delivered on that machine based on the username to the left of the @. Many mail systems now support the use of LDAP to determine how to deliver a message.

The delivery process can include advanced operations, such as locating the exact mail drop for the user in a cluster of mail servers. However, the most common usage is for allowing full-name email aliases and implementing email lists.

As mentioned in section 1.3.3, directories can help you target mailings based on information associated with identities. In an LDAP directory, users are often placed together in groups, either as a list of users or as a dynamic specification (such as all users in department A). These groups can be used for authorization, personalization, and even mailing lists.

We discuss group schemas in chapter 2. Examples of managing groups appear in chapter 7.

1.4 BRIEF HISTORY

The previous section makes it obvious that there are a wide variety of uses for LDAP-enabled directory services. Many of these uses first came about with earlier standards—particularly X.500, which we mentioned briefly earlier in this chapter. In this section we will take a quick look at how LDAP came to its latest incarnation.

1.4.1 X.500 and DAP

LDAP is a TCP/IP-based client/server directory access protocol originally based on a subset of the X.500 Directory Access Protocol (DAP). X.500 is a comprehensive set of standards from the ITU Telecommunication Standardization Sector (ITU-T) that describes all aspects of a global directory service. X.500, like many standards, has gone through many revisions; work is still in progress to update it further. As shown in figure 1.7, a client originally talked to an X.500 server using the DAP protocol.

Figure 1.7
The X.500 client uses DAP to communicate with the X.500 Directory System Agent (DSA).

Designed to be the standard directory service for the Open Systems Interconnection (OSI) world, X.500's fortune has risen and fallen over the years, but it still has a substantial following. Early on, X.500 was accepted by many large information technology (IT) organizations as the direction for global directory services. Although early products had their problems, they also showed a great deal of promise. Many large companies and universities implemented pilot projects, usually involving the hosting of white pages.

One big issue arose very quickly with X.500: the fact that its access protocol required an OSI protocol stack and complex binary encoding of structures represented in a language called Abstract Syntax Notation One (ASN.1). Most desktop computers at the time were ill equipped to deal with DAP.

As Internet Protocol (IP) became the dominant networking standard, DAP's OSI origins made it less attractive. Many of the organizations piloting X.500 directories had already adopted IP and were looking for a protocol with less baggage for client access. Even worse, X.500's complexity and the lack of freely available standards documents or easy-to-use APIs made it difficult to develop clients without paying fees to the ITU-T.

As we've stated since the beginning of this chapter, even the best directory is useless when applications are not available to take advantage of it. Several white pages applications were available, but an electronic phone book is often not enough to justify the expense of collecting and cleansing all the information necessary to make a directory truly useful.

1.4.2 A new standard is born

In 1991, after a few false starts with other potential standards, LDAP was brought forth as a lightweight means for accessing the DAP-enabled directories of the X.500 world. The first set of standards, LDAPv2, were eventually defined and accepted by the Internet Engineering Task Force (IETF), an important standards body responsible for many important Internet standards, as RFCs 1777 and 1778.

These standards provided basic authentication, search, and compare operations, as well as additional operations for changing the directory. From the start, LDAP made X.500 more accessible, as intended. Figure 1.8 shows an X.500 server being accessed by an LDAP gateway service that is forwarding requests from an LDAP client.

Figure 1.8
The X.500 client goes away, replaced by an LDAP client talking to an LDAP server. Here, the LDAP server acts as a gateway between LDAP-aware clients and DAP-aware X.500 DSAs.

Almost as important as the protocol itself was the release of a standard API and the production of a client development kit. For the first time, it was possible to access these servers programmatically without wandering knee-deep into an arcane protocol.

1.4.3 LDAP goes solo

As time went by, some people began to wonder what made X.500 so special in the first place. The University of Michigan, which had developed the reference implementation of LDAP, released a stand-alone server called Slapd that would allow the LDAP server to present data from a local data store rather than simply act as a gateway to an X.500 server.

Slapd was followed by a second service called Slurpd, which read the changes from one server and replicated those changes via the LDAP protocol to other Slapd servers. Figure 1.9 shows a typical stand-alone LDAP environment.



Click here for a larger image.

Figure 1.9
An LDAP client talks to a Slapd server. X.500 is no longer involved.

At this point, Netscape hired most of the original developers from the University of Michigan Slapd server to develop the Netscape Directory Server. Netscape, which was riding high with an incredible share of the Internet browser market, decided that networks would require directories and that LDAP, not X.500, should be the standard. Nearly 40 other companies announced support at that time, bringing LDAP the focus and support it needed to become the de facto standard for directory services.

1.4.4 LDAPv3

LDAP may have gained acceptance as a stand-alone service, but it was far from complete. Due primarily to its reliance on X.500 servers to provide the server-to-server communications, access control, and other functionality, LDAP was still only a skeleton of a full directory service by the mid-1990s.

Many interested parties pushed forward with the development of the next generation of the LDAP standards. In December 1996, the new version was published as RFCs 2251 to 2256. These new specifications covered items including the protocol itself, mandatory and optional schema, and LDAP URLs. A set of standard authentication mechanisms and a standard for session encryption were added to the list of core specifications in 2000. Figure 1.10 shows the core specifications that make up the LDAP standard.

Figure 1.10
The IETF has been the primary standards body for most of the existing LDAPv3 specifications. This figure shows a list of published RFCs that are considered the core LDAP standards.

1.5 LDAP REVISIONS AND OTHER STANDARDS

LDAPv3 was considered a great leap forward in several key areas, but it takes more than a protocol to make a directory service successful. It is now up to several standards bodies and industry consortia to enhance the LDAP core specifications and build a framework that allows directories from different vendors to interoperate, or at least share some of the most crucial information in a standard way, and play a more pivotal role in e-business. Figure 1.11 shows some of the many standards bodies and industry consortia that shape directory standards and define best practices in deployment and management.

Figure 1.11
Many industry consortia and standards bodies are involved with LDAP and related standards, but most have a narrow focus.

1.5.1 Replication and access control

Version 3 of the LDAP protocol was greatly improved from version 2, but lacked two important items: replication and access control. The IETF has created workgroups to deliver these missing pieces and others, as shown in figure 1.12.

Figure 1.12
IETF workgroups are trying to fill in the gaps left after the initial publication of LDAPv3.

Lack of a standard replication process has since become an interoperability nightmare as each LDAP server vendor implemented its own proprietary solution. Many products use simple LDAP protocol operations to distribute data as shown in figure 1.13. However, even those solutions using the LDAP protocol sometimes require proprietary controls or attributes.



Click here for a larger image.

Figure 1.13
Supplier-to-consumer replication exists in some products using the LDAP protocol. Unfortunately, most need to use proprietary attributes or controls to get around current limitations in the specifications.

Many parties recognized that replication was critical to obtaining scalability, redundancy, and other important benefits. To resolve this issue, the Lightweight Directory Update Protocol (LDUP) working group was created within the IETF. At the time of this writing, the group has completed draft documents detailing requirements, a model for meeting those requirements, conflict resolution processes, and a protocol specification. The use of replication is discussed further in chapter 6.

In addition to the supplier-consumer model of replication available in most existing directory servers, LDUP was chartered with allowing for multiple directory masters for the same information, which is shown in figure 1.14. It also documents a process for resolving conflicts that may arise when different and potentially conflicting changes are made independently to the same entry on each master. In addition, LDUP defines a protocol that can be used for both supplier-initiated and consumer-initiated replication.



Click here for a larger image.

Figure 1.14
Multimaster replication will allow changes to the same directory tree in multiple directories.

Security was further along in some respects. The Simple Authentication and Security Layer (SASL), originally developed for the Internet Mail Access Protocol (IMAP), was added as a core LDAP standard early on as a way to negotiate an appropriate type of client and/or server authentication and even session encryption.

Developing a standard for access control has proven to be much more time consuming and has produced fewer results. As shown in figure 1.15, such a standard will allow a server to determine if an authenticated entity should be able to read or update a particular entry or an entire portion of the directory.



Click here for a larger image.

Figure 1.15
LDAP access control standards will include a mechanism for determining in advance whether an operation will be permitted.

The task of creating such a standard fell into the hands of the LDAP extensions (LDAPEXT) workgroup within the IETF. This workgroup was formed to handle any extensions needed to the LDAPv3 standards outside of replication. As this book is being written, most activities of the LDAPEXT workgroup have been moved to individual submissions and will likely become an informational RFC rather than a full standard. Some aspects of access control may be pursued as part of the interoperability requirements for replication.

To understand why access control might be bundled with the replication workgroup, think about the fact that any replication of information outside a vendor's products will render that data insecure—other vendors will not know the access control rules of the source data. Any practical solution for replication is dependent on a standard for access control. We will look at access control further in chapter 13 when we discuss directory security in more detail.

1.5.2 Directory Enabled Networking

As computer networks evolve to support more variety and depth of services, the complexity of network management increases accordingly. Most network devices, including routers and switches, have traditionally been configured using command-line shells. Although this configuration enables relatively consistent management of a single device, it does nothing to simplify the coordination of configurations across large numbers of devices. Such coordination is critical when you're enabling guaranteed quality of service and other offerings that span multiple devices.

Directory Enabled Networking (DEN) provides a way for devices to configure themselves based on information in a directory. Originally an initiative from Microsoft and Cisco, DEN is now part of the CIM defined by the DMTF.

CIM is a set of object-oriented, implementation-neutral schemas that represents logical and physical attributes of hardware and software. The DMTF, rather than being protocol architects like the IETF, focused primarily on creating common object definitions that allow two CIM-aware applications to store and use information consistently.

Contrary to popular belief, CIM and DEN are not LDAP-specific information models, but are instead "meta" models that can be specialized for a number of environments, of which LDAP is one. XML is an example of another way that CIM objects can be represented.

Momentum behind DEN as the killer application that would drive directories has died down to an extent over the l ast few years, and most of the work around directories has moved to identity management solutions. In this book, we will not focus on DEN as a specific application due to the current lack of software and hardware that can truly exploit this technology.

1.5.3 XML and directories

The eXtensible Markup Language (XML) is an industry standard language used to define structured documents. It offers a set of common tags for defining data about data, or metadata. This metadata can be used to describe particular document types. Instances of documents implementing these types can then be shared and used by XML-aware applications.

DSML is an XML document type that can be used to create structured documents representing information in a directory service. This information represented in DSML can include both directory entries and schema information. DSMLv2 extends the specification to cover the representation of directory operations. Documents conforming to these standards can be exchanged using non-directory protocols like HTTP, as shown in figure 1.16. Many new services that support DSML are becoming available from both large vendors (Sun and Microsoft) and startups.



Click here for a larger image.

Figure 1.16
Here a DSML-enabled application talks to a DSML service that acts as an intermediary between an LDAP server and the DSML-enabled application.

DSML is most useful in applications that are already XML enabled. These include most modern application servers. DSML is especially useful in cases where direct access to the directory would normally not be permitted. For example, consider a situation in which a firewall is blocking all traffic except HTTP. To get around this limitation, a DSML encoding of a directory entry can be transmitted over the HTTP protocol for interpretation and presentation. Such a situation is shown in figure 1.17. Emerging standards like Simple Object Access Protocol (SOAP) make it clear that LDAP will not be the only standard for sharing directory information in the future.



Click here for a larger image.

Figure 1.17
DSML is useful for sharing directory information across firewalls that might limit direct access to directories.

1.6 DIRECTORY MANAGEMENT

Despite the importance of having well-defined standards, it is rarely the reason for a directory services—related project to fail. Rather, the biggest headache with most new directory deployments is proper management of information in the directory. In the days when enterprise directories were used primarily for storing white pages information, it was often adequate to simply import information into the directory periodically from other, more authoritative data sources. Due to the lack of sophisticated management tools, there wasn't much choice.

Today, directory management tools for users and groups are much more sophisticated. In addition to giving a central administrator the ability to change information about objects in a directory, these tools typically allow for delegation of administrative duties and even user self-management, where appropriate.

This ability to distribute administration works well in intranet and Internet environments, but it is especially critical in extranet environments where multiple organizations are working together, potentially using the same applications and data. In such environments, the segmentation of administration and access is very important (see figure 1.18).



Click here for a larger image.

Figure 1.18
Directories can be segmented such that administration can be delegated to business partners. Such separation may be logical rather than physical.

For example, a car manufacturer with just-in-time manufacturing facilities needs to give its business partners access to certain systems in its extranet. Access to applications on the extranet is controlled based on identities in each of its distributors and component suppliers. Tracking by identity offers audit trails, which will deter a random individual from anonymously ordering unauthorized parts.

The problem is, in addition to the employees at the company, such an extranet environment including suppliers and distributors may include hundreds of thousands, if not millions, of users. Trying to manage all these users centrally would be an incredible effort.

By segmenting users by company and other means, you can push administration of identities to primary contacts within each of the business partners, thereby reducing administrative overhead. Aside from reducing administration costs, this approach also ensures better accuracy by pushing identity management closer to the identities being managed.

Information that is not related to identities and groups can still be difficult to manage with off-the-shelf products. This is the case primarily because little attention has been paid to other advanced uses of directories, such as DEN, which require management of more exotic information.

In chapter 7, we will look at managing all types of directory entries, complete with example applications to reduce manual data entry and allow some degree of user self-management.

1.7 DIRECTORY INTEGRATION

Many organizations spend months designing the schema, entry naming, and other related aspects of an enterprise directory service without considering the need for integration with existing information repositories. What usually results is a well-designed, standards-based directory service that contains stale information and is nearly useless.

Meanwhile, legacy data stores that contain mission-critical information continue to thrive because they contain fresh information, although in a way that is often inconvenient to access from new applications and nearly impossible to access from off-the-shelf applications without substantial custom development. Figure 1.19 shows how this typical scenario plays out.



Click here for a larger image.

Figure 1.19
Data in legacy systems is nearly always more useful than data in poorly integrated new systems.

By designing and implementing an appropriate level of directory integration betweenlegacy data stores and the new directory service, you can dramatically increase thevalue of the new directory (see figure 1.20).



Click here for a larger image.

Figure 1.20
Some level of directory integration is important in increasing the value of applications using new directory services.

Directory integration is far more complicated than simply synchronizing everything from a legacy data store into a newly created directory. It demands that you evaluate the needs of applications that depend on both new and legacy data stores. In many cases, both new and legacy applications that utilize the respective data stores. Very often, these applications need access to some set of the same information.

Without any directory integration, it is often difficult to get more than a small group of pioneers to quickly adopt the new applications. A new application may have substantially better functionality, but without the proper data it will be difficult to move the masses that use the legacy applications to the new environment. This issue is demonstrated in figure 1.21.



Click here for a larger image.

Figure 1.21
It is difficult to move the masses to new applications based around a standards-based directory when important information still resides only in a legacy directory.

By using integration techniques, such as synchronization, you can create a high degree of interoperability between the two environments. This approach, shown in figure 1.22, provides the necessary data flow between the two directories, offering a relatively easy migration path to the new environment. It also ensures that the information in both environments is consistent.



Click here for a larger image.

Figure 1.22
Synchronization is often necessary to offer a migration path from legacy to new applications or interoperability where legacy applications will not be migrated.

Consolidating these two environments can vastly simplify management. For example, you may find a way for a Unix-based system to use the same directory as your white pages application to store password information.

However, not every connected data store is a candidate for consolidation. Take, for example, a human resources application that relies on a set of database tables to store information. It may not make sense from an application functionality perspective for that particular application's data store to be consolidated into an enterprise directory. Some of the information may fit better in relational databases for the reasons we stated in section 1.2.1, whereas other information may not be a good candidate for synchronization because of privacy concerns. So, instead of attempting to directly replicate everything from human resources into the directory, you need a form of intelligent synchronization.

In the area of identity management, directory integration almost always seems like a great idea in theory. For example, the management of users' computer accounts in a particular organization from hire to fire demonstrates the value of synchronization and other advanced integration technology.

Today, it is often necessary to touch multiple data repositories to commit a single change uniformly to all the places that store information about a person. These changes are usually performed by different application and system administrators. In more mature environments, changes may be synchronized with scripts to facilitate this process. When administrators do not coordinate their changes, or if an automated synchronization script fails, the data repositories are no longer synchronized, and at least one of the repositories will contain stale data.

If this stale data is simply a telephone number, the impact is probably minimal. However, if an account must be deleted or suspended due to an employee's termination, the data repository with stale data is at risk from the terminated employee. If the stale data resides in an enterprise directory that is used for authenticating and authorizing users to all non-legacy systems and applications, this one failed change can potentially put the organization's entire intranet at risk. Proper directory integration is key to reducing these types of risks. For this reason, it is important to spend an adequate amount of time planning for integration.

A general integration planning process entails identifying which data elements existin each existing data source, selecting those that should be shared, and mappingbetween the source and destination schema (see figure 1.23).



Click here for a larger image.

Figure 1.23
Multiple data repositories typically store information about a person. Deciding which attributes come from where and mapping them to a normalized schema is an important part of any directory integration process. Note that the word normalized here should not be confused with database normalization rules.

This process and ways of implementing it are described in detail in chapter 7.

1.7.1 Integration via metadirectories

We cannot emphasize enough that the consolidation of all data repositories into a single enterprise directory within even the smallest of organizations is not likely to happen in our lifetimes. Even if it were possible to rewrite every legacy application to use a single standard, different directory and database software is better for different tasks. As shown in figure 1.24, this leads to many different environments within an organization that have different variations of the same user.



Click here for a larger image.

Figure 1.24
Different applications have different data repository requirements. It is not likely that a single data store could accommodate all of them.

In the past few years, a new breed of applications called metadirectories has come to market to remove some of the burden associated with directory integration. Although it may sound like yet another directory, a metadirectory is really a sophisticated directory integration toolkit.

You can use metadirectories to connect and join information between data sources, including directories, databases, and files. The connection process usually involves identifying changes in each data source. Such a connection may be real-time monitoring of changes using a direct access method into the connected data store, an occasional scan of a file-based list of changes, or a review of a full export from the connected data store.

The join process is much more complicated and usually involves several steps. Its most important job is determining that an object in one data source is the same as an object in a second data source. This aggregation of information from multiple data sources is one of the most important features of a metadirectory and the heart of the join process. Other tasks performed by a metadirectory may include unification or mapping of schema and object names, filtering unwanted information, and custom processing and transformation of data. Figure 1.25 shows a relatively logical view of how a metadirectory might work to provide a linkage between key enterprise information repositories.



Click here for a larger image.

Figure 1.25
Metadirectories provide advanced integration capabilities between different types of data stores.

With careful planning, you can create an environment in which users can be created at a single point. Then, the metadirectory service will instantiate a subset of the users' information in other connected data stores automatically, or with very little manual intervention. The actual point of instantiation may be managed by another type of software that handles the workflow needed by this process. Such software is called provisioning software.

For example, if PeopleSoft, white pages, and an Oracle database all use a telephone number, you would like that telephone number to be entered once and propagated to the other data stores. Metadirectories must also handle environments where both Oracle and PeopleSoft would be able to master new changes depending on business rules.

Metadirectories are also proving to be popular in extranet environments where two or more organizations have their own directories and want to share a portion of them with business partners or vendors. Figure 1.26 shows an extranet environment where the addition of Joe Distributor might be propagated to the manufacturer using metadirectory technology.

Figure 1.26
A user is entered into the distributor directory. The metadirectory detects a change and propagates it to an appropriate location within the manufacturer directory.

It is beyond the scope of this book to offer an in-depth look at metadirectory products. However, directory integration is critical, and some of the functionality provided by metadirectory products can be performed with a general scripting language. We discuss such techniques in detail in chapter 6.

1.8 INTEGRATION AND FEDERATION VIA VIRTUAL DIRECTORY TECHNOLOGY

Usually, metadirectories involve the creation of a new, physical directory, the contents of which are based on an aggregation of multiple information sources. One emerging alternative to metadirectory technology is virtual directory technology, sometimes called directory federation technology. This technology attempts to provide real-time directory access to other types of data stores, such as relational databases and memory-based application components. To visualize this process a bit more easily, think of the virtual directory as a kind of proxy server: the application speaks LDAP to the virtual directory software, and the virtual directory software grabs the data directly from the legacy data store by speaking its native tongue. Figure 1.27 shows a directory-enabled application accessing a virtual directory service that is providing data from existing directories, databases, and application components.



Click here for a larger image.

Figure 1.27
Virtual directories (sometimes called directory federators) accept directory requests and transform them into requests for potentially non-directory information.

Virtual directory technology is not as easy as it may sound. Each underlying data store has its own query language and information model. The virtual directory must find ways to optimize queries and map between LDAP and non-directory information models.

At this time, virtual directory technology is in its infancy, as metadirectories were a few years ago. However, it is emerging as another useful tool for providing a unified view of information to LDAP-enabled applications. It is the only way to view information in many kinds of existing repositories using directory protocols in real time.

1.9 WHY THIS BOOK?

People who have worked with directories know that installing and configuring most directory server software is generally the easiest part of a directory deployment. Writing simple applications to query the directory and use the results is also quite easy, once you understand the basics. Trouble begins to brew when it becomes necessary to keep the information in the directory up to date through both front-end data management and back-end integration with other data sources. This book focuses on making your directory deployments more successful through advanced application and interdirectory integration.

Consider that every element of data stored in a directory must be placed into the directory at some point. You can leverage the data that already exists in other repositories, someone can enter it into the directory through an administrative interface, or the data can be generated by an application. In many environments, all these tasks may need to happen to create a suitable directory service. Figure 1.28 shows some of these different techniques for moving information into the directory.

Figure 1.28
Data in directories is synchronized with existing data stores, managed through administration applications, and/or generated in some way.

For new and experienced directory service managers charged with deploying or managing a directory service, these management and integration issues are clearly the biggest challenge. Not having the right information, or having stale versions data, dilutes the value of the directory to all applications that leverage it.

Directory management involves having the right tools and tying in the right information from other, often authoritative, sources of data. In this book, we'll focus on practical solutions to common directory management problems. We will look at Perl code for administration interfaces, directory synchronization, and directory migration. The entire second part of the book is devoted to this topic.

Directory-enabled applications let you use all the information you've been collecting in directories. After all, why collect data if it nobody wants to use it? We'll look at ways to leverage LDAP in a variety of application environments with source code in Java. You'll find that such application integration is key to having a useful and important directory that people want to keep current.

With the information in this book, you'll have information flowing through your directories with much less perspiration. Servers that support the LDAP standard can provide a wide variety of functionality to a properly enabled application. This book aims to help you manage your LDAP directories and enable your applications, both new and existing, to support these directories.

1.10 SUMMARY

The LDAP standard for accessing directory services is important to software developers and system administrators. It can be used through LDAP-enabled applications and various APIs.

A number of different directory services have come into existence in the past few decades; LDAP was derived from another popular standard called X.500. These directory services provide everything from white pages to application security.

Management and application integration are the two biggest issues people tend to encounter when deploying directory services. You can address these issues many ways, as the second and third parts of this book explain.

The IETF has been the driving force behind the core LDAP specifications and many enhancements. Its most important current work is related to replication and access control. Other industry consortia and standards bodies are important in developing LDAP server and application interoperability guidelines, as well as standards that represent data from the LDAP information model in XML.

Metadirectories provide synchronized integration between multiple data repositories, and virtual directories provide real-time integration between applications and existing data via directory protocols. Provisioning tools allow for manual management of the information in directories. Each of these types of tools plays an important role in a well-rounded directory service.

In the remainder of part 1, we will focus on the LDAP standards in more detail, and discuss how to use LDAP tools to communicate with a directory server.

About the Author

Clayton Donley, the co-author of a number of open-source LDAP modules for Perl and Apache, is an independent consultant based in the Chicago area. His clients include Netscape, GTE, and ABN-AMRO. Prior to going independent, he spent seven years in various information technology roles working for Motorola in both the Chicago area and the Asia-Pacific region.

Source of this material

This is Chapter 1: Introduction to LDAP from the book LDAP Programming, Management and Integration (ISBN:1-93011-040-5) written by Clayton Donley, published by Manning Publications Co..

To access the full Table of Contents for the book.


Other Chapters from Manning Publications:

Struts in Action: Developing Applications with Tiles

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date