Performance Improvement: Session State, Page 2
Solutions for Session State Data
If you've got to maintain data it has to be stored somewhere. The question is where should it be stored? Each location has its own advantages and disadvantages. In the following sections we'll look at what the advantages and disadvantages of the different storage options are.
The easiest answer for storing session state in ASP.NET is in process. This is, in fact, the default configuration for ASP.NET. If you don't do anything else you'll get a session object that will allow you to store any kind of session information you want. The benefits of this are that it's easy and it's fast. You can store practically any object you want in the cache and retrieval doesn't require anything special. However, the disadvantages are that it's not shared between servers and it's vulnerable to the application pool recycling — this will throw away all session state data. It's also in-process so it counts against the maximum address space that a process can have. If you've got a lot of session state data this can be a problem.
Out of Process (Memory)
If you can't live with the memory constraints of being in the same process you can use the in-memory state provider that ASP.NET ships with. It works like a persisted (i.e. database) provider except that it's not actually persisted, it's still in memory. This can be important if you're running out of memory, or you want to have your session data survive the application pool recycling. The benefits are that it's still pretty fast because the information never has to be written to disk. However, the disadvantages are that all of the objects that are placed into the session must be serializable. That and because the sessions are not saved anywhere if the session state service is stopped (by say a server crash) the session data is lost.
The next step in the progression to protect the session state is to persist it to a database. While this addresses the problems with a failure, it does so at a relatively high cost. All of the operations to get to the session are now database operations so they're a few orders of magnitude slower than accessing an in memory object. So a persisted session state does protect against a server failure and reboot — it does so at a very significant performance cost. Like the out of process session state all of the objects must be serializeable.
The final option for managing session state data is to transmit it to the client as described earlier in this document. This is really only suitable for a small amount of data as cookies have size limits and they end up being transmitted on each request whether or not they're needed. This can have a big impact on the amount of time it takes the site to respond to a user's click. It's also sensitive to the tampering and snooping we discussed earlier unless encryption (or signing) is used. While this may be required for some types of information like a user's identity or the session identifier, it's not a silver bullet to the problems that persisted session management creates. However, when used in moderation it can be helpful in limiting the amount of data that must be managed via a persisted storage.
Types of Session State Data
Session state is information about the user and their experience with the system. There are a handful of types of information that fall into session state. The first two are authentication information (who the user is) and session identification (unique identifier). These form The basis for identifying who the user is both when they're logged in and when they're anonymous. With that foundation, you can maintain profile information about the user which is typically persisted somewhere and user entered data which may or may not be persisted.
The final kind of information that is typically kept in session state is user cache information. That is information about the user which is just cached and can be regenerated if necessary. It's only being held as an optimization to prevent load on the system. Let's look at each of these kinds of information in turn.
Depending upon whether you're using HTTP authentication or a forms based authentication the authentication problem may be easy — or difficult. When you use HTTP authentication for your site every request to the server includes the information necessary to authenticate the user. That is to say that their username and password are transmitted with each request. Using Kerberos for HTTP authentication is actually slightly different from this as we'll get to in a moment. Using HTTP authentication for a web site means that either the authentication information to the server must be inherently encrypted and unbreakable or means that the transport should always be encrypted.
This is why when using basic authentication (which is unencrypted) should only occur over HTTPS. Similarly, the NTLM authentication that most systems should only be used over HTTPS. Although NTLM authentication is encrypted, the encryption mechanisms were designed decades ago and isn't really strong enough to stand against the brute force attacks that can be leveled against it by today's computing power. Kerberos uses a slightly different approach.
Kerberos based authentication doesn't actually contain the user's credentials. Instead it's a cryptographic package that includes a server's certification that the user is who they say they are. In other words, the username appears but instead of the user's password, there's a note from another server that they are who they say they are and for how long the web server servicing the request should trust that they are who the package says they are. This completely eliminates the need to retransmit the user's password over and over again which limits the extent of the exposure should the package be intercepted.
This is essentially the technique that is used in forms authentication. When the user authenticates via a form, ASP.NET will create an encrypted token that when read back in will confirm that the user is a certain identity. Rather than containing the user's password the package will have been encrypted by the server so if the package decrypts the package successfully it knows that it was the one who created it and thus it's not been tampered with.
The differences between Kerberos and forms based authentication is that Kerberos is transmitted as a part of the authentication header where as the forms based authentication is transmitted down to the client as a cookie which is returned to the server by the browser as a part of the cookies. Kerberos also relies on a third party server for the certification of who the user is where forms based authentication relies on itself having previously validated the users credentials.
In some cases, anonymous access is needed for the site and yet it's useful to keep track of some of the user's settings. In this case it's not authentication that's being transmitted but the tamper resistant session key. This key is designed to allow the system to determine which session is in use and is designed to be tamper resistant so a user couldn't just change the cookie and snoop in on someone else's session.
ASP.NET does an effective job at managing the session ids internally and isn't generally something that an ASP.NET developer needs to worry about.
User Profile Data
One could argue whether user profile data is a part of session data or not. Most systems that you'll integrate to will have a way of storing the user profile data for a user. There's no question that this kind of data needs to be persisted somehow because user profile information is expected to remain around for a long time. Things like a name, email address, and other information is just a part of knowing who the user is.
User Entered Data
User entered data — like data on page one of a multi-page form is data that's necessary for a small amount of time and then is no longer needed. The biggest issue here is remembering to clear the values when they are no longer needed — such as the form is complete. Failure to clear these values out can mean that the amount of data being carried around in the user's session gets larger and larger until it becomes unwieldy.
Depending upon the sensitivity and volume of the information being entered this kind of information is ideal to be pushed to the client. If pushed via additional hidden fields on the form the process is relatively straightforward when filling out a multi-page form because the transitions will be HTTP posts — however, this can get complex if the user is allowed to go to pages in different orders. Writing these values to cookies is an acceptable workaround provided that the information isn't too sensitive and could be stored locally on the user's computer. Sensitive data can still be pushed to the user's machine but only if encryption is managed as well.
The largest area of data that ends up in a session state tends to be user cache information. This information can be regenerated if necessary. For instance, if you could cache user specific promotions, pricing, or menu options. These items could be regenerated if necessary but it's more efficient to hold on to the assembled form for use on the next request. This kind of data doesn't need to be protected at all from a server failure. This sort of data can be regenerated in the event of a server failure.