Enterprise Development Planning
The natural progression in the career of most developers goes from uncomplicated procedural programming and desktop applications that use only the resources of the local computer ("monolithic applications")—to object-oriented programming and wide-ranging enterprise software systems spread across possibly thousands of computers encompassing multiple physical locations. Accordingly, enterprise development introduces developers and architects to obstacles they won't find with desktop development. Much like human beings, software is said to have "matured" as it can do more and becomes more reliable and more robust.
So, take a look at what separates the enterprise from the desktop. Having recently spent several months prototyping, developing, testing, and deploying an enterprise software system aimed at Fortune 500 businesses, here are some concerns I had to address up front:
- Scalability Requirements (handling hundreds of concurrent users)
- DBMS Neutrality (must work with any OLEDB data source)
- Distributed Architecture Requirements
- Security Requirements (including accessibility/visibility of data by users)
- Unicode Support Requirements
- API Requirements (for third-party application integration)
- Software Configuration/Settings
- Deployment Requirements (Setup, Web UI vs. Thick "Fat" client)
- Architect Experience/Knowledge (Distributed Software Architectures)
- Developer Experience/Knowledge (OOP, COM, .NET, SQL, and so forth)
- Development Methodologies/Practices/Tools
This is not a comprehensive checklist for enterprise development by any means, but it serves the purpose of showing that enterprise development is difficult and requires some forethought. This list does not include schedule, budget, or resource constraints—which usually add more complexity to the project.
As you may have noticed, I listed architect experience separate from developer experience. I did this because I have seen projects with brilliant developers fail because the person that architected the system did a poor job. It is much easier to overcome a lack in development experience than a lack in architecture experience. Good architects are rare and possess more soft skills that, in many cases, can only be learned, not taught. Development experience is not so fickle and most times can be remedied by books, training, or mentoring. Although a discussion of hiring and interviewing practices is outside the scope of this article, it suffices to mention that getting the right people for the job can make or break a project.
Now, let me enumerate the list in more detail, starting from the top. My goal is to cover a wide range of topics, thus I will abstain from going into any real depth on the particulars. I would recommend consulting Google or Amazon.com for resources specific to any topic discussed hereafter.
If you're building a software system that could, for any conceivable reason, end up needing to support many concurrent users, you're going to want your system to be as scalable as possible. Don't make the mistake of not anticipating the number of users to grow. Even if your system starts off supporting a small user base, you probably will not want to re-design the system a year later when that same group has 500 people, all depending on your system to do their jobs. Here are some development tactics that will help you achieve scalability for applications, no matter what size:
- High Availability State Servers: If you are building a Web application, make sure you are using a state server that scales. In other words, you must be able to manage state across a Web farm (sometimes referred to as a "web cluster"). Instead of using the default in-process state handlers provided by ASP.NET, PHP, ColdFusion and other Web development application platforms, look into a state server that uses a database, an out-of-process state server, or a third-party state server. This will give you the ability to restart your Web server without losing session state data. In some environments, this is not just important, but required by the nature of the data being handled and the potential length of user sessions. When session state is being managed on a separate machine, keep in mind issues such as network latency and security. An ideal setup is for the state server machine to be on a private network accessible by the Web server. You'll also want to make use of connection pooling if you use a database as your state server.
- Database Connections: The rule with database connections is, "Acquire Late, Release Early." That is, don't open your database connections until absolutely required, and close them as soon as possible. You might also hear this rule said as "Get in, then get out!" in regards to retrieving data. Avoid server-side cursors and operations that require keeping an open database connection. There's also a chance you'll make friends with a DBA or two when developing with this mindset.
- Distributed Architecture: Typically, the more you can spread out the computing load across machines, the more you can scale. Ultimately, each application and network topology contributes to a unique environment that determines requirements. Having said that, you will generally be best equipped to scale if the components from each logical tier of your application can be configured to run either on the same machine or on a different machine for each tier. I will expand on this further in a section devoted to this topic.
Every enterprise-class system I've built or used stored its data in a relational database. Depending on the market or industry you're targeting with the software you're building, you might be fortunate enough to support only a single DBMS. If you're not so lucky—and few of us are—you're going to need a strategy for supporting multiple database systems with a single code base. What helps you do this?
- Avoid ODBC: ODBC Drivers have limitations dealing with large binary data types. Use OLEDB drivers instead. If you're going to use ODBC anyway, consider providing a tool that will allow you to switch drivers. You never know what data types you could need in the future.
- Decouple your SQL: Most developers have heard the rule of not hard-coding SQL into your code. So what should you do then? One strategy is to store all of the DDL and DML in a separate file accessible by your application. Ideally, this will be an XML (.config) or INI file. This file can contain all your SQL statements for any DBMS you support. You can also use a set of characters as a replacement token in your SQL file so that you can have parameterized queries. For example, ("SELECT Field1, Field2 FROM MyTable WHERE Field3 = '~XYZ~' ") where ~XYZ~ would be the token replaced by a value you supply at run-time. I've done this before and results were an XML file about 200K in size that contained over 500 queries. I spent about two hours to develop a utility that helped me easily manage the XML query file during development. This approach also allows you to fix SQL bugs without recompiling any of your code. Don't forget to wrap your important database actions within transactions.
- Avoid Stored Procedures: This is controversial advice, no doubt, but I'm definitely not the only one advising this. When the majority of your SQL is ad-hoc, using stored procedures will not provide a significant performance or security benefit. Additionally, if you have to support more than one DBMS, (for example, SQL Server and Oracle) you'll have to translate all of your stored procedure T-SQL to PL/SQL. Sometimes it makes sense to include specific T-SQL or PL/SQL functions in your SQL statements—like CONVERT() or TO_DATE() to handle dates—but use them sparingly. Also note I have used the word 'avoid' with regard to stored procedures. Some situations might require their use and you just have to bite the bullet.
- Research DBMS Differences: Familiarize yourself (or your team) with the differences that exist among the popular database systems. For example, SQL Server's auto-incrementing (Identity) fields don't exist in Oracle. Instead, you'll have to use Oracle sequences along with insert triggers to automatically get a primary key value inserted. Knowing these idiosyncrasies will allow you to make design accommodations for them early.
Distributed Architecture Requirements
Until the last five years or so, developing, testing, and deploying distributed applications was one of the most complicated things you could attempt to do (some might argue that it still is). You have to worry about the reliability of the network and the performance of communicating with remote servers compared to local machine processes. During the initial stages of system design, you have some choices to make with regard to your architecture.
- Require client-side database drivers? With a Web UI, the answer is an obvious no (with the possible exception of embedded ActiveX controls). For a thick client, the answer is not so clear cut. If you don't want to require client-side database drivers, you'll need a way to instantiate objects remotely from a machine that has the database drivers—then pass them to the client. This can be done with technologies such as .NET Remoting, DCOM(COM+), Java RMI, or CORBA. The distributed technology you choose will most likely depend on your development platform.
- Which Protocol/Port? Most distributed technologies allow you to use different ports (or channels) to pass serialized objects back and forth. Should the application be able to work over the Internet or only within an intranet? Does your application need to bypass firewalls?
- Deployment Risks: Most distributed objects must be registered somehow on client machines. This has the potential to make deployment more complex. You'll need a strategy to keep the deployment risks minimal.
- Interoperability Requirements: Discussing interoperability could take up an entire article (or book) by itself, so I'll summarize by mentioning that you want to use the simplest data types possible at your endpoints. Avoid returning types such as DataSet from your endpoints if interoperability is a goal of your system.
Commonly, software security means authentication and authorization. Make the user prove who they are (authenticate) and then check their privileges (authorize). These two ideas, along with user roles and groups, have been around for a long time and are pretty much assumed to be in place for any enterprise software package. What is emerging more recently, however, is smart, flexible data security. Organizations want to be able to define (authorize) who gets to see the many pieces of data they collect. To help them achieve this with your software, here are some ideas:
- Support Active Directory: Not only do users have enough user name/password combinations to remember already but, if your system requires managing its own user names and passwords, you're potentially requiring someone to spend several hours inputting names and, essentially, duplicating existing data. Try to avoid this.
- Don't Require Active Directory: Don't forget that some networks aren't running Active Directory. Have a configurable backup plan for managing user information.
- Simple and Extendable: Keep the security model simple and extendable. One of the most easily maintainable security models I've implemented consisted of defining all actions that could be taken against entities in the system (View, Edit, Create, Delete, and so on) and then allowing the admin user to define whether those actions could be taken (or not) for each entity. A "Role" then consisted of the combined settings for allowable actions upon entities. Users then could be added to roles. If for some reason we wanted to add another entity to the system, it was simply a matter of adding a row for the entity to the correct database table. All actions for the entity were already defined.
The other point to be made here is the word data in flexible data security. Allowing admin users to define wildcard strings to be used in data retrieval (SQL statements) provides the utmost power. For example, allowing an admin to express the requirement that users in the operations role should only see rows where a particular field starts with the letters 'OPS'.
- Maintain Consistent Standards: If administrators of your system are network administrators also, they're going to be used to things like denied permissions overriding approved permissions. Make sure your security works the same way to avoid confusion.
- Use Application Roles: Using application roles for your database connections helps further security by allowing you to define permissions for everyone with a single account. It's also possible to define database-object level permission with an application role to avoid costly accidents involving deletion of data. Create a standard user application role and deny delete permissions on any tables that standard users should never need to delete information from.