Saturday, April 4, 2009

Installing and Configuring Apache 2.2, Tomcat 6.0, and Open BlueDragon on Windows 2003 Server

File this one under "you can't win 'em all," or "hell has frozen over," or take your pick (though I refuse to file this under "hat eating" because it isn't a change of opinion, it's just a necessity in this case). Since I'm having to go through and set up Apache 2.2, Tomcat 6.0, and Open BlueDragon on a Windows 2003 Server, I figured I'd make lemonade and write up a step-by-step for those of you stuck in the same situation. So here's how to install all this very cool stuff on a very uncool operating system.


(If you prefer to read this in Google Docs, here's the link.)

All these steps should also work on Windows XP or Vista, which is handy for development purposes if you're on a Windows box, but again I can't really verify. If you're using XP I strongly, strongly encourage you to at a minimum install Apache as your web server. The version of IIS that comes with XP only allows you to use one web site, which for web developers simply isn't acceptable and in my opinion shields people from how things actually work in the real world. Sermon over.

Mac users--you will probably be able to follow along nearly line-by-line on this as well. It seems that the Mac install of Apache is nearly identical to the one on Windows (Apache on Linux is a bit of a different beast), and by default all the proxying stuff we're going to be using is enabled on the Mac. Not sure that's the greatest idea, but that's the way it seems to be on the Mac.

Before digging in, note that there's a LOT going on here, and that you'll reach a couple of points along the way where you may think, "It's working! I don't need to read the rest!" And if that particular setup works for you, that may be true. Realize, however, that even though later steps may supersede earlier ones, this is by design to give people a bit of a crash course in the various options involved and how they all work together. And you might want to read to the end to change a couple of settings so your server doesn't crash all the time. Seriously.

Also note that if you're here to learn how to use Tomcat and OpenBD with IIS, I can't help you. I have to draw the line somewhere. In all seriousness though, if someone has tips on getting IIS hooked into Tomcat I'd love to point people to it; I just don't personally have the experience to help people with that and won't be digging into IIS in this sort of setup. Ever. If you absolutely MUST embark on this fool's errand, here's some instructions that may help. Break a leg (or your registry).

So grab a cup of tea, coffee, beer, curare, or whatever floats your boat and let's do this thing.

Installing Apache 2.2

First, ensure that IIS is not installed, or at a minimum that it's not running and that you have the startup for the IIS service set to "disabled." This is probably obvious, but Apache will be your web server running on port 80, so if IIS is also running on port 80 you'll have a bit of a problem. I personally recommend uninstalling IIS altogether because you won't need it, and if you need other IIS-related services like FTP, there are much better options out there such as Filezilla Server.

Next, head over to http://httpd.apache.org/download.cgi and grab the "Win32 Binary MSI Installer." If you need SSL, make sure and grab the one that has OpenSSL included.


 




With Apache downloaded, run the installer, entering the appropriate information as you go. Note that for the rest of this how-to guide I'm assuming you installed Apache at C:Program FilesApache Software FoundationApache2.2, which is the default. I'll refer to this using {APACHE_HOME} for short, but where the full path is specified, if you installed Apache elsewhere you'll have to make the necessary adjustments.

At the end of the installation process you'll have Apache running as a Windows service, serving up HTTP traffic over port 80. Hit http://localhost just to make sure it's working. If you see this, you're golden:


 




If you don't see that, uninstall and give it another shot. Next we'll need Java.

Installing Java

Since Tomcat relies on Java to run, grab the latest Java SE Development Kit (JDK) from Sun's web site. As I'm writing this the latest version is JDK 6 Update 13. Make sure to get the JDK, not just the JRE. Download the installer, run it, and accept all the defaults or put Java in your preferred location; it makes no difference since we'll be pointing Tomcat directly to Java anyway. As long as you don't get any errors as you go, this step is done.

You may already have Java on your machine. That's fine, as long as it's Java 6 Update 10 or later since previous versions of Java had the classloader bug that made CFC performance not so great. If you have an older version now's a good time to upgrade.

Also a general note about Java since in my experience there seems to be some confusion in this area. You can have as many different versions of Java on your machine as you want. They don't conflict, they don't overwrite each other, they don't get in fights and cause your machine to crash. They just sit in a directory and are used by various application as needed. Now you may have specified a PATH and CLASSPATH in your environment variables that point to a particular version, and that's fine, because we will be pointing Tomcat explicitly to the version of Java we want it to use.

Installing Tomcat

For those of you used to using Adobe ColdFusion, you may not be familiar with what Tomcat is. Tomcat is a Java servlet container, meaning it can serve up Java web applications, and ultimately it's where we'll be deploying OpenBD. CFML engines are nothing more than Java web applications, so even if you install Adobe CF you're getting JRun, which is a JEE server as opposed to a servlet container, but the differences are unimportant for our purposes. So even though you aren't having to install it yourself, you still wind up with a "java server" on which the CFML engine is deployed.

To install Tomcat, go to the Tomcat download site and grab version 6.0.18 (most current as of this writing) of the "Windows Service Installer."


 



 


Note that since Tomcat runs on Java you can actually grab any download you want and it will work, but the Windows Service Installer is what you'll want on a production Windows server so Tomcat is installed, as the name implies, as a Windows service. This also gives you a little monitoring tool that lets you click buttons to stop and start Tomcat as opposed to having to--the horror--type a command in a terminal.

After the download, run the installer and again you can accept the defaults for the most part. During the installation process you will provide Tomcat with a user name and password. This is used to access the Tomcat Manager, and we'll cover that in a moment.

The installation will also prompt you for the location of Java on your machine. Although based on what I read Tomcat 6 can run on either a JRE or a JDK, personally I point Tomcat to the JDK as opposed to a JRE. This is perhaps out of habit since older versions required a JDK, but if you do use the JRE realize you may need to set slightly different variables if you start messing with your settings, specifically JRE_HOME instead of JAVA_HOME. Unless you have a reason not to, I'd recommend pointing to your JDK directory, which you downloaded and installed in the step above.

When the install is done you likely will want to go into your services panel and set Tomcat to start automatically when the machine boots. By default it's set to manual startup.

If everything went well and you didn't change the default port from 8080, which I don't recommend doing unless you really need to, you can now browse to http://localhost:8080 and see this:


 



As with the previous steps if something went awry, uninstall and give it another shot. In my experience this all does go quite smoothly though.


Accessing the Tomcat Manager

Since you provided a user name and password for the Tomcat Manager as part of the installation process, you can browse to http://localhost:8080/manager/html and log in.


 



It ain't fancy, but it does let you see what applications are deployed, active sessions, and some other good information. You can also reload individual web applications from the manager, which is handy if one app needs to be reloaded due to code changes or problems and you don't want to impact all the other applications on Tomcat.

So far so good? Excellent! (You did answer in the affirmative, correct?) Let's get our OpenBD on.

Installing OpenBD

As I said above, OpenBD and all CFML engines are Java web applications, so with Tomcat up and running the "install" of OpenBD is really just a deployment of the OpenBD Java web application, or WAR (which stands for Web ARchive. Yeah, I know--slick acronym eh?).

Much as people complain about the complexities of Java, deployment is one thing Java got 100% right in my opinion. WAR files are actually ZIP files that contain a deployment descriptor, which means any Java server that adheres to the Java standards (and they all do) will understand what a WAR file is and what to do with it. There are differences in how the specific Java servlet container or JEE server will handle the WAR, but we won't dig into that here. Just know that a WAR file is a standard in the Java world, and it means dead-simple deployment. Copy a single file to the right place and your application is deployed.

So where do we get this WAR file? Why from the OpenBD download page of course!


 



 


You can grab the latest stable (1.0.1 as of this writing) or the nightly build; either will work fine. Just make sure to grab the WAR file and not the Ready2Run or source code versions. (In case you're interested, the Ready2Run version of OpenBD is a pre-configured version of Jetty, another servlet container, with OpenBD already deployed.)

With the file downloaded, we now need to deploy it. Tomcat does auto-deployment of any WAR files that are dropped in its webapps directory, so copy the downloaded openbd.war file into {TOMCAT_ROOT}/webapps. Note that for the rest of this how-to guide I'm assuming you did NOT rename the WAR file. This is important once we get into URLs, context paths, and the other fun that comes later. If you know what you're doing and want to call it something else that's fine, just make the necessary adjustments as you go through this.

After copying the WAR file to Tomcat's webapps directory, give Tomcat a few seconds to do its deployment magic. If you're watching the directory you'll notice that an openbd directory will be created automatically; this is Tomcat expanding the WAR file. That's it, you just deployed a WAR file.

You may now browse to http://localhost:8080/openbd and you'll see OpenBD's test page:


 



Congratulations! You now have an instance of OpenBD running on Tomcat.

But what about Apache as the web server? Why is there a port number and that silly "/openbd" bit in the URL? Glad you asked, because the rest of this guide deals with making our base install more "production ready."

Virtual Hosts in Apache

Strictly speaking you don't need to use Apache as your web server. Tomcat has a very nice, very fast, production-quality HTTP connector called Coyote built in, and I'll probably cover using Tomcat without Apache in another blog post at some point. It does simplify the overall setup and if you don't need what Apache has to offer, it's one less moving part in your setup.

That being said, there are plenty of reasons to use Apache. Whether it's familiarity, or you need to run Perl, PHP, etc. on the same physical box as OpenBD, or you need Apache features that don't exist in Tomcat's web server, it's not at all uncommon to use Apache as the front-end web server in a setup like this.

The first thing we'll cover as far as Apache configuration goes is virtual hosting. Simply put, virtual hosting is the way you tell Apache to respond to different host names (e.g. foo.com vs. bar.com) on the same IP address. In IIS speak these are called "web sites," and it's the exact same concept. This is very handy for both production servers and development environments, so as a web developer you owe it to yourself to become well-versed in configuring virtual hosts.

By default virtual hosts are disabled, so the first step is to enable them. Open up {APACHE_HOME}/conf/httpd.conf, which is the main Apache configuration file, in a text editor. Scroll down towards the bottom of the file and look for the line

"# Virtual hosts" (without the quotes), or you could also do a search for that text. The line immediately following is "#Include conf/extra/httpd-vhosts.conf" (again without the quotes).

As you might guess, this directive includes the file conf/extra/httpd-vhosts.conf. The # is a comment in the Apache configuration file, so just delete the # from the Include line (NOT from the # Virtual hosts line since that actually is just a comment line), and save the file.

Any time you edit any of the Apache configuration files you need to restart Apache, but hold off on that for a minute because we have another file to edit. Now that we have virtual hosts enabled, which is what uncommenting the line in the last step accomplished, we need to configure a virtual host for localhost, because otherwise Apache won't know what to do when we ask for localhost in our browser.

Open {APACHE_HOME}/conf/extra/httpd-vhosts.conf in a text editor. At the bottom of that file you'll see some existing <VirtualHost> blocks. You can leave those as is for reference or delete them; at a minimum I'd say comment them out by adding a # to the beginning of each line, but even if you don't they won't be hurting anything.

After the existing <VirtualHost> blocks we're going to add a new one of our own. So add a couple of line breaks to the end of the file, and type the following:

<VirtualHost *:80>
    ServerName localhost
    DocumentRoot "C:/Program Files/Apache Software Foundation/Apache2.2/htdocs"
</VirtualHost>

Let's explain this a bit. The VirtualHost part is probably obvious; you're just telling Apache that you're defining a virtual host. *:80 means "this virtual host applies to all IP addresses on this server for requests that come in on port 80." If you have specific IPs you need to worry about, you just replace the * with an IP address.

ServerName is the host name that comes to Apache as part of the HTTP request, and this server name will be checked against the host name in the request to see if it's the one that needs to handle the request. In this case we're specifying localhost, but you can imagine for a "real" site that would be something like mydomain.com. Note that if you also want this virtual host to respond to other host names, you can use multiple ServerAlias lines, e.g. ServerAlias www.mydomain.com. And finally, DocumentRoot is the path to the root of this virtual host, meaning where the virtual host's files are located.

Note that even though this is Windows we're using forward slashes (/) in the docroot path as opposed to backslashes (). I think you may need to escape the by doubling them up if you choose to use those, so I'd recommend using forward slashes instead, even though that my look odd in a Windows path.

There are numerous other options involved with virtual hosts, so if you want to learn more (and you likely will), be sure and peruse the Apache docs on virtual hosts.

Save the httpd-vhosts.conf file, and at this point you can restart Apache from your Services panel. To restart Apache you can also right-click on the Apache icon that got installed in the lower right of your system tray (or whatever the heck it's called on Windows), choose "Open Apache Monitor" and then click the Restart button.


 



After the restart, hit http://locahost in your browser to make sure everything is still working.

That's all well and good but at this point we still don't have Apache and Tomcat on speaking terms. There are many ways to accomplish this, so let's look at one now.

Basic HTTP Proxying With mod_proxy

The idea behind proxying is, as the term itself implies, configuring Apache to serve as a go-between that handles requests coming into your web server and hands them off to Tomcat. Note that this is a bit different than the perhaps more familiar scenario of letting the web server handle the static content and only leveraging the CFML engine to do the CFML processing based on file extension. You can do things this way as well, but personally I prefer the proxy route because I find it simpler and more flexible. If you're famliar with mod_proxy and HTTP proxying, and you're thinking "but the performance sucks!", A) it doesn't really suck that badly, and B) we'll go a different route in a bit.

Done wrong proxying can be very dangerous so I strongly encourage you to read the Apache docs on mod_proxy. Call my old-fashioned but I like reading the manuals for things I use, at least to the point where I'm dangerous and can write up long blog posts teaching other people to be equally dangerous.

By default mod_proxy is disabled (unless you're on a Mac), and the default settings when you do enable it are safe, but as with everything if you don't know what the pitfalls are you can get into trouble. Specifically make sure you always have ProxyRequests set to Off, which it is by default, so I'm merely reiterating how important this is. If you set ProxyRequests to On, you may wind up with an open proxy server and the miscreants trolling the tubes LOVE open proxy servers.

So let's throw caution to the wind (seriously, it'll be fine) and enable mod_proxy. To accomplish this, navigate to {APACHE_HOME}/conf and open up httpd.conf in a text editor. Scroll down and look (or do a find) for "#LoadModule proxy_module modules/mod_proxy.so" (without the quotes) and uncomment that line.

This only enables the base Apache proxy module, which is necessary for all proxying, but there are add-on modules that handle specific kinds of proxying. At this point we want to do HTTP proxying so we'll need to enable the HTTP proxy module. This allows HTTP requests coming into Apache to be handed off (or "proxied out") to Tomcat. Scroll down a bit more (it's about 5 lines further down in my conf file) and find "#LoadModule proxy_http_module modules/mod_proxy_http.so" Uncomment that line, and save the file.

Don't restart Apache just yet--there's one more step involved.

Proxying to OpenBD From Apache

This is the moment you've all been waiting for (and if I didn't include all my poor attempts at witticism we would have gotten here sooner): we're now going to proxy to OpenBD from Apache. What this will do is eliminate the need to use the port and context path in the URL.

In order to accomplish this we simply need to add a new virtual host to Apache that proxies to OpenBD. Remember that virtual hosts are used to allow Apache to respond to multiple host names, and within each virtual host block we provide settings specific to that virtual host.

Navigate to your {APACHE_HOME}/conf/extra directory and open httpd-vhosts.conf in a text editor. We created a virtual host for localhost earlier so this should be semi-familiar. Scroll to the bottom of the file and add the following virtual host:

<VirtualHost *:80>
    ServerName openbd.local

    ProxyRequests Off

    <Proxy *>
        Order deny,allow
        Allow from all
    </Proxy>

    ProxyPass / http://localhost:8080/openbd/
    ProxyPassReverse / http://localhost:8080/openbd/
</VirtualHost>


Let's look at the new information in this virtual host as compared to the simpler one we created for localhost. The basic VirtualHost, *:80, and ServerName pieces should make sense. Just note the ServerName, which indicates we'll be calling our OpenBD instance via the host name "openbd.local".

Starting with ProxyRequests is where things get interesting. ProxyRequests Off is telling Apache not to proxy requests. Now this may seem like a misnomer, because we are going to be proxying requests, but not in the way that is enabled by setting ProxyRequests to On. By setting ProxyRequests to On, you are allowing Apache as a whole (or a specific virtual host in this case) to act as a forward proxy server. This is bad unless that's really what you want, which 99.9% of the time it isn't. We will be proxying requests but setting ProxyRequests to Off does not disable the ability to use the ProxyPass directive, which is what we'll be using.

The <Proxy *> block tells Apache that it will be proxying all requests for this virtual host. As you can imagine, and as with everything related to Apache, you have a lot of granular control here, but for our purposes we're going to proxy everything out to Tomcat for this virtual host.

Within the <Proxy> block you can allow and deny traffic so you could, for example, only allow proxying if the request is coming from a specific IP or range of IPs. This refers to the remote client IP, so for publicly available sites this means that you have to allow proxying for requests coming from any IP address.

The Order directive dictates the order in which requests will be allowed or denied to proxy. By specifying an order of deny,allow we're telling Apache to process the Deny directives first, then the Allow directives. We aren't specifying any Deny directives in this case, but we do specify an Allow directive of "Allow from all" which, as I mentioned above, is necessary since on a web server we typically don't know where requests will be coming from. Of course if you do know where requests will be coming from and you want to limit access by subnet or something along those lines, this is a handy way to accomplish that.

Bottom line here is we're telling Apache to allow proxying for all requests that hit this virtual host. Check the Apache docs for more info on the Proxy directive, as well as the mod_authz_host docs for more info on the Order, Allow, and Deny directives.

Finally, we need to tell Apache specifically where it will proxy to as requests come in. Remember above when we hit our OpenBD test page with the URL http://localhost:8080/openbd? Well that's the same URL we're going to tell Apache to use when someone makes a request for the virtual host openbd.local. This is accomplished using the ProxyPass and ProxyPassReverse directives.

ProxyPass is the forward proxy action. The / following ProxyPass tells Apache that all requests that come into the virtual host, regardless of subdirectories, etc., are to be proxied to http://localhost:8080/openbd/ Please note the slash at the end of the proxy URL! Things don't work properly without that.

ProxyPass gets requests from Apache to Tomcat, but we also need to do the opposite, meaning proxy the response from the OpenBD webapp on Tomcat back to Apache. This is done via the ProxyPassReverse directive, and as the name indicates this is a way for Tomcat to hand things back to Apache. Technically speaking at that point, Apache is acting as a reverse proxy for Tomcat to proxy Tomcat's response back to the client.

Lots of explanation for a few lines of configuration ... bottom line is that now Apache and Tomcat are now communicating with one another. Make sure and save your httpd-vhosts.conf file before proceeding.

The last step here is to add an entry for openbd.local to your hosts file so you can hit that URL and have it resolve to an IP address. Open up C:Windowssystem32driversetchosts in a text editor, add a line for 127.0.0.1 openbd.local and save the hosts file.

Lastly, restart Apache if you haven't already, and you can now hit http://openbd.local and see the OpenBD test screen. Make note of the CGI variables being dumped on the OpenBD test screen, however, because HTTP proxying is not without its limitations.


Proxying to OpenBD From Apache using Apache JServ Protocol (AJP)

At this point we have Apache passing the entire HTTP request off to Tomcat for processing. As you might guess this isn't the most efficient way of doing things, since the HTTP request is being passed around in all its uncompressed, plain text glory. That being said, I've seen perfectly acceptable performance with HTTP proxying, so if you're a wimp and want to stop there us cool folks who are going to forge ahead will forgive you.

The great thing about technology, and particularly open source, is that someone always invents a better mousetrap. In the case of proxying, that better mousetrap comes to us in the form of the Apache JServ Protocol (AJP for short). As the name indicates AJP is a protocol just like HTTP, but it's certainly not a common one since its sole purpose is to facilitate communication between Apache and Tomcat. Purpose-driven solutions do solve specific problems quite well, however, and AJP is no exception.

Why is AJP better than HTTP proxying? First and foremost it performs better. AJP enables binary, packet-based TCP connectivity between Apache and Tomcat so it's fast. This is in contrast to the plain-text format of HTTP. Using AJP, Apache can also maintain an open connection with Tomcat which avoids the expensive process of a new socket being opened for every request. This can mean you'll have more connections open, but it also means better performance.

Concerning CGI variables, with HTTP proxying things can get a bit weird. When you have Apache talking to Tomcat instead of the client (meaning a web browser) talking directly to Tomcat, it's probably not surprising that some of the CGI variables get lost due to the proxying, specifically these:




  • Remote Client IP (CGI.REMOTE_ADDR) -- this becomes CGI.HTTP_X_FORWARDED_FOR


  • Host name requested by the client (CGI.REMOTE_HOST) -- this becomes CGI.HTTP_X_FORWARDED_HOST, but note that you can use the ProxyPreserveHost directive to retain the remote host


  • Host name of the proxy server -- this becomes CGI.HTTP_X_FORWARDED_SERVER



In addition to performance advantages, AJP also retains all of the original CGI variables by default. There are still some CGI problems, however, specifically with CGI.PATH_INFO, and this can wreak havoc with SES URLs that rely on this CGI variable. We'll address that a bit later. Just know that if you need important CGI variables like those listed above, they are retained by AJP. You can get the full skinny on AJP in the AJP docs.

Luckily since we already have the basic pieces for proxying in place, making the switch to AJP is pretty simple.

First, navigate to {APACHE_HOME}/conf and open httpd.conf in a text editor. Find the line "#LoadModule proxy_ajp_module modules/mod_proxy_ajp.so" and uncomment it. While you're in there you can disable HTTP proxying by commenting out the HTTP proxy line, but note that you DO need the base mod_proxy module enabled to do any proxying, which includes AJP. Having HTTP proxying doesn't hurt anything other than perhaps causing Apache to take up a bit more RAM, so either leave it enabled or disable it if you choose.

Once you have the AJP module enabled, save the httpd.conf file, but before restarting Apache we have to modify our virtual host settings as well.

Since AJP is its own protocol, you're probably already guessing that the http://localhost:8080/openbd/ bit in our proxy settings from above will no longer work (or won't be using AJP at any rate) since it's proxying over HTTP. To use AJP we have to make some adjustments in our virtual host settings, so open up {APACHE_HOME}/conf/extra/httpd-vhosts.conf in a text editor. Find the <VirtualHost> block we added above for HTTP proxying, and change it to the following:

<VirtualHost *:80>
    ServerName openbd.local

    ProxyRequests Off
    <Proxy *>
        Order deny,allow
        Allow from all
    </Proxy>

    ProxyPass / ajp://localhost:8009/openbd/
    ProxyPassReverse / ajp://localhost:8009/openbd/
</VirtualHost>

Pretty darn similar! The only alterations we need to make here are to change the protocol from http:// to ajp:// and change the port from 8080 to 8009. AJP is built into Tomcat and enabled out of the box, so with Tomcat running it's already set up to respond to AJP requests on port 8009.

Save your virutal hosts file, restart Apache, hit http://openbd.local again, and be in awe at the tremendous speed increase. OK, hitting it yourself you might not notice, so just have faith that AJP is indeed going to be a more performant choice than HTTP proxying.

What you will want to make note of on the OpenBD test page is the CGI variables dump. The X_FORWARDED... CGI variables we saw when using HTTP proxying are now gone, and the normal CGI variables are back with their proper values.

But wait! There's more! Things may seem to be working OK at this point (and really they are for the most part), but we still have the little matter of the context path to worry about since it will cause issues in certain scenarios.

Context Paths in Java Web Applications

Let's take a brief detour to discuss context paths in Java web applications, and be aware that this will be far from a complete guide to this topic. I'm only going to explain what I think is the minimum required knowledge so we can understand what issues might come up related to context paths and how we can address them.

As we saw when we first deployed OpenBD on Tomcat what seems like several days ago (how long is this how-to guide anyway?), we accessed Tomcat directly via a URL containing port 8080, but also containing /openbd at the end. The openbd part of the URL is what in the Java web app world is called a context path, and it corresponds to the physical directory name in which the web application is deployed. There is a single ROOT web application in Tomcat that is the one web application that is absent a context path, and you'll see this in action by hitting http://localhost:8080, which displays the welcome page. If you look inside the ROOT directory under Tomcat's webapps, this is where you'll see the code and images for the Tomcat welcome page.

You can certainly run OpenBD as the ROOT application in Tomcat. By deleting the existing ROOT directory, renaming the openbd.war to ROOT.war, and dropping it in Tomcat's webapps directory, you would have an instance of OpenBD running as your ROOT (meaning "context path-less") application in Tomcat. I've found in my own use of Tomcat, however, that I prefer to have my applications isolated as opposed to leveraging a single web application on Tomcat for all my CFML applications, hence why I didn't just tell you to put OpenBD on Tomcat as the ROOT application to begin with.

Since everything is hitting Tomcat on port 8080, you can think of the context path as similar to putting subdirectories in the root of a web server and hitting everything by referring to the server name follows by the subdirectory name. Looking at it from the front end this makes sense; you have /openbd in your URL, and that corresponds to a directory in the docroot of Tomcat. Simple enough. From the backend perspective, however, things work a bit differently, because the backend code isn't really aware of the fact that it's inside this context path.

Let me explain by using the example of creating an instance of a CFC. Consider running this line of CFML within our /openbd context path:

<cfset foo = CreateObject("component", "path.to.Foo") />

As long as inside your openbd directory you have path/to/Foo.cfc, this code will work. Note that you did not have to refer to the CFC by using openbd.path.to.Foo. This differs from a plain old web server, because if you have an openbd directory under your docroot on a web server, and you don't have any virtual hosts dictating different behavior, you would have to use openbd.path.to.Foo to get to that CFC. Since openbd is our context path in this case, and by default an application inside a context path cannot see outside its context path, the area immediately inside the openbd directory is the root of the application as far as the backend code is concerned.

So far so good? Let's examine things from the front end of the equation again. Where things get tricky is with regular front-end assets like images, stylesheets, javascript files, etc. This is because where things like image paths are concerned, the context path does come into play since it's part of the request coming from the client. So whereas path.to.Foo works for a CFC inside the context path, <img src="/path/to/image.jpg" /> will not work; in this case you would need to use <img src="/openbd/path/to/image.jpg" /> instead. The killer is the initial / in that image path. To backend code, / resolves to immediately inside the context path. To front-end code, / resolves to one level above the context path, i.e., to the root of Tomcat as a whole.

How do we resolve this? More importantly, does it even matter? If you've used / in all your front-end code thinking that will get you to the root of your application, then it will definitely matter. If you've left that initial / off all your image, css, etc. paths, then you're safe ... for now. You'll still have problems when we do the last few steps in our setup however, specifically where SES URL parsing is concerned.

Why am I rambling on about all this stuff and having you perform certain steps only to have you change things immediately thereafter? Partially to torture you, sure, but mostly it's because I personally like going through things very systematically so I really understand what's going on and why I am doing each step, not to mention what the pros, cons, and consequences of making various decisions are. Also because if you go through things this way and survive, I hope you truly understand what's going on so you can A) address problems as they arise, and B) extend this knowledge as your configuration needs change.

Our next step--and luckily for both of us it's the penultimate step--is to create a virtual host in Tomcat to set the stage for our final configuration.

Configuring Virtual Hosts in Tomcat

As stated near the beginning of this how-to guide, Tomcat contains a full-blown web server, but we haven't touched that piece of things yet. Since we're interested in removing the context path from our equation, however, we now need to do just a small bit of configuration to Tomcat's HTTP connector. Check out the Tomcat docs on virtual hosting if you want to learn all about configuring virtual hosts in Tomcat, but for the purposes of this how-to guide we'll keep it very simple.

To review what we're doing and why we're doing it, on Tomcat we're going to set up a virtual host so we can point a host name within Tomcat to a specific directory as its root and not have to worry about the context path. The context path doesn't show up in the URL anymore given how we're proxying, but as described above it can still cause issues. Creating virtual hosts in Tomcat is actually very similar to how it's done in Apache; we simply edit a configuration file and restart Tomcat.

Navigate to {TOMCAT_HOME}/conf and open server.xml in a text editor. Scroll down towards the bottom of the file or search for the line beginning with <Host name="localhost"

Scroll down a bit more to find the closing </Host> line, which should be right before a </Engine> line. Right after the </Host> we're going to add a new host to Tomcat as follows:

<Host name="openbd.local">
    <Context path="" docBase="C:/Program Files/Apache Software Foundation/Tomcat 6.0/webapps/openbd" />
</Host>

And that's all there is to it. This is a bit different from virtual host configuration on Apache, but it's probably similar enough that you can see what's going on.

The name attribute in the <Host> tag will be the host name that this host will respond to, which is the equivalent of the ServerName in Apache. Note that the host name on Tomcat does not need to be identical to the host name on Apache since we'll still be proxying to Tomcat from Apache explicitly (we'll see that in a second). In the <Context> tag we set the context path to ""--meaning it won't have a context path!--and since we're losing the context path we need to tell Tomcat where to find the files for this host, which is the path indicated in the docBase attribute. Note again here we're using forward slashes in our Windows path.

Save the server.xml file and restart Tomcat. You can restart Tomcat via your services panel, or Tomcat has a configuration application similar to the Apache configuration application. If you don't already have it in your system tray, go to Start -> All Programs -> Apache Tomcat 6.0 -> Monitor Tomcat. This will cause a second icon to appear on the right-hand side of your system tray. One of these is for Apache, the other is for Tomcat. Right-click on the one for Tomcat and click on "Stop Service." Give it a few seconds, then right-click again and click on "Start Service." (Yeah, doing this from the Services panel is probably easier.)

After Tomcat restarts, check that things are working by hitting http://openbd.local:8080 in your browser. You should see the OpenBD test page.


 



 


The context path is now gone thanks to our host on Tomcat, but we're still hitting Tomcat using the port in order to bypass Apache, which lets us make sure our Tomcat virtual host is working properly. This method of hitting Tomcat remains available even once we hook Apache and Tomcat together, and it's a handy way to test Tomcat without going through Apache. If you ever run into any issues, this can help you troubleshoot which piece of your setup is having problems.

Finalizing the Apache Virtual Host Configuration

By now you're an old hat with this. There's one more tweak we need to make to our virtual host on the Apache side to get it talking to our newly created host on Tomcat. Open up Apache's httpd-vhosts.conf ({APACHE_HOME}/conf/extra/httpd-vhosts.conf) in a text editor and find the openbd.local virtual host we've been editing. All we need to do now is point to the Tomcat host instead of localhost and the context path:

<VirtualHost *:80>
    ServerName openbd.local

    ProxyRequests Off

    <Proxy *>
        Order deny,allow
        Allow from all
    </Proxy>

    ProxyPass / ajp://openbd.local:8009/
    ProxyPassReverse / ajp://openbd.local:8009/
</VirtualHost>

We're still using AJP, and still proxying in the same way, but now we're hitting our Tomcat virtual host without using a context path instead of ajp://localhost:8009/openbd/ as we were before. Note that we still need the port for proxying, but this won't show up in the browser since it's all done behind the scenes.

Save the file, restart Apache, and now--FINALLY!--you can hit http://openbd.local in your browser and see the magic working. The browser talks to Apache, Apache talks to Tomcat, Tomcat does its thing with OpenBD, and all is right with the world.

You're done now! Unless you want to learn the secrets of using CFML-based SES URLs that is ... and oh yeah. You'll want to read the last bit so Tomcat won't crash the second it's under load. (Do I know how to keep you reading or what?)

Search Engine Safe (SES)/Search Engine Friendly (SEF) URLs

This is a bit of a tangent, but some applications and frameworks in the CFML world do their own version of SES/SEF URLs using the CGI variable CGI.PATH_INFO. Well, with all this proxying going on, rather than go into the gory details allow me to summarize: this won't work. Since I'm most familiar with Mach-II's use of SES URLs (although BlogCFC and many other CFML applications do the same thing), I'll use Mach-II as my example to illustrate the problem.

If you enable SES URLs in a Mach-II application, the URL http://localhost/index.cfm?event=foo becomes http://localhost/index.cfm/event/foo. This request will make it through Apache just fine since Apache's not going to handle the request anyway, but once it hits Tomcat and OpenBD, you'll get a 404 because strictly speaking that path doesn't exist, and without us telling the server to translate CGI.PATH_INFO into something that OpenBD can comprehend and handle, it just won't work properly.

All is not lost, however. OpenBD ships with a SearchEngineFriendlyURLFilter and enabling it is as simple as uncommenting a chunk of XML in a configuration file, which if you've stuck with me this long, you're an expert at.

Head over to {TOMCAT_HOME}/webapps/openbd/WEB-INF and open the web.xml file in a text editor. Right inside the <web-app> node you'll see a chunk of XML that begins with the following:

<!--
    Uncomment the configuration of this filter in order to support search engine
    friendly URLs like <a href="http://127.0.0.1/bdj2eeregr/index.cfm/path/info.
http://127.0.0.1/bdj2eeregr/index.cfm/path/info.<br />
So do what the man (er, file) says and uncomment that chunk of XML! I just move the close comment (-->) up to right below the paragraph of explanatory text so I have the paragraph of text intact. Save that file and restart Tomcat, or you can navigate to Tomcat's web-based manager and reload the openbd webapp individually (see "Accessing the Tomcat Manager" above if you need a refresher on that).

Mission accomplished, you now have SES URL capabilities thrown into the mix.

One more tip I'll throw in for free: If you enable SES URLs you may notice you still have problems with things like image paths resolving correctly. Things like <img src="path/to/my/image.jpg"/> will definitely cause issues because based on the SES URL, it will be looking for images in the wrong place.

OpenBD has a nice quick fix for this. Just throw <cfbase> (yes, that's a real tag in OpenBD) in your HTML page's <head> block, and like magic the problem is solved. You can read up on <cfbase> on the OpenBD wiki.

Now, for the final act, let's give Tomcat enough breathing room so it doesn't crash when you try to us it in production. (Aren't you glad you read this whole thing?)

One More Thing ...

Java loves memory. By default Tomcat is given 64MB of memory. Is this a problem? You bet your overflowing stack it is. But luckily it's easy to fix.

On other operating systems you would get the pleasure of editing another text file, but on Windows you get a GUI to accomplish this task. Go to Start -> All Programs -> Apache Tomcat 6.0 -> Configure Tomcat. Click on the "Java" tab and you'll see this screen:


 



 


The settings at the bottom are the ones we're concerned with.




  • "Initial memory pool" (equivalent of -Xms in JAVA_OPTS)


  • "Maximum memory pool" (equivalent of -Xmx in JAVA_OPTS)


  • "Thread stack size" (equivalent of -Xss in JAVA_OPTS)



In most cases you should only have to worry about the first two. All the reading I did on thread stack size basically said "don't worry about it unless you have a problem," so that's good enough for me.

So how much memory should you give Tomcat? Well buddy, how much RAM you got? Seriously though, particularly if you're going to be running Tomcat as the main application on a server, I'd give it as much RAM as you can. On a 32-bit machine with 2GB of RAM (just for a point of reference), I'd set the initial memory pool and the maximum memory pool both at 1024MB. If you have 4GB available, you can probably go 1.5 times that (realize as you approach 2GB allocated to 32-bit Java you're hitting its limits), but Tomcat seems to run pretty darn well in 1GB depending on how much you're hammering on it. If you're on 64-bit adjust accordingly.

Why set both the min and max memory settings to the same value? The theory is that you'll get better performance because Java's going to take up all the RAM it will ever use right when Tomcat fires up as opposed to having to grab more memory as the load increases. Also, if Java tries to grab memory at runtime and that memory isn't available, bad things may happen. Some people also believe that garbage collection runs more efficiently when the min and max memory settings are the same. I've had good luck with setting these to be the same so that's what I tend to do.

So set those values, hit "Apply" or "OK," and then restart Tomcat. You're now production ready.

If you're interested in learning more about Tomcat memory settings and issues, you can read more in the Tomcat Memory FAQs. I particularly enjoy the answers to the "Why do I get OutOfMemory Errors?" question. So helpful, and yet so snarky!

What About mod_jk?

If you're totally sadistic, you may want to dig into mod_jk. Then again, if you really need mod_jk, chances are you care enough to not be using Windows. If you're interested in mod_jk on Linux, check out Dave Shuck's blog. Either he's way crazier than I am or I'm way lazier than he is. This how-to guide's long enough as it is, so maybe I'll cover mod_jk another time. I personally like AJP proxying (and mod_jk actually uses AJP as its protocol) because it's fast, flexible, and easy to set up, but there are cases where mod_jk will be a better (or necessary) choice, e.g. more advanced load balancing, if you need support for larger packet sizes, etc. For now we'll leave it alone. You can thank me later.

Common Errors

Probably the most common error you'll see as you make configuration changes to your setup is 503 errors from Tomcat. This seems to be related to the fact that since we're using AJP, the connection to Tomcat can stay open so it doesn't restart correctly. That's largely anecdotal evidence, but I've even seen cases where according to the Windows service panel Tomcat wasn't running, but it was still serving up requests just fine. A restart of Tomcat, or Apache, or sometimes both usually fixes this issue.

Other than that I really haven't run into any additional problems enough to call them "common errors." It really is a nice, flexible, fast, powerful, and darn stable setup overall.

Where to Go From Here

Even with all this information this is only the tip of the iceberg in terms of what you can do with this setup. I didn't get into mod_rewrite at all, I only scratched the surface in many of the other areas, and I didn't even mention about 95% of what Tomcat and Apache can do, so I strongly encourage you to read the docs or get a good Tomcat book; I like Professional Apache Tomcat 6 (Wrox), but Tomcat: The Definitive Guide (O'Reilly) is good as well.

Thus endeth the lesson. I hope those of you who survived found it helpful. If there are any errors, incoherencies, or other suggestions you have to make this guide better, please let me know.

We now return you to your non-Windows operating system.


Comments



Very helpful, thanks Matt. A good primer on Apache/Tomcat/OpenBD in general, not just for Windows.


One thing I still have questions about is how the WEB-INF directory is used. I didn't want my document root under the Tomcat installation directory. So when I added a HOST to server.xml, I set the docBase to my real web root. But openBD wouldn't work until I copied my WEB-INF directory there. Can you explain this a little?





@Ryan--using this setup the web app won't work unless it has everything it needs in the docBase you set. This isn't the same thing as what you may be used to with a traditional ColdFusion setup. WEB-INF is where all the JAR files you need are, the configuration files, etc. so it won't work unless that's all present in your docBase.


Short answer is to start thinking of your web applications as more self-contained entities, which is the more traditional way of doing things in the Java world.





Matt, my alternate document root with a copied WEB-INF directory does not appear to be a self-contained entity.


I notice that when I browse to /openbd/bluedragon/administrator on the site, I'm actually browsing the administrator files that are under the installed Tomcat root. (I can make changes to those files and see the modified text). So there is an outside dependency there.


That doesn't concern me much, I think its probably a good idea to have all the sites using the same administrator code. But I also notice that when I change administrator settings on my site, the bluedragon.xml file that gets updated is under {TOMCAT_ROOT}, not under my alternate document root. So its definitely not self contained. Where did I go wrong? Do I need to change something in the copied WEB-INF? Maybe something in web.xml? I looked in there but there are no absolute paths.


Could it be my HOST declaration in Tomcat's server.xml file? I notice all my HOST entries have appBase="webapps". I'm not sure what this setting is referring to.


Thanks for your help.





I guess my question is why are you using an alternate document root instead of just having everything live in the Tomcat webapps area? And if you're browsing the wrong administrator, there's something else going on with your setup that I'm missing.


What do you mean by "Tomcat root" specifically?


Wow--if you're hitting the administrator and the wrong bluedragon.xml file is getting updated, I'm really missing something with how you've done your setup, or there's something really wrong with my how-to guide. ;-)


If you have EVERYTHING under a self-contained webapp in Tomcat, you won't have these problems.


Why did you put an appBase of webapps in all your hosts? Unless I'm missing what you're trying to do you don't need it. You can read more on that here:


http://tomcat.apache.org/tomcat-6.0-doc/config/host.html


The only host on my setup that has an appBase specified is the localhost host--other than that I just leave it off.


If there's something confusing about the how-to please let me know, and if you need further assistance I'm happy to help.





My Document Root is not under the tomcat installation directory because thats just not how I organize my stuff. My websites are located in D:websites (many people also use C:inetpubwwwroot) on my desktop, and in /home/WWW-data on my server. Having the web site located in under the Tomcat installation directory doesn't seem very logical to me, but I can't exactly say why. If it were a PHP site would you want all your *.php files located under the PHP installation directory?


By "Tomcat root" I meant something like C:developmentTomcat6, the installation directory I choose when I installed Tomcat. The administrator files I was referring to under the Tomcat root were in C:developmentTomcat6webappsopenbdbluedragonadministrator.


But I solved the problem of the wrong bluedragon.xml file by removing the appBase="webapps". That was there because I copied it from some other example, maybe when I was setting up Railo.


And I had read that Tomcat document page many times trying to figure out what appBase really indicates, but I just wasn't understanding it.


I also noticed that after removing appBase="webapps", I had to copy the bluedragon directory into my Document Root (the bluedragon directory that contains all the admin cfm files).


So anyway I'm good to go now, thanks a lot.





Regarding the logic of putting stuff under Tomcat, IMO you're still thinking in the traditional CF way, not in the Java web app way. It is TOTALLY NORMAL (and IMO preferred) in the Java world that you DO have EVERYTHING (including your CFML engine!) as part of your Java web app. Then deployment is as simple as dropping a WAR file on your servlet container. The PHP example isn't exactly apt because it works the same way traditional "use the same CFML engine for all my apps" does, and the way PHP does things isn't the way Java does things.


Note that you can set things up in this more "traditional" (as far as non-Java stuff goes) way with OpenBD--that's how our "Ready2Run" download is configured. One OpenBD instance, multiple applications. But that's not really the norm in the Java webapp world.


Also, you can do things like drop OpenBD in as your root application in Tomcat and use rewriting to hand CFML files off to OpenBD for processing, or use mod_jk which is more like the JRun web server connector. Some people prefer that, but it's more of a headache to set up, and I prefer having everything self-contained.


But I'm glad you're bringing all this up--it's a different way of doing things as compared to what people are used to with CF, and it's far more normal in the Java web app world. The way CF is configured has gotten people used to doing things in kind of a "non-Java" way.


That being said, I suppose what you might prefer is using mod_jk which would have things function more like what you're used to. I'll have to dig into that at some point.


Thanks for reading and thanks for all the feedback! As always, happy to help further if need be.





BTW, if you are installing Tomcat on Windows you may run into the issue I just did, where it couldn't detect my timezone correctly. I wrote a blog post about the issue this morning: http://www.stillnetstudios.com/2009/04/11/time-off-railo-bluedragon-tomcat





Thanks Ryan! Good to know.





Matt,


Thanks very much for the presentation and the blog post. Very helpful. One question. What I would like to do is work with one codebase and run it on 2 (or 3) CFML engines to check compatibility as I develop. Is this doable? The hitch might be that the WEB-INF directories couldn't co-exist if they need to share the same parent directory, but maybe this could be worked around using mappings so they can be in separate locations.


?





Nando, I've thought about this too. I think it could be done on Linux or OSX using symbolic links, if you are willing to put your application in a subdirectory. If you don't know what a symbolic link is, its kind of like a Windows shortcut, but much more powerful.


For example, say your web app is located in /var/www/html. You could setup three tomcat applications like:


/opt/tomcat6/webapps/openbd


/opt/tomcat6/webapps/railo


/opt/tomcat6/webapps/cfusion


Then inside each of those folders you would make a symbolic link to your application files in /var/www/html:


ln -s /var/www/html /opt/tomcat6/webapps/openbd/myapp


ln -s /var/www/html /opt/tomcat6/webapps/railo/myapp


ln -s /var/www/html /opt/tomcat6/webapps/cfusion/myapp


Then browse to the sites like:


http://localhost:8080/openbd/myapp


http://localhost:8080/railo/myapp


http://localhost:8080/cfusion/myapp


You'll be browsing the same CFML files using the different engines.


Maybe Matt knows of a better way. I'd love to see a way to do this on Windows.





My initial thought was the same as Ryan's--you can do it with a symlink so long as you don't mind your app being in a subdirectory.


You might also be able to get creative with aliases/rewriting at the web server/tomcat level and potentially accomplish this but I'd have to think on that one.





Hi Matt,


First of all, thanks a lot for posting such a detailed tutorial. Even though I'm not on Windows (but OSX), I learned a lot from this post and your presentation on CFMeetup.


One thing I'd like to point out about the SES URL issue. I set up a Tomcat/Railo installation and I found that I could have URLs ending in /index.cfm/handler/action (while testing a ColdBox app) if I simply added the following servlet mapping in the web.xml file of my application:


CFMLServlet


/index.cfm/*


And my cgi.PATH_INFO variables were coming through just fine. I did this hitting Tomcat directly, though, without proxying (i.e. <a href="http://mysite.local:8080/index.cfm/handler/action).

http://mysite.local:8080/index.cfm/handler/action).</p>




Whoops (I wondered of those angle brackets would show up). Here again is what I added to my web.xml file:


<servlet-mapping>


<servlet-name>CFMLServlet</servlet-name>


<url-pattern>/index.cfm/*</url-pattern>


</servlet-mapping>





Thanks Tony--great tip!





Ok, I found a way to share code between the CFML engines using symlinks like we discussed, but on WINDOWS! Evidently there is a tool from sysinternals called Junction that allows you to make NTFS "junctions" which are similar to symlinks on Linux. I said similar, not the same - evidently when you delete a junction it deletes the folder it was linked to! So be sure to use the junction tool to delete it, don't just delete it from Windows Explorer.


http://technet.microsoft.com/en-us/sysinternals/bb896768.aspx





I was unable to get Tomcat to start as a windows service on my KickAssVPS account.


I discovered that I needed to copy msvcr71.dll from {JAVA HOME}bin to {WINDOWS}system32.


After that, Tomcat fired right up!





Thanks Christian--a quick Scroogle shows that mscvr71.dll is part of the Windows Visual C libraries, so I'm a bit confused as to A) why that was in your java bin directory in the first place, and B) why it would have anything to do with running Tomcat.


Did you update your environment variables so Windows knows where to look for Java? And did you download a fresh copy of Java from Sun's web site?





Hmm...that's interesting. I agree that doesn't make much sense, either given it's a C library.


I am using the latest Java JRE (jdk1.6.0_14, as of this writing) but no, I didn't update my environment variables.


Instead, I used Tomcat's "configure" window to make sure the path the {JAVA}jrebinclientjvm.dll was set correctly.





I don't think you have to point to a DLL in the Tomcat settings, though I'd have to double-check a Windows install to be sure. You should just have to point to the Java lib directory.


You're probably better off downloading a full JDK instead of using a JRE and pointing to the JDK in your environment variables on the server as opposed to pointing Tomcat to a DLL.





these are some great tips. I will be installing apache in the future. thanks alot.





Matt - Did you do a post over eliminating the port number and context path for Tomcat? If not do you have some resources.





@Mike--that's covered in this post if you use Apache as the web server. If you're wanting to run Tomcat standalone on port 80 that's doable too, but you'd have to read the Tomcat docs for that. I didn't cover that here.





Thanks. I found some good info on Tomcat from Corfield's "Railo For Dummies" series.



3 comments:

Kavitha Ravinuthala said...

Hi Matt,
We are getting out of memory issues in tomcat 6.we are on 64 bit with 8 gb ram.can you please let me know any use case that provides necessary information as proof so that accordingly we can upgrade ram from 8 gb to 16 gb in order to avoid out of memory perm gen space errors.

Kavitha Ravinuthala said...

Hi Matt,
We are getting out of memory issues in tomcat 6.we are on 64 bit with 8 gb ram.can you please let me know any use case that provides necessary information as proof so that accordingly we can upgrade ram from 8 gb to 16 gb in order to avoid out of memory perm gen space errors.

Matt Woodward said...

What do your settings look like currently?