Thursday, February 25, 2010

Moving From IIS To Apache: It's Easier Than You Think

I'm in the middle of moving some things from physical servers to a VM infrastructure, and one application makes heavy use of URL rewriting and proxying. This is on Windows Server 2003 and when I first set this app up a few years ago, I used ISAPI Rewrite 2 to handle the rewriting and proxying chores. It's been working fine so when I set up the new VM for this app I got a license for ISAPI Rewrite 3 and started configuring things.

I'll spare you all the gory details but yesterday afternoon--a mere few hours before I was set to do the cutover--ISAPI Rewrite started choking hard. I started getting "Bad Request (Request header too long)" errors, but only some of the time even on the same URL, so I hacked the registry as recommended by Microsoft in an attempt to fix it. That was followed with "Bad Request (Invalid Header Name)" errors, which led to another registry hack. This seemed to fix things for a while, but then suddenly IIS would stop responding and throw one of these two errors if I had any rewrite rules enabled. Things continued a downward spiral from there. I even tried installing the older version of ISAPI Rewrite but that would immediately throw a 500 error whether or not any rewrite rules were enabled.

Needless to say I had to cancel the migration, and after the problems with ISAPI Rewrite I had absolutely zero confidence in that solution. There was no way I could move forward knowing that at any moment and without reason the whole thing would come crashing down.

I don't like being backed into a corner, particularly by Windows, so I shut down IIS and installed Apache. This app has a ton of server configuration to it but once I don't trust something I simply can't use it, so the configuration work on the Apache side would be beyond worth the effort since I'd wind up with a solution I can trust. (I would have chucked Windows altogether but not really my call in this case, and given that I'm under a bit of a time crunch that was one more variable I didn't need in the mix right this second.)

Here's the steps I went through, and it actually was easier than I thought it would be.

Download and Install Apache

Actually first, make sure to shut down IIS and set the startup to "Disabled" in your services panel. Now that I have everything set up I'm going to uninstall IIS entirely, but it was handy to have around for a bit so I could fire it up and go into the IIS admin console to check my settings as I moved things to Apache.

So grab the Windows version of Apache (make sure and grab the version with SSL if you need it), run the installer (which takes all of about 10 seconds), and tell it to run as a service for all users. Next make sure when you hit localhost in your browser you get Apache's "It works!" message. Congratulations, you just freed yourself from IIS.

Connect ColdFusion to Apache

This server is running ColdFusion 8 Enterprise, and the OS on the new VM is Windows 2003 64 bit. The easiest way to hook CF into Apache is to open the Web Server Configuration Tool, which is under Start -> Programs -> Adobe -> ColdFusion 8. Since I had previously connected CF to IIS, when I launched the Web Server Configuration Tool it indicated that "localhost:cfusion" was hooked into IIS. I clicked that entry to select it, then clicked "Remove."

Next I clicked "Add" and waited about 60 seconds, and you'll see the "Add Web Server Configuration" screen. Choose the JRun Server you want to hook to Apache from the drop-down (if you have more than one), and choose "Apache" from the Web Server drop-down. Click the "..." box next to the "Configuration Directory" box and browse to your Apache conf directory, check the box "Configure web server for ColdFusion 8 applciations," and MAKE SURE to check the "Configure 32 bit webserver" box. I don't know this for a fact, but I'm pretty sure Apache for Windows is 32-bit. So even though I'm on a 64-bit box, when I didn't check that box Apache wouldn't start. This could be because I need a different version of the JRun shared object ... who knows. Apache's running great so at least at this point I don't have much motivation to look into it.

Also, click on Advanced, click on the "..." box next to "Directory and file name of server binary," and point to your httpd.exe. This way CF can restart Apache after it modifies your Apache conf file.

That's it--pretty simple stuff. Delete the IIS entry, add one for Apache, and you're done.

Basic Apache Terminology

Before moving forward with the specifics of the configuration, if you're used to IIS terminology like "web site" and "virtual directory," you'll be happy to know all that stuff exists in Apache, but it's called something different and of course you'll be editing a config file instead of clicking through configuration wizards. I prefer the directness of the config file approach anyway, and I bet many others will too once you get the hang of it.

Here's the basic terminology mapping between IIS and Apache:

  • a "web site" in IIS is a VirtualHost in Apache
  • a "virtual directory" in IIS is an Alias in Apache
  • a "home directory" in IIS is a DocumentRoot (or docroot) in Apache
  • a "host header" in IIS is a ServerName or ServerAlias in Apache
  • a "default document" in IIS is a DirectoryIndex in Apache
That should cover about 99% of what you need to know if you're moving from IIS to Apache. Apache is tremendously powerful and highly configurable so of course you can get as deep into things as you need to, but that should get most people going.

Before digging into Apache, at a high level all I did to convert things over was to open up IIS Manager and make note of all my "web sites" and their home directories. These will become virtual hosts and docroots in Apache. Next, in each IIS site take a look to see if you have any virtual directories defined. If so, make note of these--they'll become Aliases in Apache.

With that basic information in hand you're ready to configure Apache.

Apache Configuration Files

One of the things I absolutely love about Apache is that you do all your configuration in configuration files. Once you get the hang of this approach, there's just nothing simpler than being able to open a file and make the changes you need instead of clicking through a mess of popup windows to find the one setting you need to change.

The main two configuration files that most people will need are httpd.conf and extra/httpd-vhosts.conf. These are both under the Apache conf directory. httpd.conf is the main configuration file where you set server-wide configuration details. You can actually shove everything in this one file, and that's how things were done in older versions of Apache, but it's much cleaner to keep things in different files and simply enable these additional files within the main configuration file.

I won't give you the full tour of httpd.conf since the docs do a very nice job of that, but I will go over what you'll likely need to edit in order to get things working the way most people want.

Going top to bottom in httpd.conf, the first thing you'll likely want to do is enable some modules. Specifically in this case since I know I'm going to be doing rewriting and proxying, I need to enable those modules since they're turned off by default. In the long list of LoadModule statements in httpd.conf, you'll want to uncomment (i.e. remove the #) these lines:

LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule rewrite_module modules/mod_rewrite.so

Next, if you're doing CFML stuff you'll want to add index.cfm as a DirectoryIndex, so find this section and update accordingly:

<IfModule dir_module>
  DirectoryIndex index.cfm index.html
</IfModule>

You can have as many directory indexes as you want, just separate with a space and realize they will get hit in the order in which they're declared.

Finally, you'll want to enable name-based virtual hosting so you can have multiple virtual hosts sharing the same IP address. Towards the bottom of httpd.conf, find this section and uncomment the Include directive that will load the virtual hosts configuration file. When you're done it should look like this:

# Virtual hosts
Include conf/extra/httpd-vhosts.conf

Save httpd.conf, and now let's take a look at how to configure your virtual hosts.

Virtual Host Configuration

Open up conf/extra/httpd-vhosts.conf so we can configure some virtual hosts. You'll be spending a lot of time in this file as you use Apache. First, make sure this line right after the big comment block at the top is uncommented:

NameVirtualHost *:80

This enables name-based virtual hosts for all IP addresses on port 80. Next you'll see a couple of examples of virtual hosts. You can either delete those or comment them out by putting a # on each line. I tend to leave them in there but comment them out for reference.

For your first virtual host, let's set one up for localhost because (at least in my experience) once you enable name-based virtual hosting, you have to have a virtual host even for localhost. Add the following section, adjusting the DocumentRoot as needed based on where you installed Apache:

<VirtualHost *:80>
  ServerName localhost
  DocumentRoot "C:/Program Files (x86)/Apache Software Foundation/Apache2.2/htdocs"
</VirtualHost>

Save the file, and then restart Apache just to make sure all the changes we've made are working. If Apache doesn't restart don't panic, that just means you have a syntax error somewhere. Double-check everything and try again. If you can hit localhost in your browser and see "It works!", well, that message says it all I guess.

Note that if you have multiple IP addresses on your machine and want to tell a virtual host to use a specific IP, or if you want to run a site on a port other than 80, you can replace the * with an IP, and the 80 with whatever port you need.

Next let's configure a more real-world virtual host. I'll be using foo.com as my example, and we'll want people to be able to hit the site using foo.com or www.foo.com. I'm also going to tell Apache to use a log file specific to this site to make diagnosing problems and doing reporting easier. There are a few other things in here that I'll explain in a moment.

<VirtualHost *:80>
  ServerName foo.com
  ServerAlias www.foo.com

  DocumentRoot "C:/path/to/foo"

  Alias /CFIDE C:/path/to/CFIDE
 
  <Directory "D:/path/to/foo">
    Order allow,deny
    Allow from all
  </Directory>

  CustomLog "logs/foo-access.log" common
</VirtualHost>

The ServerName and ServerAlias information is pretty self-explanatory--foo.com is the primary name for this virtual host, but with the alias of www.foo.com, either foo.com or www.foo.com will hit this virtual host.

DocumentRoot tells Apache where to find the files that it will be serving when someone hits this virtual host.

I threw an Alias in the mix simply to show how "virtual directories" (in IIS speak) work. Let's say in this case I want foo.com to have access to my CF administrator or maybe the javascript files that are stored in the CFIDE directory, but that CFIDE directory is not inside this host's docroot. The Alias directive tells Apache that when someone is requesting /CFIDE on this virtual host, those files will actually be served from somewhere outside the virtual host's docroot.

The <Directory> directive requires a bit of explanation. For security reasons, by default all directory access (other than the default localhost site) is denied by Apache. This is done in the main httpd.conf file, so you can either make the change there, or I prefer to do this on a case-by-case basis inside each virtual host. In the case of a public site you won't know where people are coming from so you have to tell Apache to allow access to that directory from anywhere, which is done with the "Allow from all" line. I left this out, but note that you will likely have to add a <Directory> entry for the C:/path/to/CFIDE directory as well.

Finally, I tell Apache to create an access log specific to this site instead of using the global Apache logs.

For a lot of virtual hosts that's literally all there is to it. But since what started this whole process was rewriting issues, let's take a look at some of the cool things you can accomplish (and shoot yourself with) by using mod_rewrite.

URL Rewriting and Proxying

For the app in question we do a lot of URL rewriting and proxying so we can give the users a single site that actually is comprised of multiple sites, potentially on different physical servers. This is also a great way to handle long-term migrations where you have a legacy server that you don't really want to touch but still need content from, and you want to add a newer server in the mix.

As with everything else related to Apache this is powerful stuff, but the basics are relatively simple. I do love this quote from the mod_rewrite docs, however:

"The great thing about mod_rewrite is it gives you all the configurability and flexibility of Sendmail. The downside to mod_rewrite is that it gives you all the configurability and flexibility of Sendmail.''

Let's start with a basic rewrite rule, and then we'll look at what I have to do a lot of which is proxying. Let's say for whatever reason in the foo.com virtual host you want requests to foo.html to actually hit bar.html. First we need to enable the rewrite engine in our virtual host, so inside your <VirtualHost> block, add this line:

RewriteEngine on

Next we add a simple RewriteRule to tell requests for foo.html to be rewritten to bar.html:

RewriteRule /foo.html /bar.html [NC]

The [NC] bit at the end stands for "no case," so that way both foo.html and FOO.HTML will be rewritten to bar.html. There are a ton of flags to do various things outlined in the docs, and if you want some nice rewrite example examples they have those too.

So far so good? Next let's tackle proxying. Instead of a simple rewrite from foo.html to bar.html, let's say you want everything under a particular directory to be proxied to another server. To make the example more concrete, let's say your company has an intranet on one server and an employee directory that runs on another server, but you want people to be able to access the employee directory directly from your intranet. If you wanted to do a simple redirect from http://intranet/empdirectory to http://empdirectory, that's simple enough:

RewriteRule ^/empdirectory(.*) http://empdirectory$1 [NC,R]

The (.*) after /empdirectory will include anything that comes after /empdirectory, and this is tacked onto the end of the remote URL via the $1. The "R" flag tells Apache to do a redirect for this RewriteRule, and you can even set the status code for the redirect. This does change the URL in the user's browser, however, so what if you didn't want that to happen? This is where proxying comes in.

First, we change the "R" flag to a "P":

RewriteRule ^/empdirectory(.*) http://empdirectory$1 [NC,P]

Now we're proxying instead of doing a redirect (and note that mod_proxy needs to be enabled to use the P flag, which is why we did that earlier), but if this is all you do you'll notice that the URL in the browser still changes. This is because there's nothing in place to handle proxying the response back to the requestor. So we need to add a ProxyPassReverse directive, which will allow us to hit http://intranet/empdirectory and keep that URL while the content is actually served from http://empdirectory.

RewriteRule ^/empdirectory(.*) http://empdirectory$1 [NC,P]
ProxyPassReverse /empdirectory http://empdirectory

With all this in place you can serve content from another server without your users knowing they're hitting another server.

There are about a million and one other things you can do with mod_rewrite, but my only intent with this post was to share what I had to do in my specific move from IIS to Apache in the hopes it might help others who want to make this move.

Conclusion

Even though it was under duress, I'm honestly glad ISAPI Rewrite totally failed since that led me to setting up Apache on this box. After seeing ISAPI Rewrite have its various meltdowns I simply would not have felt comfortable using it. I'm sure I could have contacted support and gotten things figured out eventually, but it took me far longer to write this blog post than it did to switch to Apache, particularly since the rewrite syntax of ISAPI Rewrite is largely compatible with Apache's. I'm going to sleep much better at night knowing Apache is powering this app instead of being constantly worried that ISAPI Rewrite will have another meltdown.

I should have made this disclaimer at the beginning but I am in no way an Apache expert, so if there are different or better ways to do any of this, if anything is explained poorly or incorrectly, or if I omitted any important details, please comment.

Monday, February 22, 2010

Acrobat: Making the Simple Incredibly Obtuse for Fun and Profit


Although PDF documents are usually a finished product that completely embeds fonts and images, Adobe Acrobat Professional has extensive tools that allow you to directly edit and add new texts or multimedia to PDF files. You can not only insert sounds, videos and flash files, but you can also easily add images to any part of your PDF. Here is a step-by-step guide.



Sharing this mostly so I have a convenient place to look this up the next time I need to do this. It is beyond comprehension why such a simple task is so stupidly and unnecessarily obtuse in Acrobat.

The worst part about this is that the default print settings in Reader don't include stamps, so if you send a PDF with images in it, the images won't print by default.

There's gotta be an easier way but knowing Acrobat, there probably isn't.

Sunday, February 21, 2010

Changing the Host Name on CentOS

I'm working on a project that is leveraging a third-party Java library to handle payment processing. This library is a for-pay product and the license is tied to the machine name, so the original license was being used on a local development box but now that we want to move things to the production server, the license wouldn't be valid since the production server's host name isn't the same as the local development box.

Apparently the company who sells the payment processing library can update the license file, but it seemed even easier to simply change the host name of the production server to match what the license is expecting. This way we can move the payment processing functionality to a different machine as needed without having to wait for a new license key to be issued.

The production server in this case is CentOS, so to change the host name you simply update /etc/sysconfig/network with the new host name and reboot.

This is slightly different from Ubuntu, which stores the host name in /etc/hostname. On Ubuntu you can also use the hostname command to change the hostname temporarily, but it will revert back to the value in /etc/hostname when you reboot.

Saturday, February 20, 2010

Resolving CSS Issues With Grails UI Plugin

I'm working on another Grails application and am using the fantastic Grails UI plugin for a lot of the UI controls. Grails UI is a really nice Grails-friendly wrapper around the YUI components and includes things like a dialog box, calendar controls, a rich text editor, and a whole lot more. This was my first real foray into using this plugin, so I started with a simple modal dialog box that would show the contact information details for people in a simple list.

The main point of this post is to outline the simple resolution to the CSS issues I was seeing because it took me a while to figure out what was going on, but I thought I'd outline some Grails and Grails UI magic along the way.

First, in order to use the Grails UI plugin you of course have to install it, which is as simple as:

grails install-plugin grails-ui

Next, on any view page on which you wish to use any Grails UI resources, you have to indicate which resources you're going to use on the page. The nice thing about this is it will only load the JavaScript for the specific UI resources you need on each page. In the case of this example I'm only using a dialog box, so I have this line in the head section of my view page:

<gui:resources components="dialog" />

Also in my head section I need to tell Grails I'll be using some AJAX on this page, so I use the javascript tag to load the Prototype library:

<g:javascript library="prototype" />

With Protoype loaded, pulling the contact details to be shown in the dialog box is dead simple using the Grails remoteLink tag:

<g:remoteLink controller="person"
               action="showDetail"
               id="${person.id}"
               update="personDiv"
               onComplete="showPersonDialog();">${person}</g:remoteLink>

If you're not familiar with Grails, what this does is tells Grails to make an AJAX call to the person controller, call the action showDetail, and pass the ID of the person object. We'll see what's returned by the AJAX call in a moment. The update attribute of the remoteLink tag tells Grails what DOM object to update with the results of the AJAX call, and the onComplete attribute indicates a JavaScript function to call when the AJAX call is complete.

If all I was doing was updating a DIV on the page I wouldn't need this, but since I need to show the Grails UI dialog box after I pull the contact details, I need a JavaScript function to handle that, so I added this to the head section of the view page:

<script type="text/javascript">
    function showPersonDialog() {
        GRAILSUI.personDialog.show();
    }
</script>

Next, let's check out the showDetail action in my Person controller to see what it's doing when the AJAX call is made:

def showDetail = {
    def personInstance = Person.get(params.id)
    if (!personInstance) {
        def message = "No person found with ID ${params.id}"
        render(view:"personDetail", model: [message : message])
    } else {
        render(view:"personDetail", model: [personInstance : personInstance])
    }
}

Here's the simple view that's rendered in the controller action above:

<g:if test="${message}">
    <p>${message}</p>
</g:if>

<g:if test="${personInstance}">
    <p>
        <strong>${personInstance.firstName} ${personInstance.lastName}</strong><br />
        ${personInstance.email}<br />
        Phone: ${personInstance.phone}<br />
        Cell: ${personInstance.cell}
    </p>
</g:if>

And finally, here's the code for the Grails UI dialog box and the DIV that is populated with the view above:

<div class="yui-skin-sam">
    <gui:dialog id="personDialog"
                width="400px"
                title="Contact Details"
                draggable="true"
                update="personDiv"
                modal="true">
        <div id="personDiv"></div>
    </gui:dialog>
</div>

This is all pretty straight-forward. What was happening, however, is when the dialog box was shown, there was no CSS being applied to it. The YUI components that are used by the Grails UI plugin have stylesheets associated with them, but when I checked the source code of the rendered page they seemed to be getting included just fine, and as you can see above I wrapped the dialog box in a div with the correct CSS class, which is yui-skin-sam.

At this point it's important to remember that when a Grails page is rendered it uses SiteMesh, which is basically a templating/page decoration framework. As many Grails applications do, I was using a main.gsp layout page, and each individual view page gets woven into this main template.

Therein lies the problem. As I said above this was simple enough in the end but since it took me a while to figure out I thought I'd share. Even though the YUI CSS was being included in the individual view page with the dialog box code on it, for some reason the CSS wasn't getting applied. I decided to experiment and put the yui-skin-sam class in the body tag in my main.gsp layout page, and this solved the problem.

In my case I didn't have any conflicting CSS involved so this solution didn't cause any issues, but if you have other CSS involved and applying a class to the body tag in the main layout page will cause issues, you can add the additional CSS references after the <g:layoutHead /> tag in the main layout page, and this will allow you to override any CSS that came earlier.

With all this in place the Grails UI components are being styled correctly and they're extremely nice additions to any Grails app.

Wednesday, February 10, 2010

ThirstyHead: Free Webinar: Getting Started with Groovy, Grails, and MySQL (February 18, 2010)



Wednesday, Feb 10, 2010

Free Webinar: Getting Started with Groovy, Grails, and MySQL (February 18, 2010)



On February 18, 2010, join Scott Davis on a Sun/Oracle sponsored webinar: Getting Started with Groovy, Grails, and MySQL. We'll spend some time working with MySQL from Groovy 1.7 scripts (Sql.eachRow(), Sql.withBatch()). Then we'll switch gears and show you how easy it is to skin a MySQL database with Grails 1.2. Hope to see you there!



If you're interested in Grails this would be a great one to attend. Scott's a great presenter!

Monday, February 8, 2010

Windows Server 2003 Security and Files From Other Computers

Another day, another idiotic Windows "security" feature. I'm setting up several new Windows 2003 VMs, so rather than download all the necessary installation files to each machine, I'm copying them from the first one I set up. After mapping a drive and copying some files from the first VM to another VM, I tried to run the Tomcat installer and got the following error:

"Windows cannot access the specified device, path, or file. You may not have the appropriate permissions to access the item."

Even given Windows' stupid notion of what being an "administrator" means (on GNU/Linux either you're root or you're not, which makes perfect sense to me ...), this was a new error to me. Luckily it's easy to work around.

Right-click on the file in question and go to "Properties." At the bottom of the "General" tab you'll see a note next to a "Security" header that reads, "This file came from another computer and might be blocked to help protect this computer." Click the "Unblock" button next to this message and you can execute the file.

Thanks for looking out for me, Windows. Really appreciate it.

InfoQ: Getting Started with Grails, Second Edition - FREE!


Grails is a Java- and Groovy-based web framework that is built for speed. First-time developers are amazed at how quickly you can get a page-centric MVC web site up and running thanks to the scaffolding and convention over configuration that Grails provides. Advanced web developers are often pleasantly surprised at how easy it is to leverage their existing Spring and Hibernate experience.


"Getting Started with Grails" brings you up to speed on this modern web framework. Companies as varied as LinkedIn, Wired, Tropicana, and Taco Bell are all using Grails. Are you ready to get started as well?




The second edition of "Getting Started with Grails" in now available for free on InfoQ, or you can buy the print version for only $22.95. The first edition was great so I'm really looking forward to reading the update.