Thursday, October 29, 2009

Damien Katz: Koala on the loose!

Ubuntu 9.10 Karmic Koala has just been released. This is big news as this version includes Apache CouchDB, used as a replicable database by desktop apps. This means CouchDB will be on over 10 million desktops. Nice :)

WOW! Didn't realize CouchDB was standard in Ubuntu now. Very cool. I'll have to look more into how exactly it's being used. Congrats to the CouchDB team!

Ubuntu 9.10 = easiest, cheapest upgrade ever | Education IT |

The point is that I don’t need to budget for this upgrade. I don’t need to obtain volume licenses and decide where to deploy them. I don’t need to do anything except click the Easy, errr, Upgrade button. Even if we had to pay for it and properly license it, wouldn’t it be slick if we could open Windows Update in XP or Vista, choose an optional OS upgrade, enter our volume license key, and then walk away?

Of course, it would be even slicker if it was free.

Every time I *don't* have to hunt around for license keys, worry about if deploying another VM will put me over a volume license limit, etc., etc., etc., I'm so glad I'm using free software for the vast majority of what I do.

The Silent Number: Top things to do after installing Ubuntu Linux 9.10 Karmic Koala

This list of the top things to do immediately after installing your newly acquired copy of Ubuntu doubles as a general list of great software to try out and use, complete with links to any special instructions on how to set them up, Terminal commands for those who prefer a command-line interface (CLI), and when available, personal package archives (PPA), repositories to keep the applications at their newest version, not just the security updates provided for you by default. Feel free to pick and choose; enjoy!

Great "to do" list after installing Karmic. Even if you consider yourself a seasoned Ubuntu user check this out--there might be some apps in the list you weren't aware of!

Nadir Of Western Civilization To Be Reached This Friday At 3:32 P.M. | The Onion - America's Finest News Source

Experts predict that the penultimate catastrophe will occur at approximately 7:15 p.m. Thursday night, when the social networking tool Twitter will be used to communicate a series of ideas so banal they will instantaneously negate the three centuries of the Renaissance.

This would be funny if it weren't true.

Why can't some people make Windows 7 work?


I don't speak Japanese, but "fail" is the same in any language.

Wednesday, October 28, 2009

Sequoia To Publish Source Code For Voting Machines

"Voting machine maker Sequoia announced on Tuesday that they plan to release the source code for their new optical-scan voting machine. The source code will be released in November for public review. The company claims the announcement is unrelated to the recent release of the source code for a prototype voting machine by the Open Source Digital Voting Foundation. According to a VP quoted in the press release, 'Security through obfuscation and secrecy is not security.'"

Not that I care about Diebold, er, Sequoia releasing their source code because if anything ever needed a ground-up grassroots effort it's voting machine software, but has this been a great week for FLOSS FUD killing or what?

First we get the White House moving to Drupal, then we have the Department of Defense stating they prefer open source because it's more flexible and secure, and now this. Very cool.

Personally I think this is a pre-emptive PR move on Diebold/Sequoia's part because of the previous announcement by the Open Source Digital Voting Foundation, so it remains to be seen if they ever actually open source the code. All they say is they have "plans to" do so.

Tuesday, October 27, 2009

powdermonkey: Department of Defense New Guidance on Open Source Software

The DoD CIO office (or ASD-NII) just has posted new open source software guidance for the whole Department of Defense! Only took about 18 months to get through, so worth it. Hopefully this puts the FUD to bed.

If it's secure enough for the Department of Defense ...

I also love that they're embracing the idea of open source as a way to better anticipate new threats. All one needs to do is look around to see that proprietary software has far more exploits than open source software, but that's still a common point of FUD.

OpenBlueDragon releases 1.2

This release has a few notable extras:

  • Plugin Interface overhauled enabling full SpreadSheet support

  • Full Nirvanix Cloud Files Support

  • Core Amazon SQS Support Functions

  • Updated Amazon SimpleDB support for their query language

  • Java 6 support

  • Query Slow Log Monitoring

  • +tons of smaller bug fixes

Great stuff in 1.2, and we're settling into a nice 6-month release cycle, but of course you get the latest stuff in your hot little hands every night with our nightly builds.

Make sure and check out the release notes for all the details on this release!

Monday, October 26, 2009

Why Adobe Needs to Support Linux

I just got back from the wonderfully educational and inspiring BFlex/BFusion conference in Bloomington, IN, and I want to thank Bob Flynn for putting on another great conference and for his tremendous hospitality to Team Mach-II while we were in town. Save us a spot for next year!

BFlex is designed to be all hands-on sessions, some 90-minute and some full-day. This is a great idea and one which other conferences could stand to emulate, at least in a small dose. The hands-on sessions allow participants to bring their own laptops and get concrete experience in CFML and Flex as opposed to simply watching slides and hoping they remember how to do everything when they get back home.

This model is not without its dark side, however, and the hell of hands-on sessions can be summarized in one word: installation.

My hands-on session this year was entitled "Building and Deploying CFML on an Open Source Stack," and as I thought through all the bits and pieces attendees would have to install to get any value out of the session, I saw my 90 minutes being completely consumed by installation and configuration issues. Rather than chance it I decided that I would only expect attendees to install VirtualBox (because it's free and runs on Linux, Mac, and Windows), and I would give them a Linux VM with everything pre-configured. That way we could focus on playing with the open source stack instead of dealing with installation issues in the limited time we had in the session.

So I created my VM for the session, put it on a few thumb drives, and as I was doing some intro slides before the hands-on portion people installed the VM on VirtualBox and I hoped for the best. I expected at least a couple of people to have problems with the setup but to my amazement there were literally ZERO problems with the VirtualBox VM. By the time I was done flapping my gums everyone was ready to get their hands dirty.

The next morning I was a TA for one of the all-day Flex classes, and to call the configuration problems a bloodbath would be being kind. Trying to get a room packed full of people to get BlazeDS up and running when they all have different operating systems and hardware is a nightmare of epic proportions. Between the installation and configuration problems, not to mention some problems with the code files for the course, by the time lunch hit the class still couldn't proceed. Half of the full day class was gone, which put a serious dent in the amount of learning that otherwise would have happened.

My experience handing everyone a per-configured VM versus a room full of people trying to do a native install of a bunch of software was a major eye-opener for me. It's a simple idea, and it's the reason when you go to training centers that you use pre-configured workstations. Having people, even technical people, bring their own machines and install everything necessary for a course is a recipe for disaster.

So what does this have to do with Linux?

The VM idea is great and worked better than anyone had hoped. So why not do the same thing for the all-day courses?

Because that would be breaking the law.

Here's the problem. I could hand everyone in my session a VM because it was using Ubuntu and all open source software. Heck, I even made the VM downloadable for people who were attending a different session in my slot or who weren't at the conference.

Adobe software, at least the apps that would be used in training courses, only runs on Windows or Mac. Mac licensing prohibits VMs of OS X to even be created, and while you can create Windows VMs the proprietary license would prohibit a Windows VM from being distributed. Given the platforms on which Adobe software runs Windows would be the only choice for a courseware VM, but because of Windows licensing that isn't a possibility.

Which leads us straight back to installation and configuration hell.

At this point you might be thinking that pointing to training as a reason for Adobe to support Linux is a weak argument. Adobe has to worry about the overall market for sales of their products, after all, not just training scenarios.

I'll address that notion by asking a question: after struggling for several hours to get things up and running, and in some cases still failing, what does this do to course attendee's opinion of Flex? I doubt many people were hurriedly finishing their lunch to return to another installation torture session, and I'd be willing to bet it left a seriously bad taste in the mouths of beginners in the room (and this was a beginner class after all) who will see Flex as well beyond their reach after such a hugely negative experience.

Being able to run trial versions of Adobe software on a free operating system opens up a huge number of doors in my opinion. I truly hope that Bob and crew will find a way to do everything with VMs for BFlex next year; perhaps there are "lab" licenses available from Microsoft for just this purpose that expire after 30 days or something along those lines. Not ideal compared to a VM students could refer to long after the course is over, but at least it's something.

Using a free OS, however, would eliminate all the license issues that render trainers helpless and people can focus on learning instead of slamming their heads against installers and Java environment settings and all the rest of the nonsense that gets in the way of what they're trying to do.

I realize this is wishful thinking, but after having such a positive experience with free software and seeing the abysmal experience people had due solely to operating system licensing issues, all it did was convince me even more that free software is the way to go. Maybe Adobe will see the light someday as well.

Massive CouchDB Brain Dump

The following is a semi-unorganized brain dump of everything of interest I've come across while learning the incredibly cool CouchDB document-oriented database system. In this brain dump I pull things from many different resources including my own head, so there may be literal quotes from some of these resources without inline attributions. For that I apologize, but rest assured I'm not trying to plaigarize anyone or take credit where it's not due; I was just merely taking notes as I perused a lot of different resources and organized them in a way that made sense to me. I do have a complete list of all of the resources I used at the end to let you explore on your own. Again, my apologies to the creators of the resources from which I pull for not attributing inline.

I'll be presenting CouchDB to the ColdFusion Meetup on December 17 (Charlie did a great job of booking a full schedule through the end of the year) so don't miss it!

CouchDB: General Concepts

  • document-oriented
  • schemaless
  • JSON-based
  • REST-based
  • MapReduce
  • basically throw out everything you know about databases and you'll pick this up a lot faster
  • Calls are made to the database via HTTP
    • Yes, that means via a browser, curl, cfhttp ... anything that talks HTTP
  • Responses come back as JSON
  • Lock-free design--reads don't have to wait for writes or other reads
  • Why are these good things?
    • More like how data works in the real world
      • e.g. business cards--if one has a fax and another doesn't, in an RDBMS you have to have a fax field that's going to be null for anyone who doesn't have a fax
      • with CouchDB, you can have one record with a fax field, and another with no fax field, but they're both considered business cards since every document in CouchDB is 100% self-contained
    • Simple
      • some of this stuff is so simple you'll be amazed there isn't more to it
    • Fast
      • Push/Pull of JSON data over HTTP
      • No messy, time-consuming joins between tables--a document contains all its data
    • Scalable
    • Takes the object-relational mismatch out of the picture to a certain extent
    • It works like the web does
  • History of CouchDB
    • development started in 2004
    • originally written in C++, now in Erlang
    • Damien Katz quit his job and self-funded development full-time for 2 years
      • was formerly with IBM working on Lotus Notes, also a brief stint at MySQL
    • Damien Katz is now a full-time employee at IBM and gets paid to work on CouchDB full time
    • CouchDB is a top-level Apache project and is released under the Apache 2.0 license
  • CouchDB's motto: RELAX.
    • we shouldn't have to worry about data so much

Why Use CouchDB?

  • throws out the relational model and looks at what matters with data in the majority of applications
  • vastly simplifies data modeling and interaction with data
  • extremely flexible since there are no preset schemas
  • in relational databases, data does get to a point where it's unwieldy and slow to access
  • relational model is hard to scale, and doesn't do so very naturally or quickly
  • CouchDB offers ...
    • robust, dead simple replication to any number of servers
    • bi-directional conflict detection and resolution
    • fantastic performance on huge databases
      • number of records in a database has very little impact on performance
    • fantastic scalability
      • Erlang was designed for real-time telcom apps in the 1980s, so it's ideal for high scalability and highly concurrent apps like database servers
      • early testing with CouchDB shows it can handle 20,000 concurrent connections with no problems, and they haven't even done and performance profiling yet
        • lead developer said in an interview that using conventional threading in C++ you'd be lucky to handle 500 concurrent connections
      • Erlang also can help with multi-machine scalability, failover, etc. but CouchDB is not taking advantage of any of this yet
  • CouchDB speaks the language of the web
    • REST, HTTP, and JSON are how CouchDB works natively
  • already gaining a huge amount of traction and becoming very popular

Is the Relational Model Dead?

  • there is an increasing indication that the relational model will begin being seen as a solution, not the solution
  • Map/Reduce is simply a better model for dealing with large datasets and taking advantage of parallel processing
  • "Map/Reduce will kill every traditional data warehousing vendor in the market. Those who adapt to it as a design/deployment pattern will survive, the rest won't."
    • Might think this came from a non-relational database vendor, but it's actually from Brian Aker, one of the original authors of MySQL and currently working on the Drizzle ( fork of MySQL
  • document-based databases like CouchDB scale far better and easier than relational databases do
    • both Amazon and Google came up with their own database solutions for their cloud computing platforms as opposed to using a traditional RDBMS--this should tell you something
  • Better and more natural fit for applications

More on "Better Fit for Applications"

  • self-contained documents
    • no more taking a real world construct and deconstructing it into a relatonal model
  • flexible schemas
    • two documents can be of the same type and not contain the same fields--don't have to have a bunch of nulls involved, worry about foreign keys, etc. etc. since every document is self-contained
    • if you need a change in your schema, it's dead simple to do--just start using the new schema
      • if you don't care that the old documents don't have the new field, you don't have to worry about them
  • speaks our language as application developers
    • REST and JSON--doesn't get much simpler than that
  • since it's all web based you take advantage of the following at the database level
    • can handle more traffic since connections aren't left open
    • clustering, proxying, caching, security, etc. behaves just as it would with an HTML document
  • Creator of CouchDB said one of the goals was to have users feel like "you could touch your data ... like it was right there in your hands"
    • eliminating all the layers between your application and your data

Relational Model vs. Document-Based aka "Key/Value Store" Databases

  • relational diagram


  • key/value diagram


  • CouchDB pros
    • ideally suited for cloud computing
    • more natural fit with the code we write -- no ORM mismatch nonsense to worry about
  • CouchDB cons
    • relational databases enforce integrity at the database level
    • schemaless nature of CouchDB means your data integrity is that the APPLICATION level
      • bugs in application code using RDBMS don't lead to data integrity issues
      • bugs in application code using CouchDB CAN lead to data integrity issues
      • really this just puts this concern in a different place, but it's something to be aware of
    • no shared standards between key/value database vendors
      • much easier to move from SQL Server to MySQL than it would be to move from CouchDB to Amazon or Google
  • other concerns with cloud databases ...
    • limtations on analytics -- e.g. Amazon queries cannot take longer than 5 seconds to run
    • limitations on data returned -- e.g. Google queries cannot return more than 1000 rows
  • from an application development standpoint, in my experience thus far this does bring your data repository a bit more into the realm of your application
    • again, this isn't SQL, so your code isn't running queries and dealing with query objects; instead it's making HTTP calls and dealing with JSON
    • less friction between your app and your data, but be aware that it's a bit of a whole new world when you're working with CouchDB

Should You Consider CouchDB?

  • yes if you ...
    • have tables with lots of columns of which you typically only use/display a few
    • have lots of joins in your queries
    • are serializing JSON or XML data into single columns in your relational database
    • have data that is more heirarchical or flat than it is relational
    • have systems that require frequent schema changes
    • are reaching the performance capacity of a single database server and need to scale out
    • have an amount of data that is difficult for a single server to hold
    • have background processes running on your database that impact performance of the database as a whole
  • the nice thing about CouchDB is that it's highly and easily scalable by its very nature
    • but if you don't need scalability now, you don't have to worry about it; you just get it when you need it practically for free
  • Why not just dump JSON data into a relational database?
    • because RDBMSes don't know anything about JSON, so you don't get any of the huge efficiency and functionality advantages you get with CouchDB

Other Document-Based Databases

Building/Installing CouchDB

  • basic requirements: Erlang, SpiderMonkey (JavaScript engine), other miscellany
  • on Linux you'll need to install some prerequisites/dependencies if you don't have them; here's the list for Ubuntu ...
    • sudo apt-get install subversion
    • sudo apt-get install libtool
    • sudo apt-get install automake
    • sudo apt-get install libmozjs-dev
    • sudo apt-get install libicu-dev
    • sudo apt-get install curl
    • sudo apt-get install libcurl4-gnutls-dev
    • sudo apt-get install erlang-dev
    • sudo apt-get install erlang-nox
    • sudo apt-get install openssl
    • sudo apt-get install libssl-dev
      • double-check you have openssl and libssl installed; otherwise you may not get an error until you first try to run CouchDB
  • alternatively you can try ...
  • then grab the code and build it (do this from your home directory or wherever you like)
    • svn co couchdb
    • cd couchdb
    • sudo ./bootstrap
      • You should see "You have bootstrapped Apache CouchDB, time to relax." If not, fix any dependency issues it lists (the error messages are very explicit).
    • sudo ./configure
      • You should see "You have configured Apache CouchDB, time to relax."
    • sudo make
    • sudo make install
      • if you don't get any errors with make or make install you should be able to launch CouchDB!
  • Check for information about installing on Windows. Haven't tried this myself, likely won't, so best of luck.

Running CouchDB

  • on Linux the install process puts the couchdb script in your path, so you can open a terminal and type sudo couchdb
    • you should see:
      Apache CouchDB has started. Time to relax.
      [info] [<version_number>] Apache CouchDB has started on
    • If you see an error along the lines of {"init terminating in do_boot",{undef,[{crypto,start,[]} ... that means you don't have erlang-nox and/or libssl-dev installed, so you'll have to go back through the steps above once you have those dependencies resolved.
    • If you have other errors when trying to start CouchDB check

Interacting with CouchDB with CURL

  • some basic examples
    • curl -X GET http://localhost:5984/
      • returns basic server info
    • curl -X GET http://localhost:5984/_all_dbs
      • returns list of all databases on the server
    • curl -X PUT http://localhost:5984/contacts
      • creates a new database called contacts
    • curl -X PUT http://localhost:5984/contacts/6e1295ed6c29495e54cc05947f18c8af -d '{"firstName":"Matt", "lastName":"Woodward", "email":""}'
      • creates a new document in the contacts database; the string after the database name is a UUID
    • curl -vX PUT http://localhost:5984/contacts/6e1295ed6c29495e54cc05947f18c8af/headshot.jpg?rev=2-2739352689 -d@headshot.jpg -H "Content-Type: image/jpg"
      • attaches a headshot jpeg to the document with the ID provided
    • curl -X GET http://localhost:5984/_uuids
      • CouchDB returns a new UUID; can add ?count=N to get back N UUIDs if you need more than one
    • curl -X GET http://localhost:5984/contacts/6e1295ed6c29495e54cc05947f18c8af
      • returns the document with the UUID provided
    • curl -X DELETE http://localhost:5984/contacts/6e1295ed6c29495e54cc05947f18c8af?rev=2-212344
      • deletes the document with the ID provided; note that you must provide the latest revision number for the document in order for the delete to succeed
    • curl -X DELETE http://localhost:5984/contacts
      • deletes the contacts database
    • curl -X POST http://localhost:5984/_replicate -d '{"source":"contacts","target":"contacts-replica"}'
      • replicates the contacts database to the contacts-replica database
  • of course since this is just HTTP, you can use CURL's -v flag to get a verbose listing of everything CouchDB is doing on each request
  • performing updates is a bit different
    • if you do a PUT of a document with the same ID but don't include a revision number, the update will fail
    • you have to include the latest revision number in CouchDB in updates for them to work
    • what this means in practice is that you'll pull the document you want to update back, update the JSON (or update the data in your application code), and then do a put of the updated document with the new data since this will contain the most recent revision in CouchDB
    • the updated document gets a new revision number, and the original document is retained in CouchDB as a previous revision

Versioning of Documents

  • CouchDB uses a multi-version concurrency control (MVCC) system
  • each document in CouchDB gets a revision number
  • previous versions of documents are saved in CouchDB
    • BUT ... unlike a version control system, there is no guarantee how long the previous versions will be retained
    • you can tell CouchDB you want to retain the previous versions of a document if you need to
  • remember that all communication with CouchDB is done over HTTP
    • HTTP is stateless--you open a connection to CouchDB, make a request, then the connection is terminated
      • this is good because it means CouchDB can handle a lot of traffic since connections are short-lived
  • If you're familiar with Etags in the HTTP world, CouchDB uses its revision numbers as Etags in HTTP responses
    • Etags are very useful for caching
    • since all documents in CouchDB are really just resources in the HTTP/REST sense, your data behaves like any other HTML resource

Futon: CouchDB's Web-Based Interface

  • browse to http://localhost:5984/_utils for the web-based interface to CouchDB
  • handy way of perusing your documents, managing datbases, etc.
  • definitely handy for creating design documents and views
    • mini editor for creating temporary and permanent views--can execute temporary views from within the editor
  • can kick off replication and a ton of other stuff from Futon

Creating a Document

  • new documents have _id and _rev fields added automatically
  • documents are versioned much like code is in SVN, so every version of every document in the db is stored
  • click "add field" in Futon to add a new field to a document
  • double-click value (default is null) to edit
  • values must be JSON valid data
    • strings have to have quotes around them, e.g. "hello" not just hello
    • valid datatypes are string, number, boolean, list, and key/value dictionaries
  • you can do a "view source" on a document from Futon to see the JSON version of the document
  • as you update a document, the version number will change with each revision
    • if another process changes the document before you save your changes, a conflict will arise
  • CouchDB has no concept of "types"
    • e.g. in a blog application we would think of "posts" and "comments" as types
    • remember that CouchDB is schemaless, so there is no inherent structure to documents contained within the database itself
    • common to use a type field on a document containing a string that defines the type
      • makes it easy to write a view that pulls back specific document types
      • CouchDB does NOT CARE what field you use to define type--you can call this anything because again, CouchDB has no concept of document types
    • remember also that even if you define a document as a type, it does NOT have to literally match the structure of other documents with that same type
      • e.g. music library--could define "album" as a type, and if one album has a year field and another doesn't, they're both still "album" types since we defined the type explicitly
      • BUT, if you do want to require a specific structure for a document type, you can do that with validation functions and, e.g., reject an addition or update of an album to a music library if it didn't contain a year field
    • also handy to infer type based on fields for more flexibility
      • e.g. in a blog app we could use if (doc.title && doc.body) and assume if those fields are present that this is a blog post as opposed to a comment

How Documents Are Not Like Database Records

  • self-contained--no joins across table to put together a single record
    • documents in CouchDB map directly to an object instance in your application
  • typically documents will automatically have authors and publish dates associated with them
    • very easy to publish documents of any type in the future
    • if you create user accounts for CouchDB it automatically keeps track of who created and modified records
  • don't break documents into smaller units than you need to!
    • a blog post will have an author--don't have the author be a separate document
  • JSON document format
    • CouchDB documents all have _id and _rev fields
      • _id can be anything, so long as its unique--UUID, plain old string, whatever
      • _rev is the revision number--this changes with each update to a document
        • to update a document, you have to provide the most recent value of _rev so CouchDB knows you're working with the latest revision
    • if users have been configured, documents will have an author field
    • Do I really look like a guy with a plan? You know what I am? I’m a dog chasing cars. I wouldn’t know what to do with one if I caught it. You know, I just… do things. The mob has plans, the cops have plans, Gordon’s got plans. You know, they’re schemers. Schemers trying to control their little worlds. I’m not a schemer. I try to show the schemers how pathetic their attempts to control things really are.

      The Joker, The Dark Knight (this quote is used in the forthcoming O'Reilly CouchDB book)

Running Queries

  • again, forget everything you know
  • there is no SQL here
  • instead of running queries in the traditional sense, data is filtered using map and reduce functions, which are written in javascript
  • the map and reduce functions combined create a CouchBD view
  • views are stored as rows sorted by key
    • extremely efficient even for millions of records
  • can create temporary views for testing, but these are rather inefficient, so views that are going to be used regularly are stored in the database as documents
    • once a view is stored in the database as a document, CouchDB indexes behind the scenes for efficiency
  • main points:
    • map functions allow you to sort your data using any key you choose
    • CouchDB is designed to provide extremely fast access to data by key and key range
    • you don't really run queries against CouchDB, you query a view
    • when you query a view, CouchDB runs the map function against every document in the database in which the map function is defined
  • map functions have a single "doc" parameter which is each individual document in your database
  • the emit() function is used to spit out matching documents, and you can specify the fields you want to output
  • if you're querying every document in the database every time, isn't that inefficient?
    • you'd think so, but no
    • CouchDB only runs through all the documents the first time the view is queried
    • as documents change, CouchDB only has to update what's changed
    • everything is stored in a B-Tree, which is very efficient
  • creating multiple views specific to how you want to access the data helps with efficiency
  • to execute a view, you just--surprise--hit a URL over HTTP and get JSON back
    • e.g. http://localhost:5984/database/_design/designdocname/_view/viewname
    • to add a key to this, it's just an argument in the query string of the url, e.g.
      • http://localhost:5984/database/_design/designdocname/_view/viewname?key=value
    • can also retrieve documents by key range
      • http://localhost:5984/database/_design/designdocname/_view/viewname?startkey=startvalue&endkey=endvalue
  • default query engine or "view server" in CouchDB is JavaScript, but you can write your own in any language
    • Remember it's all just HTTP and JSON!
  • MAP FUNCTIONS take a document as an argument and emit key/value pairs
    • Btrees are very efficient--even with lots of documents the tree is "shallow" and it's pre-indexed so searches are very fast
  • REDUCE FUNCTIONS operate on the rows returned by map functions and act as filters on the documents


  • can replicate local -> local, local -> remote, or remote -> remote from Futon
  • as with everything in CouchDB, this is all HTTP/REST based
    • initial replication may be time consuming
    • subsequent replications are diffs only
    • if you trigger replication from Futon, you have to leave the browser window open!
    • but since all this is HTTP based, easy to set up cron jobs that use curl to do replication
  • a POST to CouchDB containing the source and target of replication is all that's needed to kick off replication
    • CouchDB maintains a session history of replication sessions, again in JSON
  • can replicate among local databases or between databases not on the same physical box
  • CouchDB has automatic conflict detection and resolution
    • remember that documents in the database are versioned, so conflicts are handled quite gracefully
    • documents that are in conflict when a replication occurs get a new _conflict:true attribute added to them
      • one of the two competing documents is given the latest revision number, the other is given a previous revision number
      • these conflicts are also replicated, so all databases will have the same information
  • CouchDB takes the approach of "eventual consistency"
    • traditional RDBMS systems enforce consistency--put consistency above all in replication situations
    • “Each node in a system should be able to make decisions purely based on local state. If you need to do something under high load with failures occurring and you need to reach agreement, you’re lost… If you’re concerned about scalability, any algorithm that forces you to run agreement will eventually become your bottleneck. Take that as a given.”

      Werner Vogels, Amazon CTO and Vice President
    • consistency between nodes is not guaranteed on writes, but the nodes will eventually be consistent on reads

Design Documents

  • documents that contain application code
  • IDs must start with _design as the ID, e.g. "_design/myapp"
  • possible to write entire apps in HTML/JavaScript, store this code as a design document in CouchDB, and run the entire app from the CouchDB database
    • dynamic code (views and validation) written as JSON and stored as a document in CouchDB
      • MapReduce queries stored in the views field
      • data output CAN be things other than JSON using the show field, e.g. CouchDB can output RSS without any middleware
    • static HTML pages stored as attachments to the design document


  • validation functions are used to do things like prevent users who aren't logged into an app from performing document updates
  • validation functions are stored in design documents under the validate_doc_update field
    • can only have one validation function per design document
    • but remember you can have multiple design documents per database
  • documents must pass all the validation rules on all design documents in the database in order to be saved
    • the order in which validation functions are executed is arbitrary
  • most common example, since CouchDB is schemaless, is to require that certain fields be included if a document is declared to be of a particular type
    • e.g. require "title" and "body" for a document type of post
      function(newDoc, oldDoc, userCtx) {
         function require(field, message) {
           message = message || "Document must have a " + field;
           if (!newDoc[field]) throw({forbidden : message});

         if (newDoc.type == "post") {

Show Functions

  • since everything in CouchDB is JSON, HTTP, and JavaScript, it works well in any programming environment
  • CouchDB doesn't, however, address things like outputting HTML
  • easy enough to have CFML call CouchDB over HTTP and output the results in HTML
  • CouchDB can, however, generate HTML natively using show functions
  • basic show function
    function(doc, req) {
       return '<h1>' + doc.title + '</h1>';
    • the "return" bit here is sent back to the browser as a HTTP response
  • you can even write full HTML templates that embed CouchDB-specific scripting so you don't have to embed HTML in javascript functions


  • Documents in CouchDB, which are JSON, can have file attachments
  • doing an HTTP PUT with a -d@filename.ext flag tells CouchDB to attach the file to the document ID provided in the PUT request
    • as with other updates to documents, you DO need to provide the current revision number of the document to attach a file to the document
    • unlike other updates you do NOT need to provide the data for the document itself in order to add an attachment to it
  • documents may have mutliple attachments
  • attachments are made available as, surprise, HTTP resources
    • http://localhost:5984/contacts/6e1295ed6c29495e54cc05947f18c8af/headshot.jpg would display the headshot.jpg file attached to the document with the ID provided
  • if you pull a document back from CouchDB that has an attachment, the attachment file name and meta information such as type, size, etc. are contained in the JSON with a key of "_attachments"
    • adding ?attachments=true returns the attachments in base64 format as part of the JSON


  • you can build applications entirely in CouchDB
  • if you replicate your databases to another server, you replicate your app as well
    • means if you replicate to a local instance of CouchDB, you get offline data mode "for free"
  • applications are stored in CouchDB as design documents
  • can use couchapp for developing native CouchDB apps


  • can lock down databases by editing a simple config file
    • by default it's in /usr/local/etc/couchdb/local.ini
    • there's also a default.ini file, but any changes made to default.ini are overwritten when CouchDB is upgraded
  • e.g. adding admin accounts to CouchDB
    • uncomment [admins] section and add authorized users as user = pass for each line
    • when CouchDB is restarted the passwords are hashed so they aren't stored in the config file in plain text
  • remember--this is all just HTTP so you can apply the same HTTP-based security, proxies, reverse proxies, etc. as you would to any web resource
    • e.g. putting a web server in front of CouchDB and using HTTP authentication would be trivial

General Tips/Tricks

  • In your applications, you'll want to create your own UUIDs for document IDs instead of letting CouchDB auto-create them
    • WHY? stated this in book but didn't elaborate
  • Since replication and resynching is so dead simple, easy to replicate to a local DB for offline use, then resynch when back online
  • For bulk conversions of existing databases to CouchDB, couple of performance tips
    • use the bulk document API instead of looping and doing individual document additions
    • don't use CouchDB's auto-assigned IDs--increases db size and has a big performance hit during conversion

There Are Some Cons ...

  • it's new--they call it the bleeding edge for a reason
    • stuff WILL change between versions that will break your apps!
  • it's a completely new skillset--there is no sql here
  • views take a long time to build the first time they're saved, but after that they're incredibly fast regardless of the number of documents involved
  • large databases can take up a lot of disk space
    • raw data is one consideration, but the views often take up much more disk space than the data itself
    • this is trading disk space for performance, which is a good tradeoff, so you just need to plan for disk capacity
  • CouchDB does not deal well with relational data
    • that being said, we all likely spend a lot of time dealing with the shortcomings of relational data, specifically how horrendously bad the relational model is at dealing with heirarchical data, so I'm not sure this is a straight con as compare with the relational model
    • recurring theme in the CouchDB literature is DON'T GO AGAINST THE GRAIN! Don't try to force CouchDB to behave like an RDBMS, because it's not.
  • CouchDB does not support transactions well
    • e.g. check to see if a user name is unique, then assign it--no way to isolate this from another simultaneous request in the same way you can with relational database transactions
  • Reads on single documents, and writes in general, are slower than you might be used to with an RDBMS
    • but, the CouchDB model scales much better
  • Have to write all of your "queries" (views) in advance--no on the fly SQL allowed
  • map-reduce not as flexible as sql--sometimes you'll have to return more data than you want or need and process on the application side

Common Use Case That Is a Bit Odd in CouchDB: Unique Constraints

  • e.g. guarantee a unique user name or email address
  • thread here
  • other than the _id field, CouchDB has no way of guaranteeing uniqueness in any other field
  • you CAN do a check to see if the record already exists, but remember ...
    • This is HTTP. There are no transactions.
    • There is no locking in CouchDB
  • On the other hand, guaranteed uniqueness doesn't scale
  • Some solutions
    • you can use anything as your _id, so use one piece of data that has to be unique AS your _id
    • use a relational table in an RDBMS to store any unique values and put a unique constraint on that field in the RDBMS
      • you could still have problems, but this would reduce the likelihood from slim (without doing this) to extremely slim


Yet Another PDF/Acrobat Pro Rant

I'm in PDF hell yet again today and had to vent. Now that iText fixed my
searchability problems (CFPDFFORM fail), I'm noticing cases where the font
in particular fields in the generated PDF does not in any way match the
settings that are in the PDF form when you look at the settings in Acrobat.

For example, all the form fields in one of the PDFs I'm working with are
set to font face Times New Roman and "Auto" for the font size. Random
fields here and there show up as Arial instead of Times New Roman and come
out some massive font size, even though other fields with the same amount
(or less) text are a reasonable size and are the correct font face.

Since I only recently figured out how to do mass changes of the font face
on multiple fields (usability fail; and this doesn't work consistently by
any means, but it's faster than doing it one by one), I thought I missed
setting the font face correctly on a field or two. But lo and behold when I
open the PDF form in Acrobat Pro the font is CLEARLY set correctly, yet the
generated PDF still renders the font incorrectly.

All that's bad enough, but the PDF size issue is really starting to kill
me. The particular PDF I'm modifying started out at about 500K in size. I'm
having to experiment with some things to figure out these annoying font
issues, so I changed all the fonts from Times New Roman to Arial, saved the
PDF, and the file size went up to 800K. I then changed the font back from
Arial to Times New Roman (which is what it was originally) and the file
size is now 1MB.

What. The. @$&*.

I'm sure there are stupid subtleties or fancy Acrobat Guru tips and tricks
of which I am woefully unaware but my file size shouldn't grow by 200K
every time I save it, so I'll declare this a "fail" and try to suffer
through. Once I get these god-forsaken things working if I never have to
touch Acrobat the rest of my natural life it'll be too soon.

Sunday, October 25, 2009 using Drupal | Dries Buytaert

I think Drupal is a great fit in terms of President Barack Obama's desire to reduce cost and to act quickly. Drupal's flexibility and modularity enables organizations to build sites quickly at lower cost than most other systems. In other words, Drupal is a great match for the U.S. government.

Second, this is a clear sign that governments realize that Open Source does not pose additional risks compared to proprietary software, and furthermore, that by moving away from proprietary software, they are not being locked into a particular technology, and that they can benefit from the innovation that is the result of thousands of developers collaborating on Drupal. It takes time to understand these things and to bring this change, so I congratulate the Obama administration for taking such an important leadership role in considering Open Source solutions.

This is a great move for a whole bunch of reasons, and is yet another example to point to when you run into that thankfully increasingly rare type of person who claims open source isn't secure. The "reduce costs and act quickly" bits are important to remember as well.

Once you get on an open platform the possibilities really open up.

Saturday, October 24, 2009

My Open CFML Presentation from BFusion 2009

Here's a PDF of my "Building and Deploying CFML on an Open Source Stack" presentation from BFusion 2009. The VirtualBox VM we used in the session is available here (2.9 GB zip file), and the user/password on the VM is "floss" (without the quotes) for both. The VM includes:

  • Ubuntu 9.04

  • Java 1.6.0 Update 16

  • MySQL 5

  • Apache

  • Tomcat 6.0.20

  • OpenBD (WAR and pre-deployed)

  • Eclipse with CFEclipse and Subclipse

  • A ColdTonica WAR to practice deployment

The one thing we didn't get to in my session is connecting Apache to Tomcat, but that's simple enough so give me a shout if you have trouble with that.

What we did get to (that I'm glad we did) is monitoring Tomcat and OpenBD with VisualVM, and a lot of people (based on the reaction) seemed not to be aware of that tool.

Once again this is a GREAT conference (I'd be lying if I said Ben Nadel's portion of the keynote didn't get me a little teary eyed) with great technical content and day-long training sessions. If you aren't here this year, get here next year!

InfoQ: Simplifying Java EE with Grails

Graeme Rocher introduces Groovy and its corresponding web framework, Grails, followed by a code writing demo intended to highlight the advantages of using Grails over Java EE in order to develop web applications.

Haven't even watched this one yet but after seeing Graeme present at SpringOne2GX this year, I know this will be a fantastic one to watch if you're interested in Grails.

Open Source Voting Software Concept Released

Wired is reporting that the Open Source Digital Voting Foundation has announced the first release of Linux- and Ruby-based election management software. This software should compete in the same realm as Election Systems & Software, as well as Diebold/Premiere for use by County registrars. Mitch Kapor — founder of Lotus 1-2-3 — and Dean Logan, Registrar for Los Angeles County, and Debra Bowen, California Secretary of State, all took part in a formal announcement ceremony. The OSDV is working with multiple jurisdictions, activists, developers and other organizations to bring together 'the best and brightest in technology and policy' to create 'guidelines and specifications for high assurance digital voting services.' The announcement was made as part of the OSDV Trust the Vote project, where open source tools are to be used to create a certifiable and sustainable open source voting system.

Awesome news. I got to see a presentation by and talk to the folks from Trust the Vote at the Open Source Bridge conference earlier this year, and it's an astonishing effort and the one (that I've seen anyway) that is actually starting to succeed.

Yes, there is a MASSIVE amount of red tape involved with getting voting machines certified (and justifiably so), but this is a huge, huge step in the right direction.

Friday, October 23, 2009

Apple Seeks Patent On Operating System Advertising

In one alarming aspect, the device could be disabled while the advertisements run, thereby forcing users to let the advertisement run its course before the system would unlock and allow further use. In an even more invasive scenario, explained in the patent application, the user could be required to do something, such as click to continue, in order to verify that they are actively watching the advertisement and haven't simply walked away while the ad runs.

People better be paying me to use devices if this kind of crap starts happening. Yet another reason to use GNU Linux.

Thursday, October 22, 2009

Session Notes - Grails Without a Browser

Presenter: Jeff Brown, SpringSource

  • grails typically thought of as a web framework (which it is), but there are significant applications built with grails that have no browser front-end at all
  • interesting work done at LinkedIn in this regard
    • talked about it in public at JavaOne last year
    • primary revenue generating app is a grails app (partner interaction, etc.)
      • this app has no browser front-end to it
      • built a business/service layer all in grails
      • have other applications that sit in front of this that aren't grails
  • lots of stuff in grails doesn't have anything to do with building web apps specifically
    • GORM
    • service layer
    • these make good sense in any application
  • can think of grails as a platform--similar to eclipse platform
    • eclipse IDE is what you think of, but the IDE is really just one app on top of the eclipse platform
    • e.g. substantial parts of grails are being used in griffon for building desktop apps
  • grails 0.4 was the first release that had the plugin system in it
    • interview soon after this release--jeff was asked what was coming up in the following year
    • hope was that they'd see a lot of development in the plugin space
    • turned out that there's a far more active and productive plugin community than they had hoped
    • shows the power of the plugin architecture in grails as well as grails as a platform in general
  • some plugins have nothing to do with an html ui
    • remoting plugins
      • exposes services to remote calls
      • can have the grails side interacting with GORM as per usual, but make these services available to RMI, SOAP, etc.
    • good REST support built right into the framework

Code Demo

  • simple app with Car domain class, CarService
  • services
    • transactional by default
    • instance of service class is automatically added to the spring application context
  • installing xfire SOAP plugin
    • inside a service can declare a property using "expose", e.g. static expose = ['xfire'] in the service
      • this will inspect services that have an expose property declared and make them available via xfire
    • also plugins for exposing as rmi, jms, etc.
  • after fire the app up can browse to http://localhost:8080/context/services to see a list of service WSDLs
    • xfire generated the wsdl automatically
    • can test with something like SOAP UI--UI tool to allow you to exercise web services
  • as you define other operations within the service, the WSDL gets regenerated as needed to expose the new methods
    • with xfire you do need to bounce the app for it to pick up the changes
  • creating a groovy script as a command-line client to the exposed service
    • create a, give it the URL to the WSDL, and can then call methods on the web service
    • WSClient is an external project (not core to groovy) so you do need to add it to your classpath
  • note that you don't have access to the Car class when you get back data from the web service
    • you aren't getting back serialized Car objects, it's returning WSClient type of ArrayOfWhatever
      • basically you get back a list of arrays--each array contains the property of the Car object
        • really JAXB elements that contain an xml representation of the car
  • note that by default all of your services are singletons (stateless)
    • if you do happen to have a service that DOES have state, need to worry about thread safety
      • can specify that the bean is scoped in any one of the valid spring scopes
  • controllers are not singletons since they have state--new controllers are created for every request
  • xfire may or may not support (need to check) not exposing specific methods within a service using something like webparam annotations
  • installing jmx plugin
    • can add another entry to the expose property, e.g. static expose = ['xfire', 'jmx']
      • still exposed as SOAP, but also available to jmx
  • creating simple math service
  • can create a griffon client to access the grails service
  • demoing creation of griffon app (results in a swing app)
    • showing two text input fields in swing app, click button to make call to grails service to add the two numbers together
    • actually clicking on the button calls a griffon controller, and the griffon controller makes the web service call to the grails service
      • creating an instance of groovy's WSClient in the griffon controller and calling the same way as from the command line client
  • now creating a RESTful interface to the math service
    • adding a MathController to the grails app with a "product" action to multiply two numbers
    • return from the controller is xml
    • showing calling this in the browser but of course this could be called from anything that can accept xml as the return format
  • now adding another component to the griffon application to do multiplication by calling the REST URL in the grails app
    • plugin for griffon to enable REST support
    • wind up with two buttons in the griffon app, one that makes a soap call, the other making a rest call
  • can use UrlMappings in a grails app to make things more RESTful
  • in controller you can specify the allowed HTTP request types that can be made to the controller by action, e.g. static allowedMethods = [delete:'POST'] would throw a 405 if any request type other than POST is made to the delete action
    • should never do anything destructive in response to a get request
    • if you write controllers from scratch the allowedMethods property is NOT there, and in most cases you'll want to add it
  • in RESTful services, the request will come to the controller since controllers in grails apps are what respond to requests
    • can put logic in services to get it out of your controller, but the REST response since it's based on HTTP request/response needs to be handled in the controller
    • controllers don't have to render a *view* specifically, they just have to provide a response
      • response can be html, xml, json, etc.
  • when you think about grails, building web apps is a huge part of what it's used for, but grails also handles "no browser" apps or multiple UI clients quite well
  • in the end your application is just responding to requests


  • anyone working on a plugin so you can write an app that can render either a grails or griffon app?
    • nothing really going on in this specific area, but moving towards compatibility with grails and griffon plugins
    • in many cases you can do a search/replace on a grails plugin to replace "grails" with "griffon" and it'll work in griffon
  • has linkedin experienced any scalability issues?
    • linkedin app get hits really hard and is holding up very well
    • being on the jvm means you have great existing solutions for deployment, scalability, monitoring, etc.
      • can do things in other frameworks (e.g. rails, django) perhaps as quickly as with grails, but that's only one piece of the puzzle
  • what happens when an app and a plugin rely on different versions of the same jars?
    • no good solution for this at this point
    • after grails 1.2 is released, significant effort will be put into the plugin system--need to figure out what role OSGi will play in the future of grails
      • will grails apps and/or plugins turn into OSGi bundles? probably ponder this early next year
  • is GORM usable outside of Grails?
    • yes, and isn't difficult to do now
    • wiring between gorm and the rest of grails is pretty decoupled at this point
    • can use GORM in griffon, for example--drop the GORM jar in and annotate classes, and everything works
      • all the gorm dependencies are in 1 JAR now
    • when using GORM outside of grails you do need to create your own boostrapper or use annotations so GORM knows about your domain model
  • error handling with soap?
    • inside controller action if you get bad data you can render the soap response with the error
    • grails doesn't complicate or simplify any error handling with soap
    • anything you can do to specify timeouts, etc.?
      • rest client for griffon no, wsclient probably does
    • shouldn't assume when something goes wrong in groovy/grails that you always get an exception
      • in many cases grails fails silently, e.g. if you call save() and it fails, it doesn't throw an exception it just returns null
      • can now specify that save() throws an error if save fails
  • what about security with xfire and securing services?
    • can secure access to the webapp at the http level
    • don't know if xfire plugin has specific user/role based access
    • controllers in REST would hook into everything that grails already does--just an http request to the webapp

Session Notes - Not Your Father's Custom Tags

Presenter: Dave Klein

Custom Tags are Valuable

  • build reusable components
  • keep code out of pages
  • make pages more readable
    • even putting presentation/html code into custom tags can make the page more readable
  • encapsulate domain knowledge
    • easy for page designers to use without knowing much about the domain model, etc.

JSP Custom Tags are Painful

  • create handler class for each tag
  • implement one of several interfaces
  • implement interface's methods
  • <pure_evil>define a TLD</pure_evil>
  • add a page directive for each TLD
  • great once they exist, but because they're a pain to create people avoid them and don't get the benefit
  • JSP custom tag for hello wolrd is 2+ pages of code, equivalent GSP tag is about 6 lines

The Power of JSP Without the Pain

  • convention over configuration
  • no tld, no xml
  • no interfaces to implement
  • a TagLibrary is a single groovy class
    • can contain multiple tags within this class
  • each tag is a closure
  • don't need to declare within the page
    • if it's in the project, it's available on every page

GSP: A Quick-Start Guide

What you can do in a GSP tag

  • accept and use attributes
  • accept and conditionally use a body
  • access your domain model including all the GORM methods
  • call service classes
  • call other tags
    • other tags are called as methods from within a tag
  • access the session, request, and response

Implicit Objects in a GSP Tag

  • session (GrailsHttpSession)
  • request (HttpRequest)
  • response (GrailsContentBufferngResponse)
  • out (GSPResponseWriter)

Example Tag -- Output Groovy Group List

  • def groupLinks = {
    def groups = GroovyGroup.list()
    out << "<br/><ul>"
    groups.each { group ->
    out << "<li><a href='"
    out << createLink(action:'show',
    out << "'>${group}"
    out << "</a></li>"
    out << "</ul>
  • call tag as <g:groupLinks />

Example Tag With a Body

  • def ifLoggedIn = { attrs, body ->
    def user = session.use
    if (user)
    out << body()
  • <g:ifLoggedIn>
    info for logged in users goes here
  • note that in the tag code, the first parameter will be treated as a map of attrs so even if you're not using attrs, but you're using the body, still need to have a dummy parameter for attrs in order for things to work correctly

Using Custom Tags Instead of <g:if>

  • def buttonBar = {
    def user = session.user
    out << "<div class='buttons'>"
    if (user) {
    // output buttons for logged in users here
    if (user.isAdmin()) {
    // output admin buttons here
    } else {
    out << "<input type='button' value='Login' />"
    out << "</div>"
  • call tag as <g:buttonBar />

Testing Tags

  • very easy
  • tags when called as methods return a string
  • TagLibUnitTestCase makes it even easier
    • includes mocks for
      • session
      • request
      • response
  • includes other mocks from GrailsUnitTestCase

A Sample Test Class

  • import grails.test.*
    class DemoTagLibTests extends TagLibUnitTestCase {

    public void setUp() {
    tagLib.metaClass.createLink = { params ->
    } // Had to add this because otherwise createLink wouldn't be available in the test
    // This would also be true of custom tags that aren't in the same tag library

    void testButtonBarWithAdmin() {
    mockSession.user = [isAdmin:{-> true}] // map is serving as mock object
    def output = tagLib.buttonBar() // tag itself is a closure, so can call as a method here
    assert output.toString().contains('Create Stuff')

    void testIfLoggedIn() {
    mockSession.user = "Anything can go here" // doesn't matter what goes here since we're just checking session.user
    def output = tagLib.ifLoggedIn([:]) {
    'User is logged in'
    assertEquals output.toString(), 'User is logged in'

    void testGroupLinks() {
    mockDomain(GroovyGroup, // new GroovyGroup objects here ...)
    def output = tagLib.groupLinks()
    assert output = tagLib.groupLinks()
    assert output.toString().contains('<a href=')
    assert output.toString().contains('Group2')

  • remember that unit tests run much faster but don't have everything available, so if you find yourself mocking a ton of stuff in unit tests, that's a good indication you need integration tests


  • if you don't declare a namespace, uses g
  • put static namespace = 'demo' at the top of your taglib class to declare a namespace
  • use the namespace as a prefix when calling the tag as a method, e.g. namespace.tag()

Distributing TagLibs With Plugins

  • create a plugin project
  • create or copy a TagLib to the /taglib directory
  • package your plugin
  • Install your plugin in another application
  • your custom tags are now available in the application


  • showing demo of simple custom tag to replace a lot of the redundant stuff in the grails scaffolding
  • one limitation of gsp tags is you can't have nested tags
    • you can fake this out a bit by leveraging the request scope
  • great use of gsp tags is to bundle up html, css, and javascript that gets output to the page
  • FieldData plugin available that does a lot of these sorts of things
  • check out the custom tags available in the grails core for good examples of how to do things

Wednesday, October 21, 2009

It's Alive! CFML Plugin for Grails

Nothing to distribute quite yet, but I'm psyched that I just successfully hit an index.cfm in my nascent CFML plugin for Grails. What this means is that (when it's done anyway) you'll be able to run CFML from within a Grails application.

Why is this cool? Partially just because it's fun to mess around with, but there are several practical reasons as well. Here's a few off the top of my head:

  • Leverage the power of Grails for building the model and controller layer of an application while using CFML for the view
  • Integrate existing CFML functionality or entire CFML applications within the context of a Grails application
  • Write hybrid Grails and CFML applications, mixing and matching CFML and Groovy/Grails code in various sections of the application
  • Use Grails URL mappings to hit CFML or Groovy code based on URL patterns

Creating the plugin was actually relatively simple. The Grails plugin architecture is amazingly powerful and easy to use, and after Graeme Rocher's two sessions on plugin development today at SpringOne2GX I at least had an idea of how this would be accomplished. I created a new Grails plugin project, dropped the Open BlueDragon JAR files into the Grails plugin's lib directory, defined the servlets, servlet mappings, and a few other things in the Grails plugin config file, and a basic index.cfm file was processed fine.

I'm still thinking through all the possibilities and potential problems I might run into with this but at least it's limping along. Exciting stuff, and shows the extreme power of the Grails plugin architecture.

Session Notes - Grails Plugin System Part 2

Presenter: Graeme Rocher

Technical Plugins

  • typically a bit more complicated to implement
  • can mix and match technical and functional plugins
  • provide programming apis

Key Concepts

  • plugin lifecycle
  • GrailsApplication object
    • encapsulates the conventions in a project
  • Spring BeanBuilder
    • runtime spring configuration--need to understand the spring dsl
  • Metaprogramming

Plugin Lifecycle Hooks

  • enclosures that allow you to manipulate the grails runtime
  • def doWithWebDescriptor = { xml -> }
    • useful for creating plugin that manipulates web.xml
  • def doWithSpring = {}
    • can use full power of spring within a grails plugin
    • written in groovy so have full access to groovy in configuring spring
  • def doWithDynamicMethods = { applicationContext -> }
    • metaprogramming
    • e.g. could add a method that interacts with hibernate
    • since you have access to the applicationContext you have access to more or less everything
  • def doWithApplicationContext = { applicationContext -> }
    • post-processing


  • programmatic modification of generated web.xml
  • useful for integrating third-party servlets/filters
  • groovy markupbuilder style dsl
    • easy to append to existing nodes in xml
  • automates a lot of configuration you'd otherwise have to do by hand
  • doesn't happen during war deployment


  • programmatically modify Spring ApplicationContext
  • groovy dsl that generates spring xml


  • exercise your groovy metaprogramming skills
  • add new methods and properties to existing types
  • "delegate" is equivalent to "it" in metaprogramming
  • in java terms a lot of what you have to do is DI, etc. and your code becomes rather cluttered
    • e.g. in GORM there's no reference to the fact that you're using Hibernate in your domain classes
    • specific references are abstracted away

Additional Hooks

  • onChange
    • can re-add a method to a class when it changes
    • e.g. specify def observe = ['plugins'] to watch for changes
  • onConfigChange

Plugin Life-Cycle Illustrated

  • grails loads first
  • loadPlugins() called next -- pluginmanager created
  • following this, the plugin lifecycle points are called
  • all plugins, core and otherwise, are loaded at the same time
    • can specify with your own plugins if they should load explicitly before or after core plugins

Programming by Convention

  • every grails project has a set of conventions
  • configuration can be automated through conventions
  • grails supplies an API to inspect conventions via some key interfaces
    • GrailsApplication
    • GrailsClass

The GrailsApplication Interface

  • used for programming by convention
  • not an interface in the fixed Java term; dynamic object
  • inspect GrailsClass instances
  • extensible to new artefact types
  • available as a variable called application in plugins
  • interesting methods
    • Class[] getAllClasses() // gets all loaded classes
    • GrailsClass get*Class(String name) // gets a class by name
    • boolean is*Class(Class c) // checks if a class is of a given type
    • GrailsClass[] get*Classes() // all grailsclass instances of type
    • void add*Class(Class c)
    • all the * stuff in these examples are where you substitute the grails class type, e.g. getControllerClass("fully.qualified.class.Name")
  • can add new artefact handlers to add completely new artefact types to grails
    • can tell grails what the conventions are, e.g. "ends with X" or "lives in this directory"

Artefact Types

  • built in types
    • DomainClass
    • Controller
    • TagLib
    • Service
  • extensible to new by implementing ArtefactHandler

The GrailsClass Interface

  • allows you to further analyze conventions for each class
  • contains methods to get various forms of the class name
    • getShortName(): UserController (no package)
    • getPropertyName(): userController (bean name)
    • getNaturalName(): User Controller
    • others ...
      • getLogicalPropertyName(), getPackageName(), etc.

Demo: Building a Technical Plugin

  • building an API to twitter
  • using twitter4j -- twitter api for java
    • drop jar in lib directory of project
  • create a new public twitter bean in the doWithSpring plugin point
    • default constructor assumes public timeline
  • in doWithDynamicMethods are going to add ability for spring security users to submit to twitter
    • adding dynamic tweet() method to string class
    • String.metaclass.tweet = { ->
      def auth = SecurityContextHolder.context?.authentication
      if (auth instanceOf UsernamePasswordAuthenticationToken) {
      def twitter = new Twitter(auth.principal, auth.credentials)
      } else {
      throw new TwitterException("user not logged in")

Convention-Based Programming

  • have classes in the system, want to mark these as a "tweeter"
  • want to be able to call twitter methods to a user who's marked as a tweeter
  • implement this in doWithDynamicMethods
  • inside each hook there is an implicit application keyword that represents the application context
    • application.domainClasses.each { GrailsDomainClass dc ->
      def metaClass = dc.metaClass
      metaClass.tweet = { String message ->
      def twitter = new Twitter(delegate.username, delegate.password)

      metaClass.getLatestTweets = { ->
      def twitter = new Twitter(delegate.username, delegate.password)
      twitter.getPublicTimeline().text // property doesn't exist but gpath automatically grabs text property of each object


  • more advanced knowledge of the grails plugin system opens up a world of possibilities
  • combine technical and functional plugins to enable new levels of productivity
  • harness the power of groovy metaprogramming!
  • helps avoid reinvention and duplication across applications


  • inside any plugin you can do grails create-script and specify a gant script
    • used to extend the grails command line interface
  • plugins can provide different scaffolding templates to override the default ones and create your own uis
  • "plugins are the most important thing about grails"
  • any thoughts about how to eliminate bad/abandoned plugins from the central repository?
    • hudson build of all tests on all plugins done by a user
    • thoughts of integrating this into to give plugins a health status
  • plans to move plugins into git instead of SVN?
    • in grails 1.2 you can export plugin as a zip only so you can manage your source in git but still deploy to svn
    • complete move to git might be considered further down the line
  • SpringSource officially supports specific plugins

Session Notes - Grails Plugin System Part 1

Presenter: Graeme Rocher


  • plugin quick start

  • technical vs. functional plugins

  • plugins for modularity

  • distribution and dependency management

  • plugin demo

Key Facts

  • grails itself is made up of a collection of plugins

    • hibernate, webflow, gsp, etc.

    • grails core is essentially a set of plugins (about a dozen)

  • there are 300+ plugins available now

    • what a plugin can do is wide and varied

  • plugins are easy to create

    • don't need to spend a ton of time learning the internals of the framework to create a plugin

  • plugins are easy to distribute

  • everyone is a plugin developer

  • well over 25 million lines of user contributed code in the plugin repository

    • searchable

      • automatically makes any domain class searchable

    • taggable

      • adds apis to tag domain classes

    • rateable

      • ratings for domain classes

    • quartz

      • for scheduling tasks

      • interesting because it adds a new concept to grails -- "jobs"

      • adds create-job command to command line, has a jobs directory, etc.

    • gwt

    • weceem cms

      • example of a functional plugin as opposed to just providing new apis

      • this plugin contains views, controllers, domains, etc. as part of the plugin

    • feeds (rss and atom feeds)

    • iwebkit (iphone)

Common Use Case

  • you have a tag library

  • you have two applications

  • you want to share functionality

  • create a plugin!

  • grails create-plugin pluginName creates the plugin skeleton

  • grails 1.2 creates intellij project files

Plugin Structure

  • more or less identical to a grails application structure

  • can run the plugins as standalone apps

    • developing a plugin is the same as developing an application

  • only real difference is there's a plugin descriptor in the root of each plugin

    • contains info about plugin version, grails versions supported

    • can also exclude specific resources when the plugin is installed

    • also numerous plugin hooks that can be accessed (doWithWebDescriptor, doWithSpring, etc.)

  • using packages is important--should always uses packages in your applications

Plugin Quick Start -- 5 steps

  • grails create-plugin myPlugin

  • cd myPlugin

  • grails create-tag-lib myTagLib

  • grails package-plugin

  • grails install-plugin

Plugin Extension Points

  • normal grails development

    • tag libraries, controllers, etc.

    • the build

    • theses are very trivial to do

  • requires additional knowledge

    • spring configuration

    • new apis

      • addition of additional properties/methods to domain classes and domain controllers

    • new concepts

Plugin Types

  • relate to the "split" above

  • functional plugins

    • taglibs, controllers, extend the build, etc.

  • technical plugins

    • see above--spring config, metaprogramming, etc.


  • functional plugins

    • weceem cms

    • nimble

      • provides user management, integration with facebook, etc.

  • technical plugins

    • gwt

    • functional test

  • both functional and technical

    • spring security

    • searchable

      • adds search to domain classes, but also provides a complete search UI to your application

Functional Plugins

  • easier to understand

  • just like building a normal grails application

  • silos of functionality

    • forums, blog, search, etc.

  • used to create modular applications

Demo: Building a Functional Plugin

  • example: twitter clone

    • two different teams--web UI team, mobile team

    • underlying functionality is the same

  • good candidate for a plugin--need to share the domain model between the two apps

  • can install plugins in other plugins

    • e.g. can install both the iwebkit and the twitter domain model plugin into the iphone version of the app

    • can specify plugins and versions in dependsOn block in the plugin descriptor

    • when you package up your plugin it will bundle up the plugins on which your plugin depends and include them

  • can use BootStrap.groovy within plugins

  • can configure local plugin repositories and can specify order in which plugin repositories are searched

Certain Things that Plugins Cannot Package by Default

  • BootStrap.groovy, UrlMappings.groovy, DataSource.groovy

  • can create something like FooUrlMappings.groovy and that will get packaged

Some Tips

  • inline plugins

    • if you're modularizing your entire application, becomes impractical to repackage and reinstall every plugin as they change during development

  • url mappings

  • jar dependencies

  • testing

  • plugin distribution

Inline Plugins

  • use plugins without installing them

  • prevents package/install cycle

  • great for modular application development

  • go to grails-app/conf/BuildConfig.groovy and set grails.plugin.location."pluginName"="../path/to/plugin"

  • BuildConfig.groovy

    • not generated automatically in 1.1.1

    • lets you configure various aspects of the build, locations of generated files, etc.

URL Mappings

  • normal UrlMappings.groovy file is excluded from the package plugin process

  • if you want to define url mappings as part of your plugin, create a file called something like MyPluginUrlMappings.groovy and that will be included

Plugin Dependencies

  • grails supports transitive installs and plugin dependencies using dependsOn

  • can use ranges of versions for dependencies, e.g. hibernate:"1.1 > *"

  • to implement a "soft" dependency use loadAfter

    • def loadAfter = ["hibernate"]

    • can also specify loadBefore

Plugin JAR Dependencies

  • you can put jar files in the lib directory of the project just like applications

  • even better, in grails 1.2 you can define dependencies in grails-app/conf/BuildConfig.groovy

  • grails.project.dependency.resolution = {
    inherits "global" // inherit grails' default dependencies
    dependencies {
    runtime "org.apache.ant:ant:1.7.1"

  • different scopes can be involved--build, runtime, test, provided (needed during grails run-app but not during WAR deployment)

  • this means the jar files are NOT included in the plugin packaging

    • allows application into which the application is installed to resolve the dependencies

  • in a traditional java project, the jar dependency list is flat

  • in grails, you have dependencies of the app, of the framework, and of any plugins involved


  • done just like a normal application

  • sometimes it's useful to have classes used only for testing and not distributed with the plugin

  • use pluginExcludes in the plugin descriptor to achieve this

  • class MyPlugin {
    def pluginExcludes = [

Plugin Distribution

  • central repository

    • grails release-plugin

  • custom repository

    • can configure custom repos in BuildConfig.groovy

    • can also set this in settings.groovy in your user home directory

  • Grails separates notion of repos used for discovery vs. distribution (e.g. http for discovery, https for distribution)

  • can specify the repository into which to release the plugin when you run grails release-plugin

  • custom repos use svn for distribution

    • can use a basic file server for discovery

  • plugins don't necessarily have to be open source to be in the central repository

    • need to specify the license

    • would distribute jars instead of source


  • developing grails plugins is very simple

  • functional plugins are an easy way to modularize your application

  • in part 2 we'll be looking at how to develop technical plugins


  • as part of modularization is there any integration with osgi?

    • not yet, but looking into it--probably next year there will be something more concrete

  • are there particular plugins that you would highlight that follow best practices or as good models to follow?

    • depends on what you want to do

    • for a functional plugin check out weceem cms--complete end-to-end with views, js, etc.

    • for a technical plugin check out Build Test Data (creates mock data for you)

    • quartz plugin is a good example of a plugin that adds a new concept to grails, new command line arguments, etc.

      • exercises the artefact api

    • SpringWS plugin adds notion of an endpoint to Grails

    • Ratable, Commentable, Featurable, and Taggable are great examples of modifying domain classes on app load