Skip to main content

Revisiting Retrieving Documents Between Two Dates From CouchDB

In a previous post I outlined how I was retrieving documents from CouchDB with a start date property less than the current date, and and end date property greater than the current date. To summarize, in my CouchDB view I created some date/time strings in JavaScript and only emitted documents in the view that met the date criteria.


My previous post got referenced in the CouchBase newsletter, and I'm really glad it did because while I came up with what I thought was a clever solution it was also wrong. (D'OH!)


The issue I didn't consider that some kind commenters on the previous post pointed out is that my approach creates side effects because I'm emitting documents in the view based on information that isn't in the document itself. Specifically since I'm using the current system date/time when the view is created, the documents included in the view will be ones for which the criteria is valid when the view is created.


What this means is that although views get updated with current data as data within documents changes, since the entire view isn't generated each time the criteria used to determine whether or not documents are included in the view is a fixed point in time. To put it another way, my current system date/time that was current when the view was first created essentially becomes hard-coded once the view is created, which isn't at all what I needed. This causes issues if the start and end date properties in the documents change after they've been added to the view because the view only checked to see if the date criteria was met at the time the document was added to the view.


There are some great suggestions in the comments on my previous post for including data in the document itself that would allow only valid documents to be pulled right from Couch, and you'll certainly want to check those out if you're dealing with a ton of data. The solution I'm using will not be ideal for massive datasets but since that isn't the situation I'm in with this data, I wanted to share the solution I came up with in case this works for other people.


To describe my documents again, I have documents that need to be displayed on a web page if their start date/time property is less than the current date/time and if their end date/time property is greater than the current date/time.


Since the valid ranges go in opposite directions for those fields, I didn't see a way to do something like have an array key that included both the start and end dates that would allow me to get only the documents I want back from Couch. But what I can do is use a single document property as a key in Couch and get close to what I want, and then I can pare the documents down further in the application code.


In my case the end date is a more strong limiting criteria since over time there will be a large number of documents with both start and end dates in the past, but documents with end dates >= the current date will be much fewer in number (only a handful in the case of this specific data).


The first step to fix my issue was to rewrite my view to eliminate the date/time check in JavaScript since that's the cause of the unwanted side effect, and emit documents using the end date/time property as the key. I have some other criteria as well (checking type and a couple of other fields to pull valid documents for this particular display), but the basic view is now very simple:



function(doc) {
  emit(doc.dtEnd, doc);
}


With the end date/time as the key, on the application side I can simply use the current date/time as my start key when I call this view, and that gives me all documents with a valid end date/time (>= current date/time).


At this point I may still have documents that shouldn't be displayed based on the start date/time, however, since when people enter data into this application they can schedule things for future display (i.e. both start and end date/time are in the future). But, again since I'm not dealing with a huge amount of data once I limit by the end date/time, it's simply a matter of looping over the documents I get back from Couch and checking for a valid start date/time (<= current date/time) and only displaying those documents.


The issue my original view code created makes total sense now, so thanks to the commenters on my previous post who pointed out the fatal flaw in my approach. Nothing like doing something wrong as a means of learning.

Comments

Popular posts from this blog

Installing and Configuring NextPVR as a Replacement for Windows Media Center

If you follow me on Google+ you'll know I had a recent rant about Windows Media Center, which after running fine for about a year suddenly decided as of January 29 it was done downloading the program guide and by extension was therefore done recording any TV shows.

I'll spare you more ranting and simply say that none of the suggestions I got (which I appreciate!) worked, and rather than spending more time figuring out why, I decided to try something different.

NextPVR is an awesome free (as in beer, not as in freedom unfortunately ...) PVR application for Windows that with a little bit of tweaking handily replaced Windows Media Center. It can even download guide data, which is apparently something WMC no longer feels like doing.

Background I wound up going down this road in a rather circuitous way. My initial goal for the weekend project was to get Raspbmc running on one of my Raspberry Pis. The latest version of XBMC has PVR functionality so I was anxious to try that out as a …

Setting Up Django On a Raspberry Pi

This past weekend I finally got a chance to set up one of my two Raspberry Pis to use as a Django server so I thought I'd share the steps I went through both to save someone else attempting to do this some time as well as get any feedback in case there are different/better ways to do any of this.

I'm running this from my house (URL forthcoming once I get the real Django app finalized and put on the Raspberry Pi) using dyndns.org. I don't cover that aspect of things in this post but I'm happy to write that up as well if people are interested.

General Comments and Assumptions

Using latest Raspbian “wheezy” distro as of 1/19/2013 (http://www.raspberrypi.org/downloads)We’lll be using Nginx (http://nginx.org) as the web server/proxy and Gunicorn (http://gunicorn.org) as the WSGI serverI used http://www.apreche.net/complete-single-server-django-stack-tutorial/ heavily as I was creating this, so many thanks to the author of that tutorial. If you’re looking for more details on …

The Definitive Guide to CouchDB Authentication and Security

With a bold title like that I suppose I should clarify a bit. I finally got frustrated enough with all the disparate and seemingly incomplete information on this topic to want to gather everything I know about this topic into a single place, both so I have it for my own reference but also in the hopes that it will help others.Since CouchDB is just an HTTP resource and can be secured at that level along the same lines as you'd secure any HTTP resource, I should also point out that I will not be covering things like putting a proxy in front of CouchDB, using SSL with CouchDB, or anything along those lines. This post is strictly limited to how authentication and security work within CouchDB itself.CouchDB security is powerful and granular but frankly it's also a bit quirky and counterintuitive. What I'm outlining here is my understanding of all of this after taking several runs at it, reading everything I could find on the Internet (yes, the whole Internet!), and a great deal…