How To Always Sort By Date With Apache Solr In Drupal 7

Recently I had a case where Solr search results had to always be sorted by the end-date of a date field in Drupal 7.

The website had two content types, activities and courses, both with start and end-dates in a date field. On a Solr search page I wanted to do two things: only show the courses that haven't finished or started yet (those in the future) and sort them chronologically by date. Eventually, with the help of Kristof De Jaeger and Nick Veenhof I found out how to do it.

The problem is this: Solr will detect the field date, and extract the start and end dates as dm_field_date_start and dm_field_date_end. When you look at the Solr schema browser you can see those fields are being listed. Unfortunately Solr cannot sort on them because it indexed them as dm_* which means they are "date multiple" fields. Solr can only sort on single fields. The solution is to add the "date single" field (ds_end_date) to the index and use that in a sort.

There are a couple of things that need to happen in order for it to work:

  1. Add the end-date field to the Solr index so it can be returned in the results and sorted upon.
  2. Create the sort on the date field when running the query.
  3. Add a filter to exclude the past courses.
  4. Add the end-date field to the query results.
  5. Run the sort.

1. Adding A Date Field To The Solr Index

The content/entity types for courses and activities are nodes and Drupal's Solr API exposes functions for every entity type there is. There is hook_apachesolr_index_document_build() in which you can write a switch-case statement to see whether the $entity parameter is a node but it's easier to use hook_apachesolr_index_document_build_node() which is only called for nodes and it saves you a switch.
The date field on the activity and course content types was named "field_date". So in case the entity type that is being indexed is an activity or a course, add the end-date value of field_date to the document in Apache Solr's date format.
The name of the field in the Solr index is free to choose, but make sure it starts with ds_ or Solr will not be able to to sort on it. "ds" stands for "date single", which means it can only have value and which makes sorting possible.

function mymodule_apachesolr_index_document_build_node(ApacheSolrDocument $document, $entity, $entity_type, $env_id) {
  if ($entity->type == 'activity' || $entity->type == 'course') {
    $entity_date = field_get_items('node', $entity, 'field_date');
    // Add the end date of the entity_date field as a sort field to the solr index
    if (!empty($entity_date)) {
      $document->addField('ds_end_date', apachesolr_date_iso(strtotime($entity_date[0]['value2'])));
    }
  }
}



When you reindex your content you'll see in the Solr schema browser for your Solr core that the field "ds_end_date" will be added.



2. Making The Date Sort Available


Before we can sort we need to "add" the sort to the query we are executing, making sure it exists, telling it on which field to sort and how. In hook_apachesolr_query_prepare() you will want to have some logic to know when to modify the query as you do not want to modify each and every query that is performed. hook_apachesolr_query_prepare() is run for every Solr search.
In this case I check to see whether we're on the "courses" or "activities" page, and only then I add the sort. It will sort ascending.

function mymodule_apachesolr_query_prepare($query) {
 // Add a sort on date.
 if (arg(0) == 'courses' || arg(0) == 'activities') {
   $query->setAvailableSort('ds_end_date', array(
     'title' => t('End date'),
     'default' => 'asc',
   ));
 }
}



3. Filter Out The Past Courses And Activities


One other thing I wanted to do was hide the past courses and activities, they serve no more purpose anyway. This can be done by adding a query filter to the hook_apachesolr_query_alter(). We want to filter the date field and filter on a range. Anything between today and forever is OK. Date ranges are noted like [start_date TO end_date]. The start_date will be apachesolr_iso_date of the timestamp of TODAY and the end_date can be substituted with the asterisk wildcard.


Again, we only want to filter if we're on the courses or activities Solr search pages.

function hook_apachesolr_query_alter($query) {
  if (arg(0) == 'courses' || arg(0) == 'activities') {
    $query->addFilter('dm_field_date_end', '['. apachesolr_date_iso(strtotime('TODAY')) .' TO *]');
  }
}



4. Add The End-Date To The Query Results


In step 1 we added the ds_end_date to the Solr index for courses and activities... but unless we tell the query to add this field to the results it will not be there, and we will not be able to sort on it. So again, in hook_apachesolr_query_alter() we add the line and the function will look like this:

function mymodule_apachesolr_query_alter($query) {
  if (arg(0) == 'courses' || arg(0) == 'activities') {
    $query->addFilter('dm_field_date_end', '['. apachesolr_date_iso(strtotime('TODAY')) .' TO *]');
    $query->addParam('fl', 'ds_end_date');
  }
}


5. Actually Run The Date Sort


Finally we need to run the sort on the query after having added the field to the Solr search index, and made the sort available on the query.

function mymodule_apachesolr_query_alter($query) {
  if (arg(0) == 'courses' || arg(0) == 'activities') {
    $query->addFilter('dm_field_date_end', '['. apachesolr_date_iso(strtotime('TODAY')) .' TO *]');
    $query->addParam('fl', 'ds_end_date');
    $query->setSolrsort('ds_end_date', 'asc');
  }
}


And that's it.