Boost Drupal with Solr
Mon, Jan 11, 2010 by JF
The last decade has seen our consumption of information skyrocket, but our capacity to manually organize our digital life simply crumble.
Think of it: ten years ago, we would create folders to organize our files, our music, our mail, but nowadays, we simply store our information into generic folders, and as long as each piece of information is properly tagged, we let the search engines organize our life. One could argue that this was predictable, because this is how the brain works. But there has been several formal studies on the subject, so this evolution wasn’t completely blind.
One of the key ingredients of Drupal’s success is the concept of generic nodes and attributes which we don’t care how and where they are stored. What is important is the capacity to retrieve information using parameters, through the Search and Views modules.
Drupal’s core search module can be replaced with Apache Solr, a web service which includes the Lucene engine also powering Alfresco. On web sites where Alfresco is used to store documents, this has the advantage of bringing one uniform search syntax to every search query. Furthermore, since Solr is called via a REST interface, you can install it on dedicated server, which means that your website performance won’t degrade during periods where the search activity is more intense.
Here are some other features of the Lucene engine:
1. Scored results
2. Highlighter (to show words found in context)
3. Fuzzy search
4. Query spellchecker
Solr turns Lucene into a web service and provides:
1. Faceting of query results
2. “More like this” plugin
3. “Did you mean” plugin
Installation
To install Solr, you must first setup Tomcat. If you have Alfresco already installed, you can run Solr from the same Tomcat instance.
Once Tomcat is installed, you need to download the binary distribution of Solr and copy the “example” folder to one area of your choice. On my system, my package manager installs Tomcat into /opt/local/sharejava/tomcat, therefore I chose to install Solr in a subdirectory of /opt/local/share/java/apache-solr/, next to the /opt/local/share/java/apache-ant directory:
tar -xzf apache-solr-1.4.0.tgz
cp apache-solr-1.4.0/example/solr /opt/local/share/java/apache-solr/example.dev
cp apache-solr-1.4.0/example/webapps/solr.war /opt/local/share/java/apache-solr/example.dev/bin/apachesolr-1.4.0.war
Now install the Drupal ApacheSolr module and copy schema.xml and solrconfig.xml to the conf folder of your Solr instance.
cp apachesolr/schema.xml /opt/local/share/java/apache-solr/example.dev/conf
cp apachesolr/solrconfig.xml /opt/local/share/java/apache-solr/example.dev/conf
Then create a startup script for Tomcat inside /opt/local/share/java/apache-solr/example.dev:
<?xml version="1.0" encoding="utf-8"?>
<Context docBase="/opt/local/share/java/apache-solr/example.dev/bin/apache-solr-1.4.0.war" debug="0" crossContext="true">
<Environment name="solr/home" type="java.lang.String" value="/opt/local/share/java/apache-solr/example.dev" override="true"/>
</Context>
Then symlink the configuration to your Tomcat instance:
ln -s /opt/local/share/java/apache-solr/example.dev/apache-solr-example.dev.xml /opt/local/share/java/tomcat6/conf/Catalina/local/apache-solr-example.dev.xml
Finally, restart Tomcat and configure Drupal ApacheSolr to call the Solr path /apache-solr-example.dev and you are good to go!
For more information:
Solr: http://lucene.apache.org/solr/features.html
Lucene: http://lucene.apache.org/
Drupal module: http://drupal.org/project/apachesolr
Ed Dowding posted on February 14, 2010 2:44 pm
Weird - I've just been doing something you 2 out of 3 of your most recent topics! Solr & Proximity. Now I wonder if you'd know how to integrate the two, so there could be solr filters AND proximity filtering?
admin posted on January 12, 2010 8:11 pm
Hi Jakes
My take is the economics of providing hosting will probably prove not to be that compelling. Development and support are two different tracks and require totally different resources.
On the other hand, you can get Tomcat/PHP/MySQL hosting for under 10$ a month, a and quick introduction like this one are meant to raise the awareness of Solr and break the perceived barrier of complexity. Solr is extremely important for the Drupal platform, but it is viewed by most developers as added cost, increased complexity, and risk factor.
What might win more developers is a configuration script (with a good tutorial), just like cloudtools did for EC2.
Thought? Comments?
JF
Jakes posted on January 11, 2010 2:17 pm
Hi JF,
Thanks for sharing this information with the world. It am sure is very helpfull when setting it up your own server. I have 10 sites, with the largest having 25000 visitors and obout 21000 nodes.
I was searching on-line for a hosted SOLR service and it is not available, except for Acquia (which is pricy).
I am now thinking of setting up my own SOLR server and is aiming to offer it as a hosted solution for other Drupal developers, that is either to small or to in-experienced to do their own server.
What is your feeling here?
Jakes
Post new comment