Introduction
Solr is a search engine platform based on Apache Lucene. It is written in Java and uses the Lucene library to implement indexing. It can be accessed using a variety of REST APIs (e.g. XML and JSON). This is the feature list from their website:
Advanced Full-Text Search Capabilities
Optimized for High Volume Web Traffic
Standards Based Open Interfaces – XML, JSON and HTTP
Comprehensive HTML Administration Interfaces
Server statistics exposed over JMX for monitoring
Linearly scalable, auto index replication, auto failover and recovery
Near Real-time indexing
Flexible and Adaptable with XML configuration
Extensible Plugin Architecture
In this article, I will show you how to install Solr on Ubuntu using two different methods. The first one will be the simple method and the second the more advanced method. I recommend the second method because it installs a newer version of Solr on all Ubuntu versions, even in the most recent version 14.04 at time of writing.
Installing Solr using apt-get (easy way)
If you want to install Solr the easy way, you should use this section of the article. Solr doesn’t work alone; it needs a Java servlet container such as Tomcat or Jetty. In this article, we’ll use Jetty, although Tomcat is just as easy. First, we should install the Java JDK. If you want to install a custom version, please see this article. If you want a simple installation, execute the following commands:
sudo apt-get -y install openjdk-7-jdk
mkdir /usr/java
ln -s /usr/lib/jvm/java-7-openjdk-amd64 /usr/java/default
Ubuntu provides 3 Solr packages by default: solr-common
, the package that contains the actual Solr code; solr-tomcat
, Solr integrated with Tomcat; and solr-jetty
, which is just like solr-tomcat
but with the Jetty web server. In this article, we will install solr-tomcat
, so execute the following command:
sudo apt-get -y install solr-tomcat
Your Solr instance should now be available at http://YOUR_IP:8080/solr
. Skip the next section on installing manually if you want to configure Solr.
Installing Solr Manually
To install Solr manually, you will need a little more time. First, we should install the Java JDK. If you want to install a custom version, please see this article. For this section, we will be using Jetty instead of Tomcat. If you want a simple installation, execute the following command:
sudo apt-get -y install openjdk-7-jdk
mkdir /usr/java
ln -s /usr/lib/jvm/java-7-openjdk-amd64 /usr/java/default
We can now start the real installation of Solr. First, download all files and uncompress them:
cd /opt
wget http://archive.apache.org/dist/lucene/solr/4.7.2/solr-4.7.2.tgz
tar -xvf solr-4.7.2.tgz
cp -R solr-4.7.2/example /opt/solr
cd /opt/solr
java -jar start.jar
Check if it works by visiting http://YOUR_IP:8983/solr
. When it works, go back into your SSH session and close the window with Ctrl+C. Then open the /etc/default/jetty
file (nano /etc/default/jetty
) and paste this into it:
NO_START=0 # Start on boot
JAVA_OPTIONS="-Dsolr.solr.home=/opt/solr/solr $JAVA_OPTIONS"
JAVA_HOME=/usr/java/default
JETTY_HOME=/opt/solr
JETTY_USER=solr
JETTY_LOGS=/opt/solr/logs
Save it and open the file /opt/solr/etc/jetty-logging.xml
(nano /opt/solr/etc/jetty-logging.xml
) and paste this into it:
<New id="ServerLog" class="java.io.PrintStream"> <Arg> <New class="org.mortbay.util.RolloverFileStream"> <Arg><SystemProperty name="jetty.logs" default="."/>/yyyy_mm_dd.stderrout.log</Arg> <Arg type="boolean">false</Arg> <Arg type="int">90</Arg> <Arg><Call class="java.util.TimeZone" name="getTimeZone"><Arg>GMT</Arg></Call></Arg> <Get id="ServerLogName" name="datedFilename"/> </New> </Arg> </New> <Call class="org.mortbay.log.Log" name="info"><Arg>Redirecting stderr/stdout to <Ref id="ServerLogName"/></Arg></Call> <Call class="java.lang.System" name="setErr"><Arg><Ref id="ServerLog"/></Arg></Call> <Call class="java.lang.System" name="setOut"><Arg><Ref id="ServerLog"/></Arg></Call></Configure>
Then, create the Solr user and grant it permissions:
sudo useradd -d /opt/solr -s /sbin/false solr
sudo chown solr:solr -R /opt/solr
After that, download the start file and set it to automatically start up if it hasn’t been done already:
sudo wget -O /etc/init.d/jetty http://dev.eclipse.org/svnroot/rt/org.eclipse.jetty/jetty/trunk/jetty-distribution/src/main/resources/bin/jetty.sh
sudo chmod a+x /etc/init.d/jetty
sudo update-rc.d jetty defaults
Finally start Jetty/Solr:
sudo /etc/init.d/jetty start
You can now access your installation just as before at http://YOUR_IP:8983/solr
.
Configuring a schema.xml for Solr
First, rename the /opt/solr/solr/collection1
to an understandable name like apples (use whatever name you’d like). (This can be skipped if you installed it using apt-get
. In that case, you can execute the following command instead: cd /usr/share/solr
):
cd /opt/solr/solr
mv collection1 apples
cd apples
Also, if you installed Solr manually, open the file core.properties (nano core.properties
) and change the name to the same name.
Then, remove the data
directory and change the schema.xml:
rm -R data
nano conf/schema.xml
Paste your own schema.xml in here. There is a very advanced schema.xml in the Solr Repository. You can probably find a lot more of them on the internet, but I won’t go into depth about that. Restart Jetty/Tomcat:
For the simple installation.
sudo service tomcat6 restart
For the advanced installation.
sudo /etc/init.d/jetty restart
When you now visit your Solr instance, you should see the Dashboard with the collection somewhere.
Conclusion
You have now successfully installed Solr and can start using it for your own site! If you don’t know how to make a schema.xml, find a tutorial on how to do that. Then, find a library for your programming language that connects with Solr.