Your Django site can have a useful search tool in no time

IT Igor Támara Igor Támara

Igor Támara

Senior Software Engineer
4 min read.

When you have a Django site and you don't want to torture your database with searches, but you want to offer your users the possibility of having a nice search tool, you could take a look at haystack, this is a configurable search engine where you can pick backends such as xapian, solr and whoosh, among others.  If you have plenty of hardware go for solr, if you don't expect so many visits go for whoosh, but for sure you will find xapian really cool. We prefer xapian too.

Wait, you have already a relational database, now you are adding a document database, so in your next geek party you would be able to speak fluently about it; note that you will have a second database specialized for searches, so the process is like this : you add or update a record to your original database, the system adds or updates a record to the document database, the "index" can be updated and the searches go only in the document database letting your relational database do the things it has to do in your deploy and not overwhelming it with the user search requests.

We assume you are working with a virtualenv, if you are not already using it, go for it first :) , if you are using Debian or Ubuntu, you can follow this guide straightforward, other flavors or oses should work, but you will need to work on it by yourself or upgrade to the recommended Oses we use here.  We assume you have python 2.7, if not, the copy/paste technique won't work directly.

Getting the tools

in your host install xapian

apt-get install python-xapian

Assuming your virtualenv is called tr, the following script should work tweaking your vars according to your installation

Interfacing" class="redactor-autoparser-object">https://gist.github.com/179399... Django with haystack

Complement your requirements.txt with the following lines. If you are wondering what does requirements.txt is about, take a look at virtualenv

xapian-haystack==1.1.5beta
django-haystack==1.2.6
translitcodec==0.2

As usual, run your

pip install -r requirements.txt

In your settings.py add 'haystack' ,

And configure the following variables in your settings.py, make sure you have this variables defined BEFORE the apps configuration:

HAYSTACK_SITECONF = 'rt.search_sites'
HAYSTACK_SEARCH_ENGINE = 'xapian'
HAYSTACK_XAPIAN_PATH = '/var/lib/rtsearch/rtindex'

where rt is your project, you have a search_sites.py file inside it, more on this later, and you have the directory 

/var/lib/rtsearch/rtindex 

with the proper permissions (www-data:www-data or whatever user runs the webserver with the virtualenv).

Powering your django site with haystack

Do you recall we talked about a document database?  in haystack there is a file search_indexes.py that looks familiar to models.py, there we specify how we pick elements from the relational database to be searched, so with this file haystack knows what would be used to compare versus the searched thing, in other words, with this file the index is generated, so when someone searches, the document database is hit, but not your relational database.  In haystack there is a file where we can drop the information that will be used for the search, it's a template and certainly you can customize your results page and the usual aspects you are interested when offering a search tool.  you will take a look at three files, the autodiscover, the search_indexes.py, the template and the configured urls, and as a bonus we will show you an approximation to search no matter if your users have accents or not.

Still reading? We'll go for the index, this is a search_indexes.py, just for motivation we have the following:

There" class="redactor-autoparser-object">https://gist.github.com/179385... are three ways to update the index, in real time, in batch and in a queue, for this sample we used realtime, you are warned to use queue on a production site, this file resembles models.py , we are defining the fields we are using to index our searches, the easiest one is assignednumber , mapping directly to a field of the model RR, for the sake of make yor life easier we considered a ManyToManyField represented by people where we defined a method called indexsearch() that will be used to index, in the other hand we are showing in territory a possible ForeignKey that can be nullable where we are indexing by the field name, and finally we describe the field text that has the attribute use_template set to True, this is special interest because we are able to define in a template many other fields and aspects using the template language to iterate and follow relations in order to make our index a little bit more complete.  The following code shows you a sample template, this template should be located at templates/search/indexes/informationgathering/restitutionrequest_text.txt


Finally" class="redactor-autoparser-object">https://gist.github.com/179420... when you define your index, make sure you register it for the model, ala admin interface.

You can activate your url with something like

Rebuilding" class="redactor-autoparser-object">https://gist.github.com/179423... the index

If for any reason you need to rebuild your index, your django shell has now rebuild_index , so you can use

./manage.py rebuild_index

If necessary

Using accents

Because we(Axiacore) are in spoken spanish country we need to control/avoid accents, so we offer a urls complement, a form and a template.

Follows a sample form using translitcodec, yes, you saw it earlier in requirements.txt.

and" class="redactor-autoparser-object">https://gist.github.com/179425... you can customize your result using your own template to show the results of the search, it should be located at templates/search/search.html

Finally" class="redactor-autoparser-object">https://gist.github.com/179427... your urls.py could look something like

References

 More to be done

Yes, you can plug your own backend, you can make partial searches and the advanced things that your favourite document database can do.

Traducción de Andrés Cárdenas Versión en español


Written by Igor Támara

IT Igor Támara Igor Támara

A seasoned developer, Igor brings expertise in designing and building complex software systems. With a focus on quality and performance, they lead projects that drive innovation and deliver reliable solutions to meet user needs.

Newsletter

Subscribe to our newsletter:

Read more

Instalación de PIL en OSX Lion

PIL es una libreria necesaria para el manejo de imagenes en python, trabajando en entornos virtuales es necesario instalarla ...

1 min read.

Llegamos a un millón!

Hemos llegado a un millón de visitas en las páginas servidas por nuestros proyectos, esperamos seguir creciendo y llegar a gr...

1 min read.

Build Once. Own Forever.