Add search

Review Request #200 — Created Jan. 2, 2008 and submitted


Review Board SVN (deprecated)


This is an initial stab at fulltext and field-specific search.  It uses PyLucene/JCC,
which calls into the apache lucene implementation through the JNI.  The indexing is
done via a command "index" which can do either a full or incremental index
of the database.

There's still a few things left before this is perfect.  I need to figure out:
1. Review indexing.  I'm blocked on the fact that I can't import reviews now due
   to that bug that's being discussed on the list.
2. Filename indexing needs a better tokenizer.  I'll probably have to tokenize it
3. It doesn't index dates yet.
4. This needs to be documented a lot -- especially how to build PyLucene, which is
   a total pain.
Used it a lot!  It's sweet!
  2. /trunk/reviewboard/contrib/tools/post-review (Diff revision 1)
    Good fix though separate change :) I encountered this too.. I think we should probably just be setting reviewboard:url to a path without the http://, but we should protect against it here anyway.
  3. /trunk/reviewboard/reviews/ (Diff revision 1)
    Is this part of this change? Seems unrelated and unfinished.
    1. Nope!  I've removed it.
  4. Can you do str( ?
  5. str(
  6. I don't know Lucene. What's the difference between doing this and adding the blob at the end?
    1. The blob at the end is the default field.  This line specifically lets people search the summary field.
  7. Not at all important, but it could be faster to request.bugs_closed.replace(",", " ")
    1. I'm gonna leave it as-is, because it's two lines otherwise.
  8. What happens if get_full_name doesn't return anything but an empty string? Will it get mad if there's nothing there?
    1. Nope.  This string will get tokenized later by lucene.
  9. /trunk/reviewboard/reviews/ (Diff revision 1)
    This function is so generic (aside from the "ReviewRequest") that I think this could be in its own app somewhere, maybe even in djblets. The search function could take an initial queryset so that it knows the object type, and then filter off that.
    I'd sure love to use this in other projects :)
    1. I'll leave it here for now, and figure out how to make this easy to integrate into other projects later.
  10. /trunk/reviewboard/reviews/ (Diff revision 1)
    result_ids = [int(lucene.Hit.cast_(hit).getDocument().get('id')
                  for hit in hits]
    ? :)
  11. Put a space before the /. Some browsers get upset otherwise.
    1. Done.  I wouldn't mind someone giving this some CSS love ;)
  12. Not sure we should assume there's a full name. In one of the other templates, we call something that gives us the proper display name (full name if possible, falling back on username).
  1. Looks good in general. One small thing (also, the post-review addition is still in there).
    1. I've just committed the post-review change separately.
  2. I'm going to be adding djblets_utils soon and moving things into that. This looks like a good candidate.
    1. I'm actually thinking of making a context processor which would let people do {% if settings.key %} without having to include it in every view.
  3. Seems this would be better as getattr(settings, setting, False)