neděle 17. února 2008

How to highlight a phrase on results from Lucene.Net

You came here probably because of Google search or mentioning this page on Stack Overflow. However this down is from 2008 and I am pretty sure it is not working anymore, last time I checked, there were no Lucene.Net.Highlight in Lucene.Net 2.4, sorry...

If you are Outlook, Remember The Milk or Producteev user interested in synchronization, you may have a look at my Outlook synchronization pages.


Adding Lucene.Net based search to HawkWiki was really easy. The results page was showing only the links and I wanted to see also a part of the resulting page. I decided to use the highlighting feature in Lucene.Net.Highlight. I have not found Lucene.Net.Highlight.dll compiled, so I have compiled version 2.0.0 and you can download it. The code to highlight part of the page was also very simple. First instantiate the highlighter based on you search query:

Lucene.Net.Highlight.QueryScorer scorer = new Lucene.Net.Highlight.QueryScorer( query, reader, ContentsField );
Highlighter highlighter = new Highlighter( scorer );

If you are going through search hits, open the content and let highlighter to highlight the content:

StreamReader sr = File.OpenText(path);
string content = sr.ReadToEnd();

string fragment = highlighter.GetBestFragment( analyzer, ContentsField, content );
result[ i ] = new Result();
string relPath = path.Replace( rootDir, "" );
result[ i ].Path = relPath;
result[ i ].Title = relPath.Remove( 0, 1 ).Replace( ".wiki", "" );
result[ i ].Fragment = fragment;

Simple, isn't it.

Žádné komentáře: