全文檢索

綜合評比

Google 自訂搜尋引擎

http://www.google.com/cse/?utm_campaign=TW-en&utm_medium=et-5&utm_source=TW-en-et-receng-we1-analytics-48&hl=zh-TW

「網頁如果是公開的,要對其內容作搜尋,目前還是以Google為最佳選擇(市占率最高),包括站內搜尋」 -- 曾保彰, 2009

根據 Google Developer -- Custom Search 的說明, 如果要使用 API, 那麼一天只有100次搜尋是免費, 再多就要收費了. 資料來源: https://developers.google.com/custom-search/v1/overview, 2012.6.13

Elasticsearch

https://www.elastic.co/products/elasticsearch

Build on top of Apache Lucene™

與 solr 比較:Apache Solr vs ElasticSearch

Modular search for django.

Haystack lets you write your search code once and choose the search engine you want it to run on. With a familiar API that should make any Djangonaut feel right at home and an architecture that allows you to swap things in and out as you need to, it's how search ought to be.

Haystack is BSD licensed, plays nicely with third-party apps without needing to modify the source and supports Solr, Elasticsearch, Whoosh and Xapian.

Lucene

免費開放原始碼搜尋引擎 Lucene簡介 -- 曾保彰, 2009

「Lucene是Apache基金會開放源碼計畫之一,以Java語言撰寫,具有支援Unicode多國語言,在網路社群中持續發展等優點,並且有眾多的開放源碼系統以其為核心。但缺點是Lucene是以程式庫的方式提供,必需以Java語言撰寫程式才能取用,且功能繁瑣,學習期長,並不易切入進行實作。 」 -- 〈全文檢索伺服器Solr初探〉 -- 張錦堂

Ferret

Ferret - Lucene implementation in Ruby

搜尋結果中,只傳回符合條件的 Document 數量,不容易取得中文關鍵字出現的次數。

Lightweight python wrapper for Apache Solr.

solrpy is a python client for solr, an enterprise search server built on top of lucene. solrpy allows you to add documents to a solr instance, and then to perform queries and gather search results from solr using your favorite programming language--python.

django-sphinx

https://github.com/dcramer/django-sphinx/

A transparent layer for full-text search using Sphinx and Django

Xapian is an Open Source Search Engine Library, released under the GPL. It's written in C++, with bindings to allow use from Perl, Python, PHP, Java, Tcl,C#, Ruby and Lua (so far!)

「Xapian is very very fast, but less of a complete solution than SOLR.」 -- Ranieri, 2010.7.12

Fast, pure-Python full text indexing, search, and spell checking library.

Python native binding for Hyper Estraier: http://pypi.python.org/pypi/estraiernative/0.2

Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files.

Swish-E API for Python: http://pypi.python.org/pypi/Swish-E/0.5

Postgres + TSearch2