Apache Lucene Publisher's description
Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java.
Apache Lucene(TM) is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
What's New in This Release:New Features:
LUCENE-4906: PostingsHighlighter can now render to custom Object, for advanced use cases where String is too restrictive
LUCENE-5133: Changed AnalyzingInfixSuggester.highlight to return Object instead of String, to allow for advanced use cases where String is too restrictive
LUCENE-5207, LUCENE-5334: Added expressions module for customizing ranking with script-like syntax.
Jack Conradson, Ryan Ernst, Uwe Schindler via Robert Muir)
LUCENE-5180: ShingleFilter now creates shingles with trailing holes, for example if a StopFilter had removed the last token.
LUCENE-5219: Add support to SynonymFilterFactory for custom parsers.
LUCENE-5235: Tokenizers now throw an IllegalStateException if the consumer does not call reset() before consuming the stream. Previous versions throwed NullPointerException or ArrayIndexOutOfBoundsException on best effort which was not user-friendly.
LUCENE-5240: Tokenizers now throw an IllegalStateException if the consumer neglects to call close() on the previous stream before consuming the next one.
LUCENE-5214: Add new FreeTextSuggester, to predict the next word using a simple ngram language model. This is useful for the "long tail" suggestions, when a primary suggester fails to find a suggestion.
LUCENE-5251: New DocumentDictionary allows building suggesters via contents of existing field, weight and optionally payload stored fields in an index
LUCENE-5261: Add QueryBuilder, a simple API to build queries from the analysis chain directly, or to make it easier to implement query parsers.
LUCENE-5270: Add Terms.hasFreqs, to determine whether a given field indexed per-doc term frequencies.
LUCENE-5269: Add CodepointCountFilter.
LUCENE-5274: FastVectorHighlighter now supports highlighting against several indexed fields.
LUCENE-5304: SingletonSortedSetDocValues can now return the wrapped SortedDocValues
LUCENE-2844: The benchmark module can now test the spatial module. See spatial.alg
LUCENE-5302: Make StemmerOverrideMap's methods public
LUCENE-5296: Add DirectDocValuesFormat, which holds all doc values in heap as uncompressed java native arrays.
LUCENE-5189: Add IndexWriter.updateNumericDocValues, to update numeric DocValues fields of documents, without re-indexing them.
LUCENE-5298: Add SumValueSourceFacetRequest for aggregating facets by a ValueSource, such as a NumericDocValuesField or an expression.
LUCENE-5323: Add .sizeInBytes method to all suggesters (Lookup).
LUCENE-5312: Add BlockJoinSorter, a new Sorter implementation that makes sure to never split up blocks of documents indexed with IndexWriter.addDocuments.
LUCENE-5297: Allow to range-facet on any ValueSource, not just NumericDocValues fields.
LUCENE-5272: OpenBitSet.ensureCapacity did not modify numBits, causing false assertion errors in fastSet.
LUCENE-5303: OrdinalsCache did not use coreCacheKey, resulting in over caching across multiple threads.
LUCENE-5307: Fix topScorer inconsistency in handling QueryWrapperFilter inside ConstantScoreQuery, which now rewrites to a query removing the obsolete QueryWrapperFilter.
LUCENE-5330: IndexWriter didn't process all internal events on #getReader(), #close() and #rollback() which causes files to be deleted at a later point in time. This could cause short-term disk pollution or OOM if in-memory directories are used.
LUCENE-5342: Fixed bulk-merge issue in CompressingStoredFieldsFormat which created corrupted segments when mixing chunk sizes. Lucene41StoredFieldsFormat is not impacted.
LUCENE-5222: Add SortField.needsScores(). Previously it was not possible for a custom Sort that makes use of the relevance score to work correctly with IndexSearcher when an ExecutorService is specified.
LUCENE-5275: Change AttributeSource.toString() to display the current state of attributes.
LUCENE-5277: Modify FixedBitSet copy constructor to take an additional numBits parameter to allow growing/shrinking the copied bitset. You can use FixedBitSet.clone() if you only need to clone the bitset.
LUCENE-5260: Use TermFreqPayloadIterator for all suggesters; those suggesters that can't support payloads will throw an exception if hasPayloads() is true.
LUCENE-5280: Rename TermFreqPayloadIterator -> InputIterator, along with associated suggest/spell classes.
LUCENE-5157: Rename OrdinalMap methods to clarify API and internal structure.
LUCENE-5313: Move preservePositionIncrements from setter to ctor in Analyzing/FuzzySuggester.
LUCENE-5321: Remove Facet42DocValuesFormat. Use DirectDocValuesFormat if you want to load the category list into memory.
LUCENE-5324: AnalyzerWrapper.getPositionIncrementGap and getOffsetGap can now be overridden.
LUCENE-5225: The ToParentBlockJoinQuery only keeps tracks of the the child doc ids and child scores if the ToParentBlockJoinCollector is used.
LUCENE-5236: EliasFanoDocIdSet now has an index and uses broadword bit selection to speed-up advance().
LUCENE-5266: Improved number of read calls and branches in DirectPackedReader.
LUCENE-5300: Optimized SORTED_SET storage for fields which are single-valued.
LUCENE-5211: Better javadocs and error checking of 'format' option in StopFilterFactory, as well as comments in all snowball formated files about specifying format option.
Changes in backwards compatibility policy:
LUCENE-5235: Sub classes of Tokenizer have to call super.reset() when implementing reset(). Otherwise the consumer will get an IllegalStateException because the Reader is not correctly assigned. It is important to never change the "input" field on Tokenizer without using setReader(). The "input" field must not be used outside reset(), incrementToken(), or end() - especially not in the constructor.
LUCENE-5204: Directory doesn't have default implementations for LockFactory-related methods, which have been moved to BaseDirectory. If you had a custom Directory implementation that extended Directory, you need to extend BaseDirectory instead.
System Requirements:Apache Lucene runs of Java 6 or greater. When using Java 7, be sure to install at least Update 1! With all Java versions it is strongly recommended to not use experimental -XX JVM options. It is also recommended to always use the latest update version of your Java VM, because bugs may affect Lucene. An overview of known JVM bugs can be found on http://wiki.apache.org/lucene-java/JavaBugs.
CPU, disk and memory requirements are based on the many choices made in implementing Lucene (document size, number of documents, and number of hits retrieved to name a few). The benchmarks page has some information related to performance on particular platforms.
Program Release Status: Major Update
Program Install Support: Install and Uninstall