Mozilla Skin

LuSql

From CISTI-ICIST LAB WIKI

Principal Investigator: Glen Newton, glen.newton@gmail.com

Note: LuSql development has moved to Google Code.

LuSql is a simple but powerful tool for building Lucene indexes from relational databases. It is a command-line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode it uses threading to take advantage of multiple cores.

LuSql can handle complex queries, allows for additional per record sub-queries, and has a plug-in architecture for arbitrary Lucene document manipulation. Its only dependencies are three Apache Commons libraries, the Lucene core itself, and a JDBC driver.

LuSql has been extensively tested, including a large 6+ million full-text & article metadata document collection, producing an 86GB Lucene index.

Contents

Features

  • Convert DBMS content into Lucene full-text index
  • Easy to use
  • Flexible
  • High performance
  • Open Source
  • Handles complex queries

License

The LuSql software is licensed under the Apache License v2.

The LuSql software is copyright © 2008 National Research Council Canada.

Getting LuSql

Documentation

User Manual & Tutorial

A 40+ page user manual (PDF, 1.4MB) is available which includes installation instructions, an explanation of command line arguments, an extensive tutorial and some performance numbers for LuSql.

Quick Start

Example Usage & Command line options

Changes

Accumulate news and suggested changes.

  • Version 0.901 released: added -E to allow setting JDBC transaction isolation. See Download
  • v0.9