JAVA Software Downloads

Crawlers in Java

QHeritrix
Heritrix is the Internet Archive’s open-source, extensible, web-scale, archival-quality web crawler project.

http://crawler.archive.org/

WebSPHINX
WebSPHINX ( Website-Specific Processors for HTML INformation eXtraction) is a Java class library and interactive development environment for Web crawlers that browse and process Web pages automatically.

License Apache Software License
http://www-2.cs.cmu.edu/~rcm/websphinx/

JoBo
JoBo is a simple program to download complete websites to your local computer. Internally it is basically a web spider. The main advantage to other download tools is that it can automatically fill out forms (e.g. for automated login) and also use cookies for session handling.

License GNU Library or Lesser General Public License (LGPL)
http://www.matuschek.net/software/jobo/index.html

Database Connection Pools

Jakarta DBCP
DBCP is a database connection pool that relies on code in the Jakarta commons-pool package to provide the underlying object pool mechanisms that it utilizes. Applications can use the DBCP component directly or through the existing interface of their container / supporting framework.
License Apache Software License
http://jakarta.apache.org/commons/dbcp/

C3P0
C3P0 is an easy-to-use library for augmenting traditional (DriverManager based) JDBC drivers with JNDI-bindable DataSources, including DataSources that implement Connection and Statement Pooling, as described by the jdbc3 spec and jdbc2 standard extensio

License GNU Library or Lesser General Public License (LGPL)
http://sourceforge.net/projects/c3p0

Proxool
Proxool is a Java connection pool.It transparently adds connection pooling to your existing JDBC driver.

http://proxool.sourceforge.net/

 

Command Line tools

Jakarta Commons CLI
The Apache Commons CLI library provides an API for processing command line interfaces. There are three stages to command line processing. They are the definition, parsing and interrogation stages.

License Apache Software License
http://jakarta.apache.org/commons/cli/

ArgParser
ArgParser is a Java package, which can be used to specify command line options for a Java application. It has support for range checking, multiple option names (aliases), single word options, multiple values associated with an option, multiple option invocation, generating help information, custom argument parsing, and reading arguments from a file.

http://www.cs.ubc.ca/spider/lloyd/java/argparser.html

 

JArgs
JArgs is a comprehensive command line option parsing suite, for use by Java programmers. Initially, parsing compatible with GNU-style ‘getopt’ is provided. JArgs is easy to use, thoroughly tested and well documented.

License BSD License
http://jargs.sourceforge.net/