About Site Map Submit Contact Us Log in | Create an account
Create an account Log In
Average Rating
User Rating:
Visitors Rating:
My rating:

Write review
  • License: Freeware
  • Last update: 7 years ago
  • Total downloads: 340
  • Price: Free |
  • Operating system: Mac OS X
  • Publisher: Derrick Oswald
See full specifications

mac default iconHTML Parser for Mac Publisher's description

Library to parse HTML content

HTML Parser is a free and open source Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, HTML features filters, visitors, custom tags and easy to use JavaBeans. It is a fast, robust and well tested package.

Extraction encompasses all the information retrieval programs that are not meant to preserve the source page. This covers uses like:
В· text extraction, for use as input for text search engine databases for example
В· link extraction, for crawling through web pages or harvesting email addresses
screen scraping, for programmatic data input from web pages
В· resource extraction, collecting images or sound
В· a browser front end, the preliminary stage of page display
В· link checking, ensuring links are valid
В· site monitoring, checking for page differences beyond simplistic diffs

There are several facilities in the HTMLParser codebase to help with extraction, including filters, visitors and JavaBeans.

Transformation includes all processing where the input and the output are HTML pages. Some examples are:
В· URL rewriting, modifying some or all links on a page
В· site capture, moving content from the web to local disk
В· censorship, removing offending words and phrases from pages
В· HTML cleanup, correcting erroneous pages
В· ad removal, excising URLs referencing advertising
В· conversion to XML, moving existing web pages to XML

What's New in This Release:

В· The htmlparser project has been been updated with a new license, new build environment, new repository and a new web site. To identify this radical change, the version has been revved to 2.0.

В· In response to requests from the Apache community, the htmlparser license has changed from GNU Library or Lesser General Public License, to the more Apache friendly Common Public License 1.0 (http://opensource.org/licenses/cpl1.0.txt).

В· The htmlparser repository has been changed from CVS to subversion (http://subversion.tigris.org/).

В· To support automatic integration in other projects, the build environment has changed from Ant to Maven 2 (http://maven.apache.org/). This has provided an opportunity to update the web site (http://htmlparser.org).

System Requirements:

В· Java
Program Release Status:
Program Install Support:

Is HTML Parser for Mac your software?

Manage your software

Most Popular

mac default icon twttr.media.types.instagram For Mac 1.4
Permission is hereby granted, free of charge
mac default icon Fiddler For Mac 1.0
Objective-C libraries for calculating sunrise & sunset times
mac default icon Wassup For Mac 2.5
Let's you know what's up with your Java environment.
mac default icon libdvdcss for Mac 1.2.11
Free and open source library that will help you access DVDs
mac default icon CodeRunner For Mac 1.3
Edit and run code in 10 different programming languages with CodeRunner...

Related Category

» Other (625)
» Tools (3122)