Sisyphus repository
Last update: 25 april 2018 | SRPMs: 18293 | Visits: 11379068
en ru br
ALT Linux repos
S:1.2.0-alt1_11jpp8

Group :: Development/Java
RPM: boilerpipe

 Main   Changelog   Spec   Patches   Sources   Download   Gear   Bugs and FR  Repocop 

Current version: 1.2.0-alt1_11jpp8
Build date: 15 april 2018, 20:51 ( 1.4 weeks ago )
Size: 66.87 Kb

Home page:   https://github.com/kohlschutter/boilerpipe

License: ASL 2.0
Summary: Boilerplate Removal and Fulltext Extraction from HTML pages
Description:

The boilerpipe library provides algorithms to detect and
remove the surplus "clutter" (boilerplate, templates)
around the main textual content of a web page.

The library already provides specific strategies
for common tasks (for example: news article extraction) and
may also be easily extended for individual problem settings.

Extracting content is very fast (milliseconds), just needs the
input document (no global or site-level information required) and
is usually quite accurate.

Current maintainer: Igor Vlasenko

List of contributors

List of rpms provided by this srpm:

  • boilerpipe
  • boilerpipe-javadoc
ACL:
     
    design & coding: Vladimir Lettiev aka crux © 2004-2005, Andrew Avramenko aka liks © 2007-2008
    current maintainer: Michael Shigorin