Sisyphus repository
Last update: 1 october 2023 | SRPMs: 18631 | Visits: 37420856
en ru br
ALT Linux repos
S:1.03-alt1

Group :: Development/Perl
RPM: perl-HTML-TagFilter

 Main   Changelog   Spec   Patches   Sources   Download   Gear   Bugs and FR  Repocop 

Current version: 1.03-alt1
Build date: 5 august 2011, 10:19 ( 663.1 weeks ago )
Size: 18.46 Kb

Home page:   http://www.cpan.org

License: Artistic
Summary: A fine-grained html-filter, xss-blocker and mailto-obfuscator
Description:

HTML::TagFilter is a subclass of HTML::Parser with a single purpose:
it will remove unwanted html tags and attributes from a piece of text.
It can act in a more or less fine-grained way - you can specify
permitted tags, permitted attributes of each tag, and permitted
values for each attribute in as much detail as you like.

Tags which are not allowed are removed. Tags which are allowed are
trimmed down to only the attributes which are allowed for each tag.
It is possible to allow all or no attributes from a tag, or to allow
all or no values for an attribute, and so on.

The filter will also guard against cross-site scripting attacks
and obfuscate any mailto:email addresses, unless you tell it not to.

The original purpose for this was to screen user input.
In that setting you'll often find that just using:

   my $tf = new HTML::TagFilter;
   put_in_database($tf->filter($my_text));

will do. However, it can also be used for display processes
(eg text-only translation) or cleanup (eg removal of old javascript).
In those cases you'll probably want to override the default rule set
with a small number of denial rules.

   my $self = HTML::TagFilter->new(deny => {img => {'all'}});
   print $tf->filter($my_text);

Will strip out all images, for example, but leave everything
else untouched.

nb (faq #1) the filter only removes the tags themselves:
all it does to text which is not part of a tag is to escape
the <s and >s, to guard against false negatives and some common
cross-site attacks.

obPascal: Sorry about the incredibly long documentation, by the way.
When I have time I'll make it shorter.

Current maintainer: Andrey V. Stroganov

List of contributors

List of rpms provided by this srpm:

  • perl-HTML-TagFilter
ACL:
     
    design & coding: Vladimir Lettiev aka crux © 2004-2005, Andrew Avramenko aka liks © 2007-2008
    current maintainer: Michael Shigorin