Introduce html2text for extracting plaintext from statuses. #236.

Unlike strip_tags, html2text will preserve text present in other nodes,
e.g. anchor tags:

    [1] pry(main)> str = '<a href="http://www.example.com">A link</a>'
    => "<a href=\"http://www.example.com\">A link</a>"
    [2] pry(main)> Html2Text.convert(str)
    => "[A link](http://www.example.com)"
    [3] pry(main)> include ActionView::Helpers::SanitizeHelper
    => Object
    [4] pry(main)> strip_tags(str)
    => "A link"

Preserving the href of an anchor allows keyword mutes to also match on
URLs, which is something that the frontend regex filter can currently
do.
shrike
David Yip 2018-02-10 02:32:39 -06:00
parent 53c86b29f0
commit 9105b0c954
No known key found for this signature in database
GPG Key ID: 7DA0036508FCC0CC
2 changed files with 4 additions and 0 deletions

View File

@ -42,6 +42,7 @@ gem 'fast_blank', '~> 1.0'
gem 'goldfinger', '~> 2.1' gem 'goldfinger', '~> 2.1'
gem 'hiredis', '~> 0.6' gem 'hiredis', '~> 0.6'
gem 'redis-namespace', '~> 1.5' gem 'redis-namespace', '~> 1.5'
gem 'html2text'
gem 'htmlentities', '~> 4.3' gem 'htmlentities', '~> 4.3'
gem 'http', '~> 3.0' gem 'http', '~> 3.0'
gem 'http_accept_language', '~> 2.1' gem 'http_accept_language', '~> 2.1'

View File

@ -205,6 +205,8 @@ GEM
highline (1.7.10) highline (1.7.10)
hiredis (0.6.1) hiredis (0.6.1)
hkdf (0.3.0) hkdf (0.3.0)
html2text (0.2.1)
nokogiri (~> 1.6)
htmlentities (4.3.4) htmlentities (4.3.4)
http (3.0.0) http (3.0.0)
addressable (~> 2.3) addressable (~> 2.3)
@ -601,6 +603,7 @@ DEPENDENCIES
goldfinger (~> 2.1) goldfinger (~> 2.1)
hamlit-rails (~> 0.2) hamlit-rails (~> 0.2)
hiredis (~> 0.6) hiredis (~> 0.6)
html2text
htmlentities (~> 4.3) htmlentities (~> 4.3)
http (~> 3.0) http (~> 3.0)
http_accept_language (~> 2.1) http_accept_language (~> 2.1)