-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
typeclassed String to StringLike #82
Open
BebeSparkelSparkel
wants to merge
7
commits into
fimad:master
Choose a base branch
from
BebeSparkelSparkel:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
dd95b70
removed strings
BebeSparkelSparkel ee0b370
fixed .travis/stack*.yaml
BebeSparkelSparkel f2a69db
added ExtendedDefaultRules to benchmarks
BebeSparkelSparkel a91324b
exposed select
BebeSparkelSparkel 800c6a9
Merge branch 'expose_select'
BebeSparkelSparkel 4cad1b3
removed export of `select`
BebeSparkelSparkel dfa2fd3
Uses stackage tagsoup
BebeSparkelSparkel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,3 +10,6 @@ cabal-dev | |
.stack-work/ | ||
cabal.sandbox.config | ||
cabal.config | ||
*.swp | ||
*.swo | ||
*.DS_Store |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,6 +22,7 @@ import Text.HTML.Scalpel.Internal.Select.Types | |
import Control.Applicative | ||
import Control.Monad | ||
import Data.Maybe | ||
import Text.StringLike (StringLike, castString) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice! I never really liked that StringLike was a part of the TagSoup package. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think StringLike is an abomination as well |
||
|
||
import qualified Control.Monad.Fail as Fail | ||
import qualified Data.Vector as Vector | ||
|
@@ -67,7 +68,7 @@ instance Fail.MonadFail (Scraper str) where | |
|
||
-- | The 'scrape' function executes a 'Scraper' on a list of | ||
-- 'TagSoup.Tag's and produces an optional value. | ||
scrape :: (TagSoup.StringLike str) | ||
scrape :: (StringLike str) | ||
=> Scraper str a -> [TagSoup.Tag str] -> Maybe a | ||
scrape s = scrapeTagSpec s . tagsToSpec . TagSoup.canonicalizeTags | ||
|
||
|
@@ -77,7 +78,7 @@ scrape s = scrapeTagSpec s . tagsToSpec . TagSoup.canonicalizeTags | |
-- | ||
-- This function will match only the first set of tags matching the selector, to | ||
-- match every set of tags, use 'chroots'. | ||
chroot :: (TagSoup.StringLike str) | ||
chroot :: (StringLike str) | ||
=> Selector -> Scraper str a -> Scraper str a | ||
chroot selector inner = do | ||
maybeResult <- listToMaybe <$> chroots selector inner | ||
|
@@ -91,30 +92,30 @@ chroot selector inner = do | |
-- | ||
-- > s = "<div><div>A</div></div>" | ||
-- > scrapeStringLike s (chroots "div" (pure 0)) == Just [0, 0] | ||
chroots :: (TagSoup.StringLike str) | ||
chroots :: (StringLike str) | ||
=> Selector -> Scraper str a -> Scraper str [a] | ||
chroots selector (MkScraper inner) = MkScraper | ||
$ return . mapMaybe inner . select selector | ||
|
||
-- | The 'matches' function takes a selector and returns `()` if the selector | ||
-- matches any node in the DOM. | ||
matches :: (TagSoup.StringLike str) => Selector -> Scraper str () | ||
matches :: (StringLike str) => Selector -> Scraper str () | ||
matches s = MkScraper $ withHead (pure ()) . select s | ||
|
||
-- | The 'text' function takes a selector and returns the inner text from the | ||
-- set of tags described by the given selector. | ||
-- | ||
-- This function will match only the first set of tags matching the selector, to | ||
-- match every set of tags, use 'texts'. | ||
text :: (TagSoup.StringLike str) => Selector -> Scraper str str | ||
text :: (StringLike str) => Selector -> Scraper str str | ||
text s = MkScraper $ withHead tagsToText . select s | ||
|
||
-- | The 'texts' function takes a selector and returns the inner text from every | ||
-- set of tags (possibly nested) matching the given selector. | ||
-- | ||
-- > s = "<div>Hello <div>World</div></div>" | ||
-- > scrapeStringLike s (texts "div") == Just ["Hello World", "World"] | ||
texts :: (TagSoup.StringLike str) | ||
texts :: (StringLike str) | ||
=> Selector -> Scraper str [str] | ||
texts s = MkScraper $ withAll tagsToText . select s | ||
|
||
|
@@ -123,15 +124,15 @@ texts s = MkScraper $ withAll tagsToText . select s | |
-- | ||
-- This function will match only the first set of tags matching the selector, to | ||
-- match every set of tags, use 'htmls'. | ||
html :: (TagSoup.StringLike str) => Selector -> Scraper str str | ||
html :: (StringLike str) => Selector -> Scraper str str | ||
html s = MkScraper $ withHead tagsToHTML . select s | ||
|
||
-- | The 'htmls' function takes a selector and returns the html string from | ||
-- every set of tags (possibly nested) matching the given selector. | ||
-- | ||
-- > s = "<div><div>A</div></div>" | ||
-- > scrapeStringLike s (htmls "div") == Just ["<div><div>A</div></div>", "<div>A</div>"] | ||
htmls :: (TagSoup.StringLike str) | ||
htmls :: (StringLike str) | ||
=> Selector -> Scraper str [str] | ||
htmls s = MkScraper $ withAll tagsToHTML . select s | ||
|
||
|
@@ -141,7 +142,7 @@ htmls s = MkScraper $ withAll tagsToHTML . select s | |
-- | ||
-- This function will match only the first set of tags matching the selector, to | ||
-- match every set of tags, use 'innerHTMLs'. | ||
innerHTML :: (TagSoup.StringLike str) | ||
innerHTML :: (StringLike str) | ||
=> Selector -> Scraper str str | ||
innerHTML s = MkScraper $ withHead tagsToInnerHTML . select s | ||
|
||
|
@@ -150,7 +151,7 @@ innerHTML s = MkScraper $ withHead tagsToInnerHTML . select s | |
-- | ||
-- > s = "<div><div>A</div></div>" | ||
-- > scrapeStringLike s (innerHTMLs "div") == Just ["<div>A</div>", "A"] | ||
innerHTMLs :: (TagSoup.StringLike str) | ||
innerHTMLs :: (StringLike str) | ||
=> Selector -> Scraper str [str] | ||
innerHTMLs s = MkScraper $ withAll tagsToInnerHTML . select s | ||
|
||
|
@@ -160,22 +161,22 @@ innerHTMLs s = MkScraper $ withAll tagsToInnerHTML . select s | |
-- | ||
-- This function will match only the opening tag matching the selector, to match | ||
-- every tag, use 'attrs'. | ||
attr :: (Show str, TagSoup.StringLike str) | ||
=> String -> Selector -> Scraper str str | ||
attr :: (Show str, StringLike str) | ||
=> str -> Selector -> Scraper str str | ||
attr name s = MkScraper | ||
$ join . withHead (tagsToAttr $ TagSoup.castString name) . select s | ||
$ join . withHead (tagsToAttr $ castString name) . select s | ||
|
||
-- | The 'attrs' function takes an attribute name and a selector and returns the | ||
-- value of the attribute of the given name for every opening tag | ||
-- (possibly nested) that matches the given selector. | ||
-- | ||
-- > s = "<div id=\"out\"><div id=\"in\"></div></div>" | ||
-- > scrapeStringLike s (attrs "id" "div") == Just ["out", "in"] | ||
attrs :: (Show str, TagSoup.StringLike str) | ||
=> String -> Selector -> Scraper str [str] | ||
attrs :: (Show str, StringLike str) | ||
=> str -> Selector -> Scraper str [str] | ||
attrs name s = MkScraper | ||
$ fmap catMaybes . withAll (tagsToAttr nameStr) . select s | ||
where nameStr = TagSoup.castString name | ||
where nameStr = castString name | ||
|
||
-- | The 'position' function is intended to be used within the do-block of a | ||
-- `chroots` call. Within the do-block position will return the index of the | ||
|
@@ -211,7 +212,7 @@ attrs name s = MkScraper | |
-- , (2, "Third paragraph.") | ||
-- ] | ||
-- @ | ||
position :: (TagSoup.StringLike str) => Scraper str Int | ||
position :: (StringLike str) => Scraper str Int | ||
position = MkScraper $ Just . tagsToPosition | ||
|
||
withHead :: (a -> b) -> [a] -> Maybe b | ||
|
@@ -221,27 +222,27 @@ withHead f (x:_) = Just $ f x | |
withAll :: (a -> b) -> [a] -> Maybe [b] | ||
withAll f xs = Just $ map f xs | ||
|
||
foldSpec :: TagSoup.StringLike str | ||
foldSpec :: StringLike str | ||
=> (TagSoup.Tag str -> str -> str) -> TagSpec str -> str | ||
foldSpec f = Vector.foldr' (f . infoTag) TagSoup.empty . (\(a, _, _) -> a) | ||
|
||
|
||
tagsToText :: TagSoup.StringLike str => TagSpec str -> str | ||
tagsToText :: StringLike str => TagSpec str -> str | ||
tagsToText = foldSpec f | ||
where | ||
f (TagSoup.TagText str) s = str `TagSoup.append` s | ||
f _ s = s | ||
|
||
tagsToHTML :: TagSoup.StringLike str => TagSpec str -> str | ||
tagsToHTML :: StringLike str => TagSpec str -> str | ||
tagsToHTML = foldSpec (\tag s -> TagSoup.renderTags [tag] `TagSoup.append` s) | ||
|
||
tagsToInnerHTML :: TagSoup.StringLike str => TagSpec str -> str | ||
tagsToInnerHTML :: StringLike str => TagSpec str -> str | ||
tagsToInnerHTML (tags, tree, ctx) | ||
| len < 2 = TagSoup.empty | ||
| otherwise = tagsToHTML (Vector.slice 1 (len - 2) tags, tree, ctx) | ||
where len = Vector.length tags | ||
|
||
tagsToAttr :: (Show str, TagSoup.StringLike str) | ||
tagsToAttr :: (Show str, StringLike str) | ||
=> str -> TagSpec str -> Maybe str | ||
tagsToAttr attr (tags, _, _) = do | ||
guard $ 0 < Vector.length tags | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is why I chose to mix String and StringLike. It seems a lot less convenient to have to explicitly type each tag name as a
TagName String
.Is there a specific benefit that using StringLike enables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have a String or Text specific file that just declares the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmmm, there's already Text.HTML.Scalpel and Text.HTML.Scalpel.Core. I'd rather not create more re-exports of the API if its not necessary.
Other than API consistency is there a benefit to having StringLike, Text, and String versions of the API? I'm not super familiar with the implementation of OverloadedStrings, but I think the conversion happens at run time. If that's the case I don't think there would be a performance difference since internally, the library is already converting to
Text
for comparing nodes.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using libs that use Text and it's inconvenient to convert to strings and from strings.
What if we have a classed based file Text.HTML.Scalpel.StringLike and then just apply String to all the functions in Text.HTML.Scalpel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I was not really expecting people to be passing dynamic values as attributes. I think having a single
Text.HTML.Scalpel.StringLike
is a bit more palatable, let's go with that.