Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strings containing emojis produce shifted results #51

Open
david-gorski opened this issue May 12, 2020 · 0 comments
Open

Strings containing emojis produce shifted results #51

david-gorski opened this issue May 12, 2020 · 0 comments

Comments

@david-gorski
Copy link

I seems that when emojis are present it shifts the resulting groups and matched strings.

For example I was using

let textRegex = "\\[([^]]*)\\]".r!

let input = """
Mexican culture has lots of rich history and great food! 🌯🌯🌯🌯

Avacado's are already incredibly popular and for good reason: They taste good, work in tons or recipes, and are [good for you]<url>{https://www.healthline.com/nutrition/12-proven-benefits-of-avocado}. So maybe you already have [avocado]<trend>{avocado-intake} every day or maybe just every once in a while, but maybe there's even more reasons to love these green fatty fruits! [Avacado's have been shown to improve sleep]<url>{https://www.cbsnews.com/pictures/foods-that-will-help-you-sleep-better/9/}.
"""

for text in textRegex.findAll(in: input).makeIterator(){
    print(text.matched)
}

This produces:
d for you]<url
cado]<tre
cado's have been shown to improve sleep]<url

Instead of the expected:
good for you
avocado
Avacado's have been shown to improve sleep

The shifting is caused by the presence of the emoji. Each emoji shifts the results index by 1, so here its shifted by 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant