Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spans (from rich text) inside cells cause newlines #7

Open
jywarren opened this issue Jul 29, 2016 · 13 comments
Open

spans (from rich text) inside cells cause newlines #7

jywarren opened this issue Jul 29, 2016 · 13 comments
Labels

Comments

@jywarren
Copy link
Collaborator

Pasting content from an external page into a table in a Woofmark instance (like this one: https://publiclab.github.io/PublicLab.Editor/examples/) often results in spans inside tables.

screenshot 2016-07-29 at 11 55 15 am

That becomes:

| col0 | col1 | col2 |
|------|------|------|
| cell | through the steps  
 | cell |
| cell | cell | cell |
| cell | cell | cell |
| cell | cell | cell |

However, tds may often contain spans, and it seems we could figure out how to parse them gracefully at least. Here it seems like that is almost happening properly.

The span tags are removed. I was going to suggest just that the extra newline could be stripped too. I'm happy to write a test and attempt this change?

@jywarren jywarren added the bug label Jul 29, 2016
@bevacqua
Copy link
Owner

Please do!

@jywarren
Copy link
Collaborator Author

That's odd -- putting a span into the 'tables with complex content still get proper padding' test doesn't trigger the issue. Digging a bit.

@jywarren
Copy link
Collaborator Author

OK -- pasting in even a single word (on a recent Chrome, in ChromeOS) appends a <br> to the table cell content:

<td><span style="color: rgb(170, 170, 170);">steps</span><br></td>

Trying to think through if we should strip that out, strip trailing <br>s out, or what. It definitely doesn't survive the markdown conversion because a newline mid-row breaks the table.

I think it's appropriate to strip <br> tags in this case. Thoughts or counter-cases?

@bevacqua
Copy link
Owner

bevacqua commented Jul 29, 2016

Are you copying a part of a page and getting this rich format? Where are you pasting? It may be more of a contenteditable issue than something specific to domador I think.

I'm fine with stripping <\s*br\s*\/?\s*> from start/end of cell, in any case.

@jywarren
Copy link
Collaborator Author

Yes, copying a single word (mid-sentence). I'd agree that it's contenteditable but it's also an extremely common use case for Woofmark (actually reported by a user of one of my projects). Perhaps it could be a configurable option?

Though if a <br> results in an unparseable markdown table, it seems like an issue regardless of where the
came from, but I guess that's a question for the GFM table spec.

@bevacqua
Copy link
Owner

bevacqua commented Jul 29, 2016

I think the issue specific to domador is that <br> in table cells should be kept as-is, and not as \n. There should, furthermore, be a fix to woofmark where we strip <br> from start/end of pasted content. Thoughts?

@jywarren
Copy link
Collaborator Author

That sounds reasonable, yeah. I didn't find tables represented in Commonmark, but is the GFM tables behavior supposed to be that HTML may be nested in tables, but Markdown may not be?

For example, an unordered list can be placed inside a table in woofmark but it doesn't survive conversion to valid markdown via domador. I tend to think it's unreasonable (impossible?) to expect multiline Markdown to be parseable inside a GFM table cell (see below), but just curious if there's a well-established spec for that.

| col0 | col1 |
|------|------|
| cell | 

- one
- two |
| cell | cell |

@jywarren
Copy link
Collaborator Author

jywarren commented Jul 29, 2016

And for good measure, this works:

col0 col1
cell
Hi!
cell cell
| col0 | col1 |
|------|------|
| cell | <br> Hi! <br> |
| cell | cell |

@bevacqua
Copy link
Owner

bevacqua commented Jul 29, 2016

Yeah, I'd avoid newlines being produced by domador in general when creating Markdown. How about a flag where we tell domador to preserve certain HTML tags if in a cell?

@bevacqua
Copy link
Owner

bevacqua commented Jul 29, 2016

As in:

| col0 | <ul><li>one</li><li>two</li></ul> |
|------|------|
| cell | <ul><li>one</li><li>two</li></ul> |
| cell | cell |

Which would render properly:

col0
  • one
  • two
cell
  • one
  • two
cell cell

@bevacqua
Copy link
Owner

I'm pretty sure we need to preserve all block elements and <br>, any others you can think of?

@bevacqua
Copy link
Owner

bevacqua commented Jul 29, 2016

By the way if you want to join ponyfoo.com/slack that'll make chatting about this stuff easier :)

@jywarren
Copy link
Collaborator Author

jywarren commented Jul 29, 2016

Joined, thanks. Created a test for this in #8, will take a look at coding in a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants