Skip to content

Commit

Permalink
HTML API: Track spans of text with (offset, length) instead of (start…
Browse files Browse the repository at this point in the history
…, end)

This patch follows-up with earlier design questions around how to represent
spans of strings inside the class. It's relevant now as preparation for WordPress#5683.

The mixture of (offset, length) and (start, end) coordinates becomes confusing
at times and all final string operations are performed with the (offset, length)
pair, since these feed into `strlen()`.

In preparation for exposing all tokens within an HTML document this change:
 - Unifies the representation throughout the class.
 - It creates `token_starts_at` to track the start of the current token.
 - It replaces `tag_ends_at` with `token_length` for re-use with other token types.

There should be no functional or behavioral changes in this patch.

For the internal helper classes this patch introduces breaking changes, but those
classes are marked private and should not be used outside of the HTML API itself.
  • Loading branch information
dmsnell committed Nov 30, 2023
1 parent 4d19f6c commit 259fa03
Show file tree
Hide file tree
Showing 4 changed files with 112 additions and 68 deletions.
30 changes: 24 additions & 6 deletions src/wp-includes/html-api/class-wp-html-attribute-token.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
*
* @access private
* @since 6.2.0
* @since {WP_VERSION} Replaced `end` with `length` to more closely match `substr()`.
*
* @see WP_HTML_Tag_Processor
*/
Expand All @@ -23,6 +24,7 @@ class WP_HTML_Attribute_Token {
* Attribute name.
*
* @since 6.2.0
*
* @var string
*/
public $name;
Expand All @@ -31,6 +33,7 @@ class WP_HTML_Attribute_Token {
* Attribute value.
*
* @since 6.2.0
*
* @var int
*/
public $value_starts_at;
Expand All @@ -39,6 +42,7 @@ class WP_HTML_Attribute_Token {
* How many bytes the value occupies in the input HTML.
*
* @since 6.2.0
*
* @var int
*/
public $value_length;
Expand All @@ -47,22 +51,36 @@ class WP_HTML_Attribute_Token {
* The string offset where the attribute name starts.
*
* @since 6.2.0
*
* @var int
*/
public $start;

/**
* The string offset after the attribute value or its name.
* Byte length of the entire attribute name or name and value pair expression.
*
* Example:
*
* <div class="post">
* ------------ length is 12, including quotes
*
* <input type="checked" checked id="selector">
* ------- length is 6
*
* <a rel=noopener>
* ------------ length is 11
*
* @since {WP_VERSION}
*
* @since 6.2.0
* @var int
*/
public $end;
public $length;

/**
* Whether the attribute is a boolean attribute with value `true`.
*
* @since 6.2.0
*
* @var bool
*/
public $is_true;
Expand All @@ -76,15 +94,15 @@ class WP_HTML_Attribute_Token {
* @param int $value_start Attribute value.
* @param int $value_length Number of bytes attribute value spans.
* @param int $start The string offset where the attribute name starts.
* @param int $end The string offset after the attribute value or its name.
* @param int $length Byte length of the entire attribute name or name and value pair expression.
* @param bool $is_true Whether the attribute is a boolean attribute with true value.
*/
public function __construct( $name, $value_start, $value_length, $start, $end, $is_true ) {
public function __construct( $name, $value_start, $value_length, $start, $length, $is_true ) {
$this->name = $name;
$this->value_starts_at = $value_start;
$this->value_length = $value_length;
$this->start = $start;
$this->end = $end;
$this->length = $length;
$this->is_true = $is_true;
}
}
17 changes: 9 additions & 8 deletions src/wp-includes/html-api/class-wp-html-span.php
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
*
* @access private
* @since 6.2.0
* @since {WP_VERSION} Replaced `end` with `length` to more closely align with `substr()`.
*
* @see WP_HTML_Tag_Processor
*/
Expand All @@ -31,23 +32,23 @@ class WP_HTML_Span {
public $start;

/**
* Byte offset into document where span ends.
* Byte length of span.
*
* @since 6.2.0
* @since {WP_VERSION}
* @var int
*/
public $end;
public $length;

/**
* Constructor.
*
* @since 6.2.0
*
* @param int $start Byte offset into document where replacement span begins.
* @param int $end Byte offset into document where replacement span ends.
* @param int $start Byte offset into document where replacement span begins.
* @param int $length Byte length of span.
*/
public function __construct( $start, $end ) {
$this->start = $start;
$this->end = $end;
public function __construct( $start, $length ) {
$this->start = $start;
$this->length = $length;
}
}
Loading

0 comments on commit 259fa03

Please sign in to comment.