From ee3539ad67330a9aaaea6c52f9adc6ea900a356b Mon Sep 17 00:00:00 2001
From: Addison Phillips <addison@lab126.com>
Date: Thu, 14 Dec 2023 08:47:42 -0800
Subject: [PATCH] Add requirement for character encoding in trunction

Addresses #124
Addresses w3c/i18n-actions#62

- Add a requirement with explanation such that byte length
  truncation needs to specify a character encoding
  (and that legacy encodings should be avoided)
- Add links to glossary terms in this section in some places
- Small tweaks to other text
---
 index.html | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/index.html b/index.html
index 5b06603..dd9e8aa 100644
--- a/index.html
+++ b/index.html
@@ -2969,7 +2969,7 @@ <h3>Text truncation in UTF-8</h3>
     	</div>
 
 	<div class="req" id="char_trunc_grapheme_boundary">
-	<p class="advisement">Specifications that limit the length of a string SHOULD require truncation on grapheme boundaries, as truncation in the midst of a combining or joining sequence can alter the meaning of the string.</p>
+	<p class="advisement">Specifications that limit the length of a string SHOULD require truncation on grapheme boundaries, as truncation in the midst of a <a>grapheme</a> or <a>combining character sequence</a> can alter the meaning of the string.</p>
 	</div>
 
 	<div class="req" id="char_trunc_indicator">
@@ -2977,8 +2977,14 @@ <h3>Text truncation in UTF-8</h3>
 	</div>
 
 	<div class="req" id="char_trunc_min_size">
-	<p class="advisement">When specifying a length limitation in code units (such as bytes), specifications SHOULD set the maximum length in a way that accommodates users whose language requires multibyte code unit sequences.</p>
+	<p class="advisement">When specifying a length limitation in code units (such as bytes), specifications SHOULD set the limit in a way that accommodates users whose language requires multibyte code unit sequences.</p>
 	</div>
+	
+	<div class="req" id="char_trunc_character_encoding">
+		<p class="advisement">If a specification specifies a length limit in code units (such as bytes), it MUST specify the <a>character encoding</a> used in measuring the limit; such a limit SHOULD NOT specify a <a>legacy character encoding</a>.</p>
+	</div>
+	
+	<p>If a specification permits or requires truncation of a field, the <a>character encoding</a> is important in knowing what the limit means. If the limit is in bytes and <a>legacy character encodings</a> are permitted, note that conversion of Unicode data to a non-Unicode encoding can also result in data loss (since most <a>legacy character encodings</a> encode only a subset of Unicode).</p>
 </section>
 
 <section id="strcat" class="subtopic">