Group Unicode symbols into browsable categories #110

amyjko · 2023-07-24T03:24:45Z

What's the problem?

From Amy's first comment:
The glyph chooser in the editor is just search. Add a browsing feature that organizes symbols into categories. There are a lot of ways we might organize; start with design work.

From Erica and Bethany's design spec:

What is Unicode? And how is it organized?

Unicode is a standardized character encoding system that aims to represent every character from every writing system in the world. It provides a unique code point for each character, symbol, or ideograph, regardless of the platform, program, or language. This standard allows computers to handle and display text in various languages and scripts.

Unicode includes a vast range of characters, including letters, numbers, symbols, emojis, and special characters. Each character is assigned a unique numerical value known as a "code point," which can be represented in various formats, such as hexadecimal or decimal.

Unicode is crucial for ensuring multilingual support in software and communication technologies, making it possible to display text and symbols from different languages and writing systems consistently and accurately across different devices and platforms.

For the unicode organization, please checkout the Unicode Data Format link. It shows how unicode is organized and what the meaning of the numerical value is.

So what’s the problem? What are we trying to fix?

The current observed behavior is that as a user searches, there is no organization or way to parse through the unicode efficiently. Although the search bar is meant to narrow down the search, the search results are cluttered with irrelevant and missing values with no understandable prioritization within the results. Additionally, as Wordplay aims to be a multilingual platform, the current search results only reflect the English language. For each of these problems, we will expand on the current state and our proposed solutions.

What's the design idea?

There are mainly 3 parts of the design idea, and the details are explained in the design spec:

Multi-language inaccessible
UI redesign (Developers can change this now)
Future considerations

Design specification

Propose grouping Unicode symbols into browsable categories (#110)

Designers: Bethany Chum, Erica Ding

Proposed Solutions

1.Multi-Language Inaccessible

In our current design, we incorporate all of the Unicode glyphs and symbols. However, not a single Unicode is accessible in other languages where all the descriptions are in English only. This means that the Wordplay search box does not have multilingual search functionality. For example, when the language selected is English, when entering 'Comida' in Spanish (meaning ‘Food’), there is no search result.

But even within the selected language, the appropriate search results are not shown. In the same example, with the Spanish language selected, searching ‘Comida’ does not produce any results, meaning the behavior of the search is limited to English searches. The search should be translated, especially if a language other than English is selected.

(Same goes with a Chinese search; searching ‘猫’ which is the Chinese character for ‘cat’ does not produce any results. We propose that the suggested behavior is that regardless of whether you are logged in with English or Mandarin as your language, it should produce the same search results as when ‘cat’ is searched in English.)

To make Wordplay fully accessible in another language, we need to translate the Unicode descriptions, but there is currently no work done on this. We first propose to translate a small amount of Unicode at a time, starting with the Unicode we think would be most frequently used or most popular with our targeted audience (elementary - middle school students). [Refer to 2.1 for the categories we propose to translate first.] For the actual translation task, we suggest the locale team to tackle a category of Unicode at a time. Or, we could outsource this task to any contributors out there who would like to translate Unicode into their language.

With translated Unicode descriptions, we propose the search feature can allow users to search in their own language in no matter which language version they were logged in to.

Immediate steps we propose is writing a disclaimer message below the search bar while a user uses the search feature stating that “Unicode search and descriptions are only available in English” (which would be translated to other languages depending on the selected language).
Once we begin translating the Unicode, we understand that the process of translating Unicode is a large task, so we propose adding a disclaimer message to acknowledge that Wordplay is not yet fully accessible in other languages. For example, the message could look like, “Some Unicode not yet available in your language; we are currently translating the rest of the Unicode into your language”

2. UI redesign

2.1 Prioritize operators

Operators are important symbols that users use to learn programming, but we have found that when users use the search box, operators disappear until the user deletes everything in the search box. This can make users spend more time rediscovering these operators.

current display of expanded glyphs without operators displayed

To make the operators easy to access, we decided to stick the operators always on the top of the search box, to stand out from the searching result.

proposed display of Operators with collapsed Symbols dropdown menu

2.2 Dropdown categories (translated and most frequently used)

Once we have condensed the unicode to valid and relevant results, we think it is best to organize the symbols/glyphs into a few categories: Emojis, Arrows, Shapes, and Other. We think these categories best reflect why a user might want to choose a symbol/glyph in Wordplay. Because Wordplay is designed to be a playful, global, and accessible programming language, we think these categories eliminate the excess symbols/glyphs that will likely not be used given the other alternatives.

We propose using a dropdown menu to present the categories and limit the search feature within those categories so the search process is organized.

Once a category has been chosen, an expanded section will open under the Operators bar. There will be three components in this expanded section: a search bar, recently used (symbols/glyphs), and all the symbols/glyphs that are in this category. The 'recently used' section displays the glyphs most recently used by the user and is positioned at the top of the expanded section. To enhance user accessibility and continuity across various devices, the glyphs saved in each user account are stored in the cloud, ensuring seamless synchronization of recently used glyphs.

The duration set for the 'recently used' section is three months, aligning with the educational context of Wordplay as a web app for class learning. This time frame corresponds to a quarter, providing an optimal period for students to complete substantial programming assignments or cover significant course content. Moreover, to prioritize a streamlined user experience, only eight glyphs can be stored in the 'recently used' column. This limitation ensures that users can easily access and manage a concise set of recently used symbols, promoting efficiency and decluttering the workspace.

The remainder of the section will be populated with all the symbols/glyphs in the given category until there is a search. As the user begins to search, the search engine will only search through the symbols/glyphs in that category. The recently used section will scroll with the rest of the results (will not be frozen at the top of the expanded section) since they are likely not what the user is looking for, given they are scrolling away from it.

Below are our categorization examples:

Emojis

U+231A WATCH	U+231B HOURGLASS
U+23E9 BLACK RIGHT-POINTING DOUBLE TRIANGLE	U+23FF OBSERVER EYE SYMBOL
U+2614 UMBRELLA WITH RAIN DROPS	U+2615 HOT BEVERAGE
U+2648 ARIES	U+2653 PISCES
U+267F WHEELCHAIR SYMBOL
U+26F2 FOUNTAIN	U+26F5 SAILBOAT
U+26F7 SKIER	U+26FA TENT
U+26FD FUEL PUMP
U+270A RAISED FIST	U+270D WRITING HAND
U+2728 SPARKLES
U+1F300 CYCLONE	U+1F531 TRIDENT EMBLEM
U+1F549 OM SYMBOL	U+1F57A MAN DANCING
U+1F58A LOWER LEFT BALLPOINT PEN	U+1F58D LOWER LEFT CRAYON
U+1F5A5 DESKTOP COMPUTER	U+1F5A8 PRINTER
U+1F5D1 WASTEBASKET	U+1F5D3 SPIRAL CALENDAR PAD
U+1F5FA WORLD MAP	U+1F64F PERSON WITH FOLDED HANDS
U+1F680 ROCKET	U+1F6C5 LEFT LUGGAGE
U+1F6CB COUCH AND LAMP	U+1F6FC ROLLER SKATE
U+1F90C PINCHED FINGERS
U+1F90F PINCHING HAND	U+1F9FF NAZAR AMULET
U+1FA70 BALLET SHOES	U+1FA74 THONG SANDAL
U+1FA78 DROP OF BLOOD	U+1FA86 NESTING DOLLS
U+1FA90 RINGED PLANET	U+1FAAC HAMSA
U+1FAB0 FLY	U+1FABA NEST WITH EGGS
U+1FAC0 ANATOMICAL HEART	U+1FAC5 PERSON WITH CROWN
U+1FAD0 BLUEBERRIES	U+1FAF6 HEART HANDS

Arrows

U+21C4 RIGHTWARDS ARROW OVER LEFTWARDS ARROW	U+21F3 UP DOWN WHITE ARROW
U+2301 ELECTRIC ARROW
U+2303 UP ARROWHEAD	U+2304 DOWN ARROWHEAD

Shapes

U+25A0 BLACK SQUARE	U+25D7 RIGHT HALF BLACK CIRCLE
U+25D9 INVERSE WHITE CIRCLE	U+25F7 WHITE CIRCLE WITH UPPER RIGHT QUADRANT
U+2686 WHITE CIRCLE WITH DOT RIGHT	U+2689 BLACK CIRCLE WITH TWO WHITE DOTS
U+26F6 SQUARE FOUR CORNERS
U+2729 STRESS OUTLINED WHITE STAR	U+274B HEAVY EIGHT TEARDROP-SPOKED PROPELLER ASTERISK
U+1F532 BLACK SQUARE BUTTON	U+1F53F UPPER RIGHT SHADOWED WHITE CIRCLE
U+1F7E0 LARGE ORANGE CIRCLE	U+1F7EB LARGE BROWN SQUARE
U+1F90D WHITE HEART	U+1F90E BROWN HEART
U+1FA75 LIGHT BLUE HEART	U+1FA77 PINK HEART

3. Future Considerations

Tutorials

For a Wordplay user and novice programmer, using the search bar might be initially perplexing, especially when confronted with unicodes and glyphs. Offering tutorials to guide them through our features would be beneficial, helping them navigate and avoid feeling overwhelmed.

Translating everything

As mentioned in our first proposed point about multi-language Unicode descriptions, Unicode is currently only accessible in English. Ideally, we would be able to translate 149813 glyphs into all the different languages that Wordplay supports and plans to support. The next steps would be to translate Unicode by categories and push the translations to Wordplay a category at a time (opposed to each translated description at a time so it is less messy for the user searching for glyphs).

Expanding dropdown categories to incorporate more of the translated unicode

As our proficiency in translating Unicode advances, we have the opportunity to include a broader range of language symbols, such as Han Zi (the Mandarin Chinese term for Han characters). According to the official Unicode explanation, the comprehensive coverage extends to languages written in various widely-used scripts. (the full explanation can be found in the insert link)

When users switch to a specific language, we can enhance the dropdown menu by incorporating a dedicated section that shows the unique character categories relevant to that language.

External link:

Figma: https://www.figma.com/file/aCN4TggJDma3zW8y1OMTbE/Unicode-Grouping?type=design&node-id=0%3A1&mode=design&t=mILYDFWALqqRHnjo-1

YizhouDing · 2023-11-01T00:45:18Z

Propose grouping Unicode symbols into browsable categories (#110)

Designers: Bethany Chum, Erica Ding

Link on Github: #110

“The glyph chooser in the editor is just search. Add a browsing feature that organizes symbols into categories. There are a lot of ways we might organize; start with design work.” (Github description)

Problem

Currently, there is a search bar that allows a user to search for symbols/glyphs. The symbols/glyphs are pulled from Unicode.

What is Unicode? And how is it organized?

Unicode is a standardized character encoding system that aims to represent every character from every writing system in the world. It provides a unique code point for each character, symbol, or ideograph, regardless of the platform, program, or language. This standard allows computers to handle and display text in various languages and scripts.

Unicode includes a vast range of characters, including letters, numbers, symbols, emojis, and special characters. Each character is assigned a unique numerical value known as a "code point," which can be represented in various formats, such as hexadecimal or decimal.

Unicode is crucial for ensuring multilingual support in software and communication technologies, making it possible to display text and symbols from different languages and writing systems consistently and accurately across different devices and platforms.

For the unicode organization, please checkout the Unicode Data Format link. It shows how unicode is organized and what the meaning of the numerical value is.

So what’s the problem? What are we trying to fix?

The current observed behavior is that as a user searches, there is no organization or way to parse through the unicode efficiently. Although the search bar is meant to narrow down the search, the search results are cluttered with irrelevant and missing values with no understandable prioritization within the results. Additionally, as Wordplay aims to be a multilingual platform, the current search results only reflect the English language. For each of these problems, we will expand on the current state and our proposed solutions.

Proposed Solutions

1.1 Condense the unicode range – remove error unicode

Currently, there are several blank spaces that appear in the search results. Developers should prioritize identifying the reasons behind these errors and resolving them. If necessary, these error unicode results should be removed.

1.2 Condense the unicode range – remove irrelevant unicode

Some less commonly used symbols also appear. At least in the English language, there are no commonly used terms for these symbols/glyphs. They can also be removed if not necessary because it causes clutter and requires users to spend more time searching for the symbol they want.

2.1 Prioritization – Operators

Operators are important symbols that users use to learn programming, but we have found that when users use the search box, operators disappear until the user deletes everything in the search box. This can make users spend more time rediscovering these operators.

To make the operators easy to access, we decided to stick the operators always on the top of the search box, to stand out from the searching result.

2.2 Prioritization – Dropdown Categories

Once we have condensed the unicode to valid and relevant results, we think it is best to organize the symbols/glyphs into a few categories: Emojis, Arrows, Shapes, and Other. We think these categories best reflect why a user might want to choose a symbol/glyph in Wordplay. Because Wordplay is designed to be a playful, global, and accessible programming language, we think these categories eliminate the excess symbols/glyphs that will likely not be used given the other alternatives.

We propose using a dropdown menu to present the categories and limit the search feature within those categories so the search process is organized.

Once a category has been chosen, an expanded section will open under the Operators bar. There will be three components in this expanded section: a search bar, recently used (symbols/glyphs), and all the symbols/glyphs that are in this category. The ‘recently used’ section contains the most recently used glyphs and will be shown at the top of the expanded section. The remainder of the section will be populated with all the symbols/glyphs in the given category until there is a search. As the user begins to search, the search engine will only search through the symbols/glyphs in that category. The recently used section will scroll with the rest of the results (will not be frozen at the top of the expanded section) since they are likely not what the user is looking for given they are scrolling away from it.

2.3 Prioritization – Using Tooltip Text

Currently, when the user moves the cursor over the corresponding Unicode character, a small tooltip appears, reminding the user to 'insert [Unicode],' which is not very helpful for users.

current tooltip description of a cat emoji

Users still don't know which Unicode they have selected. We suggest changing the tooltip below to display the name of the Unicode character and provide a clearer description. This way, users can have a clearer understanding of the Unicode they are using and can search for the Unicode they want effectively.

proposed tooltip description of a kissing cat emoji

With descriptive tooltip text, we can use the descriptions to help organize/prioritize the search results. By using the description, it will likely be closer to what users will search for and make more sense than the current search. There is currently no observable prioritization of the results, so using the descriptions to match the input search can provide some type of order.

3. Multi-language friendly

From the current design, it appears that the Wordplay search box does not have multilingual search functionality. For example, when the language selected is English, when entering 'Comida' in Spanish (meaning ‘Food’), there is no search result. To make it more user friendly and accessible, we propose the future search feature can allow users to search in their own language in no matter which language version they were logged in to.

But even within the selected language, the appropriate search results are not shown. In the same example, with the Spanish language selected, searching ‘Comida’ does not produce any results, meaning the behavior of the search is limited to English searches. The search should be translated, especially if a language other than English is selected.

(Same goes with a Chinese search; searching ‘猫’ which is the Chinese character for ‘cat’ does not produce any results. We propose that the suggested behavior is that regardless of whether you are logged in with English or Mandarin as your language, it should produce the same search results as when ‘cat’ is searched in English.)

External link:

Figma: https://www.figma.com/file/aCN4TggJDma3zW8y1OMTbE/Unicode-Grouping?type=design&node-id=0%3A1&mode=design&t=mILYDFWALqqRHnjo-1

YizhouDing · 2023-11-01T01:04:09Z

@amyjko Me and @bethanyc32 finished the design specification.

amyjko · 2023-11-03T00:54:33Z

Wonderful design draft! I really like many of its elements. I have several questions that you should resolve before this is ready to build:

1.1 and 1.2 propose to remove some parts of Unicode. Can you precisely state which Unicode ranges you're proposing to remove and why? Many of the glyphs that aren't rendering just lack a font that's installed to display them, so that doesn't seem like a good justification for removal. And many of the unconventional shapes may be useful content for someone, so why remove them instead of organize them? (e.g., using the organizational scheme that Unicode already has?) Section 2.2 could just inherent the Unicode organization scheme, for example.
For 2.2, please define "recently used" and a rationale for the duration you choose. Also, are these recently used glyphs saved in memory, in a browser, or per user account?
Section 3 is English only because Unicode only offers English translations of glyph descriptions. What is your proposal for how to obtain translations of all Unicode glyphs? To my knowledge, Unicode only maintains English descriptions. Is your proposal that we translate all 149,813 glyphs?

YizhouDing · 2023-11-03T01:11:31Z

Thanks Amy. We will figure this out and edit our draft as soon as possible. Best, Erica Ding On Nov 2, 2023, at 5:54 PM, Amy J. Ko ***@***.***> wrote: Wonderful design draft! I really like many of its elements. I have several questions that you should resolve before this is ready to build: * 1.1 and 1.2 propose to remove some parts of Unicode. Can you precisely state which Unicode ranges you're proposing to remove and why? Many of the glyphs that aren't rendering just lack a font that's installed to display them, so that doesn't seem like a good justification for removal. And many of the unconventional shapes may be useful content for someone, so why remove them instead of organize them? (e.g., using the organizational scheme that Unicode already has?) Section 2.2 could just inherent the Unicode organization scheme, for example. * For 2.2, please define "recently used" and a rationale for the duration you choose. Also, are these recently used glyphs saved in memory, in a browser, or per user account? * Section 3 is English only because Unicode only offers English translations of glyph descriptions. What is your proposal for how to obtain translations of all Unicode glyphs? To my knowledge, Unicode only maintains English descriptions. Is your proposal that we translate all 149,813 glyphs? — Reply to this email directly, view it on GitHub<#110 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AZYGRO4QOYHW6ADZRS63WF3YCQ6FJAVCNFSM6AAAAAA2U6YYDWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJRG42TQMRYGI>. You are receiving this because you were assigned.Message ID: ***@***.***>

YizhouDing · 2023-12-04T08:58:54Z

Propose grouping Unicode symbols into browsable categories (#110)

Designers: Bethany Chum, Erica Ding

Link on Github: #110

“The glyph chooser in the editor is just search. Add a browsing feature that organizes symbols into categories. There are a lot of ways we might organize; start with design work.” (Github description)

Problem

Currently, there is a search bar that allows a user to search for symbols/glyphs. The symbols/glyphs are pulled from Unicode.

What is Unicode? And how is it organized?

Unicode is a standardized character encoding system that aims to represent every character from every writing system in the world. It provides a unique code point for each character, symbol, or ideograph, regardless of the platform, program, or language. This standard allows computers to handle and display text in various languages and scripts.

Unicode includes a vast range of characters, including letters, numbers, symbols, emojis, and special characters. Each character is assigned a unique numerical value known as a "code point," which can be represented in various formats, such as hexadecimal or decimal.

Unicode is crucial for ensuring multilingual support in software and communication technologies, making it possible to display text and symbols from different languages and writing systems consistently and accurately across different devices and platforms.

For the unicode organization, please checkout the Unicode Data Format link. It shows how unicode is organized and what the meaning of the numerical value is.

So what’s the problem? What are we trying to fix?

The current observed behavior is that as a user searches, there is no organization or way to parse through the unicode efficiently. Although the search bar is meant to narrow down the search, the search results are cluttered with irrelevant and missing values with no understandable prioritization within the results. Additionally, as Wordplay aims to be a multilingual platform, the current search results only reflect the English language. For each of these problems, we will expand on the current state and our proposed solutions.

Proposed Solutions

1.Multi-Language Inaccessible

In our current design, we incorporate all of the Unicode glyphs and symbols. However, not a single Unicode is accessible in other languages where all the descriptions are in English only. This means that the Wordplay search box does not have multilingual search functionality. For example, when the language selected is English, when entering 'Comida' in Spanish (meaning ‘Food’), there is no search result.

But even within the selected language, the appropriate search results are not shown. In the same example, with the Spanish language selected, searching ‘Comida’ does not produce any results, meaning the behavior of the search is limited to English searches. The search should be translated, especially if a language other than English is selected.

(Same goes with a Chinese search; searching ‘猫’ which is the Chinese character for ‘cat’ does not produce any results. We propose that the suggested behavior is that regardless of whether you are logged in with English or Mandarin as your language, it should produce the same search results as when ‘cat’ is searched in English.)

To make Wordplay fully accessible in another language, we need to translate the Unicode descriptions, but there is currently no work done on this. We first propose to translate a small amount of Unicode at a time, starting with the Unicode we think would be most frequently used or most popular with our targeted audience (elementary - middle school students). [Refer to 2.1 for the categories we propose to translate first.] For the actual translation task, we suggest the locale team to tackle a category of Unicode at a time. Or, we could outsource this task to any contributors out there who would like to translate Unicode into their language.

With translated Unicode descriptions, we propose the search feature can allow users to search in their own language in no matter which language version they were logged in to.

Immediate steps we propose is writing a disclaimer message below the search bar while a user uses the search feature stating that “Unicode search and descriptions are only available in English” (which would be translated to other languages depending on the selected language).
Once we begin translating the Unicode, we understand that the process of translating Unicode is a large task, so we propose adding a disclaimer message to acknowledge that Wordplay is not yet fully accessible in other languages. For example, the message could look like, “Some Unicode not yet available in your language; we are currently translating the rest of the Unicode into your language”