CJKWORD Learning book

Click the button below to download the original PDF document.

Download Original PDF (cawguide.pdf)

Table of Contents


Author's Foreword

[cite_start]

According to ancient history books such as the Hwandan Gogi, the Korean people were recorded to have governed the languages of the world, and their contributions to the creation of Hangeul and the formation of Chinese characters (漢字) demonstrate their linguistic and script heritage. [cite: 19]

[cite_start]

Chinese characters were used as the national script for over a thousand years and have had a profound impact on the entire population, making the technology for inputting characters, including Hanja, a historical task that cannot be discontinued. [cite: 20] [cite_start]Documents and books from before the modern era were all written in Chinese characters, and Hanja input technology is absolutely necessary for their digitalization. [cite: 21]

[cite_start]

The current input method uses the phonetic value (音價) of characters, regardless of their original form or meaning, which results in slow speeds and difficulties in handling homonyms. [cite: 22] [cite_start]As Chinese characters are the oldest and most widely used script, the problem of Hanja input remains a top global issue. [cite: 23]

Numerous countries, research institutions, universities, and companies have attempted to solve this problem, and it was even pursued as a national project in Korea but remained unsolved. [cite_start]However, our company has meticulously analyzed complex Chinese characters, created the CAW alphabet based on their meaning and phonetic values, and developed a technology for directly inputting Chinese characters through their combination, which we are now introducing to the public. [cite: 24]


Features of This Technology


Company Overview

[cite_start]

Anyword Co., Ltd. is a software development company that has dramatically improved the Chinese/Hanja input problem, which is emerging as a major global concern. [cite: 41] [cite_start]It is a strong small company that can not only develop and sell its products directly but also sell its technology to software companies worldwide. [cite: 42]


Business Goals & Strategy

Business Goals

Business Content

Execution Method

Expected Effects

Domestic:

    [cite_start]
  1. Fulfill the demand for Hanja learning for everyone from elementary school students to adults preparing for employment and Hanja proficiency tests. [cite: 60]
  2. [cite_start]
  3. Fulfill the demand for traditional educational content such as the Myeongsimbogam. [cite: 61]
  4. [cite_start]
  5. Popularize traditional culture and content that has been forgotten by digitalizing classical documents. [cite: 62]

International:

    [cite_start]
  1. Meet the global demand for Chinese language learning and Chinese character processing. [cite: 64]
  2. [cite_start]
  3. Resolve the inconvenience of Chinese input on smartphones. [cite: 65]
  4. [cite_start]
  5. Address the Chinese language issues in current application software such as Word, Hangeul, and Apple Word. [cite: 66]
  6. [cite_start]
  7. Address Chinese language issues in various development languages for internet browsers. [cite: 67]

CAW+ General Overview

What is CAW+?

[cite_start]

It is a platform that uses the CAW+ alphabet, which is a collection of the least common multiple shapes (character pieces) obtained after decomposing Chinese characters, arranged according to frequency. [cite: 79] [cite_start]The combination of alphabets to create the most efficient input method allows users to learn and utilize it quickly with minimal effort. [cite: 80] [cite_start]Therefore, without the need for many characters, all characters can be created simply by combining 184 alphabets. [cite: 81]


Understanding the Concept of CAW + ALPHABET 184

The ALPHABET in CAW+ refers to:

[cite_start]

The decomposed letters that make up a single Chinese character. [cite: 86] [cite_start]The Caw Alphabet consists of characters that can no longer be broken down, and each character is assigned a unique code. [cite: 87] [cite_start]However, when trying to input difficult and complex Chinese characters using these decomposed characters, code duplication and confusion arose. [cite: 88] [cite_start]Therefore, Chinese characters formed by the combination of several other characters were also designated as Caw Alphabets. [cite: 88]

[cite_start]

Thus, a total of 184 Caw Alphabets are used to write Chinese sentences. [cite: 89] [cite_start]If you are familiar with this Caw Alphabet, you can quickly and easily input even difficult and complex Chinese characters by inferring the code from the Caw Alphabets that constitute the character. [cite: 90]


Alphabet Derivation Method

[cite_start]

The 1800 educational Hanja characters are basic and high-frequency characters, so the decomposition started from these 1800 characters. [cite: 94] [cite_start]The decomposition of the 1800 characters was not into small shapes like the current ones, but by breaking down distinguishable characters and then further breaking down the collected decomposed characters to create a preliminary alphabet. [cite: 95] [cite_start]Next, the 3500 Hanja characters used in the Hanja Proficiency Test are decomposed and assembled using these preliminary alphabets, creating another form of a secondary alphabet. [cite: 96] [cite_start]It is about 95% the same as the first one, but a 1% difference carries enough weight to change the whole thing. [cite: 97] [cite_start]That is, changing one combination method can result in dozens or hundreds of combination differences. [cite: 98]

[cite_start]

The alphabet created in this way was then applied to KSC 5601 (4888 characters), China's common commercial characters (3500 characters), and IICore (9710 characters) to create the current alphabet. [cite: 99] [cite_start]Therefore, this alphabet is designed to be applied most efficiently to all characters. [cite: 100]


Running CAW + ALPHABET 184

Finding the Code Table:

    [cite_start]
  1. Double-click CAW25.exe to run it. [cite: 104]
  2. [cite_start]
  3. Click on "File" at the top. [cite: 105]
  4. [cite_start]
  5. Click on "Code Table". [cite: 106]

Combination Principles of CAW + ALPHABET 184

1. Additive Method

[cite_start]

Adding 目 (mu) to 木 (mm) creates the character 相. [cite: 147] [cite_start]Using CAW technology, you can simply type "mmmu" using the well-known characters 木 and 目. [cite: 148]

2. Substitution/Transformation Method

[cite_start]

The character 腾 (to rise) is similar in shape to 勝 (to win), where the character 力 is replaced by 馬. [cite: 150] [cite_start]This is called the substitution/transformation method. [cite: 151] [cite_start]The character 勝 is formed by 月+八, and by adding 馬, the character 騰 is created. [cite: 152]

3. Variant Characters

[cite_start]

One of the reasons that direct input of Hanja is not allowed is the existence of many variant characters. [cite: 154] [cite_start]For example, 峰 (peak) and 峯 have the same sound and meaning. [cite: 155] [cite_start]When they use the same combination, a character is added to signify the second form. [cite: 155] [cite_start]Therefore, 峰 is created with the combination of 山 + a component, and by adding another component, the character 峯 is created. [cite: 156] [cite_start]The case of 鳥 (bird) and 烏 (crow) is also resolved using the second form method. [cite: 157] [cite_start]Based on frequency, 鳥 becomes the alphabet character, and 烏 becomes the second form character. [cite: 157]


CAW + ALPHABET 184 Classification Method

Classification Method:


Classification Details

Prime Code Characters (Classification 1 by Frequency)

[cite_start]

This group represents the most frequently used characters that form the basis of Chinese writing, typically composed of up to 4 strokes. [cite: 226, 227] [cite_start]For example, the knife radical (刀), with pinyin [dāo], has the highest frequency among characters starting with 'd', so it is assigned the code 'dd'. [cite: 232, 233] [cite_start]This classification consists of 20 codes created by doubling a consonant (bb, cc, dd, etc.), corresponding to all alphabets except A, F, N, O, T, and V. [cite: 268]

Pronunciation Code Characters (Classification 2 by Frequency)

[cite_start]

This is a classification where the five vowels A, E, I, O, U are applied to the code based on the character's pronunciation and frequency. [cite: 382] [cite_start]For example, the character for eight (八), with pinyin 'ba', is assigned the code 'ba' due to its high frequency. [cite: 384] [cite_start]This classification consists of characters that are used next most frequently after the prime code characters. [cite: 386] [cite_start]This group has a total of 63 codes such as ba, be, bi, bo, bu, ca, ce, etc., corresponding to all alphabets except A, I, J, K, U, and V. [cite: 438]

Vowel Group Code Characters (5 Vowel Group Code Mapping)

[cite_start]

This system groups characters into five "vowel groups" based on their proximity on the QWERTY keyboard layout. [cite: 545] [cite_start]The code for a character is formed by combining an initial consonant with a letter from one of the vowel groups. [cite: 219]


Added Notation and Easily Confused Notation

[cite_start]

This section explains parts where the combination method has been changed in recent versions or parts that are easily confused. [cite: 4546]

    [cite_start]
  1. The character 支 (branch) uses the combination of 士 and 乂. [cite: 4547]
  2. [cite_start]
  3. The character 囱 (chimney) uses the combination of 白 and 夂. [cite: 4548] [cite_start]While a combination of 丶 and 口 is possible, the priority was given to the one that reveals the overall outline. [cite: 4548]
  4. [cite_start]
  5. To the combination of 丶 and 王 for the character 主 (master), the combination of 亠 and 土 has been added. [cite: 4549]
  6. [cite_start]
  7. 末 (end) uses 十木, 未 (not yet) uses 一木, and 耒 (plow) uses 二木. [cite: 4550]
  8. [cite_start]
  9. To the combination of 車 and 丶 for the character 甫 (great), the combination of 十 and 用 has been added. [cite: 4553]
  10. [cite_start]
  11. To the characters 土 and 人 for 坐 (to sit), the combination of 人人土 has been added. [cite: 4554]
  12. [cite_start]
  13. To the characters 广 and 彐 for 唐 (Tang dynasty), the combination of 广 and 聿 has been added. [cite: 4555]
  14. [cite_start]
  15. The character 敖 (to ramble) uses 士方. [cite: 4556]
  16. [cite_start]
  17. Combinations using 丂. [cite: 4557]
  18. [cite_start]
  19. If the components of a combination are the same, 二 is added to express the meaning of a "second form." [cite: 4558]
  20. [cite_start]
  21. Variant characters with the same meaning but different forms, and characters with similar forms. [cite: 4559]

Easily Confused Characters

[cite_start]

The characters listed here are those that are used as components in other places or are not easily recalled. [cite: 4563] [cite_start]By trying to type them, you can understand their combination methods and apply them when similar-shaped characters appear. [cite: 4564]

1) High-frequency characters

[cite_start]

(This section lists examples of high-frequency characters that might be confusing.) [cite: 4565]

2) Expressions for characters with three repeating shapes

[cite_start]

(This section lists examples of characters formed by repeating a component three times.) [cite: 4569]

3) Expressions for simplified or related characters

[cite_start]

(This section lists examples of simplified characters or characters related in form.) [cite: 4570]

4) Others

[cite_start]

(This section lists other miscellaneous characters that are easily confused.) [cite: 4575]