(hangul compose and decompose)
Rust helpers for composing and decomposing Hangul syllable blocks and words. The crate in lib/ exposes small, focused modules for combining jamo into syllables, grouping syllables into words, and mixing Hangul with arbitrary text while enforcing valid Unicode Hangul composition rules.
hangul-cd focuses heavily on composition through a modular wrapper approach. There are four modules, each providing an API wrapper layer over the previous one:
jamo- The Jamo layer provides utilities for working directly with individual Hangul Jamo; Jamo are stored as enum values, allowing for instantiation from and conversion to either Modern or Compatibility Unicode codepoints, as well as helpers for classification of Jamo.block- The Block layer provides aHangulBlockstruct for composing syllable blocks from individual Jamo, as well as aBlockComposerwhich allows callers to use a simple stack-like push-pop interface to add to or remove from a given block, and functions to convert blocks to Unicode codepoints, as well as take completed blocks and decompose them into individual Modern or Compatibility Jamo codepoints.word- The Word layer wraps the Block layer by keeping track of a list of Hangul blocks and extends the push-pop mechanism from the Block layer, allowing callers to create full Hangul words composed of multiple syllable blocks simply by pushing Jamo repeatedly and print to Unicode codepoints.string- The String layer allows for mixing of Hangul and non-Hangul text and continues to make use of the push-pop mechanism from previous layers.
Work with individual Hangul letters, normalize compatibility codepoints, and compose or decompose composite Jamo.
use hangul_cd::jamo::{
Character,
Jamo,
JamoConsonantComposite,
JamoConsonantSingular,
JamoPosition,
JamoVowelSingular,
modernized_jamo_initial,
};
// Convert compatibility to modern codepoints or classify any char
assert_eq!(modernized_jamo_initial('ㄱ'), '\u{1100}');
assert!(matches!(Character::from_char('ㅙ').unwrap(), Character::Hangul(_)));
// Build and inspect Jamo values directly; works with either
// compatibility or modern Jamo Unicode codepoints
let jamo = Jamo::from_compatibility_jamo('ㅗ').unwrap();
assert_eq!(jamo.char_modern(JamoPosition::Vowel), Some('ᅩ'));
// Compose composite consonants programmatically
let double = JamoConsonantSingular::Giyeok.combine_for_initial(&JamoConsonantSingular::Giyeok);
assert_eq!(double, Some(JamoConsonantComposite::SsangGiyeok));Compose and decompose single syllable blocks.
use hangul_cd::block::{
BlockComposer,
BlockCompletionStatus,
HangulBlock,
HangulBlockDecompositionOptions,
};
use hangul_cd::jamo::{Jamo, JamoConsonantSingular, JamoUnicodeType, JamoVowelSingular};
// Use `BlockComposer` to easily work with `HangulBlock`s
let mut composer = BlockComposer::new();
// Push directly using Jamo or with chars
composer.push(&Jamo::Consonant(JamoConsonantSingular::Giyeok));
composer.push(&Jamo::Vowel(JamoVowelSingular::A));
composer.push_char('ㅇ');
// Complete `BlockComposer` blocks as `HangulBlock` structs
let block = match composer.try_as_complete_block().unwrap() {
BlockCompletionStatus::Complete(block) => block,
_ => unreachable!(),
};
// Convert `HangulBlock`s to syllables
assert_eq!(block.to_char().unwrap(), '강');
// Decompose `HangulBlock`s into their constituent characters
let opts = HangulBlockDecompositionOptions {
decompose_composites: true,
jamo_era: JamoUnicodeType::Compatibility,
};
assert_eq!(block.decomposed_vec(&opts).unwrap(), vec!['ㄱ', 'ㅏ', 'ㅇ']);Compose multiple syllables into a word with push/pop semantics.
use hangul_cd::word::{HangulWordComposer, WordPushResult};
// Use `HangulWordComposer` to push and pop Jamo to words
let mut word = HangulWordComposer::new();
for c in "ㅇㅏㄴㄴㅕㅇ".chars() {
assert_eq!(word.push_char(c).unwrap(), WordPushResult::Continue);
}
assert_eq!(word.as_string().unwrap(), "안녕".to_string());
// Backspace-like pops remove Jamo from a word one at a time
word.pop().unwrap();
assert_eq!(word.as_string().unwrap(), "안녀".to_string());Mix Hangul words with arbitrary text.
use hangul_cd::string::StringComposer;
// Use a `StringComposer` to push and pop both Hangul and non-Hangul chars
let mut s = StringComposer::new();
for c in "ㅎㅏㄴㄱㅡㄹ rocks".chars() {
s.push_char(c).unwrap();
}
assert_eq!(s.as_string().unwrap(), "한글 rocks".to_string());
s.pop().unwrap(); // remove 's'
s.push_char('!').unwrap();
assert_eq!(s.as_string().unwrap(), "한글 rock!".to_string());Add the crate to your project (for a local path):
[dependencies]
hangul = { path = "lib" }Compose a single block or whole words:
use hangul_cd::block::HangulBlock;
use hangul_cd::string::StringComposer;
use hangul_cd::jamo::Jamo;
// Manual block construction
let block = HangulBlock {
initial: 'ㄱ',
vowel: 'ㅏ',
final_optional: Some('ㅇ'),
};
assert_eq!(block.to_char().unwrap(), '강');
// Stream jamo into a string
let mut composer = StringComposer::new();
for c in "ㅎㅏㄴㄱㅡㄹ ㅇㅏㄴㄴㅕㅇ!".chars() {
composer.push_char(c).unwrap();
}
assert_eq!(composer.as_string().unwrap(), "한글 안녕!".to_string());
// Pop behaves like backspace
composer.pop().unwrap(); // removes '!'
assert_eq!(composer.as_string().unwrap(), "한글 안녕".to_string());Work directly with jamo or compatibility jamo:
use hangul_cd::jamo::{create_composite_initial, modernize_jamo_initial, Character};
assert_eq!(modernize_jamo_initial('\u{3131}'), '\u{1100}'); // ㄱ -> modern jamo
assert_eq!(create_composite_initial('ㄱ', 'ㄱ'), Some('ㄲ'));
assert!(matches!(Character::from_char('ㅘ'), Character::Hangul(_)));From lib/, run the library test suite:
cargo test