Rule-based approach 문제점 1. Hard to scale-up 2. Brittle(깨지기 쉬움) 3. Ambiguity Tagged Corpus 종류 Ø 품사(Part-of-Speech)가 tag된 corpus Ø Tree tagged corpus ( Penn Treebank) Ø Semantic code(feature) tagged corpus Ø Discourse structure tagged corpus Ø Parallel corpus for MT ( 한-영의 parallel corpus를 뜻함) N-gram 사용 영역 Ø Speech recognition - 음성인식 Ø OCR & Handwriting recognition Ø Machine translation - 기계번역 Ø G..