Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a very superficial example, Korean is agglutinative and you can't easily take affixes apart from a word. For example, you can't easily recognize that the following sentence has a noun "언어" and a verb stem "배우-" without a substantial knowledge about Korean:

    언어를 배운다.
    [topic omitted] Language-(object marker "를") learn-(verb conjugation "ㄴ다").
    (One) learns a language.
Many syntaxes are based on easily segmented tokens, or words, so they are not a good fit for Korean and other agglutinative languages. My friend has made a programming language, Yaksok [1], specially made for Korean and solving this problem by making all invocations as a pattern, somewhat similarly to AppleScript:

    # - "약속" is a keyword for procedure declarations.
    # - Unquoted words are formal parameters.
    # - Anything quoted should occur literally, except for slashed alternations
    #   used for affixes varying by the preceding word.
    약속 대상"을/를 배운다"
        ...
Of course this results in a very unconventional parser.

[1] http://yaksok.org/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: