ããããç§ã®è©±ã®åé ã§ãããã¯äžé£ã®ãããã°ã©ããŒãä¿¡ããŠããXã«ã€ããŠã®èª€è§£ãããã®å¥ã®é²åºã§ã¯ãªãããšãè¿°ã¹ãŸãããããªãã¯ãã®ãããªå瀺ãããã€ã§ãèŠã€ããããšãã§ããŸããããããç§ã¯ãããã®èšäºã奜ãã§ã¯ãããŸããã圌ãã¯ããããééã£ãŠãããšæãããããŸããŸãªããšããªã¹ãããŠããŸãããããããªãã§ããã代ããã«äœããã¹ããã説æããããšã¯ãã£ãã«ãããŸããã人ã ã¯ãã®ãããªèšäºãèªãã§ããã®ææãç¥çŠãããããã®èšäºã«èšèŒãããŠããªãééããç¯ãããã®èå³æ·±ãæ°ããæ¹æ³ãèŠã€ãã«è¡ãã®ã§ã¯ãªãããšæããŸããããã¯ããããã®ãšã©ãŒã®åå ãšãªã£ãŠããåé¡ãå®éã«ç解ããŠããªãã£ãããã§ãã
ãããã£ãŠãç§ã®ã¬ããŒãã§ã¯ãããã€ãã®åé¡ãå¯èœãªéã説æãããããã解決ããæ¹æ³ã説æããããšããŸãã-ç§ã¯ãã®ã¢ãããŒããã¯ããã«å¥œãã§ããééãããšãã«è§Šããã ãã®ãããã¯ã®1ã€ïŒ1ã€ã®ã¹ã©ã€ããšä»ã®ã¹ã©ã€ãã§ã®ããã€ãã®èšåïŒã¯ããã£ã©ã¯ã¿ãŒã®å Žåã«é¢é£ããå¯èœæ§ã®ããè€éãã§ããç§ãè°è«ããåé¡ïŒå€§æåãšå°æåãåºå¥ããªãèå¥åã®æ¯èŒïŒã«å¯Ÿããå ¬åŒã®Correct Answerâ¢ããããè¬æŒã§ã¯ãPythonæšæºã©ã€ãã©ãªã®ã¿ã䜿çšããããšã§ç§ãç¥ã£ãŠããæåã®è§£æ±ºçã瀺ããŸããã
ãã ããUnicodeã®å Žåã®ããæ·±ãè€éãã«ã€ããŠç°¡åã«èª¬æããã®ã§ã詳现ã«ã€ããŠã¯å°ãæéãå²ããŠèª¬æããããšæããŸããããã¯èå³æ·±ãããšã§ããããããç解ããããšã¯ãããã¹ãåŠçã³ãŒããèšèšããã³äœæããéã®æ±ºå®ã«åœ¹ç«ã¡ãŸãããããã£ãŠããããã°ã©ããŒãä¿¡ããXã«ã€ããŠã®èª€è§£ãããããã°ã©ããŒãç¥ã£ãŠããã¹ãçå®ããšããèšäºã®å察ãæäŸããŸãã
ãããŠãã1ã€ãUnicodeã«ã¯çšèªããããããããŸãããã®èšäºã§ã¯ãUnicodeæšæºã§ãããããäž»ã«ã倧æåããšãå°æåãã®å®çŸ©ã䜿çšããŸãããããã®çšèªã䜿çšããŸããå°æå/倧æåã®ãããªä»ã®çšèªã奜ããªããããã¯åé¡ãããŸããããŸããç§ã¯ãããã·ã³ãã«ããšããçšèªã䜿çšããŸãããããã¯ééã£ãŠãããšæãã人ãããŸããã¯ããUnicodeã§ã¯ããæåãã®æŠå¿µã¯å¿ ããã人ã ãæåŸ ãããã®ã§ã¯ãªããããä»ã®çšèªã䜿çšããŠåé¿ããã®ãæåã®å ŽåããããããŸãããã ãããã®èšäºã§ã¯ãUnicodeã§äœ¿çšãããŠããçšèªã䜿çšããŠã䞻匵ã§ããæœè±¡çãªãšã³ãã£ãã£ã説æããŸããéèŠãªå Žåã¯åžžã«ãã³ãŒããã€ã³ããªã©ã®ããå ·äœçãªçšèªã䜿çšããŠæ確ã«ããŸãã
3ã€ä»¥äžã®ã¬ãžã¹ã¿ããããŸã
ãšãŒãããã®èšèªã®ãã€ãã£ãã¹ããŒã«ãŒã¯ã圌ãã®èšèªãç¹å®ã®ããšã瀺ãããã«å€§æåãšå°æåã䜿çšãããšããäºå®ã«æ £ããŠããŸããããšãã°ãè±èª[ããã³ãã·ã¢èª]ã®èšèªã§ã¯ãéåžžãæã¯å€§æåã§å§ãŸããã»ãšãã©ã®å Žåå°æåã§ç¶ããŸãããŸããåºæåè©ã¯å€§æåã§å§ãŸããå€ãã®ç¥èªãç¥èªã¯å€§æåã§æžãããŠããŸãã
ãããŠãç§ãã¡ã¯éåžžãã¬ãžã¹ã¿ãŒã¯2ã€ãããªããšèããŠããŸããæåãAããšæåãaãããããŸãã 1ã€ã¯å€§æåããã1ã€ã¯å°æåã§ã-ããã§ã¯ãããŸãããïŒ
ãã ããUnicodeã«ã¯3ã€ã®ã¬ãžã¹ã¿ããããŸãã倧æåãå°æåãã¿ã€ãã«ã±ãŒã¹[titlecase]ããããŸããè±èªã§ã¯ãååã¯ãã®ããã«æžãããŠããŸããããšãã°ããã¢ãã³ãžã£ãŒãºïŒã€ã³ãã£ããã£ãŠã©ãŒããéåžžããã®å Žåãååèªã®æåã®æåã¯åçŽã«å€§æåã§èšè¿°ãããŸãïŒãŸããããŸããŸãªã«ãŒã«ãã¹ã¿ã€ã«ã«ãã£ãŠã¯ãèšäºãªã©ã®äžéšã®åèªã¯å€§æåã«ãªããŸããïŒã
Unicodeæšæºã§ã¯ã倧æåã®æåã®äŸã瀺ããŠããŸããU+ 01F2 LATIN CAPITAL LETTER D WITH SMALLZã次ã®ããã«ãªããŸãïŒÇ²ã
ãã®ãããªæåã¯ãUnicodeæšæºã®æãåæã®ãœãªã¥ãŒã·ã§ã³ã®1ã€ã§ããæ¢åã®ããã¹ããšã³ã³ãŒãã£ã³ã°ãšã®äžäœäºææ§ã®æªåœ±é¿ãåŠçããããã«å¿ èŠã«ãªãå ŽåããããŸãã Unicodeã¯ãæšæºã®æåã®çµã¿åããã䜿çšããŠã·ãŒã±ã³ã¹ãäœæããæ¹ã䟿å©ã§ãããã ããå€ãã®æ¢åã®ã·ã¹ãã ã§ã¯ãæ¢è£œã®ã·ãŒã±ã³ã¹çšã«ã¹ããŒã¹ããã§ã«å²ãåœãŠãããŠããŸããããšãã°ãISO-8859-1ïŒ "latin-1"ïŒã§ã¯ã "é"æåã®çªå·ã¯0xe9ã®æ¢è£œã®ãã©ãŒã ã§ãã Unicodeã§ã¯ããã®æåãå¥ã®ãeããšã¢ã¯ã»ã³ãããŒã¯ã§æžãããšãæãŸããã§ãããããã ããlatin-1ãªã©ã®æ¢åã®ãšã³ã³ãŒãã£ã³ã°ãšã®å®å šãªäžäœäºææ§ã確ä¿ããããã«ãUnicodeã¯æ¢è£œã®æåã«ã³ãŒããã€ã³ããå²ãåœãŠãŸããããšãã°ãU + 00E9 LATIN SMALL LETTER E WITHACUTEã
ãã®æåã®ã³ãŒãäœçœ®ã¯ãã®latin-1ãã€ãå€ãšåãã§ãããããã«äŸåããªãã§ãã ãããUnicodeã§ã®æåãšã³ã³ãŒãããããã®äœçœ®ãä¿æããå¯èœæ§ã¯äœãã§ããããšãã°ãUTF-8ã§ã¯ãã³ãŒãäœçœ®U + 00E9ã¯ãã€ãã·ãŒã±ã³ã¹0xc30xa9ãšããŠæžã蟌ãŸããŸãã
ãããŠãã¡ãããæ¢åã®ãšã³ã³ãŒãã£ã³ã°ã«ã¯å€§æåã䜿çšãããšãã«ç¹å¥ãªåŠçãå¿ èŠãªæåããããŸãããã®ããããããã¯ããã®ãŸãŸãUnicodeã«å«ãŸããŠããŸããããããã確èªãããå Žåã¯ããæ°ã«å ¥ãã®UnicodeããŒã¿ããŒã¹ã§Ltã«ããŽãªïŒãã¬ã¿ãŒãã¿ã€ãã«ã±ãŒã¹ãïŒã®æåãæ€çŽ¢ããŠãã ããã
ã±ãŒã¹ãå®çŸ©ããæ¹æ³ã¯ããã€ããããŸã
UnicodeæšæºïŒÂ§4.2ïŒã«ã¯ã3ã€ã®ç°ãªãã±ãŒã¹å®çŸ©ããªã¹ããããŠããŸããããããã3ã€ã®ãã¡ã®1ã€ãéžæããã®ã¯ãããã°ã©ãã³ã°èšèªã«ãã£ãŠè¡ãããŸãããã以å€ã®å Žåãéžæã¯ç¹å®ã®ç®æšã«ãã£ãŠç°ãªããŸãããããã®å®çŸ©ã¯æ¬¡ã®ãšããã§ãã
- æåã¯ãLuã«ããŽãªïŒãæåã倧æåãïŒã®å Žåã¯å€§æåã§ãLlã«ããŽãªïŒãæåãå°æåãïŒã®å Žåã¯å°æåã§ãããã®èŠæ Œã¯ããã®å®çŸ©ã®å¶éãèªèããŠããŸããç¹å®ã®åã·ã³ãã«ã¯ãã«ããŽãªã®1ã€ã«ã®ã¿åž°å±ããå¿ èŠããããŸãããã®ããã倧æåãŸãã¯å°æåã§ãå¿ é ãã§ããå€ãã®æåã¯ãä»ã®ã«ããŽãªã«å±ããŠããããããã®èŠä»¶ãæºãããŸããã
- æåã¯ãUppercaseããããã£ãç¶æ¿ããå Žåã¯å€§æåã«ãªããLowercaseããããã£ãç¶æ¿ããå Žåã¯å°æåã«ãªããŸããããã¯ã1ã€ã®å®çŸ©ãšä»ã®æåããããã£ã®çµã¿åããã§ããã倧æåãšå°æåãå«ãŸããå ŽåããããŸãã
- 倧æåã«ããããããåŸãå€æŽãããªãå Žåãæåã¯å€§æåã«ãªããŸããå°æåã«ãããã³ã°ãããåŸãå€æŽãããªãå Žåãæåã¯å°æåã«ãªããŸããããã¯ããªãäžè¬çãªå®çŸ©ã§ãããçŽæçã§ã¯ãªãåäœãããããšããããŸãã
ã·ã³ãã«ã®éããããµãã»ããïŒå ·äœçã«ã¯æåïŒã䜿çšããŠããå Žåã¯ã1ã€ã®å®çŸ©ã§ååãªå ŽåããããŸããã¬ããŒããªãŒãããåºãå ŽåïŒæåã§ã¯ãªãæåã®ãããªèšå·ãå«ãŸããŠããå ŽåïŒã2çªç®ã®å®çŸ©ãé©ããŠããå¯èœæ§ããããŸããUnicodeæšæºÂ§4.2ã§æšå¥šãããŠããŸãã
Unicodeæååãæäœããããã°ã©ããŒã¯ãæåããããã£ãçŽæ¥æäœããªãå Žåã¯ãisLowerCaseïŒããã³ãã®æ©èœçãªããšãtoLowerCaseïŒãªã©ã®æååé¢æ°ãæäœããå¿ èŠããããŸãã
ããã§èšåãããŠããæ©èœã¯ãUnicodeæšæºã®Â§3.13ã§å®çŸ©ãããŠããŸããæ£åŒã«ã¯ãå®çŸ©3ã¯Â§3.13ã®isLowerCaseé¢æ°ãšisUpperCaseé¢æ°ã䜿çšããããããtoLowerCaseãštoUpperCaseã®åºå®äœçœ®ã§å®çŸ©ãããŠããŸãã
ããã°ã©ãã³ã°èšèªã«æååãŸãã¯åã ã®æåã®å€§æåãšå°æåããã§ãã¯ãŸãã¯å€æããæ©èœãããå Žåã¯ãäžèšã®å®çŸ©ã®ã©ããå®è£ ã§äœ¿çšãããŠãããã調ã¹ã䟡å€ããããŸããèå³ãããã°ãPythonã®isupperïŒïŒã¡ãœãããšislowerïŒïŒã¡ãœããã¯2çªç®ã®å®çŸ©ã䜿çšããŸãã
ãã£ã©ã¯ã¿ãŒã®å Žåããã®å€èŠ³ãååã§ç解ããããšã¯äžå¯èœã§ã
å€ãã®ãã£ã©ã¯ã¿ãŒã®ç»å Žã«ãããã©ã®ãããªå ŽåããããããŸããããšãã°ããAãã¯å€§æåã§ããããã¯ãã·ã³ãã«ã®ååãLATIN CAPITALLETTERAããããæããã§ãããã ãããã®æ¹æ³ãæ©èœããªãå ŽåããããŸããã³ãŒããã€ã³ãU + 1D34ãåããŸãã次ã®ããã«ãªããŸãïŒáŽŽãUnicodeã§ã¯ã次ã®ååãå²ãåœãŠãããŸãïŒMODIFIER LETTER CAPITAL H.ã§ã¯ã倧æåã§ãããïŒ
å®éãå°æåã®ããããã£ãç¶æ¿ããŠãããããå®çŸ©ïŒ2ã§ã¯ãèŠèŠçã«å€§æåã®Hã«äŒŒãŠãããååã«ãCAPITALããšããåèªãå«ãŸããŠããã«ãããããããå°æåã«ãªã£ãŠããŸãã
äžéšã®æåã¯ãŸã£ãã倧æåãšå°æåãåºå¥ããŸãã
Unicodeæšæºã®Â§3.13ã®å®çŸ©135ã¯æ¬¡ã®ããã«è¿°ã¹ãŠããŸãã
Cã«å°æåãŸãã¯å€§æåã®ããããã£ãããå ŽåããŸãã¯General_CategoryãTitlecase_Letterã§ããå Žåã«éããCã¯å€§æåãšå°æåãåºå¥ããŸãã
ããã¯ãå€ãã®UnicodeæåïŒå®éã«ã¯ããããã®ã»ãšãã©ïŒã倧æåãšå°æåãåºå¥ããªãããšãæå³ããŸãã圌ãã®äºä»¶ã«ã€ããŠã®è³ªåã¯æå³ããªããäºä»¶ã®å€æŽã¯åœŒãã«åœ±é¿ãäžããŸããããã ããå®çŸ©ïŒ3ã«ããããã®è³ªåã«å¯ŸããçããåŸãããšãã§ããŸãã
äžéšã®æåã¯ãè€æ°ã®ã¬ãžã¹ã¿ãããããã«åäœããŸã
ã€ãŸããå®çŸ©ïŒ3ã䜿çšããŠã倧æåãšå°æåãåºå¥ããªãæåã倧æåã§ãããå°æåã§ããããå°ãããšããã¯ãããšããçããåŸãããŸãã
Unicodeæšæºã¯ãæåU + 02BD MODIFIER LETTER REVERSED COMMAïŒæ¬¡ã®ããã«ãªããŸãïŒ ÊœïŒã®äŸïŒè¡š4-1ã7è¡ç®ïŒã瀺ããŠããŸããç¶æ¿ãããå°æåãŸãã¯å€§æåã®ããããã£ããªããLtã«ããŽãªã«å±ããŠããªãããã倧æåãšå°æåã¯åºå¥ãããŸãããåæã«ã倧æåã«å€æããŠãå€æŽããããå°æåã«å€æããŠãå€æŽãããªãããã3çªç®ã®å®çŸ©ã«ããã°ããããªãã¯å€§æåã«å±ããŸããïŒããšããäž¡æ¹ã®è³ªåã«ãã¯ãããšçããŸãããšãããªãã¯å°æåã§ããïŒã
ããã¯äžå¿ èŠãªæ··ä¹±ãåŒãèµ·ããå¯èœæ§ãããããã§ãããèŠç¹ã¯ãå®çŸ©ïŒ3ãUnicodeæåã®ä»»æã®ã·ãŒã±ã³ã¹ã§æ©èœãã倧æåãšå°æåã®å€æã¢ã«ãŽãªãºã ãåçŽåã§ããããšã§ãïŒå€§æåãšå°æåãåºå¥ããªãæåã¯ãã®ãŸãŸã«ãªããŸãïŒã
ã±ãŒã¹ã¯ç¶æ³ã«å¿ããŠç°ãªããŸã
Unicodeã±ãŒã¹å€æããŒãã«ããã¹ãŠã®æåãã«ããŒããŠããå Žåããã®å€æã¯åã«ããŒãã«å ã®é©åãªå ŽæãèŠã€ããããšã§ãããšèãããããããŸãããããšãã°ãUnicodeããŒã¿ããŒã¹ã«ã¯ãU + 0041 LATIN CAPITAL LETTERAã¯å°æåã®U + 0061 LATIN SMALL LETTER AãšèšèŒãããŠããŸããåçŽã§ããã
ãã®ã¢ãããŒããæ©èœããªã1ã€ã®äŸã¯ãã®ãªã·ã£èªã§ããæåΣïŒã€ãŸããU + 03A3 GREEK CAPITAL LETTER SIGMAïŒã¯ãåèªã®ã©ãã«ãããã«å¿ããŠãå°æåã«å€æããããš2ã€ã®ç°ãªãæåã«ããããããŸããåèªã®çµããã«ããå Žåã¯ãå°æåã®ÏïŒU + 03C2 GREEK SMALL LETTER FINAL SIGMAïŒã«ãªããŸããä»ã®å Žæã§ã¯ÏïŒU + 03C3 GREEK SMALL LETTER SIGMAïŒã«ãªããŸãã
ããã¯ãã¬ãžã¹ã¿ã1察1ãŸãã¯äžæçã§ã¯ãªãããšãæå³ããŸããå¥ã®äŸã¯ÃïŒU + 00DF LATIN SMALL LETTER SHARP SããŸãã¯escetïŒã§ããå¥ã®å€§æåã®åœ¢åŒïŒáºãU + 1E9E LATIN CAPITAL LETTER SHARP SïŒããããŸããã倧æåã¯ãSSãã«ãªããŸãããŸãããSSããå°æåã«å€æãããšãssãã«ãªããŸãããããã£ãŠïŒå€§æåãšå°æåã®å€æã«Unicodeã®çšèªã䜿çšïŒïŒtoLowerCaseïŒtoUpperCaseïŒÃïŒïŒïŒ= Ssã
ã±ãŒã¹ã¯ãã±ãŒã«ã«äŸåããŸã
èšèªãç°ãªãã°ã倧æåãšå°æåã®å€æã«ãŒã«ãç°ãªããŸããæãäžè¬çãªäŸïŒiïŒU + 0069 LATIN SMALL LETTER IïŒãšIïŒU + 0049 LATIN CAPITAL LETTER IïŒã¯ãã»ãšãã©ã®ãã±ãŒã«ã§çžäºã«å€æãããŸãããã¹ãŠã§ã¯ãããŸããããã»ãšãã©ã§ãããã±ãŒã«azããã³trïŒãã«ã³èªïŒã§ã¯ã倧æåã®iã¯Ä°ïŒU + 0130 LATIN CAPITAL LETTER I WITH DOT ABOVEïŒã«ãªããå°æåã®Iã¯Ä±ïŒU + 0131 LATIN SMALL LETTER DOTLESS IïŒã«ãªããŸããæã ããããæ£ããããããšã¯æ¬åœã«çãšæ»ã®éããæå³ããŸãã
Unicodeèªäœãããã¹ãŠã®ãã±ãŒã«ã§èãããããã¹ãŠã®å€§æåå°æåå€æã«ãŒã«ãåŠçããããã§ã¯ãããŸããã UnicodeããŒã¿ããŒã¹ã«ã¯ããã±ãŒã«ã«åºæã§ã¯ãªãããã¹ãŠã®æåãå€æããããã®äžè¬çãªã«ãŒã«ã®ã¿ããããŸãããŸããããã€ãã®èšèªãšè€å圢åŒã«ã¯ç¹å¥ãªèŠåããããŸã-ãªãã¢ãã¢èªããã«ã³èªãã®ãªã·ã£èªã®ããã€ãã®æ©èœãä»ã®ãã¹ãŠã¯ããã«ãããŸãããèŠæ Œã®Â§3.13ã¯ããã«èšåããŠãããå¿ èŠã«å¿ããŠãã±ãŒã«åºæã®å€æã«ãŒã«ã®å°å ¥ãæšå¥šããŠããŸãã
äžäŸã¯è±èªã話ããµã€ã³ã§ã-ããã¯ç¹å®ã®ååã®ã¿ã€ãã«ã±ãŒã¹ã§ãã ãO'brianãã¯ãO'brianãã«å€æããå¿ èŠããããŸãïŒãO'brianãã§ã¯ãããŸããïŒããã ãããã®éããIt'sãã¯ãIt'Sãã§ã¯ãªããIt'sãã«å€æããå¿ èŠããããŸãã Unicodeã§åŠçãããªãå¥ã®äŸã¯ããªã©ã³ãèªã®æåã®çµã¿åãããijãã§ããããã¯ãã¿ã€ãã«ã±ãŒã¹ã«å€æãããšãã«ãåèªã®å é ã«ããå Žåã¯ãã¹ãŠå€§æåã«å€æããå¿ èŠããããŸãããããã£ãŠãã¿ã€ãã«ã¬ãžã¹ã¿ã®ãªã©ã³ãã§æ倧ã®ãã€ã¯ããIjsselmeerãã§ã¯ãªããIJsselmeerãã«ãªããŸãã Unicodeã«ã¯ãå¿ èŠã«å¿ããŠãIJU+ 0132 LATIN CAPITAL LIGATUREIJããã³Ä³U+ 0133 LATIN SMALL LIGATUREIJã®æåããããŸããããã©ã«ãã§ã¯ãã±ãŒã¹å€æã¯ããããçžäºã«å€æããŸãïŒãã ããäºææ§ã®åçæ§ã䜿çšããUnicodeæ£èŠåãã©ãŒã ã¯ãããã2ã€ã®å¥ã ã®æåã«åå²ããŸãïŒã
ã¬ããŒãã«ç€ºãããŠããè³æã«æ»ããŸãã Unicodeã®å€§æåå°æå管çã¯è€éã§ãããããå€ãã®ããã°ã©ãã³ã°èšèªã«èŠãããæšæºã®å°æåãŸãã¯å€§æåã®å€æé¢æ°ã䜿çšããŠã倧æåãšå°æåãåºå¥ããªãæ¯èŒãå®è¡ããããšã¯ã§ããŸããããã®ãããªæ¯èŒã®ããã«ãUnicodeã«ã¯å€§æåãšå°æåã®åºå¥ã®æŠå¿µããããæšæºã®Â§3.13ã¯toCaseFoldé¢æ°ãšisCaseFoldedé¢æ°ãå®çŸ©ããŠããŸãã
æããããŸããã±ãŒã¹ãžã®ãã£ã¹ãã¯å°æåãžã®ãã£ã¹ãã«äŒŒãŠãããšæããããããŸããããããã§ã¯ãããŸããã Unicodeæšæºã§ã¯ã倧æåãšå°æåãåºå¥ããæååãå°æåã«ããå¿ èŠã¯ãªããšèŠåããŠããŸããäŸãšããŠããã§ãããŒèšèªãäžããããŠããŸã-ããã§ãæããããŸãã倧æåå°æåã®æååã§ã¯ã倧æåã®æåãåºããããŸãã
ç§ã®è¬æŒã®ã¹ã©ã€ãã®1ã€ã§ã¯ãUnicodeãã¯ãã«ã«ã¬ããŒãïŒ36ãå¯èœãªéãPythonã§å®å šã«å®è£ ãããŠããŸãã NFKCæ£èŠåãå®è¡ãããçµæã®æååã«å¯ŸããŠcasefoldïŒïŒã¡ãœããïŒPython 3以éã§ã®ã¿äœ¿çšå¯èœïŒãåŒã³åºãããŸããããã§ããäžéšã®ãšããžã±ãŒã¹ã¯å€±æããããã¯IDæ¯èŒã«æšå¥šããããã®ã§ã¯ãããŸãããæåã®æªããã¥ãŒã¹ïŒPythonã¯ãXID_StartãŸãã¯XID_Continueã«ãªãæåããŸãã¯Default_Ignorable_Code_Pointããããã£ãæã€æåãé€å€ããã®ã«ååãªUnicodeããããã£ãå ¬éããŠããŸãããç§ã®ç¥ãéããNFKC_Casefoldãããã³ã°ã¯ãµããŒããããŠããŸãããå€æŽãããNFKCUAXïŒ31§5.1ã䜿çšããç°¡åãªæ¹æ³ããããŸããã
幞ããªããšã«ããããã®ãšããžã±ãŒã¹ã®ã»ãšãã©ã«ã¯ãåé¡ã®ã·ã³ãã«ã«ãã£ãŠããããããå®éã®ã»ãã¥ãªãã£ãªã¹ã¯ã¯å«ãŸããŠããŸããããŸããã±ãŒã¹ãã©ãŒã«ãã£ã³ã°ã¯ãååãšããŠæ£èŠåä¿åæäœãšããŠå®çŸ©ãããŠããŸããïŒãããã£ãŠãã±ãŒã¹ãã©ãŒã«ãã£ã³ã°åŸã«NFCã«åæ£èŠåãããNFKC_Casefoldãããã³ã°ïŒãäžè¬ã«ãæ¯èŒããå ŽåãååŠçåŸã«äž¡æ¹ã®æååãæ£èŠåãããŠãããã©ããã¯é¢ä¿ãããŸãããååŠçã«äžè²«æ§ããªããã©ãããããã³åŸã§ãç°ãªãã¯ããã®è¡ã®ã¿ãåŸã§ç°ãªãããšãä¿èšŒããããã©ãããæ°ã«ããŸãããããå¿é ãªå Žåã¯ãã¬ãžã¹ã¿ã®è¿œå åŸã«æåã§åæ£èŠåã§ããŸãã
ä»ã®ãšããååã§ã
ãã®èšäºã¯ãåã®ã¬ããŒããšåæ§ã«ç¶²çŸ çã§ã¯ãªãããã®ãã¹ãŠã®è³æã1ã€ã®æçš¿ã«åããããšã¯ã»ãšãã©äžå¯èœã§ããããããã®ãããã¯ã®è€éãã®æçšãªæŠèŠã§ããããããªãæ å ±ãæ¢ãã®ã«ååãªåºçºç¹ãæäŸããããšãé¡ã£ãŠããŸãããããã£ãŠãååãšããŠããã§åæ¢ã§ããŸãã
ä»ã®äººãäžé£ã®ãããã°ã©ããŒãä¿¡ããXã«ã€ããŠã®èª€è§£ãããã®é²åºãæžãã®ããããŠããããã°ã©ããŒãç¥ã£ãŠããã¹ãçå®ãã®ãããªèšäºãæžãå§ããããšãæåŸ ããã®ã¯åçŽã§ã¯ãªãã§ããããã