John A Meinel <john at arbash-meinel.com> writes: > Basically, it seems we need some sort of unicode normalization. maybe something like this: >>> unicodedata.normalize('NFKC',u"ra\u0308ksmo\u0308rga\u030as") u'r\xe4ksm\xf6rg\xe5s' --Denys