As such samples from non-English character sets may cause many problems down the road, one way of mitigating these upfront would be to transcribe them before import.
Just as ä->ae, ö->oe, ß->ss, ü->ue, you can do Š->sh, č->ch, etc.
As far as I could observe so far, this is how Asian languages also deal with the character set problem in official int'l documents.
Email was conceived with ASCII128 in mind, and some of its conventions can be difficult to overcome (see RFCs).
Hope this helps a little bit
some background info, in case anyone is interested:
https://en.wikipedia.org/wiki/International_email
https://en.wikipedia.org/wiki/Unicode#Email