Original problem statement:
This problem arose a lot of problems when being debugged. First of all there was an issue regarding whether to parse for WikiNames or free links at first. This bug was fixed. Secondly there was the problem of accepting some foreign characters within free links, this has also been adressed in the upcoming version 0.25 of 'Tavi.
Thirdly, and that's what this page is about, it arose a problem regarding how 'Tavi uses coding internally. And this problem is 'not solved yet. But lets first clarify what this problem actually is.
As FredrikJonsson stated:
This error is easily reproduced, and can be explained as follows. Tavi uses internally a 8bit coding scheme, which totally disregards the charset actually used (that is it presumes iso-8859-1). This in accordance with how php is written and interpreted. This leads to the following scenario:
A user enters 'Ansön' (that is Ansön using entities), when the parser of Tavi, when using UTF-8, the ö has been recognised as the unicode 0x00F6 and has been coded as the sequence: 0xC3 0xB6 or à ¶ or ö. Written using entities we now have that: Ansön has been changed into Ansön.
And since à actually is an uppercase letter, and ¶ is not recognised as a letter, Tavi (correctly according to iso-8859-1) parses this sequence as the WikiWord Ansà followed by the extra characters ¶.
This being said the problem can be reformulated to: Tavi breaks on words starting with an uppercase letter, when seing properly encoded Unicode characters which happens to have unicode encoding starting with the first byte being an actual uppercase letter in the iso-8859-1 encoding scheme.
There are two solutions to this problem: Avoid the issue or make Tavi unicode-readable.
For the time being the best solution would be to add {$EnableWikiLinks=0 to your config.php and only use free links. Then Tavi doesn't care about the mystic characters (unicode encoded characters, that is) as of version 0.25. Free links will function, and everything looks nice and not broken...
This is a lot more work, but should be done sometime. There are mainly two major issues regarding making Tavi work well together with unicode, and that is a problem related to php-handling of unicode strings and the issue of upper-/lowercase letters within unicode.
If I however find solutions to these issues, and I do hope I will find them, I will swiftly change Tavi so as to be compliant to using unicode encoded characters. But as for now, I deeply regret that we have to turn to avoiding the issue rather than fixing it... ;-(