I keep hoping a string API will catch on in which combining marks are mostly treated as indivisible. Handling text one codepoint at a time is as bad an idea as handling US-ASCII one bit at a time--almost everything it lets you do is an elaborate way to misinterpret or corrupt your data.
It's not so simple: it depends on what you're doing with the text. If you're not trying to do analysis with it, encoded text is more or less a program written in a DSL that, when interpreted by a font renderer, draws symbols in some graphical context. Depending on the analysis you want to do, you need varying amounts of knowledge. Perhaps you only need to know about word boundaries; perhaps you're trying to look things up in a normalized dictionary; maybe even decompose a word into phonemes to try and pronounce it. These require different levels of analysis, and one size won't fit all.
update: for particular purposes consider using Collator class, it makes collation keys (byte arrays) out of strings applying locale, case sensitiveness and unicode decomposition. (at least so says the doc, http://download.oracle.com/javase/6/docs/api/java/text/Colla... )