]> www.infradead.org Git - users/willy/pagecache.git/commit
Revert "unicode: Don't special case ignorable code points"
authorLinus Torvalds <torvalds@linux-foundation.org>
Wed, 11 Dec 2024 22:11:23 +0000 (14:11 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 11 Dec 2024 22:11:23 +0000 (14:11 -0800)
commit231825b2e1ff6ba799c5eaf396d3ab2354e37c6b
tree3e2895327558e6e54572d2ab00828e4c4c4d7eb2
parentec8e2d3889114f41d07cd341e80dc6de7f8eb213
Revert "unicode: Don't special case ignorable code points"

This reverts commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91.

It turns out that we can't do this, because while the old behavior of
ignoring ignorable code points was most definitely wrong, we have
case-folding filesystems with on-disk hash values with that wrong
behavior.

So now you can't look up those names, because they hash to something
different.

Of course, it's also entirely possible that in the meantime people have
created *new* files with the new ("more correct") case folding logic,
and reverting will just make other things break.

The correct solution is to not do case folding in filesystems, but
sadly, people seem to never really understand that.  People still see it
as a feature, not a bug.

Reported-by: Qi Han <hanqi@vivo.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219586
Cc: Gabriel Krisman Bertazi <krisman@suse.de>
Requested-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/unicode/mkutf8data.c
fs/unicode/utf8data.c_shipped