libc: locale: fix EUC shift check

wchar_t is unsigned on ARM platforms, and signed pretty much everywhere
else.  On signed platforms, `nm` ends up with bogus upper bits set if we
did in-fact have a valid CS2 or CS3 (MSB set).  Mask just the low byte
to avoid sign bit garbage.

Bare basic test of converting a CS2 widechar in eucCN, which would
previously kick back an EILSEQ.

Reviewed by:	bapt, rew
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D43262
This commit is contained in:
Kyle Evans
2025-04-20 13:29:45 -05:00
parent 40ad87e958
commit c4c562eadf
2 changed files with 14 additions and 1 deletions
+1 -1
View File
@@ -426,7 +426,7 @@ _EUC_wcrtomb_impl(char * __restrict s, wchar_t wc,
/* This first check excludes CS1, which is implicitly valid. */
if ((wc < 0xa100) || (wc > 0xffff)) {
/* Check for valid CS2 or CS3 */
nm = (wc >> ((len - 1) * 8));
nm = (wc >> ((len - 1) * 8)) & 0xff;
if (nm == cs2) {
if (len != cs2width) {
errno = EILSEQ;
+13
View File
@@ -41,6 +41,18 @@
#include <atf-c.h>
ATF_TC_WITHOUT_HEAD(euccs1_test);
ATF_TC_BODY(euccs1_test, tc)
{
wchar_t wc = 0x8e000000;
char buf[MB_LEN_MAX];
ATF_REQUIRE(strcmp(setlocale(LC_CTYPE, "zh_CN.eucCN"),
"zh_CN.eucCN") == 0);
ATF_REQUIRE(wctomb(&buf[0], wc) == 4);
}
ATF_TC_WITHOUT_HEAD(wctomb_test);
ATF_TC_BODY(wctomb_test, tc)
{
@@ -104,6 +116,7 @@ ATF_TC_BODY(wctomb_test, tc)
ATF_TP_ADD_TCS(tp)
{
ATF_TP_ADD_TC(tp, euccs1_test);
ATF_TP_ADD_TC(tp, wctomb_test);
return (atf_no_error());