From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on sa.int.altlinux.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.5 Date: Wed, 15 Apr 2009 15:53:44 +0200 From: Michael Schutte To: Linux console tools development discussion Message-ID: <20090415135344.GA3881@graeme> Mail-Followup-To: Linux console tools development discussion References: <20090414174549.GA4174@graeme> <49E5021C.5040703@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="V88s5gaDVPzZ0KCq" Content-Disposition: inline In-Reply-To: <49E5021C.5040703@gmail.com> Jabber-ID: schm@yne.at User-Agent: Mutt/1.5.17 (2007-11-01) Subject: Re: [kbd] =?utf-8?q?=5BPATCH=5D_loadkeys=3A_Auto-convert_=E2=80=9Ctra?= =?utf-8?q?ditional=E2=80=9D/Unicode_keysyms?= X-BeenThere: kbd@lists.altlinux.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: Linux console tools development discussion List-Id: Linux console tools development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Apr 2009 13:54:14 -0000 Archived-At: List-Archive: --V88s5gaDVPzZ0KCq Content-Type: multipart/mixed; boundary="98e8jtXdkpgskNou" Content-Disposition: inline --98e8jtXdkpgskNou Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Alexey, On Wed, Apr 15, 2009 at 01:37:32AM +0400, Alexey Gladkov wrote: > 14.04.2009 21:45, Michael Schutte wrote: > > The Linux kernel distinguishes between K(KTYP, KVAL) keysyms and Unicode > > characters. This patch makes loadkeys query the console=E2=80=99s Unic= ode mode > > and convert between the two keysym types according to the result. The > > theoretical advantage is that less keymaps need both an 8-bit and a > > Unicode variant (cf. trq[u], ua[-utf]). >=20 > I have a problem with your patch: >=20 > LANG: ru_RU.UTF-8 > keymap: data/keymaps/i386/qwerty/ruwin_cplk-UTF-8.map >=20 > The difference between the old and the new behavior is attached. This > is 'dumpkeys -n' output. Thanks for your testing, I completely missed this. But you can still type the affected characters, right? As far as I can tell, it=E2=80=99s on= ly dumpkeys which is wrong here. > And also, you have not updated the documentation for new behaviour. loadkeys(1) didn=E2=80=99t even document the old meaning of -u. I=E2=80=99= ve added a short section about the features of this patch. If you want me to add more information, please let me know. A fixed version of the patch is attached to this mail. Cheers, --=20 Michael Schutte --98e8jtXdkpgskNou Content-Type: text/x-diff; charset=utf-8 Content-Disposition: attachment; filename="auto_convert.patch" Content-Transfer-Encoding: quoted-printable =46rom 8139d872c6797d73da3465c9839ee98ae865396d Mon Sep 17 00:00:00 2001 =46rom: Michael Schutte Date: Tue, 14 Apr 2009 10:46:32 +0200 Subject: [PATCH] loadkeys: Auto-convert =E2=80=9Ctraditional=E2=80=9D/Unico= de keysyms MIME-Version: 1.0 Content-Type: text/plain; charset=3Dutf-8 Content-Transfer-Encoding: 8bit The Linux kernel distinguishes between K(KTYP, KVAL) keysyms and Unicode characters. This patch makes loadkeys query the console=E2=80=99s Unicode = mode and convert between the two keysym types according to the result. The theoretical advantage is that less keymaps need both an 8-bit and a Unicode variant (cf. trq[u], ua[-utf]). A similar patch (read_keymaps_fmt) has been in use in Debian=E2=80=99s vers= ion of kbd since 2004; see for a discussion. Credit for this goes to Denis Barbier . Signed-off-by: Michael Schutte --- man/man1/loadkeys.1.in | 19 ++++++++ src/dumpkeys.c | 5 +- src/ksyms.c | 113 +++++++++++++++++++++++++++-----------------= --- src/ksyms.h | 4 +- src/loadkeys.y | 74 ++++++++++++------------------- 5 files changed, 117 insertions(+), 98 deletions(-) diff --git a/man/man1/loadkeys.1.in b/man/man1/loadkeys.1.in index ab4c973..64031af 100644 --- a/man/man1/loadkeys.1.in +++ b/man/man1/loadkeys.1.in @@ -23,6 +23,8 @@ loadkeys \- load keyboard translation tables ] [ .I -s --clearstrings ] [ +.I -u --unicode +] [ .I -v --verbose ] [ .I filename... @@ -144,6 +146,23 @@ prints to the standard output a file that may be used = as a binary keymap as expected by Busybox .B loadkmap command (and does not modify the current keymap). +.SH "UNICODE MODE" +.B loadkeys +automatically detects whether the console is in Unicode or +ASCII (XLATE) mode. When a keymap is loaded, literal +keysyms (such as +.BR section ) +are resolved accordingly; numerical keysyms are converted to +fit the current console mode, regardless of the way they are +specified (decimal, octal, hexadecimal or Unicode). +.LP +The +.I -u +(or +.IR --unicode ) +switch tells +.B loadkeys +to bypass the check and assume that the console is in Unicode mode. .SH "OTHER OPTIONS" .TP .B \-h \-\-help diff --git a/src/dumpkeys.c b/src/dumpkeys.c index 879c96f..580d480 100644 --- a/src/dumpkeys.c +++ b/src/dumpkeys.c @@ -135,11 +135,10 @@ print_keysym(int code, char numeric) { t =3D KTYP(code); v =3D KVAL(code); if (t >=3D syms_size) { - code =3D code ^ 0xf000; - if (!numeric && (p =3D unicodetoksym(code)) !=3D NULL) + if (!numeric && (p =3D codetoksym(code)) !=3D NULL) printf("%-16s", p); else - printf("U+%04x ", code); + printf("U+%04x ", code ^ 0xf000); return; } plus =3D 0; diff --git a/src/ksyms.c b/src/ksyms.c index e8a494a..36196f1 100644 --- a/src/ksyms.c +++ b/src/ksyms.c @@ -1644,7 +1644,7 @@ struct cs { =20 /* Functions for both dumpkeys and loadkeys. */ =20 -static int prefer_unicode =3D 0; +int prefer_unicode =3D 0; static const char *chosen_charset =3D NULL; =20 void @@ -1685,11 +1685,6 @@ set_charset(const char *charset) { sym *p; unsigned int i; =20 - if (!strcasecmp(charset, "unicode")) { - prefer_unicode =3D 1; - return 0; - } - for (i =3D 1; i < sizeof(charsets)/sizeof(charsets[0]); i++) { if (!strcasecmp(charsets[i].charset, charset)) { charsets[0].charset =3D charsets[i].charset; @@ -1700,7 +1695,7 @@ set_charset(const char *charset) { if(p->name[0]) syms[0].table[i] =3D p->name; } - chosen_charset =3D charset; + chosen_charset =3D strdup(charset); return 0; } } @@ -1710,38 +1705,67 @@ set_charset(const char *charset) { } =20 const char * -unicodetoksym(int code) { +codetoksym(int code) { unsigned int i; int j; sym *p; =20 if (code < 0) return NULL; - if (code < 0x80) - return iso646_syms[code]; - for (i =3D 0; i < sizeof(charsets)/sizeof(charsets[0]); i++) { - p =3D charsets[i].charnames; - for (j =3D charsets[i].start; j < 256; j++, p++) { - if (p->uni =3D=3D code && p->name[0]) + + if (code < 0x1000) { /* "traditional" keysym */ + if (KTYP(code) =3D=3D KT_META) + return NULL; + if (KTYP(code) =3D=3D KT_LETTER) + code =3D K(KT_LATIN, KVAL(code)); + if (KTYP(code) > KT_LATIN) + return syms[KTYP(code)].table[KVAL(code)]; + + for (i =3D 0; i < sizeof(charsets)/sizeof(charsets[0]); i++) { + p =3D charsets[i].charnames; + if (!p) + continue; + p +=3D KVAL(code) - charsets[i].start; + if (p->name[0]) return p->name; } } + + else { /* Unicode keysym */ + code ^=3D 0xf000; + + if (code < 0x80) + return iso646_syms[code]; + + for (i =3D 0; i < sizeof(charsets)/sizeof(charsets[0]); i++) { + p =3D charsets[i].charnames; + if (!p) + continue; + for (j =3D charsets[i].start; j < 256; j++, p++) { + if (p->uni =3D=3D code && p->name[0]) + return p->name; + } + } + } + return NULL; } =20 /* Functions for loadkeys. */ =20 -int unicode_used =3D 0; - int ksymtocode(const char *s) { unsigned int i; int j, jmax; int keycode; sym *p; + int save_prefer_unicode; =20 if (!strncmp(s, "Meta_", 5)) { + save_prefer_unicode =3D prefer_unicode; + prefer_unicode =3D 0; keycode =3D ksymtocode(s+5); + prefer_unicode =3D save_prefer_unicode; if (KTYP(keycode) =3D=3D KT_LATIN) return K(KT_META, KVAL(keycode)); =20 @@ -1767,10 +1791,8 @@ ksymtocode(const char *s) { for (i =3D 0; i < sizeof(charsets)/sizeof(charsets[0]); i++) { p =3D charsets[i].charnames; for (j =3D charsets[i].start; j < 256; j++, p++) - if (!strcmp(s,p->name)) { - unicode_used =3D 1; - return (p->uni ^ 0xf000); /* %%% */ - } + if (!strcmp(s,p->name)) + return (p->uni ^ 0xf000); } } else /* if (!chosen_charset) */ { /* note: some keymaps use latin1 but with euro, @@ -1821,38 +1843,33 @@ ksymtocode(const char *s) { } =20 int -unicodetocode(int code) { - const char *s; - - s =3D unicodetoksym(code); - if (s) - return ksymtocode(s); - else { - unicode_used =3D 1; - return (code ^ 0xf000); /* %%% */ - } +convert_code(int code) +{ + const char *ksym; + + if (KTYP(code) =3D=3D KT_META) + return code; + else if (prefer_unicode =3D=3D (code >=3D 0x1000)) + return code; /* no conversion necessary */ + + /* depending on prefer_unicode, this will give us either an 8-bit + * K(KTYP, KVAL) or a Unicode keysym xor 0xf000 */ + ksym =3D codetoksym(code); + if (ksym) + return ksymtocode(ksym); + else + return code; } =20 int add_capslock(int code) { - char buf[7]; - const char *p; - - if (KTYP(code) =3D=3D KT_LATIN) + if (KTYP(code) =3D=3D KT_LATIN && (!prefer_unicode || code < 0x80)) return K(KT_LETTER, KVAL(code)); - if ((unsigned) KTYP(code) >=3D syms_size) { - if ((p =3D unicodetoksym(code ^ 0xf000)) =3D=3D NULL) { - sprintf(buf, "U+%04x", code ^ 0xf000); - p =3D buf; - } - } else { - sprintf(buf, "0x%04x", code); - p =3D buf; - } -#if 0 - /* silence the common usage dumpkeys | loadkeys -u */ - fprintf(stderr, _("plus before %s ignored\n"), p); -#endif - return code; + else if ((code ^ 0xf000) < 0x100) + /* Unicode Latin-1 Supplement */ + /* a bit dirty to use KT_LETTER here, but it should work */ + return K(KT_LETTER, code ^ 0xf000); + else + return convert_code(code); } diff --git a/src/ksyms.h b/src/ksyms.h index 74cff92..b3c3d0c 100644 --- a/src/ksyms.h +++ b/src/ksyms.h @@ -26,10 +26,10 @@ extern const unsigned int syn_size; #define CODE_FOR_UNKNOWN_KSYM (-1) =20 extern int set_charset(const char *name); -extern const char *unicodetoksym(int code); +extern const char *codetoksym(int code); extern void list_charsets(FILE *f); extern int ksymtocode(const char *s); -extern int unicodetocode(int code); +extern int convert_code(int code); extern int add_capslock(int code); =20 #endif diff --git a/src/loadkeys.y b/src/loadkeys.y index 9ff4759..a4a2f30 100644 --- a/src/loadkeys.y +++ b/src/loadkeys.y @@ -63,7 +63,7 @@ static void killkey(int index, int table); static void compose(int diacr, int base, int res); static void do_constant(void); static void do_constant_key (int, u_short); -static void loadkeys(char *console, int *warned); +static void loadkeys(char *console); static void mktable(void); static void bkeymap(void); static void strings_as_usual(void); @@ -73,10 +73,10 @@ static void strings_as_usual(void); static void compose_as_usual(char *charset); static void lkfatal0(const char *, int); extern int set_charset(const char *charset); +extern int prefer_unicode; extern char *xstrdup(char *); int key_buf[MAX_NR_KEYMAPS]; int mod; -extern int unicode_used; int private_error_ct =3D 0; =20 extern int rvalct; @@ -240,11 +240,13 @@ rvalue1 : rvalue } ; rvalue : NUMBER - {$$=3D$1;} - | UNUMBER - {$$=3D($1 ^ 0xf000); unicode_used=3D1;} + {$$=3Dconvert_code($1);} | PLUS NUMBER {$$=3Dadd_capslock($2);} + | UNUMBER + {$$=3Dconvert_code($1^0xf000);} + | PLUS UNUMBER + {$$=3Dadd_capslock($2^0xf000);} | LITERAL {$$=3D$1;} | PLUS LITERAL @@ -270,7 +272,7 @@ usage(void) { " -h --help display this help text\n" " -m --mktable output a \"defkeymap.c\" to stdout\n" " -s --clearstrings clear kernel string table\n" -" -u --unicode implicit conversion to Unicode\n" +" -u --unicode force conversion to Unicode\n" " -v --verbose report the changes\n"), PACKAGE_VERSION, DEFMAP); exit(1); } @@ -302,8 +304,9 @@ main(int argc, char *argv[]) { { NULL, 0, NULL, 0 } }; int c; + int fd; + int mode; char *console =3D NULL; - int warned =3D 0; =20 set_progname(argv[0]); =20 @@ -333,7 +336,7 @@ main(int argc, char *argv[]) { opts =3D 1; break; case 'u': - set_charset("unicode"); + prefer_unicode =3D 1; break; case 'q': quiet =3D 1; @@ -349,8 +352,20 @@ main(int argc, char *argv[]) { } } =20 + if (!optm && !prefer_unicode) { + /* no -u option: auto-enable it if console is in Unicode mode */ + fd =3D getfd(NULL); + if (ioctl(fd, KDGKBMODE, &mode)) { + perror("KDGKBMODE"); + fprintf(stderr, _("loadkeys: error reading keyboard mode\n")); + exit(1); + } + if (mode =3D=3D K_UNICODE) + prefer_unicode =3D 1; + close(fd); + } + args =3D argv + optind - 1; - unicode_used =3D 0; yywrap(); /* set up the first input file, if any */ if (yyparse() || private_error_ct) { fprintf(stderr, _("syntax error in map file\n")); @@ -375,14 +390,14 @@ main(int argc, char *argv[]) { char ch =3D *e; *e =3D '\0'; if (verbose) printf("%s\n", s); - loadkeys(s, &warned); + loadkeys(s); *e =3D ch; s =3D e; } free(buf); } else - loadkeys(NULL, &warned); + loadkeys(NULL); exit(0); } =20 @@ -811,20 +826,10 @@ compose(int diacr, int base, int res) { } =20 static int -defkeys(int fd, char *cons, int *warned) { +defkeys(int fd) { struct kbentry ke; int ct =3D 0; int i,j,fail; - int oldm; - - if (unicode_used) { - /* Switch keyboard mode for a moment - - do not complain about errors. - Do not attempt a reset if the change failed. */ - if (ioctl(fd, KDGKBMODE, &oldm) - || (oldm !=3D K_UNICODE && ioctl(fd, KDSKBMODE, K_UNICODE))) - oldm =3D K_UNICODE; - } =20 for(i=3D0; i