[Introduction]
Unix Incompatibility Notes:
Character Type Functions
This page describes portability issues
related to the ctype.h functions in Unix.
There aren't really that many.
Versions of ctype.h exist on every Unix system I've ever
encountered, but there are small differences in the set of functions
defined and their behavior.
Originally these were implemented using the English language alphabet.
One of the bigger changes in the ANSI standard version was the extension
to international alphabets, so that their behavior changes based on the
locale that you set.
Character Classification
In older versions of Unix, the other character conversion macros were defined
only if isascii() was true.
So you'd always need to write
(isascii(ch) && isalpha(ch))
This limitation exists in some fairly recent implementations (eg, Solaris),
you should still do this.
I think in most implementations, even old ones, these functions will
work sensibly if passed an EOF value.
- isalnum()
-
Check if alphanumeric. Equivalent to (isalpha(c) || isdigit(c)).
- isalpha()
-
Check if alphabetic. In the default 'C' locale, this is
equivalent to (isupper(c) || islower(c)), but this is not true in
all domains, where some alphabetic characters are neither upper nor lower case.
- isascii()
-
Check if ASCII. Seven bit character values between 0 and 127 are ASCII.
In older versions of Unix, this was the only one of the character
classification macros defined on non-ascii characters.
- isblank()
-
Check if a space or tab character.
This is a Gnu extension, is not in the ANSI standard and is not available
everywhere.
- iscntrl()
-
Check if a control character. Character values between 0 and 31 are control
characters, as is character value 127 (the DEL character).
- isdigit()
-
Check if a digit.
- isgraph()
-
Check if a printable character other than a space.
This didn't exist in early implementations of ctype.h
- islower()
-
Check if lower case. Which characters are lower case depends on locale.
- isprint()
-
Check if printable. Equivalent to (isgraph(c) || c == ' ')
or to !iscntrl(c).
- ispunct()
-
Equivalent to (isgraph(c) && !isalnum(c)).
- isspace()
-
Check if white space. In "C" and "POSIX" locale, these are space,
form-feed ('\f'),
newline ('\n')
carriage-return ('\r')
horizontal-tab ('\t')
and vertical-tab ('\v').
- isupper()
-
Check if upper case. Which characters are upper case depends on locale.
- isxdigit()
-
Check for hexidecimal digit. That is
'0' through '9',
'A' through 'F', or
'a' through 'f',
This can differ in different locales.
This didn't exist in early implemtations of ctyle.
Character Conversion
- toascii()
-
Converts a character to ascii by clearing the high bit.
Not safe outside the standard locales, since it turns accented letters into
random characters.
This does not exist in some of the very old Unix versions, but those are
probably rare enough now not to be worth worrying about.
- tolower()
-
If given an uppercase letter, as defined by isupper(), return the
corresponding lowercase letter.
ANSI versions return the input character if the input character is not upper
case.
However, older versions would return random junk if passed a character that
was not upper case. For compatibility with such implementations, you'd
need always to do:
((isascii(ch) && isupper(ch)) ? tolower(ch) : ch)
Ain't backwards compatibility lovely?
- toupper()
-
If given an lowercase letter, as defined by islower(), return the
corresponding uppercase letter.
ANSI versions return the input character if the input character is not lower
case or if there is no corresponding upper case letter (in German, the
sharp s has no upper case version).
However, older versions would return random junk if passed a character that
was not upper case. For compatibility with such implementations, you'd
need always to do:
((isascii(ch) && islower(ch)) ? toupper(ch) : ch)
- _tolower()
-
This macro version of tolower() is available on most newer versions of
Unix. It behaves like the old-fashioned tolower() function in that
it's result is undefined if it is passed a character that is not upper case.
- _toupper()
-
This macro version of toupper() is available on most newer versions of
Unix. It behaves like the old-fashioned toupper() function in that
it's result is undefined if it is passed a character that is not a lower case.
Jan Wolter (E-Mail)
Thu Mar 6 09:38:43 EST 2003
- Original Release.