[Introduction]

Unix Incompatibility Notes:
I/O Functions

Jan Wolter

This page describes portability issues related to various Unix I/O functions. It is incomplete.

I suggest using something like Gnu's autoconf package to test for the existance of various functions before compiling and define symbols like HAVE_SNPRINTF if the functions exist.

asprintf(), vasprintf()
These work like sprintf() and vsprintf, except that they dynamically allocate a buffer large enough to hold the result, thus ensuring that there will be neither buffer overflows nor lost data. They seem to be a Gnu innovation, and, unfortunately, doesn't appear to exist on any commercial Unixes. (Redhat 6.1 Linux seems to have them, although they aren't documented in the man pages.) The following completely untested and somewhat inefficient code might simulate it for systems that have vsnprintf(), and va_copy():
  #ifndef HAVE_ASPRINTF
  #include <stdarg.h>
  int vasprintf(char **ret, const char *format, va_list ap)
  {
      va_list ap2;
      int len= 100;        /* First guess at the size */
      if ((*ret= (char *)malloc(len)) == NULL) return -1;
      while (1)
      {
          int nchar;
          va_copy(ap2, ap);
          nchar= vsnprintf(*ret, len, format, ap2);
          if (nchar > -1 && nchar < len) return nchar;
          if (nchar > len)
              len= nchar+1;
          else
              len*= 2;
          if ((*ret= (char *)realloc(*ret, len)) == NULL)
          {
		free(*ret);
		return -1;
	  }
      }
  }

  int asprintf(char **ret, const char *format, ...)
  {
      va_list ap;
      int nc;
      va_start (ap, format);
      nc= vasprintf(ret, format, ap);
      va_end(ap);
      return nc;
  }
  #endif /*HAVE_ASPRINTF*/

Besides being untested, the code above has the problems that it depends on having stdarg.h, va_copy() and vsnprintf(), none of which are fully portable. Lack of stdarg.h can probably be worked around, but the others are likely to be harder. See the variadic functions page for discussion of va_copy().

fgetln()
This is a variant of fgets() that avoids its problems with fixed-sized input buffers. However it appears to exist only in the newer versions of BSD Unix. Even Linux doesn't support it. Probably it wouldn't be hard to write your own version to use on systems that don't have it, but I find the interface a bit clunky and so mostly just avoid it.

fgetpos(),fsetpos()
Mandated by ANSI C, but not common in pre-ANSI C systems. These are similar to fseek() and ftell, but handle arbitrarily large files.

fopen(),freopen(),fdopen()
Available on all Unixes.

On some non-Unix systems, you must include the 'b' flag when openning binary files. All POSIX Unix systems recognize but ignore the 'b' flag. I don't think any Unix system needs it. I suspect some older systems may reject it, but haven't seen one that does.

fseek(),ftell(),rewind()
Always available. May have problems on some OS's that support extremely large files, because the long offsets may not be big enough.

getc(), getchar(), fgetc()
All versions of Unix have these.

Be careful to assign the returned value to an int variable, not a char variable. This is needed for EOF to be properly recognized.

gets(), fgets()
Available on versions of Unix. gets() is extremely vulnerable to buffer overflows and should be avoided.

getline()
The following description of getline() has only the vaguest agreement with the behavior of that function in modern Unixes. I don't know if it has changed since I wrote this, or if I was wildly confused at the time. It should be ignored until I have time to update it.

Getline() reads a line from standard input and stores it in a buffer. Like fgets(), it is passed the buffer size so that overflows don't occur. Unlike fgets(), if the line is longer than the buffer size, the rest of the line is discarded. This is quite useful, but unavailable on many versions of Unix. I think the following would work as a substitute, but haven't tested it:

  #ifndef HAVE_GETLINE
  char *getline(char *buf, int len)
  {
     char *p;
     int ch;
     if ((p= fgets(buf, len, stdin)) == NULL ||
         strchr(p,'\n') != NULL) return p;
     while ((ch= getchar()) != EOF && ch != '\n')
          ;
     return p;
  }
  #endif

printf(), fprintf(), sprintf()
All versions of Unix have these, but there is some variation.

Newer versions return the number of characters printed or -1 if an error occurs. Older versions may not return the size (they return nothing useful). In some cases, you may be able to use the "%n" format string to work around this, but not all versions of printf support that.

The syntax of the format strings has expanded over the years. Things that may not appear in all versions of printf() include # for alternate forms, space or + on the precision value to control the printing of signs, ' for grouping, h for short arguments, q or L or ll for long long or long double arguments, Z for size_t arguments. and conversion types EinpxX.

Note that there is little standardization about how to specify long long (64-bit) arguments, with q, L, and ll all supported by Gnu CC. However, since there isn't much standardization with anything to do with 64-bit values, this is the least of your worries.

The sprintf() call is a common cause of buffer overflows. If portability concerns forbid using snprintf() instead, then I recommend that any %s directives in the format string be given with a maximum length (like %.10s) such that the total output size can't exceed the buffer size.

scanf(), fscanf(), sscanf()
These are available on almost all versions of Unix. However, it is often difficult to properly handle erroneous input with them, so I rarely end up using them. Which things work in the format strings varies on different systems, much as the format strings for printf() do.

snprintf(), vsnprintf()
These are wonderful substitutes to use for sprintf() and vsprintf since they avoid the buffer overflow problems that those have. They exist on most modern Unixes, but unfortunately there are still many older systems (e.g. SunOS 4.1) around that don't support them, so programmers who really care about portability are pretty much still stuck using sprintf() (with great care).

Implementing your own portable snprintf() is pretty hard. There are a few around on the net. See http://www.ijs.si/software/snprintf/ for example.

Different versions of snprintf() return different values when a buffer overflow occurs. Versions compliant with the C99 standard return the number of characters that would have been in the string (not including the terminating null) if the buffer had been big enough, so reallocing the buffer one byte larger than this and trying again should work. Older versions (including glibc through version 2.0.6) simply return -1 if the buffer is too small. This incompability is probably best dealt with by always testing both cases, as in the code fragment given for vasprintf() above.

There have been some implementations of snprintf() and vasprintf() in circulation which return incorrect required string lengths when the supplied buffer is too small. Some consistantly return a value too small by one, so if you keeping trying again with a buffer one byte larger than the value it returned, you fall into an infinite loop. To be maximally paranoid, you should write retry code so that is assured to always enlarge the buffer, no matter what snprintf() returns, or so that the number of retries is limited.

vprintf(), vfprintf(), vsprintf()
These don't exist on some antiquated unixes, but are required by ANSI C. Most of those old antiquated systems have a function called _doprnt() which can be used to achieve the same effect with less grace.

vscanf(), vfscanf(), vsscanf()
These are not in the ANSI C standard. They exist on many modern Unixes, but cannot be counted on.


Jan Wolter (E-Mail)
Fri Aug 24 11:01:22 EDT 2001 - Original Release.
Mon Aug 11 08:04:48 EDT 2003 - Corrections to discussion of asprintf() and snprintf() thanks to Richard Kettlewell