Main Content

ISO/IEC TS 17961 [signconv]

Conversion of signed characters to wider integer types before a check for EOF

Description

Rule Definition

Conversion of signed characters to wider integer types before a check for EOF.1

Polyspace Implementation

This checker checks for Misuse of sign-extended character value.

Examples

expand all

Issue

Misuse of sign-extended character value occurs when you convert a signed or plain char data type to a wider integer data type with sign extension. You then use the resulting sign-extended value as array index, for comparison with EOF or as argument to a character-handling function.

Risk

Comparison with EOF: Suppose, your compiler implements the plain char type as signed. In this implementation, the character with the decimal form of 255 (–1 in two’s complement form) is stored as a signed value. When you convert a char variable to the wider data type int for instance, the sign bit is preserved (sign extension). This sign extension results in the character with the decimal form 255 being converted to the integer –1, which cannot be distinguished from EOF.

Use as array index: By similar reasoning, you cannot use sign-extended plain char variables as array index. If the sign bit is preserved, the conversion from char to int can result in negative integers. You must use positive integer values for array index.

Argument to character-handling function: By similar reasoning, you cannot use sign-extended plain char variables as arguments to character-handling functions declared in ctype.h, for instance, isalpha() or isdigit(). According to the C11 standard (Section 7.4), if you supply an integer argument that cannot be represented as unsigned char or EOF, the resulting behavior is undefined.

Fix

Before conversion to a wider integer data type, cast the signed or plain char value explicitly to unsigned char.

Example - Sign-Extended Character Value Compared with EOF
#include <stdio.h>
#include <stdlib.h>
#define fatal_error() abort()

extern char parsed_token_buffer[20];

static int parser(char *buf)
{
    int c = EOF;
    if (buf && *buf) {
        c = *buf++;    
    }
    return c;
}

void func()
{
    if (parser(parsed_token_buffer) == EOF) { 
        /* Handle error */
        fatal_error();
    }
}

In this example, the function parser can traverse a string input buf. If a character in the string has the decimal form 255, when converted to the int variable c, its value becomes –1, which is indistinguishable from EOF. The later comparison with EOF can lead to a false positive.

Correction — Cast to unsigned char Before Conversion

One possible correction is to cast the plain char value to unsigned char before conversion to the wider int type.

#include <stdio.h>
#include <stdlib.h>
#define fatal_error() abort()

extern char parsed_token_buffer[20];

static int parser(char *buf)
{
    int c = EOF;
    if (buf && *buf) {
        c = (unsigned char)*buf++;    
    }
    return c;
}

void func()
{
    if (parser(parsed_token_buffer) == EOF) { 
        /* Handle error */
        fatal_error();
    }
}

Check Information

Decidability: Undecidable

Version History

Introduced in R2019a


1 Extracts from the standard "ISO/IEC TS 17961 Technical Specification - 2013-11-15" are reproduced with the agreement of AFNOR. Only the original and complete text of the standard, as published by AFNOR Editions - accessible via the website www.boutique.afnor.org - has normative value.