Therefore, each of the GLS library functions that take a string argument allow you to pass them either a null-terminated string, or a string whose end is determined by the separate length that you pass them.
..., mbs, mbs_byte_length, ...If mbs_byte_length is the value IFX_GL_NULL then the function will assume that mbs is a null-terminated string; otherwise the function assumes that mbs_byte_length is the number of bytes in the multi-byte character string. The null-terminator of a multi-byte string consists of one byte whose value is zero.
Multi-byte character strings which are not null-terminated are called length-terminated multi-byte strings and can contain null characters, but these null characters do not indicate the end of the string.
If mbs_byte_length is neither IFX_GL_NULL nor greater than or equal to zero, then the function gives the IFX_GL_PARAMERR error.
..., mb, mb_byte_limit, ...If mb_byte_limit is IFX_GL_NO_LIMIT then the function will read as many bytes as necessary from mb to form a complete character; otherwise, it will not read more than mb_byte_limit bytes from mb when trying to form a complete character.
1. If mb is a character in a null-terminated multi-byte string, then mb_byte_limit must be equal to IFX_GL_NO_LIMIT. For example, if mbs points to a string of multi-byte characters that are null terminated,
for ( mb = mbs; *mb != '\0'; mb += bytes ) { if ( (bytes = ifx_gl_mblen(mb, IFX_GL_NO_LIMIT)) == -1 ) /* handle error */ }2. If mb is a character in a multi-byte string which is not null-terminated or a character in a buffer by itself, then mb_byte_limit must be equal the number of bytes between where mb points and the end of the buffer which holds the string or character. For example, if mbs points to a string of multi-byte characters that are not null terminated and mbs_bytes is the number of bytes in that string,
for ( mb = mbs; mbs_bytes > 0; mb += bytes, mbs_bytes -= bytes ) { if ( (bytes = ifx_gl_mblen(mb, mbs_bytes)) == -1 ) /* handle error */ }or if mb points to one multi-byte character and mb_bytes is the number of bytes in the buffer that holds the character,
if ( (bytes = ifx_gl_mblen(mb, mb_bytes)) == -1 ) /* handle error */
If the function cannot determine whether mb is a valid multi-byte character, because it would need to read more than mb_byte_limit bytes from mb or if mb_byte_limit is less than or equal to zero, then the function gives the IFX_GL_EINVAL error.
..., wcs, wcs_char_length, ...If wcs_char_length is the value IFX_GL_NULL then the function will assume that wcs is a null-terminated string; otherwise the function assumes that wcs_char_length is the number of characters in the wide-character string. The null-terminator of a wide-character string consists of one gl_wchar_t whose value is zero.
Wide-character strings which are not null-terminated are called length-terminated wide-character strings and can contain null characters, but these null characters do not indicate the end of the string.
If wcs_char_length is neither IFX_GL_NULL nor greater than or equal to zero, then the function gives the IFX_GL_PARAMERR error.
gl_mchar_t mbs[20*IFX_GL_MB_MAX];
gl_mchar_t *mbs = (gl_mchar_t *) malloc(20*IFX_GL_MB_MAX);
gl_mchar_t *mbs = (gl_mchar_t *) malloc(20*ifx_gl_mb_loc_max());
gl_mchar_t mbs[20*IFX_GL_MB_MAX+1];
gl_mchar_t *mbs = (gl_mchar_t *) malloc(20*IFX_GL_MB_MAX+1);
gl_mchar_t *p = (gl_mchar_t *) malloc(20*gl_mb_loc_max()+1);
gl_wchar_t wcs[20];
gl_wchar_t *wcs = (gl_wchar_t *) malloc(20*sizeof(gl_wchar_t));
gl_wchar_t wcs[21];
gl_wchar_t *wcs = (gl_wchar_t *) malloc(21*sizeof(gl_wchar_t));
However, truncating a string that can contain even one multi-byte character is difficult. This is because truncating at an arbitrary byte location in the string can result in truncating a multi-byte character in its middle such that the truncated string ends with the first 1, 2 or 3 bytes of a character without the character's remaining bytes.
If such a situation occurs, then subsequent traversal of the truncated string could result in reading beyond the end of the buffer.
Therefore, all GLS library functions which traverse one multi-byte character or traverse length-terminated multi-byte characters strings give a special error if they detect that an otherwise valid character has been truncated: IFX_GL_EINVAL.
If it is known that no truncation occurred to the string, then IFX_GL_EINVAL can be considered the same as IFX_GL_EILSEQ. However, if it is possible that truncation has occurred, then IFX_GL_EINVAL indicates to the caller that they need to further truncate the string so that the last byte of the string is the last byte of the last character in the string.
Depending upon your application, you may either end up making the truncated string even shorter than originally indented or you may have to replace the first 1, 2, or 3 bytes of the truncated character with a padding character that is appropriate for your application.
Even though the GLS library functions can be used to detect this situation after it has occurred, it is much better to use them to avoid the situation.
However, fragmenting a string that can contain even one multi-byte character is difficult. This is because fragmenting at arbitrary byte locations in the string can result in fragmenting a multi-byte character in its middle such that one fragment ends with the first 1, 2 or 3 bytes of a character and the next fragment starts with the remaining bytes.
If the only thing you ever will do with these fragments is to concatenate them back together to form one string, then no special processing needs to be done. However, if you traverse the fragments as multi-byte strings, this can result in reading beyond the end of one fragment or finding an illegal character at the beginning of another.
Therefore, all GLS library functions which traverse one multi-byte character or traverse length-terminated multi-byte characters strings give a special error if they detect that an otherwise valid character has been truncated at the end of a fragment: IFX_GL_EINVAL. It is impossible to detect that the beginning of a fragment contains the remaining bytes of the last character in the previous fragment without looking at the previous fragment first. This is because the last 1, 2 or 3 bytes of a multi-byte character may look exactly like a valid character.
If it is known that no fragmentation occurred to the string, then IFX_GL_EINVAL can be considered the same as IFX_GL_EILSEQ. However, if it is possible that fragmentation has occurred, then IFX_GL_EINVAL indicates to the caller that they need to fragment the string so that the last byte of each fragment is the last byte of the last character in the fragment and so that the first byte of each fragment is the first byte of the first character in the fragment.
Depending upon your application, you may either end up making a fragment even shorter than originally indented or you may have to replace the first 1, 2, or 3 bytes of the fragmented character with a padding character that is appropriate for your application and shift these bytes to the beginning of the next fragment.
Even though the GLS library functions can be used to detect this situation after it has occurred, it is much better to use them to avoid the situation.