CX Framework
Cross-platform C utility framework
Loading...
Searching...
No Matches
Encoding

Functions

bool strValidUTF8 (strref s)
 
bool strValidASCII (strref s)
 
size_t strToUTF16 (strref s, _Out_writes_opt_(wsz) uint16 *buf, size_t wsz)
 
uint16 * strToUTF16A (strref s)
 
uint16 * strToUTF16S (strref s)
 
bool strFromUTF16 (strhandle o, _In_reads_(wsz) const uint16 *buf, size_t wsz)
 
bool strB64Encode (strhandle out, const uint8 *buf, uint32 sz, bool urlsafe)
 
uint32 strB64Decode (strref s, _Out_writes_bytes_opt_(sz) uint8 *buf, uint32 sz)
 

Detailed Description

String encoding validation and conversion.

CX strings internally use UTF-8, ASCII, or unspecified/binary encoding. This module provides validation and conversion to/from other encodings.

Function Documentation

◆ strB64Decode()

uint32 strB64Decode ( strref  s,
_Out_writes_bytes_opt_(sz) uint8 *  buf,
uint32  sz 
)

Decodes a base64 string to binary data

Converts base64 text encoding back to binary data. Supports both standard and URL-safe base64 encodings automatically.

This function can be called twice: first with buf=NULL to query the required buffer size, then with an allocated buffer to perform the decoding.

Parameters
sBase64 encoded string
bufOutput buffer for binary data (NULL to query size)
szSize of output buffer in bytes
Returns
Number of bytes required for decoded data, or 0 on error

Example:

uint32 sz = strB64Decode(encoded, NULL, 0); // Query size
uint8 *data = xaAlloc(sz);
strB64Decode(encoded, data, sz); // Decode
// ... use data ...
xaFree(data);
uint32 strB64Decode(strref s, _Out_writes_bytes_opt_(sz) uint8 *buf, uint32 sz)
void xaFree(void *ptr)
#define xaAlloc(size,...)
Definition xalloc.h:199

◆ strB64Encode()

bool strB64Encode ( strhandle  out,
const uint8 *  buf,
uint32  sz,
bool  urlsafe 
)

Encodes binary data as a base64 string

Converts arbitrary binary data into base64 text encoding. Supports both standard base64 and URL-safe base64 (using '-' and '_' instead of '+' and '/').

Any existing string in the output parameter is destroyed first.

Parameters
outPointer to output string variable
bufBinary data to encode
szSize of binary data in bytes
urlsafeUse URL-safe base64 alphabet if true
Returns
true on success, false on error

Example:

string encoded = 0;
strB64Encode(&encoded, data, dataSize, false);
// ... use encoded ...
strDestroy(&encoded);
void strDestroy(strhandle ps)
bool strB64Encode(strhandle out, const uint8 *buf, uint32 sz, bool urlsafe)

◆ strFromUTF16()

bool strFromUTF16 ( strhandle  o,
_In_reads_(wsz) const uint16 *  buf,
size_t  wsz 
)

Converts a UTF-16 encoded buffer to a UTF-8 string

Decodes UTF-16 code units (including surrogate pairs) into a UTF-8 string. The function validates the UTF-16 encoding and will fail if invalid sequences are encountered. The buffer size should NOT include a null terminator if present.

Any existing string in the output parameter is destroyed first.

Parameters
oPointer to output string variable
bufBuffer containing UTF-16 code units
wszNumber of uint16 elements in buffer (excluding null terminator)
Returns
true on success, false if UTF-16 encoding is invalid

Example:

string s = 0;
if (strFromUTF16(&s, wideBuf, cstrLenw(wideBuf))) {
// Conversion successful
}
size_t cstrLenw(const unsigned short *s)
bool strFromUTF16(strhandle o, _In_reads_(wsz) const uint16 *buf, size_t wsz)

◆ strToUTF16()

size_t strToUTF16 ( strref  s,
_Out_writes_opt_(wsz) uint16 *  buf,
size_t  wsz 
)

Converts a UTF-8 string to UTF-16 encoding

Encodes the string as UTF-16 code units, including surrogate pairs for code points outside the Basic Multilingual Plane. The string must be valid UTF-8 or this function will fail.

This function can be called twice: first with buf=NULL to query the required buffer size, then with an allocated buffer to perform the conversion.

Parameters
sUTF-8 string to convert
bufOutput buffer for UTF-16 code units (NULL to query size)
wszSize of output buffer in uint16 elements
Returns
Number of uint16 elements required (including null terminator), or 0 on error

Example:

size_t sz = strToUTF16(s, NULL, 0); // Query size
uint16 *buf = xaAlloc(sz * sizeof(uint16));
strToUTF16(s, buf, sz); // Convert
// ... use buf ...
xaFree(buf);
size_t strToUTF16(strref s, _Out_writes_opt_(wsz) uint16 *buf, size_t wsz)

◆ strToUTF16A()

uint16 * strToUTF16A ( strref  s)

Converts a UTF-8 string to UTF-16 in an allocated buffer

Convenience wrapper around strToUTF16() that allocates the buffer automatically. The returned buffer must be freed with xaFree() when no longer needed.

Parameters
sUTF-8 string to convert
Returns
Allocated UTF-16 buffer (null-terminated), or NULL on error. Caller must free with xaFree()

Example:

uint16 *wide = strToUTF16A(s);
if (wide) {
// ... use wide ...
xaFree(wide);
}
uint16 * strToUTF16A(strref s)

◆ strToUTF16S()

uint16 * strToUTF16S ( strref  s)

Converts a UTF-8 string to UTF-16 in a scratch buffer

Convenience wrapper around strToUTF16() that uses a temporary scratch buffer. This is useful for passing to OS APIs that require UTF-16 strings.

IMPORTANT: The returned buffer is temporary and may be overwritten by other operations (see cx/utils/scratch.h). Use or copy the result immediately.

Parameters
sUTF-8 string to convert
Returns
Scratch buffer with UTF-16 encoding (null-terminated), or NULL on error. Do not free - buffer is managed by scratch system

Example:

uint16 *wide = strToUTF16S(path);
if (wide) {
CreateFileW(wide, ...); // Use immediately
}
uint16 * strToUTF16S(strref s)

◆ strValidASCII()

bool strValidASCII ( strref  s)

Validates that a string contains only ASCII characters

Verifies that all bytes in the string are in the ASCII range (0x00-0x7F). If validation succeeds, both the ASCII and UTF-8 flags are cached in the string header (since ASCII is a subset of UTF-8).

Parameters
sString to validate
Returns
true if the string contains only ASCII characters, false otherwise

Example:

if (strValidASCII(filename)) {
// Safe to use with ASCII-only APIs
}
bool strValidASCII(strref s)

◆ strValidUTF8()

bool strValidUTF8 ( strref  s)

Validates that a string contains valid UTF-8 sequences

Verifies that all byte sequences in the string form valid UTF-8 code points. If validation succeeds, the UTF-8 flag is cached in the string header for future reference (if the string was allocated by CX).

Parameters
sString to validate
Returns
true if the string contains only valid UTF-8, false otherwise

Example:

if (strValidUTF8(s)) {
// Safe to process as UTF-8 text
}
bool strValidUTF8(strref s)