g_unichar_decompose - のねのBlog

g_unichar_decompose()

g_unichar_decompose ()
gboolean
g_unichar_decompose (gunichar ch,
                     gunichar *a,
                     gunichar *b);
Performs a single decomposition step of the Unicode canonical decomposition algorithm.
This function does not include compatibility decompositions. It does, however, include algorithmic Hangul Jamo decomposition, as well as 'singleton' decompositions which replace a character by a single other character. In the case of singletons *b will be set to zero.
If ch is not decomposable, *a is set to ch and *b is set to zero.
Note that the way Unicode decomposition pairs are defined, it is guaranteed that b would not decompose further, but a may itself decompose. To get the full canonical decomposition for ch , one would need to recursively call this function on a . Or use g_unichar_fully_decompose().
See UAX15 for details.
Parameters
ch
a Unicode character
 
a
return location for the first component of ch
 
b
return location for the second component of ch
 
Returns
TRUE if the character could be decomposed
Since: 2.30

gboolean
g_unichar_decompose (gunichar  ch,
                     gunichar *a,
                     gunichar *b)
{
  gint start = 0;
  gint end = G_N_ELEMENTS (decomp_step_table);

  if (decompose_hangul_step (ch, a, b))
    return TRUE;

  /* TODO use bsearch() */
  if (ch >= decomp_step_table[start].ch &&
      ch <= decomp_step_table[end - 1].ch)
    {
      while (TRUE)
        {
          gint half = (start + end) / 2;
          const decomposition_step *p = &(decomp_step_table[half]);
          if (ch == p->ch)
            {
              *a = p->a;
              *b = p->b;
              return TRUE;
            }
          else if (half == start)
            break;
          else if (ch > p->ch)
            start = half;
          else
            end = half;
        }
    }

  *a = ch;
  *b = 0;

  return FALSE;
}