Home > Publications & Training > Case Studies > Unicode and diacritic characters
Google
Web This Site

Unicode characters and diacritics

The primary role of Excel is analysis and visualization of data, which put less emphasis on the use of special text characters.  Nonetheless, there will always be some need for special characters, both Unicode characters and diacritics in Excel.

A diacritic in English is a glyph that modifies the sound of the character to which it is attached.  Examples are naïve, résumé, and saké.  In other fields, glyphs modify a letter to convey a specific meaning.  Examples include:

·        In Statistics the sample mean is denoted by x-bar (x̄) and the sample proportion by p-hat (p̂).  Examples of Unicode characters are the population mean (the lowercase Greek letter mu, μ) and the population standard deviation (the lower case Greek letter sigma, σ).

·        In Economics, profit is denoted by the Greek letter pi (π).

·        In Mathematics, well, in Mathematics, there are a plethora of symbols including the capital sigma (Σ) for sum and ∫ for an integral.

·        Currencies are denoted by symbols such as the US Dollar ($), the Euro (€), the Japanese Yen (¥), the Chinese Yuan (), and the Indian Rupee ().

Figure 1 is a very small sample of Unicode characters and sample diacritics.

Figure 1

This note shows different ways to enter Unicode characters and diacritical marks.

Insert special character with MS Word or MS PowerPoint

Excel Insert symbol

Windows Character Map

Excel Insert equation

References

 

Insert special character with MS Word or MS PowerPoint

It turns out that Word, and even more than Word PowerPoint, supports an easy way to enter a Unicode character or diacritic.  Sadly, this technique does not work in Excel.

Word supports two ways to enter special characters:

1)      Type in the 4 character hex code and press ALT+x.  This is very convenient since Word will convert the code into its Unicode equivalent or diacritical glyph.  Unfortunately, this method is not without its problems.

The first problem is that there is one exception where Word does nothing – or at least one exception that I found.  If there is a preceding character and that character is x, Word will not do anything!  I don’t know why but it doesn’t work.  So, for example, y0304 ALT+x results in ȳ but x0304 ALT+x does nothing.  The same applies to Unicode characters.

 

The other problem is that if there is a previous character, sometimes – but not always – Word ‘swallows’ that character and creates something totally unexpected.  For example, enter 222B ALT+x and the result is the integral sign, ∫, as expected.  But, a222B ALT+x results in 򢈫 whereas y222b ALT+x results in the expected result of y∫.  And, of course, if the sequence is x222b ALT+x then Word will leave everything untouched.

 

2)      The second method requires the numerical keypad.  Hold the ALT key and enter the 4 or 5 digit decimal value.  This has worked consistently in my tests.   Enter x then ALT+0772 on the numerical keypad and the result will be x̄.  The only downside, of course, is the requirement for the numerical keypad.

Once this character is in the Word document, simply copy and paste it into Excel.

It is worth noting that when necessary Word will change the font to something that supports the required character.  For example, while typing using Calibri, if one enters ALT+8984 on the numerical keypad the result will be the Cloverleaf symbol and the font will change to Cambria Math.

This is where PowerPoint comes in. While it supports only the ALT+decimal digit method for entering special characters, it, somehow, does not require a change in the font.  So, one can get the Peace symbol, for example, while retaining the Calibri font.

 

Insert symbol

The next way to insert a Unicode character or a diacritical mark is to insert a symbol within Excel itself (Insert tab | Symbols group | Symbol button).  This will bring up the Symbol dialog box (Figure 5).

Figure 5

Select the character of interest or first select a subset (the dropdown in the top-right).  In addition, the dialog box allows one to directly specify the hex code for a character (use the Character code field).

To insert a diacritic, type the character then select the desired diacritic as shown in Figure 6.

Figure 6

While this looks very promising, there’s one problem with it.  It’s hierarchical in the sense that the results depend on selecting the correct font.  For example, with the default font of Calibri, some characters from Figure 1 like the Ying-yang, Peace, and Cloverleaf symbols, as well as the currency symbols for the Indian Rupee and the Chinese Yuan are not available.  To get the symbols, select the font MS UI Gothic.  To get the Indian Rupee symbol, select the Arial font.  And, I still don’t know where to find the Chinese Yuan symbol!  This, of course, means that unless one already knows where the symbol resides, there will be some amount of trial and error to locate it.

Windows Character Map

This is similar to the Insert Symbol dialog box in Excel except that it is a Windows utility.  To run it use Windows start button | All Programs > Accessories > System Tools > Character Map.  It has a similar though not identical look and feel as the Symbol dialog box and in my limited testing seems to have the same strengths and weaknesses primarily its hierarchical nature.

 

Insert equation

Excel 2010 supports Microsoft’s own equation editor, while earlier versions such as Excel 2007 and Excel 2003 support the Microsoft Equation 3.0.  Either method inserts a mathematical equation as an object in the worksheet.  This object includes a rich set of controls to create fairly sophisticated equations.  The features include Greek letters and accents.

One problem with this approach is that the result in an object in the worksheet and not part of the text in a cell.  It may also be overkill if all one wants to do is use a diacritic or insert one Unicode character.

In Excel 2010, insert an equation with the Insert tab | Symbols group | Equation button (or Equation dropdown) as in Figure 2.

Figure 2

After inserting an equation, click inside the equation box and select the Equation Tools contextual tab.  This shows a large number of controls for use with an equation (Figure 3).

Figure 3

Insert symbols from the Symbols group (more are available by clicking the More down-arrow) and diacritics are available through the Structures group | Accent dropdown (Figure 4).

 

Figure 4

 

References

There are several excellent references on the Web.  A few that I found helpful, and in no particular order:

http://unicode.org/charts/

http://tlt.its.psu.edu/suggestions/international/bylanguage/ipavowels.html, and

http://www.alanwood.net/demos/symbol.html

Excel 2010, Excel 2007, Unicode characters, special characters, diacritics, Economics, Statistics, Mathematics, hat, bar

Comments