How do you deal with the “special” characters that MS Word adds?

calendar_today Asked May 6, 2009
thumb_up 7 upvotes
history Updated April 14, 2026

Direct Answer

With regards to clients posting copy/pasted text from Word in textareas: The most reliable way to ensure that the client sends you text in any particular encoding (thus hopefully…. This is a 4-line Word VBA snippet, ranked #22nd of 32 by community upvote score, from 2009.


The Problem (Q-score 5, ranked #22nd of 32 in the Word VBA archive)

The scenario as originally posted in 2009

I’m wondering how you clean the special characters that MS Word as, such as m- and n-dashes and curly quotes?

I often find myself copying content from clients from Word and pasting into a static HTML page, but the content ends up with weird characters because the special characters are not converted to their correct ACSII codes and therefore show up as garbled text. (For these basic websites, I’m using Dreamweaver.)

I have seen a lot of similar problems when clients copy content from Word into text only fields (mostly textareas). When I put this into a PDF (through PHP) or it shows up on the page it too has garbled text.

How do you deal with this? Is there a cleaning service or program you use?

Why community consensus is tight on this one

Across 32 Word VBA entries in the archive, the accepted answer here holds solid answer (above median) status — meaning voters are unusually aligned on the right fix.


The Verified Solution — solid answer (above median) (+7)

4-line Word VBA pattern (copy-ready)

With regards to clients posting copy/pasted text from Word in textareas:

The most reliable way to ensure that the client sends you text in any particular encoding (thus hopefully doing any conversion from CP-1252 [or whatever Word uses] for you), is to add the accept-charset="..." attribute to all your <form>s. E.g.:

<form ... accept-charset="UTF-8">
   ...
</form>

Most browsers will obey that and make sure any “Word-specific” characters are converted to the appropriate character set before it gets to your website.

Once invalid text gets to your website, there’s very little you can do to fix it reliably, so it’s best to simply check all input for being valid in whatever character set you use, and discard any requests that have invalid text. This is necessary even with accept-charset, because undoubtedly there are some clients out there that will ignore it.


When to Use It — vintage (14+ years old, pre-2013)

Ranked #22nd in its category — specialized fit

This pattern sits in the 63% tail relative to the top answer. Reach for it when your scenario closely matches the question title; otherwise browse the Word VBA archive for a higher-consensus alternative.

What changed between 2009 and 2026

The answer is 17 years old. The Word VBA object model has been stable across Office 2013, 2016, 2019, 2021, 365, and 2024/2026 LTSC, so the pattern still compiles. Changes that might affect you: 64-bit API declarations (use PtrSafe), blocked macros in downloaded files (Mark-of-the-Web), and the shift toward Office Scripts for web-first workflows.

help
Frequently Asked Questions

Is this above-median answer still worth copying?
expand_more

Answer score +7 vs the Word VBA archive median ~4; this entry is solid. The score plus 5 supporting upvotes on the question itself (+5) means the asker and 6 subsequent voters all validated the approach.

Does the 4-line snippet run as-is in Office 2026?
expand_more

Yes. The 4-line pattern compiles on Office 365, Office 2024, and Office LTSC 2026. Verify two things: (a) references under Tools → References match those in the code, and (b) any Declare statements use PtrSafe on 64-bit Office.

This answer is 17 years old. Is it still relevant in 2026?
expand_more

Published 2009, which is 17 year(s) before today’s Office 2026 build. The Word VBA object model has had no breaking changes in that window. Three things to re-test: (1) blocked macros on downloaded files (Mark-of-the-Web), (2) 64-bit API declarations (PtrSafe, LongPtr), (3) any shift toward Office Scripts for web scenarios.

Which Word VBA pattern ranks just above this one at #21?
expand_more

The pattern one rank above is “Developing MS Word add-in”. If your use case overlaps, compare both before committing.

Data source: Community-verified Q&A snapshot. Q-score 5, Answer-score 7, original post 2009, ranked #22nd of 32 in the Word VBA archive. Last regenerated April 14, 2026.