Question posted 2008 ยท +8 upvotes
I’d like to take some RTF input and clean it to remove all RTF formatting except ul b i to paste it into Word with minor format information.
The command used to paste into Word will be something like: oWord.ActiveDocument.ActiveWindow.Selection.PasteAndFormat(0) (with some RTF text already in the Clipboard)
{rtf1ansideff0{fonttbl{f0fnilfcharset0 Courier New;}}
{colortbl ;red255green255blue140;}
viewkind4uc1pardhighlight1lang3084f0fs18 The company is a global leader in responsible tourism and was ul the first major hotel chain in North Americaulnone to embrace environmental stewardship within its daily operationshighlight0par
Do you have any idea on how I can clean up the RTF safely with some regular expressions or something? I am using VB.NET to do the processing but any .NET language sample will do.
Accepted answer +6 upvotes
I would use a hidden RichTextBox, set the Rtf member, then retrieve the Text member to sanitize the RTF in a well-supported way. Then I would use manually inject the desired formatting afterwards.
Top ms-word Q&A (6)
- XML – adding new line +19 (2012)
- How to open and manipulate Word document/template in Java? +18 (2012)
- Why does the file utility identify Microsoft Word files as CDF? What is this CDF? +15 (2011)
- Version Control for word documents +13 (2008)
- programatically convert word docx to doc without using ole automation +13 (2008)
- What makes Microsoft-Word-generated HTML documents so large in code? +12 (2015)
ms-word solutions on this site
.