Question posted 2015 · +7 upvotes
Below is a simple W3C-validated code to print “Hello World”:
<!DOCTYPE html>
<html>
<head>
<meta charset = "utf-8">
<title>Hello</title>
</head>
Hello World
</html>
But when I do the same thing with MS Word, the code generated is of 449 lines Why do all these extra lines appear in the code?
Accepted answer +12 upvotes
Name space of Word:
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns:m="http://schemas.microsoft.com/office/2004/12/omml"
xmlns="http://www.w3.org/TR/REC-html40">
Word keep meta datas informations:
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Author>xxxxxx</o:Author>
<o:LastAuthor>xxxxx</o:LastAuthor>
<o:Revision>2</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Created>2015-05-25T11:40:00Z</o:Created>
<o:LastSaved>2015-05-25T11:40:00Z</o:LastSaved>
<o:Pages>1</o:Pages>
<o:Words>1</o:Words>
<o:Characters>11</o:Characters>
<o:Company>Sopra Group</o:Company>
<o:Lines>1</o:Lines>
<o:Paragraphs>1</o:Paragraphs>
<o:CharactersWithSpaces>11</o:CharactersWithSpaces>
<o:Version>12.00</o:Version>
</o:DocumentProperties>
</xml><![endif]-->
Word add a css style:
<style>
<!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;
mso-font-charset:0;
mso-generic-font-family:roman;
mso-font-pitch:variable;
mso-font-signature:-536870145 1107305727 0 0 415 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;
mso-font-charset:0;
mso-generic-font-family:swiss;
mso-font-pitch:variable;
mso-font-signature:-536870145 1073786111 1 0 415 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin-top:0cm;
margin-right:0cm;
margin-bottom:10.0pt;
margin-left:0cm;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;
mso-fareast-language:EN-US;}
.MsoChpDefault
{mso-style-type:export-only;
mso-default-props:yes; ......
Word use the css style:
<p class=MsoNormal>Hello World</p>
You need to keep this information if you need to modify it in future. If you are doing a simple export, you can delete all metadatas.
4 code variants in this answer
- Variant 1 — 5 lines, starts with
<html xmlns:v="urn:schemas-microsoft-com:vml" - Variant 2 — 18 lines, starts with
<!--[if gte mso 9]><xml> - Variant 3 — 42 lines, starts with
<style> - Variant 4 — 1 lines, starts with
<p class=MsoNormal>Hello World</p>
Top ms-word Q&A (6)
- XML – adding new line +19 (2012)
- How to open and manipulate Word document/template in Java? +18 (2012)
- Why does the file utility identify Microsoft Word files as CDF? What is this CDF? +15 (2011)
- Version Control for word documents +13 (2008)
- programatically convert word docx to doc without using ole automation +13 (2008)
- SaveAs vs SaveAs2 in the Microsoft Office Word object model +11 (2010)
ms-word solutions on this site
— top 19%.