Microsoft to 'Open' Office File Formats

Ratcheting up its support of XML, Microsoft announced on Thursday that it will release a new Extensible Markup Language (XML) technology that replaces existing Office file formats for Word, Excel and PowerPoint with fully-documented, royalty free formats.

The formats, called Microsoft Office Open XML, will debut in the second half of 2006 with Office 12 and constitute a dramatic upgrade to Office and a broad extension of Microsoft's XML strategy - without the direction of standards bodies.

When it reached into its hopper to outline some of the advantages of the change, Microsoft referred to the formats as being compact and robust by design. Integrated ZIP compression reduces the file size by up to 50 percent and files are broken down even further into a modular file structure that will make data recovery more successful and enhance security.

The new format has a built-in facility that can automatically detect damage, segmenting files into components that can be managed and repaired independently.

Because of this design, undamaged parts of files may still be opened. Microsoft contends that this file structure improves security because potentially dangerous code or sensitive content such as metadata can be stripped out by the content's owner. Microsoft will also rely on partners to build utilities to inspect the XML.

BetaNews has learned that files can be renamed with a .ZIP extension, making those components viewable inside of the ZIP containers.

Interoperability

As a consequence of being XML-based, the interoperability capabilities of Office have risen dramatically. Applications and systems such as databases can access the content of documents and spreadsheets for queries or data entry, making those processes autonomous and virtually hands free.

This capability is part of what Bill Gates referred to as "information solutions and IT fundamentals" -- or the promised benefits of open XML standards and rapid deployment tools that extend Office into business information systems -- in a speech given two weeks ago at the Microsoft CEO Summit.

Offering true interoperability, Microsoft promises that the Office Open XML formats will work the same way with non-Microsoft software and systems such as Oracle SQL Plus or MySQL. Office schemas, which define the structure, layout and rules of documents are built atop the Office 2003 schemas. The new versions are, according to Microsoft, "more complete and better" than their Office 2003 counterparts. Schemas will be open, fully documented and carry a "perpetual" royalty free license.

Is it Truly Open?

When faced with the decision of adopting open standards or remaining proprietary, Microsoft opted to preserve the "full fidelity" of past Office functionality and instead developed its own "open" XML-based format. In saying that the format is "open" Microsoft acquiesces to the state of Massachusetts. Under the state's "Open Standards" policy, a format is open if it is fully documented and anyone can use it.

For comparison's sake, it should be noted that Adobe's PDF format is also designated as being "open" under the state's guidelines.

"We have legacy here," Jean Paoli, Senior Microsoft XML Architect, told BetaNews. "It is our responsibility to our users to provide a full fidelity format. We didn't see any alternative; believe me we thought about it. Without backward compatibility we would have other problems."

"Yes this is proprietary and not defined by a standards body, but it can be used by and interoperable with others. They don't need Microsoft software to read and write. It is not an open standard but an open format," Paoli explained.

When asked why Microsoft did not use the OASIS (Organization for the Advancement of Structured Information Standards) OpenOffice.org XML file format, Paoli answered, "Sun standardized their own. We could have used a format from others and shoehorned in functionality, but our design needs to be different because we have 400 million legacy users. Moving 400 million users to XML is a complex problem."

"This is a case of reality versus standards - this is reality. We can't do (support) everything. Where does it stop?" Paoli is one of the authors of the original XML specification. The schemas flag all of the features in the current corresponding Office files formats.

The Microsoft Office Open XML Formats are backwards compatible to Office 2000. Those versions of Office that are supported will be issued updates to read the new formats. Gartner estimates that only 1.6 percent of customers in the United States will be using versions of Office that predate Office 2000 by the end of 2005 with even fewer legacy users in Europe.

Customers that do not wish to adopt the format may set Office 12 to save in the legacy formats by default. Existing documents may be converted to the Office 12 formats with a bulk converter.

Likewise, customers that use alternative productivity suites including OpenOffice will require converters to translate OASIS standard document formats into the Office Open XML formats. Microsoft's Paoli said that it was too early to say whether or not Office will support the standards.

A New Ecosystem

Paoli predicts that the formats will generate an ecosystem around Office and create opportunities for ISVs and developers. "We jumpstarted the XML movement in the industry," said Paoli.

Commenting on the formats, Jupiter Research senior analyst Joe Wilcox told BetaNews, "If you look at Office 2003 and XML, Microsoft tried to solve the problem of company information being locked on a legacy datastore like a mainframe. And XML was the secret sauce for getting that done."

"However, Microsoft failed to address another problems and that is company information stored in its own proprietary file formats. So with the XML format, Microsoft has provided a means for opening up data locked in its own formats."

More information will be available at the Office 12 preview Web site beginning June 6, which is the start of Tech Ed.

© 1998-2014 BetaNews, Inc. All Rights Reserved. Privacy Policy.