Thread: Fake returns?
View Single Post
  #4  
Old October 9th, 2008, 05:55 PM posted to microsoft.public.word.formatting.longdocs
Klaus Linke
external usenet poster
 
Posts: 401
Default Fake returns?

[...] If you look at the Word file in a text editor there's no sign of a
difference, so I figure the difference must be in the header of the Word
file that specifies that there is one type of para-break encoding, when in
fact the file contains mixed para-break encodings.


Yes, something like that is my guess too. If you could look into the binary
*.doc format (or its equivalent in memory once Word has loaded a doc),
functioning paragraph marks would likely have a pointer associated with them
that points to a data structure with the style and all the paragraph
formatting.
In the problematic cases, that pointer wasn't created. Just speculation,
though.


Along these same lines of getting under the hood of what's happening in
the Word file, I'd love to have a better understanding of how to determine
when files contain Unicode versus when they don't, whether files sometimes
*think* they contain Unicode but in fact they don't and vice versa, etc.


Interesting questions... One quick way to tell if a file has "Unicode
characters" (precisely, characters that aren't in the old Windows code page
1252) is to try to save as Plain Text (*.txt), choosing the Windows
(Standard) encoding.
If the file contains such characters, the dialog shows a yellow exclamation
mark, and the characters that can't be saved are marked red in the preview
window.

Greetings,
Klaus