|
code
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Getting information on a Word documentI've seen a program that reports statistics in Word that are not part of the
normal statistics. Such as font changes (from bold to normal etc.) as well as header and footer information. Based on the speed of this program, it appears it is not opening the documents for viewing. Is anyone aware of where I might be able to find information on doing this? Thank you First: What tool does this as I would be curious to see how it is accessing
the document. It may just know where inside of the document this information is stored and opening the file to retrieve these bytes (not through Word). Two: May want to post in the Word VBA newsgroup as they would have a far better understanding of the Word document format (and possible way to extract the information without opening the document in Word) and the Word API.. -- Show quoteHide quoteChris Hanscom - Microsoft MVP (VB) Veign's Resource Center http://www.veign.com/vrc_main.asp Veign's Blog http://www.veign.com/blog -- "Suzette" <a***@hotmail.com> wrote in message news:eBNpqwbzFHA.1264@tk2msftngp13.phx.gbl... > I've seen a program that reports statistics in Word that are not part of > the normal statistics. Such as font changes (from bold to normal etc.) as > well as header and footer information. Based on the speed of this > program, it appears it is not opening the documents for viewing. Is > anyone aware of where I might be able to find information on doing this? > > Thank you > > First: What tool does this as I would be curious to see how it is It's a custom program that I saw. It doesn't appear to open the document to > accessing the document. It may just know where inside of the document > this information is stored and opening the file to retrieve these bytes > (not through Word). get the info because it does 20 documents in a directory in less that 30 seconds. I attempted to do it with a reference to Word but I can't get the information without opening the document for the information. > Two: May want to post in the Word VBA newsgroup as they would have a far I asked over there without any luck. I didn't think of the Word API. I > better understanding of the Word document format (and possible way to > extract the information without opening the document in Word) and the Word > API.. will look at that avenue. Thanks. 20 docs in 30 secs doesn't seem that fast to me. Sound like, with those
times, that the API is being used.. Are you sure that the custom application is not reading a customized setting in the documents. Meaning, maybe the company has modified the Normal.Dot document to allow for tracking of those changes and embed the information inside of each document. Can the tool read those changes in any Word document or a document specific to the company? -- Show quoteHide quoteChris Hanscom - Microsoft MVP (VB) Veign's Resource Center http://www.veign.com/vrc_main.asp Veign's Blog http://www.veign.com/blog -- "Suzette" <a***@hotmail.com> wrote in message news:ucyTKZczFHA.3152@TK2MSFTNGP10.phx.gbl... >> First: What tool does this as I would be curious to see how it is >> accessing the document. It may just know where inside of the document >> this information is stored and opening the file to retrieve these bytes >> (not through Word). > > It's a custom program that I saw. It doesn't appear to open the document > to get the info because it does 20 documents in a directory in less that > 30 seconds. I attempted to do it with a reference to Word but I can't get > the information without opening the document for the information. > >> Two: May want to post in the Word VBA newsgroup as they would have a far >> better understanding of the Word document format (and possible way to >> extract the information without opening the document in Word) and the >> Word API.. > > I asked over there without any luck. I didn't think of the Word API. I > will look at that avenue. Thanks. > "Suzette" <a***@hotmail.com> wrote in message The Microsoft .doc file format is proprietary and there is little publishednews:eBNpqwbzFHA.1264@tk2msftngp13.phx.gbl... > I've seen a program that reports statistics in Word that are not part of the > normal statistics. Such as font changes (from bold to normal etc.) as well > as header and footer information. Based on the speed of this program, it > appears it is not opening the documents for viewing. Is anyone aware of > where I might be able to find information on doing this? > > Thank you > > information concerning its internal formats, officially or otherwise. Some formats varry dramatically between versions as well. However, they do "license" the Microsoft .doc (and other office products) binary file format documentation. http://support.microsoft.com/default.aspx?scid=kb;en-us;840817 I suspect the company you are talking about did that - or reversed engineered everything themselves - which would take some serious effort and thus I don't think they would be too enthusiastic to 'share' it either. <g> Note: You not only pay a non-trivial amount for the 'license', you are also required to essentially give-away your first-born and more if you redistribute the information. <g> hth -ralph You could use Word automation and make word invisible, and do the work, or
scan the binary file, but it's huge task. You could find Word 97 Binary Format on the web, but not Word 2000+. If you are an MSDN subscriber, perhaps you can download it. If not, you have to fax a "free" license agreement to MS and they would send it to you or give you a link(?). The article that describes how to contact them is no longer at MS web site, but in your copy of MSDN, look for Q290958. It had one of 2 titles(same article): 290958 - HOW TO Obtain the Word Binary File Format (BFF) for Word Versions 2002, 2000, and 97 WD2002: How to Obtain the Word Binary File Format Here is a site which listed its contents. It appears to be the same as the last portion of the article that Ralph posted(at the end of the article): http://www.tech-geeks.org/list-archive/tech-geeks/12-2003/msg00979.html Microsoft Word 97 Binary File Format http://www.aozw65.dsl.pipex.com/generator_wword8.htm http://www.wotsit.org There maybe a third party solution that makes it easier to do, but I am not aware of any. Show quoteHide quote "Suzette" <a***@hotmail.com> wrote in message news:eBNpqwbzFHA.1264@tk2msftngp13.phx.gbl... > I've seen a program that reports statistics in Word that are not part of > the normal statistics. Such as font changes (from bold to normal etc.) as > well as header and footer information. Based on the speed of this > program, it appears it is not opening the documents for viewing. Is > anyone aware of where I might be able to find information on doing this? > > Thank you > Go to http://www.mvps.org/emorcillo/en/code/vb6/index.shtml -- download the
first file there and extract the two tlb files to the system32 folder, and run regtlib against each file (just as you would run regsvr32). Then download the "Reading Document Properties" file on that same page about half-way down, extract and run. -- Show quoteHide quoteRandy Birch MS MVP Visual Basic http://vbnet.mvps.org/ ---------------------------------------------------------------------------- Read. Decide. Sign the petition to Microsoft. http://classicvb.org/petition/ ---------------------------------------------------------------------------- "Suzette" <a***@hotmail.com> wrote in message news:eBNpqwbzFHA.1264@tk2msftngp13.phx.gbl... : I've seen a program that reports statistics in Word that are not part of the : normal statistics. Such as font changes (from bold to normal etc.) as well : as header and footer information. Based on the speed of this program, it : appears it is not opening the documents for viewing. Is anyone aware of : where I might be able to find information on doing this? : : Thank you : : |
|||||||||||||||||||||||