Doc1 Part 2: Binary Chunks.

  • Altered documents contain clues to previous versions – Binary chunks.
  • Also contain misinformation
  • They increase the filesize from the original 696Kb to 6.9Mb
  • “Confidential” watermark: from GDIC?
  • MSODatastore contains a Guccifer Clue.

The source of Guccifer’s 1.doc can be viewed in a browser:


We can see that it’s mostly a text format. A real word .doc is a binary format and would look something like this in a browser.


Contained within the text of 1.doc are several encoded hex strings that contain the items mentioned above. Brief instructions about how to extract them yourself are at the bottom of this post, and I’ve uploaded B64 versions of some to dropbox.

“Russian Language” Theme

The Russian language in the documents has long been disputed, see for example Adam Carter’s excellent blog and links therein. In the source we can see the section where the Russian style comes from:

\par }{\rtlch\fcs1 \af0 \ltrch\fcs0 \insrsid12588815\charrsid11758497 
\par }{\*\themedata 504b030414000600080000002100828abc13fa0000001c020000130000005b436f6e74656e745f54797065735d2e786d6cac91cb6ac3301045f785fe83d0b6d8

This is the “Russian” language theme and  decodes as an XML docx theme as shown in the tree below. It really is a Russian theme. The only problem is: it’s damn easy to fake as I’ll show in another post.

├── [Content_Types].xml
├── _rels
│   └── rels
├── theme
│   └── theme
│   ├── _rels
│   │   └── themeManager.xml.rels
│   ├── theme1.xml
│   └── themeManager.xml

Edit: theme1.xml in G2.0’s doc seems to be exactly the same as this one and is mentioned in these MSDocs about XML SDK.

The ColorScheme

Not the most thrilling thing in the world, but included for completeness.



BLIP1 – WMF with Confidential .tga file

Link to Base 64 Encoded version

WordPress doesn’t allow .wmf or .tga images “for security reasons” so this is a PNG copy of it.

Strapping on the veterinary rubber gloves and going elbow deep into the WMF that that “wraps” the .tga file we find some interesting stuff:

We’ve got GDIC mentioned at least twice (the only possible relevant thing I can find for GDIC is General Defence Intelligence Committee), System, IBM, DOS, and what could be a sort of file path heavily obfuscated. I’ve tried finding Win32FileTimes without success. Unfortunately WMF/EMF is an old format and isn’t very well documented.

This watermark has been added as it isn’t in the original docx. 

  • We presume by G2.0.
  • The question is: why? Does he want us to find GDIC?
  • Is he implying he’s also hacked GDIC?
  • Oh Lord, Is it a SIGN?


Is just the png from the file:


Link to Base64 encoded file  [Windows users may want to virus check this after decoding]


It’s a known problem that if a WORD .doc with graphics is saved as an .rtf the filesize balloons as it makes two copies of images, one original, and one as a WMF. While it’s a known problem this difference in file sizes seems a bit extreme. There may be something else going on here. Guccifer2.0 said he left some surprises. Blip3 is supposedly the same full size PNG as BLIP2 but just wrapped in a WMF shell for compatibility. Yet BLIP3 is  1.8 Mb and BLIP 2 is only 247kb. What’s going on?

Opening via imagemagick shows a tiny 69x69px version of BLIP2 {shown top left}. Clicking on properties show data of an unknown type attached to the image:

Screenshot from 2018-02-15 20-51-01

  • The PNG is there right enough from offset 0x164 to 0x3DF68 (253.8K) but the file continues all the way to to 0x1C4154 (1851.7K)
  • I stopped counting at 60 the occurrences of the string “WMFC”. As far as I can tell the term for a Windows Meta File is WMF, not WMFC. Searching for WMFC only brings up “Wait For Memory Function Complete”… Crikes.
  • Nobody panic. It’s either a mystery that’s either irrelevant or utterly vital to the survival of the species.

Part 3 looks at the only solid breadcrumbs that G2.0 has left us: the MSODatastore


How to find the Binaries:

Scheme is essentially the same for all. They’re chunks of data you’ll find in the source here:


E.g. for the “Russian” Theme:

  1. Open the above in the browser
  2. Search for “*\themedata”
  3. Copy hex to a txt file, say russianfile.txt
  4. Convert to hex to binary (on Linux:):
    1.  xxd -r -p russianfile.txt >> russianfile.bin
  5. Then either
    1. view in a hex viewer, or
    2. In the case of the Russian theme, it’s XML so copy to a .zip extension and unzip then it’s in plain text

For blips search for “blipuid”, for datastore, err, “datastore”


