Doc1 Part 2: Binary Chunks.

  • Altered documents contain clues to previous versions – Binary chunks.
  • Also contain misinformation
  • They increase the filesize from the original 696Kb to 6.9Mb
  • “Confidential” watermark: from GDIC?
  • MSODatastore contains a Guccifer Clue.

The source of Guccifer’s 1.doc can be viewed in a browser:

view-source:https://guccifer2.files.wordpress.com/2016/06/1.doc

We can see that it’s mostly a text format. A real word .doc is a binary format and would look something like this in a browser.

view-source:https://www.benefits.va.gov/WARMS/docs/admin26/pamphlet/pam26_7/ch06.doc

Contained within the text of 1.doc are several encoded hex strings that contain the items mentioned above. Brief instructions about how to extract them yourself are at the bottom of this post, and I’ve uploaded B64 versions of some to dropbox.

“Russian Language” Theme

The Russian language in the documents has long been disputed, see for example Adam Carter’s excellent blog and links therein. In the source we can see the section where the Russian style comes from:

\par }{\rtlch\fcs1 \af0 \ltrch\fcs0 \insrsid12588815\charrsid11758497 
\par }{\*\themedata 504b030414000600080000002100828abc13fa0000001c020000130000005b436f6e74656e745f54797065735d2e786d6cac91cb6ac3301045f785fe83d0b6d8
72ba28a5d8cea249777d2cd20f18e4b12d6a8f843409c9df77ecb850ba082d74231062ce997b55ae8fe3a00e1893f354e9555e6885647de3a8abf4fbee29bbd7
2a3150038327acf409935ed7d757e5ee14302999a654e99e393c18936c8f23a4dc072479697d1c81e51a3b13c07e4087e6b628ee8cf5c4489cf1c4d075f92a0b
44d7a07a83c82f308ac7b0a0f0fbf90c2480980b58abc733615aa2d210c2e02cb04430076a7ee833dfb6ce62e3ed7e14693e8317d8cd0433bf5c60f53fea2fe7
065bd80facb647e9e25c7fc421fd2ddb526b2e9373fed4bb902e182e97b7b461e6bfad3f010000ffff0300504b030414000600080000002100a5d6a7e7c00000

This is the “Russian” language theme and  decodes as an XML docx theme as shown in the tree below. It really is a Russian theme. The only problem is: it’s damn easy to fake as I’ll show in another post.

├── [Content_Types].xml
├── _rels
│   └── rels
├── theme
│   └── theme
│   ├── _rels
│   │   └── themeManager.xml.rels
│   ├── theme1.xml
│   └── themeManager.xml

Edit: theme1.xml in G2.0’s doc seems to be exactly the same as this one and is mentioned in these MSDocs about XML SDK.

The ColorScheme

Not the most thrilling thing in the world, but included for completeness.

 

 

BLIP1 – WMF with Confidential .tga file

Link to Base 64 Encoded version

WordPress doesn’t allow .wmf or .tga images “for security reasons” so this is a PNG copy of it.

Strapping on the veterinary rubber gloves and going elbow deep into the WMF that that “wraps” the .tga file we find some interesting stuff:

We’ve got GDIC mentioned at least twice (the only possible relevant thing I can find for GDIC is General Defence Intelligence Committee), System, IBM, DOS, and what could be a sort of file path heavily obfuscated. I’ve tried finding Win32FileTimes without success. Unfortunately WMF/EMF is an old format and isn’t very well documented.

This watermark has been added as it isn’t in the original docx. 

  • We presume by G2.0.
  • The question is: why? Does he want us to find GDIC?
  • Is he implying he’s also hacked GDIC?
  • Oh Lord, Is it a SIGN?

BLIP2

Is just the png from the file:

BLIP3

Link to Base64 encoded file  [Windows users may want to virus check this after decoding]

 

It’s a known problem that if a WORD .doc with graphics is saved as an .rtf the filesize balloons as it makes two copies of images, one original, and one as a WMF. While it’s a known problem this difference in file sizes seems a bit extreme. There may be something else going on here. Guccifer2.0 said he left some surprises. Blip3 is supposedly the same full size PNG as BLIP2 but just wrapped in a WMF shell for compatibility. Yet BLIP3 is  1.8 Mb and BLIP 2 is only 247kb. What’s going on?

Opening via imagemagick shows a tiny 69x69px version of BLIP2 {shown top left}. Clicking on properties show data of an unknown type attached to the image:

Screenshot from 2018-02-15 20-51-01

  • The PNG is there right enough from offset 0x164 to 0x3DF68 (253.8K) but the file continues all the way to to 0x1C4154 (1851.7K)
  • I stopped counting at 60 the occurrences of the string “WMFC”. As far as I can tell the term for a Windows Meta File is WMF, not WMFC. Searching for WMFC only brings up “Wait For Memory Function Complete”… Crikes.
  • Nobody panic. It’s either a mystery that’s either irrelevant or utterly vital to the survival of the species.

Part 3 looks at the only solid breadcrumbs that G2.0 has left us: the MSODatastore

————————————————————————————————————-

How to find the Binaries:

Scheme is essentially the same for all. They’re chunks of data you’ll find in the source here:

view-source:https://guccifer2.files.wordpress.com/2016/06/1.doc

E.g. for the “Russian” Theme:

  1. Open the above in the browser
  2. Search for “*\themedata”
  3. Copy hex to a txt file, say russianfile.txt
  4. Convert to hex to binary (on Linux:):
    1.  xxd -r -p russianfile.txt >> russianfile.bin
  5. Then either
    1. view in a hex viewer, or
    2. In the case of the Russian theme, it’s XML so copy to a .zip extension and unzip then it’s in plain text

For blips search for “blipuid”, for datastore, err, “datastore”

18 Comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s