狂風日誌

人生就是應該嚐盡酸甜苦辣

Java: ZIP 壓縮檔的註解

Java 的”java.util.zip”套件可以輕易獲得各個內容物的註解。

然而,整個Zip壓縮檔的註解卻無法透過此API輕易獲得。

要攫取此檔案的註解必須要瞭解Zip檔的架構。

解法

ZipTest.java: extract the zip file comment
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
import java.io.IOException;
import java.io.RandomAccessFile;

public class ZipTest {
    public static void main(String[] args) {
      String file = null;
      /* use the first parameter as the input zip file */
      if (args.length > 0)
          file = args[0];

      RandomAccessFile zipFile = null;
      try {
          zipFile = new RandomAccessFile(file, "r");

          long fileSize = zipFile.length();
          /* find the magic number in a reverse manner */
          for (long i = 1 ; fileSize - i >= 0; ++i) {
              zipFile.seek(fileSize - i);
              byte b = zipFile.readByte();
              if (b == 0x06) {
                  zipFile.seek(fileSize - i - 3);
                  /* check for magic "0x06054b50" in little endian */
                  byte[] key = new byte[4];
                  zipFile.readFully(key);
                  if (key[0] != 0x50 || key[1] != 0x4b || key[2] != 0x05) {
                      continue;
                  }
                  /* get the file comment size */
                  byte[] tmp = new byte[18];
                  zipFile.readFully(tmp);
                  int commentSize = (tmp[16] & 0xff) | ((tmp[17] & 0xff) >> 8);
                  if (commentSize > 0) {
                      byte[] comment = new byte[commentSize];
                      zipFile.readFully(comment);
                      System.out.println("comment: " + new String(comment));
                  }
                  break;
              }
          }
      } catch(Exception ex) {
          ex.printStackTrace();
      } finally {
          if (zipFile != null) {
              try {
                  zipFile.close();
              } catch (IOException ex) {
              }
          }
      }
    }
}

ZIP檔案的註解放在EOCD記錄 (End of Central Directory Record)中。

至於EOCD存放在哪裡,就必須稍微研究一下Zip的檔案格式。


ZIP 的檔案格式

[local file header 1]
[encryption header 1]
[file data 1]
[data descriptor 1]
.
.
.
[local file header n]
[encryption header n]
[file data n]
[data descriptor n]
[archive decryption header]
[archive extra data record]
[central directory header 1]
.
.
.
[central directory header n]
[zip64 end of central directory record]
[zip64 end of central directory locator]
[end of central directory record]

接下來我們看EOCD的結構

end of central dir signature    4 bytes  (0x06054b50)
number of this disk             2 bytes
number of the disk with the
start of the central directory  2 bytes
total number of entries in the
central directory on this disk  2 bytes
total number of entries in
the central directory           2 bytes
size of the central directory   4 bytes
offset of start of central
directory with respect to
the starting disk number        4 bytes
.ZIP file comment length        2 bytes
.ZIP file comment               (variable size)

由以上資料我們可以知道,只要有辦法找到 “end of central dir signature”,我們就有辦法知道Zip註解的長度以及內容。 以上的解法就是從檔案尾端一個byte一個byte讀取,直到找個 “0x0605450” 這個 magic number。

參考資料

PKWARE