File Format:Nitro+ Pak4

From TLWiki
Jump to: navigation, search

This is a description of the Pak4 format used by Nitro+ games. It's probably not perfect, and it's also only based on observations gleaned from the Gekkou no Carnevale project so implementation in other games may differ, particularly in regard to file types not present in Gekkou no Carnevale archives. In fact, it isn't even sure if "Pak4" is the correct name. This will try to give a text-only description; for an actual implementation of the operations outlined here, see carnevore (in Python) or nitro.cpp in the ExtractData sources (in C++).

The file format used in Jingai Makyou - which I shall call "Pak2" based on the version number in the files - appears to be a more primitive version of this.

General[edit]

Pak4 files are pretty standard archive files; they consist, in that order, of a signature, an index and the actual contents. The archives are both partially compressed (using zlib) and partially encrypted (using a custom encryption algorithm). They have the file extension ".pak". All Pak files in the game directory are combined into a virtual file system with subdirectories; in case of collisions (files with the same name and path in multiple archives), archive files that sort highest alphabetically take precedence. For Gekkou no Carnevale, unpacked files in the game directory and subdirectories take even higher precedence. The basic unit of information for a Pak4 file is a DWORD (4 bytes), interpreted numerically as being little-endian.

Encryption algorithm (approximation)[edit]

The encryption algorithm works by transforming a string (or byte sequence) into a DWORD. Basically, the string is interpreted as being null-terminated (everything after the first 0x00 byte is cut off). The key starts with 0, then for every byte in the input string, the key is multiplied with 0x89 (137), then the value of the current input string byte is added. Repeat as long as characters are available in the input string. This is apparently implemented in C(++); for languages that automatically compensate for integer overflow, you'll have to take the end result modulo 0x100000000 to get a DWORD.

NOTE: This algorithm isn't perfect; it works only for pure ASCII strings. Strings that contain double-byte characters will generate different keys, the method isn't understood yet.

Signature[edit]

Format signature[edit]

A Pak4 file starts with the byte sequence 0x04 0x00 0x00 0x00 ("4" as a little-endian DWORD). Presumably this identifies it as being an actual Pak4 file.

Game signature[edit]

After that follows a sequence of 256 bytes. For Gekkou no Carnevale, this is either the string "GekkounoCarnevale" or "GekkouNoCarnevale", filled with null bytes to get to 256 bytes. It seems that the actual content of the string does not matter as far as the game is concerned; it is, however, fed to the encryption algorithm to generate a key used to encrypt the index with (called the "archive key" from now on).

Index[edit]

Index description fields[edit]

The index starts with a sequence of 16 bytes (4 DWORDs) that describe the actual index (little-endian for numerical values). These 4 DWORDs mean the following:

  • The first DWORD is presumably somehow generated using the game signature/key; the method to do this isn't known. They are 0x64 0x00 0x00 0x00 if the game signature is "GekkounoCarnevale", and 0x65 0x00 0x00 0x00 if it is "GekkouNoCarnevale".
  • The second DWORD is the uncompressed size of the index in bytes, XOR'd with the archive key.
  • The third DWORD is the number of files in the archive, XOR'd with the archive key.
  • The fourth DWORD is the the compressed size of the index as it is in the archive, XOR'd with the first DWORD.

Index data[edit]

The index proper is compressed using zlib at the default compression level. It is not encrypted further. Each file record in the index is made up of the following:

  • a little-endian DWORD stating how many characters the file name (including its path) is long.
  • The filename and path; path components are separated using a backslash.
  • 5 DWORDs that describe the file. They are encrypted using a key generated by feeding the filename to the key algorithm (called the "file key" from now on). They have the following meaning (again, little-endian):
    • The first DWORD is the offset of the beginning of the file, counting from the end of the compressed index, XOR'd with the file key. Since this is known when working on an existing archive, it can be used to second-guess the file key when unpacking.
    • The second DWORD is the uncompressed size of the file, XOR'd with the file key.
    • The third DWORD is the offset of the current record, counting from the beginning of the index data, XOR'd with the file key.
    • The fourth DWORD is a a boolean value (0x00 or 0x01 little endian), stating whether the file is compressed (0x01) or not, XOR'd with the file key.
    • The fifth DWORD is the compressed size of the file, XOR'd with the file key. If the file is not compressed to begin with, "0" is used instead of the actual compressed size.

File data[edit]

The file data is pretty straightforward. Files are first encrypted using the file key with a slight hitch: Text files (having the ending ".nps" (game scripts), ".h" (script header files) or ".ini" (configuration settings)) are cyclically XOR'd with the file key for their entire length, while for other files, only the beginning 1024 bytes are. After that and if appropriate, files are compressed using zlib at the standard compression level. Generally this is also only used for text files, but since compression is noted in the index it may be possible to decide on another basis.