Stuart's journal

> recent entries
> calendar
> friends
> profile

Saturday, January 19th, 2002
4:28 pm - File 3
Number of entries: 57

Each line is 4 values, the last one is always zero.

0 -31540 0 0
10610 -29753 0 0
0 -29752 -10610 0
21941 -22672 0 0
13309 -25274 -13309 0
0 -22672 -21941 0
29732 -10677 0 0
24346 -14017 -14182 0
14181 -14017 -24346 0
0 -10677 -29732 0
31034 -7777 -6137 0
29387 -8828 -11114 0
26191 -8458 -17487 0
22138 -9314 -22138 0
17486 -8458 -26191 0
11114 -8828 -29387 0
6137 -7777 -31034 0
0 -6363 -31992 0
-10610 -29752 0 0
-13309 -25274 -13309 0
-21941 -22672 0 0
-14181 -14017 -24346 0
-24346 -14017 -14182 0
-29732 -10677 0 0
-6137 -7777 -31034 0
-11114 -8828 -29387 0
-17486 -8458 -26191 0
-22138 -9315 -22138 0
-26190 -8458 -17487 0
-29387 -8828 -11114 0
-31034 -7777 -6137 0
-31992 -6363 0 0
0 -29752 10610 0
-13309 -25274 13309 0
0 -22672 21941 0
-24346 -14017 14182 0
-14181 -14017 24346 0
0 -10677 29732 0
-31034 -7777 6137 0
-29387 -8828 11114 0
-26191 -8458 17487 0
-22138 -9315 22138 0
-17486 -8458 26191 0
-11114 -8828 29387 0
-6138 -7777 31034 0
0 -6364 31992 0
13309 -25274 13309 0
14181 -14017 24346 0
24346 -14017 14182 0
6138 -7777 31034 0
11113 -8828 29387 0
17486 -8458 26191 0
22138 -9314 22138 0
26191 -8458 17487 0
29387 -8828 11114 0
31034 -7777 6137 0
31992 -6363 0 0

(comment on this)

4:27 pm - File 2
Number of entries: 80

Again, each line starts with the offset which can be ignored. Following this is a zero, then an incrementing number
in steps of 26 (this file doesn't have that though). Then 6 values.

2788 - 0 0 1188 1496 1232 544 576 592
2820 - 0 0 132 176 528 16 48 64
2852 - 0 0 132 528 264 16 64 32
2884 - 0 0 264 528 308 32 64 80
2916 - 0 0 176 220 572 48 96 112
2948 - 0 0 176 572 528 48 112 64
2980 - 0 0 528 572 616 64 112 128
3012 - 0 0 528 616 308 64 128 80
3044 - 0 0 308 616 352 80 128 144
3076 - 0 0 220 2376 396 96 160 176
3108 - 0 0 220 396 572 96 176 112
3140 - 0 0 572 2332 440 112 192 208
3172 - 0 0 572 440 616 112 208 128
3204 - 0 0 616 2288 484 128 224 240
3236 - 0 0 616 484 352 128 240 144
3268 - 0 0 352 2244 88 144 256 272
3300 - 0 0 0 264 704 0 32 288
3332 - 0 0 264 308 968 32 80 304
3364 - 0 0 264 968 704 32 304 288
3396 - 0 0 704 968 748 288 304 320
3428 - 0 0 308 352 1012 80 144 336
3460 - 0 0 308 1012 968 80 336 304
3492 - 0 0 968 1012 1056 304 336 352
3524 - 0 0 968 1056 748 304 352 320
3556 - 0 0 748 1056 792 320 352 368
3588 - 0 0 352 2200 836 144 384 400
3620 - 0 0 352 836 1012 144 400 336
3652 - 0 0 1012 2156 880 336 416 432
3684 - 0 0 1012 880 1056 336 432 352
3716 - 0 0 1056 2112 924 352 448 464
3748 - 0 0 1056 924 792 352 464 368
3780 - 0 0 792 2068 660 368 480 496
3812 - 0 0 0 704 1144 0 288 512
3844 - 0 0 704 748 1408 288 320 528
3876 - 0 0 704 1408 1144 288 528 512
3908 - 0 0 1144 1408 1188 512 528 544
3940 - 0 0 748 792 1452 320 368 560
3972 - 0 0 748 1452 1408 320 560 528
4004 - 0 0 1408 1452 1496 528 560 576
4036 - 0 0 1408 1496 1188 528 576 544
4068 - 0 0 0 132 264 0 16 32
4100 - 0 0 792 2024 1276 368 608 624
4132 - 0 0 792 1276 1452 368 624 560
4164 - 0 0 1452 1980 1320 560 640 656
4196 - 0 0 1452 1320 1496 560 656 576
4228 - 0 0 1496 1936 1364 576 672 688
4260 - 0 0 1496 1364 1232 576 688 592
4292 - 0 0 1232 1892 1100 592 704 720
4324 - 0 0 0 1144 132 0 512 16
4356 - 0 0 1144 1188 1672 512 544 736
4388 - 0 0 1144 1672 132 512 736 16
4420 - 0 0 132 1672 176 16 736 48
4452 - 0 0 1188 1232 1716 544 592 752
4484 - 0 0 1188 1716 1672 544 752 736
4516 - 0 0 1672 1716 1760 736 752 768
4548 - 0 0 1672 1760 176 736 768 48
4580 - 0 0 176 1760 220 48 768 96
4612 - 0 0 1232 1848 1540 592 784 800
4644 - 0 0 1232 1540 1716 592 800 752
4676 - 0 0 1716 1804 1584 752 816 832
4708 - 0 0 1716 1584 1760 752 832 768
4740 - 0 0 1760 2464 1628 768 848 864
4772 - 0 0 1760 1628 220 768 864 96
4804 - 0 0 220 2420 44 96 880 896
4836 - 0 0 1716 1540 1804 752 800 816
4868 - 0 0 1232 1100 1848 592 720 784
4900 - 0 0 1232 1364 1892 592 688 704
4932 - 0 0 1496 1320 1936 576 656 672
4964 - 0 0 1452 1276 1980 560 624 640
4996 - 0 0 792 660 2024 368 496 608
5028 - 0 0 792 924 2068 368 464 480
5060 - 0 0 1056 880 2112 352 432 448
5092 - 0 0 1012 836 2156 336 400 416
5124 - 0 0 352 88 2200 144 272 384
5156 - 0 0 352 484 2244 144 240 256
5188 - 0 0 616 440 2288 128 208 224
5220 - 0 0 572 396 2332 112 176 192
5252 - 0 0 220 44 2376 96 896 160
5284 - 0 0 220 1628 2420 96 864 880
5316 - 0 0 1760 1584 2464 768 832 848

(comment on this)

4:25 pm - File 1
Number of entries: 57

Each line starts with the offset in the file, and contains 11 entries. 5 of these seem to be always zero, and the
last in each record is also zero.

280 - 162 -656058 822 0 0 0 0 0 32768 32768 0
324 - 655522 -698 822 0 0 0 0 0 32 32768 0
368 - 162 -698 -654536 0 0 0 0 0 32768 65503 0
412 - 250956 -606172 822 0 0 0 0 0 20237 32768 0
456 - 463574 -464110 822 0 0 0 0 0 9620 32768 0
500 - 605636 -251492 822 0 0 0 0 0 2523 32768 0
544 - 162 -606172 -249976 0 0 0 0 0 32768 45298 0
588 - 162 -464110 -462588 0 0 0 0 0 32768 55915 0
632 - 162 -251492 -604650 0 0 0 0 0 32768 63012 0
676 - 605636 -698 -249976 0 0 0 0 0 2523 45298 0
720 - 463574 -698 -462588 0 0 0 0 0 9620 55915 0
764 - 250956 -698 -604650 0 0 0 0 0 20237 63012 0
808 - 267714 -535800 -266728 0 0 0 0 0 19405 46130 0
852 - 497752 -322664 -278892 0 0 0 0 0 7910 46740 0
896 - 279878 -322664 -496772 0 0 0 0 0 18795 57625 0
940 - -655196 -698 822 0 0 0 0 0 65503 32768 0
984 - -250630 -606172 822 0 0 0 0 0 45298 32768 0
1028 - -463248 -464110 822 0 0 0 0 0 55915 32768 0
1072 - -605310 -251492 822 0 0 0 0 0 63012 32768 0
1116 - -250630 -698 -604650 0 0 0 0 0 45298 63012 0
1160 - -463248 -698 -462588 0 0 0 0 0 55915 55915 0
1204 - -605310 -698 -249976 0 0 0 0 0 63012 45298 0
1248 - -267386 -535800 -266728 0 0 0 0 0 46130 46130 0
1292 - -279544 -322664 -496772 0 0 0 0 0 46740 57625 0
1336 - -497424 -322664 -278892 0 0 0 0 0 57625 46740 0
1380 - 162 -698 656182 0 0 0 0 0 32768 32 0
1424 - 162 -606172 251616 0 0 0 0 0 32768 20237 0
1468 - 162 -464110 464228 0 0 0 0 0 32768 9620 0
1512 - 162 -251492 606290 0 0 0 0 0 32768 2523 0
1556 - -605310 -698 251616 0 0 0 0 0 63012 20237 0
1600 - -463248 -698 464228 0 0 0 0 0 55915 9620 0
1644 - -250630 -698 606290 0 0 0 0 0 45298 2523 0
1688 - -267386 -535800 268366 0 0 0 0 0 46130 19405 0
1732 - -497424 -322664 280530 0 0 0 0 0 57625 18795 0
1776 - -279544 -322664 498410 0 0 0 0 0 46740 7910 0
1820 - 250956 -698 606290 0 0 0 0 0 20237 2523 0
1864 - 463574 -698 464228 0 0 0 0 0 9620 9620 0
1908 - 605636 -698 251616 0 0 0 0 0 2523 20237 0
1952 - 267714 -535800 268366 0 0 0 0 0 19405 19405 0
1996 - 279878 -322664 498410 0 0 0 0 0 18795 7910 0
2040 - 497752 -322664 280530 0 0 0 0 0 7910 18795 0
2084 - 364406 -698 545950 0 0 0 0 0 14575 5537 0
2128 - 128070 -698 643842 0 0 0 0 0 26378 648 0
2172 - -127742 -698 643842 0 0 0 0 0 39157 648 0
2216 - -364078 -698 545950 0 0 0 0 0 50960 5537 0
2260 - -544964 -698 365064 0 0 0 0 0 59998 14575 0
2304 - -642862 -698 128722 0 0 0 0 0 64887 26378 0
2348 - -642862 -698 -127084 0 0 0 0 0 64887 39157 0
2392 - -544964 -698 -363426 0 0 0 0 0 59998 50960 0
2436 - -364078 -698 -544312 0 0 0 0 0 50960 59998 0
2480 - -127742 -698 -642204 0 0 0 0 0 39157 64887 0
2524 - 128070 -698 -642204 0 0 0 0 0 26378 64887 0
2568 - 364406 -698 -544312 0 0 0 0 0 14575 59998 0
2612 - 545292 -698 -363426 0 0 0 0 0 5537 50960 0
2656 - 643190 -698 -127084 0 0 0 0 0 648 39157 0
2700 - 643190 -698 128722 0 0 0 0 0 648 26378 0
2744 - 545292 -698 365064 0 0 0 0 0 5537 14575 0

(comment on this)

4:24 pm - Tachyon PAK file cracking - part 2
Previously, I'd worked out how to get textures out of the files, and also the vertex
arrays, but I was only working from one PAK file (phoenix.pak). When I went into work
I got some more, thanks to Shadetree and to Skeeter for sending them.

With these new pak files (tethys, orion, glint laser and blade), I could change
some of the hard coded references in my program to variables and finally fill
in the gaps with the header format.

First off was the header name - this is now the ObjectName, as shown in the Glint Light
Laser pak.

Second where the offsets to the various lists, as well as the object list and texture list.

Here is what I had before, on the TETHYS.pak file...


3DPK
8195/2003/0320/800
Header filename: TETHYS
Unknown:
No of object lists: 1
1: 656456/A0448/48040A/4719626
2: 0/0//0
3: 96/60/60/96
4: 6260/1874/7418/29720
5: 10791/2A27/272A/10026
6: 0/0//0
7: 0/0//0
8: 0/0//0
9: 0/0//0
10: 0/0//0
11: 0/0//0
12: 0/0//0
3000 - 104
1 - 1
0 - 160
0 - 0

Probably makes no sense to you, until I show you this:

3DPK
8195/2003/0320/800
Object name: TETHYS
Unknown: 2
No of object lists: 1
1: 656456/A0448/48040A/4719626
2: 0/0//0
Object list offset: 96
Texture list offset: 6260
Texture list end: 10791
1: 0/0//0
2: 0/0//0
3: 0/0//0
4: 0/0//0
5: 0/0//0
6: 0/0//0
7: 0/0//0
3000 - 104
1 - 1
0 - 160
0 - 0

Now you can see some of the blanks being filled in. I still had to work out
the exact format of the object lists (those that tell me where in the file to
find the start of the objects), but I now knew how long the entire texture
list was.

The 8195 seems constant throughout the file, so I think this is definately
a file version number. I'll tag it as such for now.

Test - ignore :)

100 - 128
200 - 448
500 Start of Actual Objects: 724
3000 - 1000

3000 - 104
1 - 1
0 Start of Actual Objects: 160
0 - 0

3000 - lastItemInList
100 - firstItemInList


The list you see above is what I call the "infoList". Basically, the actual objectList
itself has another section of information about it, for example where the objectList
is in the file, and what the offsets are. It is to this list that the offset 96, which
appears immediately after the name of the object, points.

So, to recap, we have a header structure like so:

Public Type pakFile
Identifier As String * 4 ' Always '3DPK'
Version As Long ' Always equal to 8195
Name As String * 32 ' Name of object - max size of 32, ex: "GALSPAN LIGHT LASER"
Unknown As Long ' Unknown as yet
noOfGroups As Long ' Number of object groupings
Unknown2 As Long ' Always seems to be a large number
Filler As Long ' Always zero
infoList As Long ' Info list offset
textureStart As Long ' Start of texture list
textureEnd As Long ' End of texture list - always seems to be EOF
Filler2 As Long ' Always zero
Unknown3 As Long ' Unknown - always seems to be a large number. (game ID?)
Filler3 As Long ' Always zero (?) - x 5
End Type

In working out the header structure, I realised that the first value in the so-called
"infoList", is *always* the start of the actual objectInfoList, thus I've added them
to the following structure.

Public Type infoList
Filler As Integer ' Always zero
Type As Integer ' Specifies the type of information this is. For example
' 3000 = endOfObjectList, 50/75/100 = startOfObjectList
Value As Long ' Actual value
End Type

I'll probably rename things later on, becuase it's confusing - this file format seems
to have a lot of arrays and so on.

Using the information above, I successfully extracted all the header information from
all the PAK files I had, which was good :)

For my next task, I needed to work out the exact format of the objectLists themselves.
This is in case they contain information such as number of polygons, faces and so
on.

Each objectList starts with value 65536, as far as I can tell. This is followed by
the number of objects in this list, for example 6. Each ship, by the way, has a
maximum of 4 object lists - although this is to be verified. Then there is a filler,
always value 12 for ships.

Next, the following is repeated for each objectList:

ObjectOffset As Long ' Actual start of this object
Filler As Long ' x 3. Seem to be always zero.
Flags As Long ' Either 0 or 64, no use verified - flags maybe?
Unknown As Long ' Always 44
Filler2 As Long ' x 5. Seem to be always zero

The most information information - and only bit at this time - we can get is the
objectOffset. We will examine each object in turn to extract actual polygon
and vertex information.

Random stuff - ignore :)

GLIGHTL
--------
noOfTextures: 7
VertexOffset: 11584
noOfVerts: 55

7 144 150 55 0 1
92 288 6624 11424 12304 12304
6336 4800 880 0 10947
18c0 12c0 370 0 2ac3

offset name length
dec hex dec hex
12324 3024 PCX1 1952 7a0 1157 d9 = 217 1157+217 = 1374 = 55e
14276 37c4 PCX2 (maybe +1) 2120 848
16396 400c PCX3 (maybe +2) 1900 76c
18296 4778 PCX4 (maybe +3) 1456 5b0
19752 4d28 PCX5 1164 48c
20916 51b4 PCX6 (maybe +2) 1472 5c0
22388 5774 PCX7 (maybe +3) 862 35e

00 01 08 00 = 264 or = 256 and 8
40 00 00 00 = 64
10 00 00 00 = 16
85 04 00 00 = 1157

12512 - 1925 - 785

485 = 1157, taking past the main PCX data and onto what seems like another
array. This starts with D9 which is 217. Number of bytes, or number
of entries?

I am inclined to think it's number of entries, since the PCX format states:

"To access a 256 color palette:
First, check the version number in the header, if it contains a 5
there is a palette. Second, read to the end of the file and count
back 769 bytes. The value you find should be a 12 decimal, showing
the presence of a 256 color palette."

In the PCX files contained within the PAK file, 769 bytes back from the end
of the texure entry is not, in fact, 12 decimal, but the D9 I mentioned above.
However, it definately seems to apply that this is the pallete information
and knowing the format of this I can complete the code for reading the
*entire* PCX file out of the PAK.

Basically 769 - 4 bytes (for the length of 00 00 00 D9), is 765 bytes. Divide
this by 3 and we get 255. Or we could simply take D9 as 1 byte, which seems
to be common in the PCX format, and we have 256. Bingo!

So, now all I have to do is read x bytes as the data, then another 1 byte
then 3 x 256 bytes and I'm at the end of the file!

The PCX reading code isn't perfect though, since on some files it munges
the headers. I still need to work out exactly how to calculate the
numberOfBytesPerScanLine. I'll investigate, but it's a big step forward :)

More later!

Stuart

(comment on this)

Friday, January 11th, 2002
3:20 pm - Tachyon: The Fringe mod for Homeworld notes.
Notes - by Stuart Stanfield (aka Delphy)
Written January 11th 2002

These notes follow the progress of how I am extracting
the PAK files from the game Tachyon:The Fringe for the
Tachyon Homeworld mod team. Thanks to Skeeter for
sending me the file.

If it's convoluted, then apologies - I type it as I am
going along. If you want to see final results just
scroll down :)

3DPX indicates beginning of PAK file
3DO1 seems to indicate beginning of object
These seem to have a number of texture entries as PCX files
The string "3DO1" occurs 20 times within the file
The string "PCX" occurs 100 times within the file


objName noOfTextures Offset Length
3D01 - PHOXH01 - 20 PCX references - 1056 - 78272
PHOXH02 - 1 PCX reference - 79328 - 376
PHOXH03 - 1 PCX reference - 79704 - 376
PHOXH04 - 1 PCX reference - 80080 - 376
PHOXH05 - 1 PCX reference - 80456 - 376
PHOXH06 - 1 PCX reference - 80832 - 376
PHOXH07 - 2 PCX references - 81208 - 4052
PHOXM01 - 12 PCX references - 85260 - 51024
PHOXM02 - 1 PCX reference - 136284 - 376
PHOXM03 - 1 PCX reference - 136660 - 376
PHOXM04 - 1 PCX reference - 137036 - 376
PHOXM05 - 1 PCX reference - 137412 - 376
PHOXM06 - 1 PCX reference - 137788 - 376
PHOXL01 - 12 PCX references - 138164 - 31100
PHOXL02 - 1 PCX reference - 169264 - 376
PHOXL03 - 1 PCX reference - 169640 - 376
PHOXL04 - 1 PCX reference - 170016 - 376
PHOXL05 - 1 PCX reference - 170392 - 376
PHOXL06 - 1 PCX reference - 170768 - 376
PHOX_T - 4 PCX references - 171144 - ???
(PHOX_T seems to be last object in file)
Total - 65 PCX references


Length is calculated simply by taking the offset of the
next record (ie for PHOXH01 we need the start of PHOXH02 which
is 79,328 bytes into the file), and taking away the start of
this record (ie PHOXH01 starts at 1,056).

This leaves the total size of the record (78,272 bytes)
including headers. For this reason we cannot yet calculate the
editing record size as we do not have a boundary.

The number of PCX references was calculated by searching
manually for all occurances of "PCX" within the current
object (ie if we are searching PHOXH01 for PCX then if we
start going into PHOXH02 then we know we have the total
number for PHOXH01). However, all texture references seem
to be at the top of the object structure.

There are also more references to PCX files further down in
the file after the last object - these all are prefixed with 1H.

Next we convert both the length and number of PCX references
into hex and look manually within each object to see if we
can find direct matches. To start with PHOXH01 we are looking
for 0x14h (dec 20) as the number of PCX entries, and 0x131C0h
(dec 78272) as the length of this object. Note that the
header may or may not be included, so normally I look for
values within 4 of this figure.

In this particular case when I looked within the record
structure manually for these values I could not find them,
however I found a value close to this in the main header
of 0x13bc0h. This equate to decimal of 80,832 - which
happens to exactly match the start of object PHOXH06!

This lead me to convert the offset of the first record
PHOXH01 - value of 1056 which in hex is 0x420h. Converting
the endian of this using a function I wrote a long time
ago changes this to 2004. Searching for this *new*
value I matched in the header part! (note I have another
function to convert 2004 into 0420, and thus 1056)

Knowing that at least some of the objects have start offsets
within this header, I convert the offset of the second one
to hex, and got E03501. This matched straight after the
first one. I'd found the header and I now knew exactly
where each object started, but now I needed to try and find
if the number of textures was in this header too.

Starting with the first object which has 20 PCX references,
(since it's a lot easier in a file to look for 0x14 than it is
0x01!), I looked at where I had found the start offset.
Unfortunately, I didn't find anything even resembling this
value - just a lot of blank space and a curious "40 2C" combo
which seemed to be in most records.

So, I turned my attention to the actual record itself. I found
a reference to 0x15h, which is 21, so keeping that in mind
I skipped to looking at PHOXM01 (as this has 12 PCX refs).

I found the reference to 12 PCX refs (0xCh) 44 bytes after
the start of the object in this one. This was verified
by looking at PHOXM02 (which has 1 pcx) and PHOXL01 (which has
12 pcx).

I went back to PHOXH01 and found in the same place the reference
to 0x15h (21 decimal) - this could only mean I must have
miscalculated the number of textures in this object. Since I
could not find more than 20 references to PCX in this object
using ASCII find, I figured that it might be a reference to
a "blank" texture, or perhaps one that doesn't show up.

Okay so now I had both the start of each object, and the number
of textures within, I needed to work out the actual structure
of the file. Up till now I had been using a pure hex editor
and simple searching, along with noting down numbers, converting
them to hex and so on.

Starting again with PHOXH01, I know that the pointer to the
start of this object is located at offset 0x8C (140 decimal)
in the file. Looking at PHOXH02, this is at offset 0xB8 (184
decimal). Subtracting 140 from 184 gives us a record size
for this header portion of 44 bytes. This means that for
each object in the file we have to read at least 44 bytes.

All we needed now where the record boundaries of this header
section. To try and guess these we look at the first and last
records - PHOXH01 and PHOX_T. By comparing the positions of
where the start offset is within this file, we can roughly
work out *where* in the record structure the start offset
is located, and thus where to slide the 44 bytes to to get
the true record structure.

So, we need to look for 889C02 in the file as this is the
start offset of PHOX_T (converted to hex). This is near
the end of the header, exactly 44 bytes before the start
of PHOXH01 proper (the object itself, not the header).

Thus, since the offset is 44 bytes before the start of the
actual object records, we know that within the header record
the offset is the first number.

Most programs store values in files on word boundaries - that
is every 4 bytes. In Visual Basic this is known as a "Long",
however a quick visual check can reveal any integers (2 byte
numbers), simply by looking down the mid columns and seeing
if any values fall there. In this case, they did, so I knew
the header structure was consisted of at least 1 long and
at least 1 integer.

I wrote a quick and dirty record structure in Visual Basic
to allow for this, and outputted the information.


Public Type pakHeader
startOffset as Long
unknown1 As Long
unknown2 As Long
unknown3 As Long
unknown4 As Long
unknown5 As Long
unknown6 As Long
unknown7 As Long
unknown8 As Long
unknown9 As Long
unknown10 As Long
End Type


After trying this record structure out, I noticed that every
time the object type changes - ie from PHOXH?? to PHOXL?? -
an extra 12 bytes are added to the record. I coped with this
by checking for the value "65536". If the offset that we
attempt to get equals this, then seek 12 bytes forward and
continue from there, but flag each name with a * to indicate
this fact. This let me work out the object type changes - which
means I can further refine the record structure and work out
what goes where.

I made a ListView form and had everything arranged in columns.
This easily allowed me to see what each value was and where.
A screen shot is attached.



You can see from this screenshot the marking of the boundaries,
and the various unknown values.

What I now wanted to do was to try and figure out that 12 byte
discrepancy. I guessed that instead of listing all 20 objects
in one list it was broken up into 3 possibly 4 lists (if you
include PHOX_T as a single object on it's own).

Thus we have:

List 1: PHOXH01 - PHOXH07 (7 entries)
List 2: PHOXM01 - PHOXM06 (6 entries)
List 3: PHOXL01 - PHOXL06 (6 entries)
List 4: PHOX_T (1 entry)

This makes a total of 20 objects which matches my estimate
earlier on. Next I had to find some reference to these
values in the header structure.

If we look at the 12 bytes I was previously skipping you'll
notice the format:

00 00 01 00 07 00 00 00 0C 00 00 00

See the 07 in there? That's in it's own 4 byte block, which
means it's a long variable. This matches the number of entries
in the first list.

Checking the beginning of the second list we turn up exactly
the same numbers, except with 06 instead of 07. That means we've
found the start of each list, and how many items are in list.

We now need a second record structure:


Public Type objectList
unknownStart As Long ' I don't know what this is - defaults to 65536
noOfEntries As Long ' Number of objects in this list
unknownEnd As Long ' Defaults to 0x0C (15 decimal)
End Type


Adding this into the program, we get exactly the same display as
before, except now with the correct list structure.

The start of this list structure was at offset 128 in the file, which
still left 100 or so bytes unknown right at the top. My next task
was to try and determine what these where. I was assuming 4 objectLists
but I needed to turn this into a dynamic structure that could handle
any number of objectLists.

At the top of the file, at offset 44 was the number 4, by itself.
Although this matched the number of lists I could not be sure if
it indeed was the correct value - to do that I would have to
look at another PAK file with a different number of objectLists.
Indeed, this still left 80 bytes between that number and the start
of the objectLists, and 28 bytes before it unknown.

The only way to try and work out what these numbers where was
to do a mass convert into hex and try and make sense of them by
comparing them to things like file size and record length and so
on.

So, insert a function to grab x number of longs from the file,
and output them to a text box. This was a raw list and I then
had to look at the data and try and make sense of it.

Basically what I was looking for where any numbers that fell
in the range of 1 to the total size of the file. In most file
formats, there are offsets that point to other lists at the
bottom of the file, so this was what I was looking for.

The 4th number in this list fit this category, being 0x030EA4h
(which is 200356 decimal). Scrolling down to this byte in the
file I find what looks like another record structure, except
here in the file are the missing PCX entries (sharp eyed readers
might remember I said that there where 100 occurances but
wondered why there where only 65 PCX references in the object list).
However, although I now knew this was the start of a new list I
did not know how long it was.

As it happened, the 5th number in the unknown header was 0x04DBB5h,
which in decimal is 318389. This is exactly the same number as the
total file size. So, I could probably assume that it was either
the end of the list, or the total file size.

Incidentally, if you are wondering what the 1st, 2nd and 3rd
numbers are, the first was 1084135 - way out of the offset
range for this file - the second was 0 and the third 96.

This third value of 96 intrigued me, as it was almost but
not quite the total size of the header up until the start of the
objectLists. I left it for the time being.

Another list offset value came out of the unknown ones too -
173852 decimal. This seemed to point to a new array that
definately wasn't part of the previous set of data in the
file. Subtracting the next list - 200356 - from this
value gave a total length of 26,504 (0x6788h).

I tried a search for this length but didn't turn up anything
- which was not really suprising since I hadn't found any
more "lengths". However, I would keep it in mind for
later working out of record structures.

Back to the header.... the 96 did have a meaning - it's the
size of all headers *except* for anything involving objectLists.
You see, there was also 7 values of the following:

98: 65536
102: 128
106: 131072
110: 448
114: 32768
118: 724
122: 196608
126: 1000

The values 128, 448, 724 and 1000 correspond exactly with the
start of listObject record that I discussed earlier. I am
not sure of the relevance of the other values, but next I
decided to try integers instead of longs on some of the numbers.
Here is what I got:

100 - 128
200 - 448
500 - 724
3000 - 1000

I wasn't sure of the significance of these new numbers, so again
they are commited to "try and work it out sometime later"

By this stage I pretty much had the entire header worked out -
except for one value of simply "3" that was in there. I
wasn't sure what this was - it could indicate the number
of different types of list - ie vertex, objects and textures,
but then again maybe not. The reason I said it was number of
lists is that you have the values I mentioned above that
point halfway through the file, but there is a break where
possibly another value could go. I left it alone until such
time as I could compare with another file.

The next task was to work out the 3DO1 objects themselves.
The start of these was the following:

4 character string - always "3DO1"
2 x int - first always 257, second 0
8 character string - object name ie PHOXH01

After that it was pretty unknown. All I knew was that there
was the number of textures and the actual texture list 92
bytes from the start. I decided to switch tacks and look more
at the actual texture data than the file spec.

I wrote a quick routine which listed all textures used by each
object, and I found that textures are used across multiple objects
as follows:


PHOXH01
--------
No of textures: 21
OPODTOPH.PCX
OPODSDEH.PCX
OFNTSDEH.PCX
ONSECNEH.PCX
OTOPMIDH.PCX
OBLUMTLH.PCX
OBTMPANH.PCX
OTPAN02H.PCX
OTAITOPH.PCX
OBIGWNGH.PCX
OTEKPIPH.PCX
OENGSQUH.PCX
OSMLWNGH.PCX
OTPAN01H.PCX
OENGHOLH.PCX
OMIDENGH.PCX
DCLAWG01.PCX
DCLAWE01.PCX
OCPITH.PCX
OCANOPYH.PCX
PHOXH02
--------
No of textures: 1
DCLAWG01.PCX
PHOXH03
--------
No of textures: 1
DCLAWG01.PCX
PHOXH04
--------
No of textures: 1
DCLAWG01.PCX
PHOXH05
--------
No of textures: 1
DCLAWG01.PCX
PHOXH06
--------
No of textures: 1
DCLAWG01.PCX
PHOXH07
--------
No of textures: 2
OPHELMH.PCX
OPFACEH.PCX
PHOXM01
--------
No of textures: 12
OPODTOPM.PCX
OPODSDEM.PCX
OFNTSDEM.PCX
OBTMSHOT.PCX
OTOPSHOT.PCX
OTAITOPM.PCX
OBIGWNGM.PCX
OTEKPIPM.PCX
OSMLWNGM.PCX
DCLAWG01.PCX
DCLAWE01.PCX
ONSECNEM.PCX
PHOXM02
--------
No of textures: 1
DCLAWG01.PCX
PHOXM03
--------
No of textures: 1
DCLAWG01.PCX
PHOXM04
--------
No of textures: 1
DCLAWG01.PCX
PHOXM05
--------
No of textures: 1
DCLAWG01.PCX
PHOXM06
--------
No of textures: 1
DCLAWG01.PCX
PHOXL01
--------
No of textures: 12
OPODTOPM.PCX
OPODSDEM.PCX
OFNTSDEM.PCX
OBTMSHOT.PCX
OTOPSHOT.PCX
OTAITOPM.PCX
OBIGWNGM.PCX
OSMLWNGM.PCX
OTEKPIPM.PCX
DCLAWG01.PCX
DCLAWE01.PCX
ONSECNEM.PCX
PHOXL02
--------
No of textures: 1
DCLAWG01.PCX
PHOXL03
--------
No of textures: 1
DCLAWG01.PCX
PHOXL04
--------
No of textures: 1
DCLAWG01.PCX
PHOXL05
--------
No of textures: 1
DCLAWG01.PCX
PHOXL06
--------
No of textures: 1
DCLAWG01.PCX
PHOX_T
--------
No of textures: 4
DCLAWE01.PCX
OPTOPT.PCX
OPFRNTT.PCX
OPSIDET.PCX


As you can see a lot of them are repeated. I loaded them all into an
Excel spreadsheet to sort out the duplicates. This left me with only
35 actual PCX files listed. (35 + 65 = 100 references to PCX, so
this was verified)

Next I needed to know where the PCX information was held, so I looked
back at the values of arrays I had gotten half way down the file, and
found what I thought was the start of the PCX images at offset 200,356.

The first value I saw was the hex code 0x23, which in decimal is 35 -
matching the total number of PCX files!

When I looked at the PCX raw information in the PAK file the first
thing was to find the header information - size of image, colour
depth and so on. The first 4 bytes after the PCX filename where
either 1 long or 2 ints, being 0x00010800h. This could equate to
either 524,544 (too big for the filesize), or 256 and 8. This
is more like the kind of information we would see in an image
file - however since I did not know how many colours the game
itself is actually played in I left this for now.

The next 8 bytes are 2 sets of 4 - 0x40h (64 decimal). 64x64
sprung to mind immediately. However in the PCX format, the
image size is from 0 upwards (meaning a 64 pixel wide image
would be 63 decimal). Also, in the PCX format, the size is
stored as xmin, ymin, xmax and ymax - to calculate the
true size we do (xmax - xmin) + 1, and (ymax - ymin) + 1.

One thing bothered me though - the xmax etc are stored as
integers (2 bytes), whereas the values pulled from the PAK
files where as longs (4 bytes). Something didn't fit in with
the PCX file format.

I figured that the PAK file maybe didn't have all of the 128
byte PCX header - so I took an existing PCX file (from Civ3,
if you are curious :) and modified simply the xmax, ymax and
what looked like the colour depth.

What I got was image that actually loaded! This is OPODTOPH.PCX
and I've converted it to gif for your viewing pleasure.



One thing to note was that Kodak Image Preview shows bands of
horizontal grey pixels interspersed with yellow on the right
of the image, whereas PSP (which I converted this picture with)
doesn't show them at all.

I used a default resolution of 100 dots per inch, and didn't change
any other aspects of the header. I would look at those later.

What I needed now was a program that would extract the full raw PCX
data from the PAK file, but in order to do *that* I would need
to know the size of the PCX record.

The final size of the OPODTOPH.pcx file was 4638 bytes. Knocking
off the 128 byte header this left 4510 bytes for actual image
data. This didn't really tie in with the size of 64x64 (4510
divided by 64 is 70.46875), so what I figured was that there was
extra data at the end of the PCX record in the PAK file.

To check exactly how big a 64x64x256 image was in bytes when saved,
I created a new image in PSP. This was saved as a version 5 PCX
Paintbrush file and came to 1089 bytes on the disk. Allowing for
the 128 byte header, this left 961 bytes of actual image data -
a lot less than was "extracted" from the PAK file.

Due to limitations in the PCX format you cannot have 16 bit images
but you can have 32 bit ones. So I created a 64x64x32 image. This
came out to only 768 bytes on disk! Obviously some pixel compression
was going on here - or I had miscalculated.

I looked again at the PCX specification:

ZSoft .PCX FILE HEADER FORMAT


Byte Item Size Description/Comments
3 Bits per pixel 1 Number of bits/pixel per plane
4 Window 8 Picture Dimensions
(Xmin, Ymin) - (Xmax - Ymax)
in pixels, inclusive
12 HRes 2 Horizontal Resolution of creating device
14 VRes 2 Vertical Resolution of creating device
16 Colormap 48 Color palette setting, see text
64 Reserved 1
65 NPlanes 1 Number of color planes
66 Bytes per Line 2 Number of bytes per scan line per
color plane (always even for .PCX files)
68 Palette Info 2 How to interpret palette - 1 = color/BW,
2 = grayscale
70 Filler 58 blank to fill out 128 byte header


Bytes 65 and 66 seemed to match the values I pulled out earlier on
- that of 256 and 8. In fact the hex of 00010800 was 2 single
bytes and 1 integer. The first, byte 64, is reserved so this leaves
us with 010800. The first byte is 01, which is the number of
colour planes, and the second and third are the number of bytes per scan
- in this case (or so I thought) - set to 8. So 8 bytes per
scan line x 1 for colour plan. However when I looked at a normal
64x64x256 image, this value was actually set to 64 bytes per
colour plane per scan line.

Manually changing the bytePerLine value in OFNTSDEH.pcx (another, larger
image I extracted), left us with the following image:



Now, does that look more like a texture? I think so.

Using this same method I then went back to OPODTOPH.pcx and modified
the header. I got this:



The final creation and actual extraction of the PCX and polygon object
data would have to come at a later time :)

-Fin, for now.

(1 comment |comment on this)

Wednesday, January 9th, 2002
5:10 pm
AOL 7 ISDN specific revision number: 4114.55

(comment on this)

Wednesday, November 21st, 2001
9:53 pm - Update of Relic Sound Tools to v1.5
Various people have pointed out over the past months that my Relic Sound Tools program doesn't
actually extract the sounds from Homeworld correctly. So, after a post on the Relic Editing forum, I decided to update it.


Basically the changes are as follows:


  • Added codecBitrate to the extract.html file that gets generated (bet not many people even noticed that file!). This will help for future reference with regards the HWSS
  • Added support for pure AIF files (as opposed to just AIFR) - thanks to Ghent for pointing this out. :)
  • Made sure that it extracts the correct values for the Homeworld voices
  • Changed AIFR writing routine to write using a buffer, so should be a *lot* faster than before.
  • Added a status bar, for more visual means of seeing the progress of extraction

(comment on this)



> top of page
LiveJournal.com