Doc: explain the binary structure of our (new) savegames
This commit is contained in:
		 Patric Stout
					Patric Stout
				
			
				
					committed by
					
						 Patric Stout
						Patric Stout
					
				
			
			
				
	
			
			
			 Patric Stout
						Patric Stout
					
				
			
						parent
						
							1ed2405907
						
					
				
				
					commit
					9643a1b80a
				
			
							
								
								
									
										175
									
								
								docs/savegame_format.md
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										175
									
								
								docs/savegame_format.md
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,175 @@ | |||||||
|  | # OpenTTD's Savegame Format | ||||||
|  |  | ||||||
|  | Last updated: 2021-06-15 | ||||||
|  |  | ||||||
|  | ## Outer container | ||||||
|  |  | ||||||
|  | Savegames for OpenTTD start with an outer container, to contain the compressed data for the rest of the savegame. | ||||||
|  |  | ||||||
|  | `[0..3]` - The first four bytes indicate what compression is used. | ||||||
|  | In ASCII, these values are possible: | ||||||
|  |  | ||||||
|  | - `OTTD` - Compressed with LZO (deprecated, only really old savegames would use this). | ||||||
|  | - `OTTN` - No compression. | ||||||
|  | - `OTTZ` - Compressed with zlib. | ||||||
|  | - `OTTX` - Compressed with LZMA. | ||||||
|  |  | ||||||
|  | `[4..5]` - The next two bytes indicate which savegame version used. | ||||||
|  |  | ||||||
|  | `[6..7]` - The next two bytes can be ignored, and were only used in really old savegames. | ||||||
|  |  | ||||||
|  | `[8..N]` - Next follows a binary blob which is compressed with the indicated compression algorithm. | ||||||
|  |  | ||||||
|  | The rest of this document talks about this decompressed blob of data. | ||||||
|  |  | ||||||
|  | ## Data types | ||||||
|  |  | ||||||
|  | The savegame is written in Big Endian, so when we talk about a 16-bit unsigned integer (`uint16`), we mean it is stored in Big Endian. | ||||||
|  |  | ||||||
|  | The following types are valid: | ||||||
|  |  | ||||||
|  | - `1` - `int8` / `SLE_FILE_I8` -8-bit signed integer | ||||||
|  | - `2` - `uint8` / `SLE_FILE_U8` - 8-bit unsigned integer | ||||||
|  | - `3` - `int16` / `SLE_FILE_I16` - 16-bit signed integer | ||||||
|  | - `4` - `uint16` / `SLE_FILE_U16` - 16-bit unsigned integer | ||||||
|  | - `5` - `int32` / `SLE_FILE_I32` - 32-bit signed integer | ||||||
|  | - `6` - `uint32` / `SLE_FILE_U32` - 32-bit unsigned integer | ||||||
|  | - `7` - `int64` / `SLE_FILE_I64` - 64-bit signed integer | ||||||
|  | - `8` - `uint64` / `SLE_FILE_U64` - 64-bit unsigned integer | ||||||
|  | - `9` - `StringID` / `SLE_FILE_STRINGID` - a StringID inside the OpenTTD's string table | ||||||
|  | - `10` - `str` / `SLE_FILE_STRING` - a string (prefixed with a length-field) | ||||||
|  | - `11` - `struct` / `SLE_FILE_STRUCT` - a struct | ||||||
|  |  | ||||||
|  | ### Gamma value | ||||||
|  |  | ||||||
|  | There is also a field-type called `gamma`. | ||||||
|  | This is most often used for length-fields, and uses as few bytes as possible to store an integer. | ||||||
|  | For values <= 127, it uses a single byte. | ||||||
|  | For values > 127, it uses two bytes and sets the highest bit to high. | ||||||
|  | For values > 32767, it uses three bytes and sets the two highest bits to high. | ||||||
|  | And this continues till the value fits. | ||||||
|  | In a more visual approach: | ||||||
|  | ``` | ||||||
|  |   0xxxxxxx | ||||||
|  |   10xxxxxx xxxxxxxx | ||||||
|  |   110xxxxx xxxxxxxx xxxxxxxx | ||||||
|  |   1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx | ||||||
|  |   11110--- xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | ## Chunks | ||||||
|  |  | ||||||
|  | Savegames for OpenTTD store their data in chunks. | ||||||
|  | Each chunk contains data for a certain part of the game, for example "Companies", "Vehicles", etc. | ||||||
|  |  | ||||||
|  | `[0..3]` - Each chunk starts with four bytes to indicate the tag. | ||||||
|  | If the tag is `\x00\x00\x00\x00` it means the end of the savegame is reached. | ||||||
|  | An example of a valid tag is `PLYR` when looking at it via ASCII, which contains the information of all the companies. | ||||||
|  |  | ||||||
|  | `[4..4]` - Next follows a byte where the lower 4 bits contain the type. | ||||||
|  | The possible valid types are: | ||||||
|  |  | ||||||
|  | - `0` - `CH_RIFF` - This chunk is a binary blob. | ||||||
|  | - `1` - `CH_ARRAY` - This chunk is a list of items. | ||||||
|  | - `2` - `CH_SPARSE_ARRAY` - This chunk is a list of items. | ||||||
|  | - `3` - `CH_TABLE` - This chunk is self-describing list of items. | ||||||
|  | - `4` - `CH_SPARSE_TABLE` - This chunk is self-describing list of items. | ||||||
|  |  | ||||||
|  | Now per type the format is (slightly) different. | ||||||
|  |  | ||||||
|  | ### CH_RIFF | ||||||
|  |  | ||||||
|  | (since savegame version 295, this chunk type is only used for MAP-chunks, containing bit-information about each tile on the map) | ||||||
|  |  | ||||||
|  | A `CH_RIFF` starts with an `uint24` which together with the upper-bits of the type defines the length of the chunk. | ||||||
|  | In pseudo-code: | ||||||
|  |  | ||||||
|  | ``` | ||||||
|  | type = read uint8 | ||||||
|  | if type == 0 | ||||||
|  |     length = read uint24 | ||||||
|  |     length |= ((type >> 4) << 24) | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | The next `length` bytes are part of the chunk. | ||||||
|  | What those bytes mean depends on the tag of the chunk; further details per chunk can be found in the source-code. | ||||||
|  |  | ||||||
|  | ### CH_ARRAY / CH_SPARSE_ARRAY | ||||||
|  |  | ||||||
|  | (this chunk type is deprecated since savegame version 295 and is no longer in use) | ||||||
|  |  | ||||||
|  | `[0..G1]` - A `CH_ARRAY` / `CH_SPARSE_ARRAY` starts with a `gamma`, indicating the size of the next item plus one. | ||||||
|  | If this size value is zero, it indicates the end of the list. | ||||||
|  | This indicates the full length of the next item minus one. | ||||||
|  | In psuedo-code: | ||||||
|  |  | ||||||
|  | ``` | ||||||
|  | loop | ||||||
|  |     size = read gamma - 1 | ||||||
|  |     if size == -1 | ||||||
|  |         break loop | ||||||
|  |     read <size> bytes | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | `[]` - For `CH_ARRAY` there is an implicit index. | ||||||
|  | The loop starts at zero, and every iteration adds one to the index. | ||||||
|  | For entries in the game that were not allocated, the `size` will be zero. | ||||||
|  |  | ||||||
|  | `[G1+1..G2]` - For `CH_SPARSE_ARRAY` there is an explicit index. | ||||||
|  | The `gamma` following the size indicates the index. | ||||||
|  |  | ||||||
|  | The content of the item is a binary blob, and similar to `CH_RIFF`, it depends on the tag of the chunk what it means. | ||||||
|  | Please check the source-code for further details. | ||||||
|  |  | ||||||
|  | ### CH_TABLE / CH_SPARSE_TABLE | ||||||
|  |  | ||||||
|  | (this chunk type only exists since savegame version 295) | ||||||
|  |  | ||||||
|  | Both `CH_TABLE` and `CH_SPARSE_TABLE` are very similar to `CH_ARRAY` / `CH_SPARSE_ARRAY` respectively. | ||||||
|  | The only change is that the chunk starts with a header. | ||||||
|  | This header describes the chunk in details; with the header you know the meaning of each byte in the binary blob that follows. | ||||||
|  |  | ||||||
|  | `[0..G]` - The header starts with a `gamma` to indicate the size of all the headers in this chunk plus one. | ||||||
|  | If this size value is zero, it means there is no header, which should never be the case. | ||||||
|  |  | ||||||
|  | Next follows a list of `(type, key)` pairs: | ||||||
|  |  | ||||||
|  | - `[0..0]` - Type of the field. | ||||||
|  | - `[1..G]` - `gamma` to indicate length of key. | ||||||
|  | - `[G+1..N]` - Key (in UTF-8) of the field. | ||||||
|  |  | ||||||
|  | If at any point `type` is zero, the list stops (and no `key` follows). | ||||||
|  |  | ||||||
|  | The `type`'s lower 4 bits indicate the data-type (see chapter above). | ||||||
|  | The `type`'s 5th bit (so `0x10`) indicates if the field is a list, and if this field in every record starts with a `gamma` to indicate how many times the `type` is repeated. | ||||||
|  |  | ||||||
|  | If the `type` indicates either a `struct` or `str`, the `0x10` flag is also always set. | ||||||
|  |  | ||||||
|  | As the savegame format allows (list of) structs in structs, if any `struct` type is found, this header will be followed by a header of that struct. | ||||||
|  | This nesting of structs is stored depth-first, so given this table: | ||||||
|  |  | ||||||
|  | ``` | ||||||
|  | type   | key | ||||||
|  | ----------------- | ||||||
|  | uint8  | counter | ||||||
|  | struct | substruct1 | ||||||
|  | struct | substruct2 | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | With `substruct1` being like: | ||||||
|  |  | ||||||
|  | ``` | ||||||
|  | type   | key | ||||||
|  | ----------------- | ||||||
|  | uint8  | counter | ||||||
|  | struct | substruct3 | ||||||
|  | ``` | ||||||
|  |  | ||||||
|  | The headers will be, in order: `table`, `substruct1`, `substruct3`, `substruct2`, each ending with a `type` is zero field. | ||||||
|  |  | ||||||
|  | After reading all the fields of all the headers, there is a list of records. | ||||||
|  | To read this, see `CH_ARRAY` / `CH_SPARSE_ARRAY` for details. | ||||||
|  |  | ||||||
|  | As each `type` has a well defined length, you can read the records even without knowing anything about the chunk-tag yourself. | ||||||
|  |  | ||||||
|  | Do remember, that if the `type` had the `0x10` flag active, the field in the record first has a `gamma` to indicate how many times that `type` is repeated. | ||||||
		Reference in New Issue
	
	Block a user