mirror of
https://github.com/Xevion/easy7zip.git
synced 2025-12-09 18:07:00 -06:00
Initialer Commit
This commit is contained in:
328
DOC/lzma.txt
Normal file
328
DOC/lzma.txt
Normal file
@@ -0,0 +1,328 @@
|
||||
LZMA compression
|
||||
----------------
|
||||
Version: 9.35
|
||||
|
||||
This file describes LZMA encoding and decoding functions written in C language.
|
||||
|
||||
LZMA is an improved version of famous LZ77 compression algorithm.
|
||||
It was improved in way of maximum increasing of compression ratio,
|
||||
keeping high decompression speed and low memory requirements for
|
||||
decompressing.
|
||||
|
||||
Note: you can read also LZMA Specification (lzma-specification.txt from LZMA SDK)
|
||||
|
||||
Also you can look source code for LZMA encoding and decoding:
|
||||
C/Util/Lzma/LzmaUtil.c
|
||||
|
||||
|
||||
LZMA compressed file format
|
||||
---------------------------
|
||||
Offset Size Description
|
||||
0 1 Special LZMA properties (lc,lp, pb in encoded form)
|
||||
1 4 Dictionary size (little endian)
|
||||
5 8 Uncompressed size (little endian). -1 means unknown size
|
||||
13 Compressed data
|
||||
|
||||
|
||||
|
||||
ANSI-C LZMA Decoder
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Please note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
|
||||
If you want to use old interfaces you can download previous version of LZMA SDK
|
||||
from sourceforge.net site.
|
||||
|
||||
To use ANSI-C LZMA Decoder you need the following files:
|
||||
1) LzmaDec.h + LzmaDec.c + Types.h
|
||||
|
||||
Look example code:
|
||||
C/Util/Lzma/LzmaUtil.c
|
||||
|
||||
|
||||
Memory requirements for LZMA decoding
|
||||
-------------------------------------
|
||||
|
||||
Stack usage of LZMA decoding function for local variables is not
|
||||
larger than 200-400 bytes.
|
||||
|
||||
LZMA Decoder uses dictionary buffer and internal state structure.
|
||||
Internal state structure consumes
|
||||
state_size = (4 + (1.5 << (lc + lp))) KB
|
||||
by default (lc=3, lp=0), state_size = 16 KB.
|
||||
|
||||
|
||||
How To decompress data
|
||||
----------------------
|
||||
|
||||
LZMA Decoder (ANSI-C version) now supports 2 interfaces:
|
||||
1) Single-call Decompressing
|
||||
2) Multi-call State Decompressing (zlib-like interface)
|
||||
|
||||
You must use external allocator:
|
||||
Example:
|
||||
void *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
|
||||
void SzFree(void *p, void *address) { p = p; free(address); }
|
||||
ISzAlloc alloc = { SzAlloc, SzFree };
|
||||
|
||||
You can use p = p; operator to disable compiler warnings.
|
||||
|
||||
|
||||
Single-call Decompressing
|
||||
-------------------------
|
||||
When to use: RAM->RAM decompressing
|
||||
Compile files: LzmaDec.h + LzmaDec.c + Types.h
|
||||
Compile defines: no defines
|
||||
Memory Requirements:
|
||||
- Input buffer: compressed size
|
||||
- Output buffer: uncompressed size
|
||||
- LZMA Internal Structures: state_size (16 KB for default settings)
|
||||
|
||||
Interface:
|
||||
int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
|
||||
const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
|
||||
ELzmaStatus *status, ISzAlloc *alloc);
|
||||
In:
|
||||
dest - output data
|
||||
destLen - output data size
|
||||
src - input data
|
||||
srcLen - input data size
|
||||
propData - LZMA properties (5 bytes)
|
||||
propSize - size of propData buffer (5 bytes)
|
||||
finishMode - It has meaning only if the decoding reaches output limit (*destLen).
|
||||
LZMA_FINISH_ANY - Decode just destLen bytes.
|
||||
LZMA_FINISH_END - Stream must be finished after (*destLen).
|
||||
You can use LZMA_FINISH_END, when you know that
|
||||
current output buffer covers last bytes of stream.
|
||||
alloc - Memory allocator.
|
||||
|
||||
Out:
|
||||
destLen - processed output size
|
||||
srcLen - processed input size
|
||||
|
||||
Output:
|
||||
SZ_OK
|
||||
status:
|
||||
LZMA_STATUS_FINISHED_WITH_MARK
|
||||
LZMA_STATUS_NOT_FINISHED
|
||||
LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
|
||||
SZ_ERROR_DATA - Data error
|
||||
SZ_ERROR_MEM - Memory allocation error
|
||||
SZ_ERROR_UNSUPPORTED - Unsupported properties
|
||||
SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
|
||||
|
||||
If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
|
||||
and output value of destLen will be less than output buffer size limit.
|
||||
|
||||
You can use multiple checks to test data integrity after full decompression:
|
||||
1) Check Result and "status" variable.
|
||||
2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
|
||||
3) Check that output(srcLen) = compressedSize, if you know real compressedSize.
|
||||
You must use correct finish mode in that case. */
|
||||
|
||||
|
||||
Multi-call State Decompressing (zlib-like interface)
|
||||
----------------------------------------------------
|
||||
|
||||
When to use: file->file decompressing
|
||||
Compile files: LzmaDec.h + LzmaDec.c + Types.h
|
||||
|
||||
Memory Requirements:
|
||||
- Buffer for input stream: any size (for example, 16 KB)
|
||||
- Buffer for output stream: any size (for example, 16 KB)
|
||||
- LZMA Internal Structures: state_size (16 KB for default settings)
|
||||
- LZMA dictionary (dictionary size is encoded in LZMA properties header)
|
||||
|
||||
1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
|
||||
unsigned char header[LZMA_PROPS_SIZE + 8];
|
||||
ReadFile(inFile, header, sizeof(header)
|
||||
|
||||
2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties
|
||||
|
||||
CLzmaDec state;
|
||||
LzmaDec_Constr(&state);
|
||||
res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
|
||||
if (res != SZ_OK)
|
||||
return res;
|
||||
|
||||
3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
|
||||
|
||||
LzmaDec_Init(&state);
|
||||
for (;;)
|
||||
{
|
||||
...
|
||||
int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
|
||||
const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
|
||||
...
|
||||
}
|
||||
|
||||
|
||||
4) Free all allocated structures
|
||||
LzmaDec_Free(&state, &g_Alloc);
|
||||
|
||||
Look example code:
|
||||
C/Util/Lzma/LzmaUtil.c
|
||||
|
||||
|
||||
How To compress data
|
||||
--------------------
|
||||
|
||||
Compile files:
|
||||
Types.h
|
||||
Threads.h
|
||||
LzmaEnc.h
|
||||
LzmaEnc.c
|
||||
LzFind.h
|
||||
LzFind.c
|
||||
LzFindMt.h
|
||||
LzFindMt.c
|
||||
LzHash.h
|
||||
|
||||
Memory Requirements:
|
||||
- (dictSize * 11.5 + 6 MB) + state_size
|
||||
|
||||
Lzma Encoder can use two memory allocators:
|
||||
1) alloc - for small arrays.
|
||||
2) allocBig - for big arrays.
|
||||
|
||||
For example, you can use Large RAM Pages (2 MB) in allocBig allocator for
|
||||
better compression speed. Note that Windows has bad implementation for
|
||||
Large RAM Pages.
|
||||
It's OK to use same allocator for alloc and allocBig.
|
||||
|
||||
|
||||
Single-call Compression with callbacks
|
||||
--------------------------------------
|
||||
|
||||
Look example code:
|
||||
C/Util/Lzma/LzmaUtil.c
|
||||
|
||||
When to use: file->file compressing
|
||||
|
||||
1) you must implement callback structures for interfaces:
|
||||
ISeqInStream
|
||||
ISeqOutStream
|
||||
ICompressProgress
|
||||
ISzAlloc
|
||||
|
||||
static void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
|
||||
static void SzFree(void *p, void *address) { p = p; MyFree(address); }
|
||||
static ISzAlloc g_Alloc = { SzAlloc, SzFree };
|
||||
|
||||
CFileSeqInStream inStream;
|
||||
CFileSeqOutStream outStream;
|
||||
|
||||
inStream.funcTable.Read = MyRead;
|
||||
inStream.file = inFile;
|
||||
outStream.funcTable.Write = MyWrite;
|
||||
outStream.file = outFile;
|
||||
|
||||
|
||||
2) Create CLzmaEncHandle object;
|
||||
|
||||
CLzmaEncHandle enc;
|
||||
|
||||
enc = LzmaEnc_Create(&g_Alloc);
|
||||
if (enc == 0)
|
||||
return SZ_ERROR_MEM;
|
||||
|
||||
|
||||
3) initialize CLzmaEncProps properties;
|
||||
|
||||
LzmaEncProps_Init(&props);
|
||||
|
||||
Then you can change some properties in that structure.
|
||||
|
||||
4) Send LZMA properties to LZMA Encoder
|
||||
|
||||
res = LzmaEnc_SetProps(enc, &props);
|
||||
|
||||
5) Write encoded properties to header
|
||||
|
||||
Byte header[LZMA_PROPS_SIZE + 8];
|
||||
size_t headerSize = LZMA_PROPS_SIZE;
|
||||
UInt64 fileSize;
|
||||
int i;
|
||||
|
||||
res = LzmaEnc_WriteProperties(enc, header, &headerSize);
|
||||
fileSize = MyGetFileLength(inFile);
|
||||
for (i = 0; i < 8; i++)
|
||||
header[headerSize++] = (Byte)(fileSize >> (8 * i));
|
||||
MyWriteFileAndCheck(outFile, header, headerSize)
|
||||
|
||||
6) Call encoding function:
|
||||
res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
|
||||
NULL, &g_Alloc, &g_Alloc);
|
||||
|
||||
7) Destroy LZMA Encoder Object
|
||||
LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
|
||||
|
||||
|
||||
If callback function return some error code, LzmaEnc_Encode also returns that code
|
||||
or it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
|
||||
|
||||
|
||||
Single-call RAM->RAM Compression
|
||||
--------------------------------
|
||||
|
||||
Single-call RAM->RAM Compression is similar to Compression with callbacks,
|
||||
but you provide pointers to buffers instead of pointers to stream callbacks:
|
||||
|
||||
SRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
|
||||
const CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
|
||||
ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
|
||||
|
||||
Return code:
|
||||
SZ_OK - OK
|
||||
SZ_ERROR_MEM - Memory allocation error
|
||||
SZ_ERROR_PARAM - Incorrect paramater
|
||||
SZ_ERROR_OUTPUT_EOF - output buffer overflow
|
||||
SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version)
|
||||
|
||||
|
||||
|
||||
Defines
|
||||
-------
|
||||
|
||||
_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
|
||||
|
||||
_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for
|
||||
some structures will be doubled in that case.
|
||||
|
||||
_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit.
|
||||
|
||||
_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type.
|
||||
|
||||
|
||||
_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
|
||||
|
||||
|
||||
C++ LZMA Encoder/Decoder
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
C++ LZMA code use COM-like interfaces. So if you want to use it,
|
||||
you can study basics of COM/OLE.
|
||||
C++ LZMA code is just wrapper over ANSI-C code.
|
||||
|
||||
|
||||
C++ Notes
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
If you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
|
||||
you must check that you correctly work with "new" operator.
|
||||
7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
|
||||
So 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
|
||||
operator new(size_t size)
|
||||
{
|
||||
void *p = ::malloc(size);
|
||||
if (p == 0)
|
||||
throw CNewException();
|
||||
return p;
|
||||
}
|
||||
If you use MSCV that throws exception for "new" operator, you can compile without
|
||||
"NewHandler.cpp". So standard exception will be used. Actually some code of
|
||||
7-Zip catches any exception in internal code and converts it to HRESULT code.
|
||||
So you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
|
||||
|
||||
---
|
||||
|
||||
http://www.7-zip.org
|
||||
http://www.7-zip.org/sdk.html
|
||||
http://www.7-zip.org/support.html
|
||||
Reference in New Issue
Block a user